Machine Learning is Complicated.
I get it.

I help people like you discover machine learning.

I won't share your email with anyone, ever.

  • Stop asking for permission

    You should never ask a ‘Can I?’ question until you’ve tried the thing you’re asking. If you’re looking for a job, asking random strangers online if you’ll be successful is a losing proposition. If they say yes, you’ve wasted time, if they say no, you’re defeated before you’ve begun.

  • 3 Real Life Machine Learning Examples

    No matter if I’m speaking to a client, a student, or a distant family member, people always ask me for examples of how I’ve applied Machine Learning in the real world. It seems that even though we’re being bombarded by articles and tutorials, that some context is missing. In this article I’m going to discuss three (semi-)recent projects of mine so that you can better understand how machine learning and data science works in practice.

  • 5 Essential Machine Learning Ideas

    The last article I wrote urged people to use experimentation in order to learn data science. But for those people who need a little more concrete advice, here’s a list of 5 essential ideas that I feel will help any beginning data scientist gain intuition about what is happening when they build and deploy machine learning models.

  • How to Learn Data Science

    Right now, I’m in a fairly unique position. On the one hand I’m writing a book (The Science of Data Science), which I hope will be as inclusive and as easy to read as possible. On the other, I’m trying to settle on a topic for my PhD thesis, which means going out to the edges of the known and poking around to see where the wall gives.

  • How to Become a Freelance Data Scientist

    While it’s true that the best data science is done by those who know their organisation very well, there’s a lot about data science that lends itself well to consulting style engagements. I’ve worked as a freelancer in data science (and analytics more generally) for the better part of a decade and in this post I’ll be showing how you can freelance using your data science skills.

  • Churn Prediction and Prevention in Python

    Churn prediction is difficult. Before you can do anything to prevent customers leaving, you need to know everything from who’s going to leave and when, to how much it will impact your bottom line. In this post I’m going to explain some techniques for churn prediction and prevention using survival analysis.

  • The fuel on the fire

    There’s an idea in machine learning called ‘the unreasonable effectiveness of data’. It describes the tendency of all models to converge on accurate representations, given enough data.

  • How I do data science

    For the past 10 years, I’ve been working with businesses of all shapes and sizes. I’ve worked on problems that ran the gamut from simple to incredibly complex. All that time, I’ve been trying to extract a framework for achieving results in analytics.

  • The inverse Pareto principle

    The Pareto principle is everywhere online. “Don’t be a busy fool - only 20% of what you do matters anyway”.

  • Remember the users

    If you’re a data analyst right now, or anyone working in business intelligence, you might be eager to get your hands on the latest and greatest machine learning algorithms. You might be trying to sell your business on the benefits of predictive analytics and patiently waiting for the go ahead.

  • The Achilles' heel of online education

    It’s high time that higher education was usurped by a cheaper and more flexible alternative. The replacement of broken, centuries-old institutions was one of the primary promises of the information revolution, but this hasn’t yet come to pass for education.

  • R vs Python: why I'm going back to R

    The R vs Python debate has been around a long time. Choosing between these immensely popular languages has been the source of countless infographics, Twitter-wars, and blog posts.

  • How to hire a data scientist

    I wanted to write a post about getting a job as a data scientist. But my god, there are so many already out there. There’s advice about portfolio building. There’s listicles filled with suggested skills. There are template answers to common interview questions. Reading all these got me thinking - what kind of candidates is this community creating?

  • Why I choose to be a freelance data scientist

    Data Scientists are in demand. And while that job title is applied to all sorts of disciplines, the reality is that nearly anybody with some analytical and predictive capabilities can (safely) call themselves a data scientist.

  • Outlier detection with one-class SVMs

    Imbalanced learning problems often stump those new to dealing with them. When the ratio between classes in your data is 1:100 or larger, early attempts to model the problem are rewarded with very high accuracy but very low specificity. You can solve the specificity problem in imbalanced learning in a few different ways:

  • Finding and fixing multicollinearity

    When you undertake feature engineering for a new project, two outcomes are most likely:

  • Meeting my supervisor

    Today I met my PhD supervisor face-to-face for the first time. Obviously, most of what was said primarily relates to my specific research topic, but we did discuss a few general things which may be of use to others (and myself in the future!)

  • Geographic features in SQL

    Something that comes up surprisingly often in my data work is the idea of capturing local (in the geographical sense) patterns. Whether it be modelling an individual’s likelihood to make a purchase based on their neighbours activity (classical Keeping Up With the Joneses!) or predicting crime risk using local crime history.

  • Slopes as features: making time-sensitive predictions

    A lot of the projects I work on are time-bound in one way or another. My clients need to know the churn rate next week, the risk of fraud next month, their anticipated revenue next quarter. But what features does a model need to do this well?