I wanted to write a post about getting a job as a data scientist. But my god, there are so many already out there. There’s advice about portfolio building. There’s listicles filled with suggested skills. There are template answers to common interview questions. Reading all these got me thinking - what kind of candidates is this community creating?
From where I’m standing, it doesn’t look good.
Granted, I’ve never worked in a large data science team. Plus, I was kind of grandfathered in to data science, having worked in plain old data analysis and BI for a while. But it seems to me that those articles filled with skills and portfolio advice are building average candidates. This wouldn’t matter so much if data science was as commoditised as accounting. But it’s not.
The value that the first or second analytics hire can have on a business is crazy. It can (and should) be like turning on the lights. There are so many companies in the dark, and they’re all looking for exceptional, interesting, and confident analysts. If they’re not yet, they soon will be.
If you’re a business that’s looking to make an impactful hire, here’s my advice:
Education doesn’t matter. I’m sick of seeing job ads that say a PhD is necessary. Be honest with yourself - if you won’t need these people to do research (and in some cases even if you do), let the PhDs alone.
It may look good on your org-chart but it’s bad for humanity. It takes a lot of work to get a PhD. These people are (or at least were) driven by a passion for discovery, for working at the fringes. If you’re just going to have them write code and nudge your growth chart up and to the right, let science keep them. Please, science needs them.
Of course it’s their choice what they do with their careers. But a PhD data scientist’s salary looks very convincing to someone who has just spent 4 years on a stipend. Ask them if they’d take the job for half the money. If they say yes, pay them more than they asked for - they’re well worth keeping and your mission is obviously important.
You shouldn’t cut data science salaries. That would be silly. Don’t trick yourself though, the reason these people will work on upping your CTR is the same reason the last generation of STEM PhDs went to work on Wall St.
So how about MOOCs?
The best data scientists that I’ve seen have learnt their skills in online courses. This is totally true but has an important caveat. These people worked in a business, developed domain expertise, took machine learning and statistics courses, and combined those skills with their knowledge to have tremendous impact.
I haven’t seen a lot of people with a MOOC-only background come from outside a company to make big waves within it. I think that has less to do with the quality of the education and more to do with a lack of business-specific (or maybe business in general) experience.
So what should you do?
Well, let me ask you this. Have you ever hired an HR assistant who didn’t study for a BA in Human Resources? Have you ever hired a software developer that studied music or a accountant with an MSc in Geography? Have you ever hired a head of customer service who didn’t go to university at all but has buckets of life experience?
By hiring only those with computer science or statistics degrees we’re building a monoculture. Relying on qualifications is the default when you have no idea what the person will be doing. So learn. And then judge candidates by the same heuristics you would use for any other position.
The success of Kaggle is the failure of curiosity. Do you have any idea how many data problems I want to solve at any given moment? A lot.
The worlds of business, science, sport, social networks, news, finance, natural resources, politics, economics, and medicine are filled to the brim with open problems. Any data scientist who isn’t curious enough to fetch, clean, manage, and analyse data, and tell a story about a problem that they care about isn’t one that you should hire.
If you ask a candidate to describe a recent project and you can’t get them to stop talking for 15 minutes - hire them.
Cleaned datasets and predefined goals are anathema to the scientist. You should ask candidates how they found problems to solve. If the answer is Kaggle, and you like that answer, be prepared to always feed your new hire new problems to work on.
You might think that because I’m so obviously in support of the scientific endeavour that I’d advise you to look beyond charts and graphs and visualisations. Nope. How data scientists communicate their solution to a problem is one of the most (if not the most) important thing they’ll do in their job.
Look at the charts. Ask questions. If you don’t understand what you’re looking at by the end of the interview, don’t hire that person.
That leads us on to soft skills. A few years ago, people viewed PhDs entering industry with suspicion. They thought they’d be hard to manage, that they might be a flight risk. They protected themselves by paying them a lot of money, and by letting them mostly self-manage.
Now that’s the default.
Silicon Valley companies compete for the best in machine learning talent and in so doing breeds divas. Now everybody else is doing the same thing.
The gifted people are always the hardest to get along with, aren’t they?
That’s obviously not true.
Don’t hire someone you don’t like.
Why do you want to work here? It’s a question that hiring organisations ask all the time. But if you’re hiring for a first or second data scientist, the answer you’ll usually get is all about their opportunity to build a team or develop a new technique - the greenfield, blue-sky opportunity of using your healthy company as a host for their ambition. Like a virus.
If how a candidate sees your company is all about them and nothing about your mission, don’t hire them. They’ll leave when the greenfield turns brown.