Many companies are trying to build data science capabilities, for various reasons: either they saw that a competitor is extracting real value, or, more commonly, simple fear of missing out.
Sometimes these initiatives fail. Actually, rather often. But why is that and how to prevent it? In my experience, there are three key factors that contribute to the success.
1. Strong internal buy-in
Your forthcoming data science team must have a clear sponsor on the organization. This person is ideally someone high enough in the food chain to understand what the real issues and pain-points are. Why is that? There needs to be a clear definition of success. You can understand success if you don’t understand where problems are. The data science team is as good as the problems it can actually solve.
More importantly, KPIs should be defined: when is a data science initiative declared successful? After the number of lines of code produced? Number of algorithms in production? Bottom-line benefits? New businesses?
2. The right people.
You must plan your hiring for the right profile according to the maturity level of your organization. Do you want to run deep learning in GPUs and just bought a Hadoop cluster? Hold on, let’s talk first about your reporting and your data quality.
Popular opinion makes us believe that you need AI PhDs from top-level universities that code the deepest deep learning in and must get paid dearly. Actually, more often than not, these people are useless: they don’t know your business and they don’t care to learn it. Their interest is, at best, purely on the algorithms and, at worst, on getting that juicy salary only.
How to weed out the bad candidates? Either you hire one of those AI PhDs and make sure that he/she is super-passionate about your business. I cannot stress this enough: make sure they love your business! Otherwise you are in for leaving a trail of disappointed people: your flagrant new hire and your senior stakeholders.
An even better idea is to train and encourage your in-house experts on developing their own data science skills. These people have been with you, they know the business from inside out and are presumably passionate enough to be working on it.
On the very early stages, when your data science initiative is about grabbing the low-hanging fruit, well-trained domain experts can get you up and running with insights faster.
3. Technology in place (and a good work environment!)
It should be somewhat obvious, but I have been in many data science projects where the data takes some time to come. Waiting 2-3 weeks for getting clearance to access all the necessary data is the rule in many large companies. This is, of course, not ideal, but it does not even stop there: data scientists often face terrible laptops, no permissions to install software updates or packages (Python 3? Forget it) and sometimes not even internet access. Our team was once asked to develop an entity reconciliation engine using only data from the test environment (yes, really) because of confidentiality issues. These blocks are sufficient to discourage the most enthusiast and prepared data science team, and will pile up in the failed IT projects in the organization.
As the team grows in size and, more importantly, value has been proven, some hardware purchases might be in order (GPU cluster, etc). But not before.
Last but not least, data science is a highly creative job! Any job with data demands huge concentration. You should provide your team with spaces to run away and dive in the data, if you cannot avoid the open office. Putting your data science team next to the hectic sales team will not do.
Have you, or perhaps are you building your own data science team? What are your experiences? Let me know in the comments!