One of the strongly debated topics in the data science forums is whether one should expect of data scientists to be individuals with strong credentials (STEM PhD and above) or if everyone, provided enough data, tenacity and GPU access can be a data scientist.
I would argue that this is looking at the problem from the wrong side: what matters is how a person reacts to an ever-changing, complex problem, and this has nothing to do with academic credentials. Yes, the PhD training used to be focused on nurturing curiosity, hard work and hard-wiring the scientific discipline in those pursuing it, but on a “publish or perish” world, PhDs lack those properties.
Instead, there are three qualities that can distinguish a data scientist from the rest.
1. Understanding of the business context
Without this, data science is simply not possible. Data sources need to be properly understood, to assess if an unusual distribution (skewed, or bimodal) is to be expected in a given feature, or if it could indicate problems in the data.
For this, it is crucial for the data scientist to be aligned with the objectives of the organization. You can use the exact same classification models to detect suspicious regions in tumors and brands in Instagram. Which problem makes you tick? This is a very personal question, and there are no correct answers. But it is important that the data scientist’s answer fits the correct role.
2. Resist through obstacles
Being aligned with the objectives and values will make hard times smoother. I remember having moral issues with the project I was working on early in my career (pre-GDPR hyper predatory marketing). This made it really hard to get through the bushes when there were data issues, or deadlines / extra work.
A data scientist that identifies themselves with their company and believes on the importance of their job will go the extra mile to be sure that the quality of their work is flawless.
3. Be able to answer: “so what”?
Real data scientist know what value are they bringing to the table, and know how to communicate it to domain experts and to senior stakeholders.
No one pays a data scientist for the number of deep neural networks trained, or the number of hyperparameters tuned. They are accountable as any other team member, and even more, given the highly strategic role they play.
A junior data scientist once told me: “if it’s not a code problem, it’s not my problem”. He remains on a junior position after 3+ years, and likely to stay there for the foreseeable future.