Succeeding With Citizen Data Scientists

Around the world, demand for data scientists is topping the charts. And as more businesses understand the power of data and catch a glimpse of a data-driven future, the pressure to hire experts who are versed in data science and AI will only increase.

Yet it takes years to train data scientists, and it doesn’t help that these highly sought-after data professionals switch employers in 1.7 years on average. The logical solution would be to train employees with the skillsets needed to analyze and innovate with data.

Training data scientists

Training citizen data scientists is a discussion that invariably comes up at CDOTrends events and roundtable sessions. Organizations are generally keen to do so but are wary about the challenges and achieving measurable results.  

But as we wrote earlier this year, organizations don’t need to start on massive or advanced data science or AI projects to succeed with data. Citizen scientists working on scores of projects can achieve a cumulative victory on a massive front, allowing them to win big at data by starting small.

Indeed, even fashion brands are waking up to the advantages of data science. Despite coming from a decidedly non-technical industry, American clothing company Levi’s took things into its own hands this year with a machine learning “boot camp”.

As reported on Vogue Business, the program trains global employees across all disciplines, in areas such as machine learning, coding, design thinking, and product management. Employees are paid their usual salaries during the eight-week virtual course with lectures, team exercises, individual assignments – and homework.

Employees must apply and are selected through a no-code challenge that evaluates their analytical skills, problem-solving abilities, curiosity, and perseverance. 450 employees have applied so far – exceeding company expectations, and 100 will have completed the training by the end of this year.

A tech stack to help

Citizen data scientists are not as well versed in the most technical aspects of data science as their professional counterparts, observes Ant Phillips, chief technology officer of IT service management company D4t4 Solutions. Though they make up for this with their subject matter expertise, it makes sense to help them succeed.

To help them, businesses can endeavor to remove roadblocks that stymie or slow them down. Specifically, the need to involve IT to access data or the need to make code changes frequently will only hamper and frustrate their efforts.

This is where self-service tools can play a role to help citizen data scientists reach their full potential. With the ability to directly access the data they need, and the tools to seamlessly manipulate and draw inferences from the data, citizen data scientists can leverage their subject matter expertise to craft successful strategies and innovative products that can define the business.

A significant amount of work needs to be done to reach this state, though, such as onboarding disparate data silos, properly documenting them, and setting up reasonable access rights across the organization – all of which require concerted effort and the right tools. But once implemented, it helps citizen data scientists leverage their insider knowledge to achieve tangible results.

Mind your data quality

Finally, the quality of data is crucial whether one is deploying cutting-edge machine learning or analytics. At a recent CDOTrends event, one participant commented that the problem of IT not being able to deliver in time can often be attributed to poor quality data – with resultant delays as IT struggles to extract the correct data.

According to Gartner, poor data quality costs organizations an average of USD 12.9 million each year, as poor quality data increases the complexity of data ecosystems and result in poor decision-making.

This is where data-wrangling comes into play. Apart from transforming and mapping data from one format into another, this includes cleaning data by identifying gaps such as missing values, erroneous entries, or extreme outliers that can distort the picture.

Data wrangling is more work than it sounds; the forward-thinking organization will be wise to hire data professionals to focus on data quality. The good news: You don’t need to be a data scientist to do it, though the right combination of data skills, business acumen, and some programming abilities are probably required.

For now, it is heartening to know that the program at Levi’s is already paying off. Some employees have already developed tools to automate certain portions of their roles, and other firms have approached Levi’s to share insights.

“You can teach someone in fashion data science, but to teach someone who has a data science background the nuances of fashion? That is really hard and takes time,” summed up Ronald Pritipaul, a design coordinator at Levi’s.

Paul Mah is the editor of DSAITrends. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose. You can reach him at [email protected].​

Image credit: iStockphoto/imtmphoto