Data science has a problem with scale.
The landscape is littered with many data science minimum viable products (MVPs) that solve a narrow use case. Yet, they do not deliver the same outcome across different parts of a business. In short, there is a lack of repeatability.
In the whitepaper “Bridging the Chasm: From the Data Science MVP to Impact at Scale,” authored by Michael Camarri, head of data science for APAC at Cognizant, these data science MVPs were called “one-hit wonders.”
Creating MVPs that can scale is not a technical problem, Camarri wrote. Rather the problem lies in the way we approach data science.
What makes data science MVPs different
The root of this scaling difficulty lies with how data science MVPs begin. And it is different from other MVPs.
“In non-data science MVPs, while it can take some time to determine the end goal, it is often well-defined at the start of the MVP. For example, ‘Build a report that shows fuel usage’ or ‘Build a product that alerts me when I need to buy more fuel’,” Camarri explains.
For data science MVPs, the end goal can be vague. It can, for example, start with the question, “Tell me how I can improve my fuel efficiency?”
“This is good and bad: you can more easily work to time — you keep tweaking the results until you run out of time. However, there is the risk of the never-ending project,” Camarri analyzes, adding that considering other new variables can stretch this timeline.
So, Camarri suggests beginning data science MVPs with business endpoints in mind. For example, “give me recommendations to improve my fuel efficiency by 10%” rather than “I want a model with an R2 of 0.7”.
“The latter as a goal just leads to pointless churning of models for no additional business benefit,” observes Camarri.
Where our mindset becomes a crutch
The problem with using business endpoints is that it is not how data scientists work.
Data scientists rely heavily on modeling the data sets. Very few of them frame the problem statement and outcomes from a business context. This creates data science MVPs that are very good at solving a single issue but are not repeatable across other parts of a business — in other words, it becomes a “one-hit-wonder.”
“A data scientist who is heavily into the details of the modeling can be the wrong person to lead a project. This is why I’d rather do a project in half the time with two data scientists. When you are heavily into the mechanics of running the models, it can be very hard to step back and see the bigger picture,” he adds.
Camarri notes that successful projects are initiated by the business teams or subject matter experts. It means you will have two leads for the project: one who is heavily embedded within the business and the other in the data science team. The collaboration allows the data science MVP to meet the business goals while “still being valid data science.”
For a data science MVP to scale, you need a variety of mindsets. The whitepaper describes three vital ones: the worker, designer, and management (see whitepaper for further explanations).
The balance of these mindsets is critical, with the designer mindset being the most difficult as it is more visionary and needs to be adaptable.
Finding this balance of mindsets is “incredibly difficult,” admits Camarri. Shifting from one mindset to another takes time and practice. Camarri even explains how he honed his ability by using his commuting time to shift mindsets.
“It is also why I’d prefer to have two people on a project as it really is quite difficult to balance these different mindsets,” says Camarri.
Two people can also help to course correct a data science MVP better. “The biggest challenge to the correct use of course correction is getting too close to the modeling. Worse than getting lost in the details is thinking that the model is the end goal rather than a business objective,” Camarri explains.
Taking the CoE highway
One way to drive better mindsets is creating a center of excellence (CoE) to structure data science MVP teams better.
CoEs offer three main benefits: technical guidance to the data science teams, allowing the teams to spend more time on the bigger picture and less time figuring out how to make a piece of code work.
CoEs can also oversee the project approach. Lastly, it offers a framework for interacting with business users.
Camarri feels the last benefit is especially vital for scaling data science MVPs. “Business engagement is a specialized role, and data scientists who are good at this are very hard to find. By centralizing this engagement, you can provide this to a much larger number of projects.”
CoEs also offer an essential side benefit: ensuring that companies do not view data science MVPs as technology projects.
The rise of cloud computing and the easy availability of powerful (open source) software means that it is easier than ever to run ‘data science models.’ But it has also resulted in MVPs that are designed to implement a particular technology or approach.
“Ironically, these ‘improvements’ can actually lead to worse business outcomes as the technology focus over-rides the business outcomes,” says Camarri.
The real challenge lies in finding the “old school” data science expert who blends the business and tech pieces. They are a rare breed.
In the meantime, companies can use CoEs to ensure data science MVPs start with a business focus. “Do not start with ‘what can I do with this data?’ or ‘what can I do with this technology?’ but start with ‘What is the business problem? What is the deliverable? What change is required to deliver value?’,” advises Camarri.
Sorry, there are no shortcuts
Companies should not be disheartened if they can’t bridge the mindsets in the first few MVPs. It takes time and effort.
Camarri argued that these are valuable lessons for the company to use and build a more robust data science practice. Even if the project needs a reboot, it can add significant value.
The whitepaper offers three valuable takeaways, based on Cognizant’s lessons in running a data science practice for over 21 years:
● Begin with a repeatability mindset, but do not enforce repeatability from the beginning.
● Value is the ultimate goal, not accuracy.
● The MVP team is unlikely to be the team that successfully scales the solution.
All three offer essential guidelines, so you have a better chance of turning your data science MVP “one-hit wonders” into “hit-making machines.”
Winston Thomas is the editor-in-chief of CDOTrends, HR&DigitalTrends and DataOpsTrends. He is always curious about all things digital, including new digital business models, the widening impact of AI/ML, unproven singularity theories, proven data science success stories, lurking cybersecurity dangers, and reimagining the digital experience. You can reach him at [email protected].
Image credit: iStockphoto/semenovp