Help! I need a data science solution: Let’s start project planning
This article is by Moshe Uziel, Chief Data Scientist in Healthcare at Stat-Market, responsible for helping clients build data intelligence into their operational processes.
In this series of blog posts, I want to provide organizations with the insights needed to integrate data science innovation into their organizations. In the previous article, we looked at the key players involved in driving data innovation. Next, we look at how data science ecosystem players can start to work together to produce results.
In my experience in developing data science solutions for leading organizations, I have seen that data science ecosystem teams need a clear plan on how they will collaborate to develop data innovation.
So, what are the first steps for building your data science solution? Let’s explore.
Start by considering: Is the project feasible?
The first stage is to evaluate project feasibility and budget.
It is at this point that the AI Product Manager and the Chief Data Scientist sit down and evaluate their resources. Through this exercise, they can estimate if they can build the solution based on their current resources, and if not, they evaluate if they are able to meet their goals through alternative solutions.
Key questions to ask in this phase include:
- Does your data science team have the knowledge to address the specific challenges of your new project?
- Is there a SME available that can commit to the project?
- Do we have all the necessary data on hand?·
- Some algorithmic developments need extensive compute resources, such as GPUs or complex cloud architecture. Has this been accounted for in your planning?
Kicking it off
Once the project has been evaluated as feasible, it is good to start off with a kickoff meeting with all parties involved. At this meeting, you can ensure that all parties are on the same page, and that they understand two things:
- The value of the project and how it will contribute positively to the organization’s overarching business goals
- The players roles in the project and how they will contribute in bringing the defined goals to life
Putting pen to paper
Next, it is time to start building your project protocol; this is the most critical point of the project development; if you want to increase your odds of getting real value out of your data science project you need to have a protocol in place.
I have heard different companies call it various names, but in a nutshell, a project protocol is a document that specifies everything about the project. The document serves as a reference for everyone included in the project.
The specifications in the document include:
THE WHY: What is the reason for doing this project, what value will it bring to the organization, and what impact will it have.
THE WHO: Specify who is part of the project and what their role is. As not everyone is working in the same department, it is advisable to add everyone’s contact information on the document.
PROJECT OUTPUTS: Outline exactly what is expected to be delivered at the end of the project. The output could be a model that needs to be integrated into the company’s product or an insight for a specific question. The possibilities for the output are endless, so desired outcomes must be specified from the start.
DATA: What kind of data do you need, and is it available? Here, the SME should outline the border business concepts, and the Data Figure needs to translate these concepts into data points in the company’s database.
MILESTONES: Break down projects into different parts, from data collection to literature review, all the way to delivery. Specify who will work on every aspect of the project. For example, sometimes it is good for the SME to validate the data before it even reaches the data scientist, so this milestone would need to be included in the document and it should reference the data engineer and the SME in the outline.
Pro tip: The closer the Data Scientist and the SME work together, the better the product. The SME will need to see the analysis of the data if they are to make the right decisions when managing the project; you want this analysis to be part of the milestones. This ensures that the data scientist won’t be left alone to make sense of the data and potentially make less than ideal decisions.
SUCCESS INDICATORS: Outline how you will measure project success. KPIs need to be described in terms of business concepts and key metrics that evaluate output success.
Surprise is not always a good thing
It is essential to minimize any surprises when delivering the project. Therefore, information about the progress of the project needs to flow between all of the players involved. This is a critical point that can not be underestimated as a lot of effort is required to ensure that the ecosystem players work collaboratively.
I hope this introduction to data science project planning will give you a clear overview of getting started.
Stay tuned for our next article in the series, and remember collaboration is king!