Agile methodologies in data science

Table of Contents

Sign up for our newsletter

We care about the protection of your data. Read our Privacy Policy.

What problem were we trying to solve?

At ProCogia, our goal is to provide the highest level of data science support we possibly can. With our larger clients, we operate as long-term support for several data science initiatives. This type of engagement requires a greater level of project management than single project or short-term engagements. On one of our current client sites, we maintain a staff of five data scientists who support many of their engineers. Before long we identified several inefficiencies that hindered our ability to develop the high-quality data science solutions that we pride ourselves on producing. The two main pain points we identified were a lack of clarity in the project requirements, and a propensity for projects to extend far beyond their initial expected duration. To address these problems, organize our efforts, and maximize our output, we adopted an agile project management process that we customized to meet the needs of our data science team.


What is Agile?

The original Manifesto for Agile Software Development was an effort by 17 veteran software developers who identified a few key features to successful development teams. Their primary values are:

Individuals and Interactions over processes and tools Working Software over comprehensive documentation Customer Collaboration over contract negotiation Responding to Change over following a plan

Additionally, 12 principles are identified to further define best practices for software development. As data scientists, we recognized that not all these principles would apply to our work, but we realized that many of them do, and we set out to organize our teams and projects in an agile way.

Our experience

As consultants we often find ourselves taking on projects from our business clients that are not fully defined. After all, that’s part of the fun of data science, the science of it. Form a hypothesis, design and carry out an experiment to test your hypothesis, evaluate the results, and iterate from there. We use a scrum process to manage our team and ensure that maximum value is delivered to our clients. Our process follows the standard scrum framework shown below.

Within this framework, we develop our projects in two-week cycles called sprints.

By holding ourselves to two-weeks and communicating our work to our clients during this time, we provide a much-needed lens into our development lifecycle. It is all too easy for clients on the business end to neglect to account for the difficulties we may face in gathering, cleaning, and pre-processing their data. Instead of our team working in a silo for two months before delivering what we think the client wants, we deliver smaller pieces more often to allow them time to digest our work and provide feedback along the way. The greatest benefit we have noticed after implementing this framework has been our ability to more quickly iterate and reach a solution that meets our client’s needs.

We are currently focusing on the sprint retrospective phase by gathering input from our team, as well as from our clients, to tweak our process. One important change we have made is in the project estimation that we do during our sprint planning. We set out to engage in a team wide level of effort estimation at the beginning of each sprint cycle. What we learned after a few tries was that our breadth of active projects meant that usually only one team member was fully aware of the overall complexities of a project to accurately estimate the level of effort. This meant that the rest of the team was providing little input to the estimation and our time spent estimating was not useful. Now, we allow individual contributors to estimate their own level of effort and reserve the time for team estimation for new projects coming into our backlog.

Overall, our implementation of agile management processes has made positive change in our team’s level of communication and productivity. This adoption has streamlined our project intake process and eliminated much of the uncertainty we used to deal with when taking on complex data science projects for our clients. Reducing the unknowns helps us reach a solution faster and improving communication helps us adapt to changing requirements. The most important lesson we have learned though is that there is always room to improve. As ProCogia grows and takes on new and varied clients, the only way to keep up with the complexities is to be agile.


Keep reading

Dig deeper into data development by browsing our blogs…
ProCogia would love to help you tackle the problems highlighted above. Let’s have a conversation! Fill in the form below or click here to schedule a meeting.