What's the best way to deal with epics that can't be broken down?

7 replies

01:57 am March 23, 2018

I currently work as a developer on a hybrid waterfall-agile team, and we're slowly but surely adopting more agile practices, namely Scrum and Kanban. We recently tackled a large task, and while we were able to handle it in our pseudo structure, it seems like a fully agile environment would have caused significant problems.

The task we needed to complete was to migrate an underlying system framework from one to another. We could break this down into smaller stories focused on individual sections of the application, which could be completed within a sprint. We could also break the tasks into phases, However since the system had no transition state, this would create a project with large chunks of broken functionality at the end of the sprint; hardly a "potentially releasable increment."

With the total amount of work requiring over 1 month, and the stories of the epic so heavily co-dependent, what would be the best way to tackle this in a scrum environment? Do we change our definition of done? Does the sprint goal need to change? Do we commit to failure, planning a one-month sprint we know we won't be able to deliver on? Do we go past 9 members, and create a Nexus for a single epic?

None of these options sound great to me.

Jason Jafarian

02:03 am March 23, 2018

To clarify , no matter how we broke up the epic, the smaller stories or components of the application are all co-dependent. There is no way we could find to deliver just one component of the Epic without the others, while maintaining a fully functional product.

The problem being, of course, that removing value from a product directly opposes Scrum's core definition of delivering the highest possible value.

(We need an edit button)

Ian Mitchell

06:15 am March 23, 2018

Was the framework migration task a complex one, with significant unknowns or risks to be managed?

Thomas Owens

10:13 am March 23, 2018

It sounds like the problem is in your definition of "potentially releasable increment".

You have a System A that is up and running. I see two good options right off for migrating to a replacement System B. You can either sun System A and transfer some functions to System B as they are completed, perhaps also maintaining System A. Depending on what the system is, this may be transparent to end users. Or you can build System B and, when it meets parity with System A, turn off System A and turn on System B (perhaps with some kind of cutover window to migrate data, for example).

Both of these present increments that can be released at the end of a Sprint. Released doesn't mean "active and everyone using it". In one option, released would mean that some functionality is available in System B and when performing a given task, users can or should or must use System B to perform that task. In the second option, System B has value-adding functionality delivered to a test environment and users can check it out and provide feedback, even if they aren't using it to carry out their daily work yet.

I do think the context is important. If you are building something brand new, users have nothing and it's very easy to deliver value. However, if you are replacing a system, you can't actually replace the system until the replacement system meets a certain threshold of capability. Depending on what the system is, you don't necessarily need to do a big-bang replacement, but a phased rollout where both systems are in actual use.

Jason Jafarian

02:49 pm March 23, 2018

Ian, at the onset we expected the task to be complex, but with only a few remaining unknowns. As the sprint progressed, we discovered more unknowns. Our understanding of the work changed significantly during the sprint.

Thomas, at the system level, this is essentially what we do (and why we aren't fully agile yet). Your comment about needing a threshold of value to justify migrating from one system or version to another is absolutely right, thank you for sharing. My takeaway (feel free to correct me) is that we could have our "potentially releasable increment" just be the current system, with no value added or lost. Once all the co-dependent stories are completed (2-3 sprints later), our PRI becomes the new modernized system. This way, our product's value may not go up every sprint, but it never goes down. I think that would work.

Ian Mitchell

03:22 pm March 23, 2018

at the onset we expected the task to be complex, but with only a few remaining unknowns. As the sprint progressed, we discovered more unknowns. Our understanding of the work changed significantly during the sprint.

Then perhaps that’s how you ought to break the work up - in terms of mitigating the risks and complexities which emerge. It doesn’t have to be user stories. It can be assumptions about the work which need to be tested.

Thomas Owens

03:23 pm March 23, 2018

Thomas, at the system level, this is essentially what we do (and why we aren't fully agile yet). Your comment about needing a threshold of value to justify migrating from one system or version to another is absolutely right, thank you for sharing. My takeaway (feel free to correct me) is that we could have our "potentially releasable increment" just be the current system, with no value added or lost. Once all the co-dependent stories are completed (2-3 sprints later), our PRI becomes the new modernized system. This way, our product's value may not go up every sprint, but it never goes down. I think that would work.

This is not what I'm saying at all.

Your new system is ready to fully replace the old system as early as the time when no value is added or lost. However, that is not a "potentially releasable increment". You can have "potentially releasable increments" with incomplete functionality. You would just choose not to actually release them because doing so would provide negative value.

What you should be doing is either deploying the system regularly to some kind of testing, pre-production, staging, etc. environment where users will be able to see and provide feedback or replacing pieces of the old system with the new system in the live environment. If you do this, you can actually demonstrate completed ("done") work and get feedback on it from users. The visibility also will help the business side see that progress is being made on a regular basis to fully replace the old system.

Jason Jafarian

04:08 pm March 23, 2018

Then perhaps that’s how you ought to break the work up - in terms of mitigating the risks and complexities which emerge.

Ian, that may work. I suppose there are very few singular tasks that, when risks and complexities are properly mitigated, would still take longer than a full sprint.

You can have "potentially releasable increments" with incomplete functionality. You would just choose not to actually release them because doing so would provide negative value.

Thomas, the hangup I have here is that the Scrum guide says "The purpose of each Sprint is to deliver Increments of potentially releasable functionality that adhere to the Scrum Team’s current definition of 'Done.'" Can an incomplete component be considered "done"?

deploying the system regularly to some kind of testing, pre-production, staging, etc. environment where users will be able to see and provide feedback or replacing pieces of the old system with the new system

I agree with you that this is the best way to handle a migration, when possible. We did actually try this early on, but this turned out not to be an option. The migration ended up being all-or-nothing, with no way to migrate only some pieces but not others.

It's an unusual scenario that wouldn't apply to the vast majority of development tasks, but that unusual scenario is what stemmed this question.