Scrum with Kanban story size and throughput
When doing Scrum with Kanban the guide doesn't mention anything about sizing the stories. Should all stories have the same size?
Reading Daniel Vacanti's book: Actionable Agile Metrics for Predictability: An Introduction says that stories don't have to be the same size, but I don't see how the cycle time calculations won't be affected by that.
Ex: using T-shirt sizing a Large story takes 3 times the time of a Small story. I can finish only 1 Large story in a sprint so my throughput is 1 story/sprint. I use the throughput as an input to sprint planning to forecast how many stories I can deliver per sprint. Now I pick a single Small story. I end up with a mostly empty sprint so I pick up 2 more Small stories.
My average throughput is now (1+3)/2 = 2 stories/sprint.
What if I pick two Large story next? I'll end up with spillover. In addition my cycle time scatterplot looks all over the place due to the different times different story size takes.
What is the correct action? Same sizing of all stories?
What is the correct action? Same sizing of all stories?
That could be a viable workflow policy. If the rate of throughput is high enough, variation in story size might no longer be statistically significant. It isn't a perfect solution, because the value of each broken-down story may become too attenuated to be useful.
Alternatively, you could have large stories triaged into a different workflow before your own workflow even starts. My first Scrum Master job was on a Small-Change Scrum Team, which handled large tickets rejected by Kanban teams. Those tickets were sometimes sneaked in to an operational Kanban to avoid drawing on capitalized Scrum budgets.
You mix the terms "throughput" with "cycle time". They are not the same. They are related but most useful when used together. I'm going to quote multiple paragraphs from his book. These are from Chapter 1 (page 13 on my digital copy)
As I just mentioned, the direct consequence of large buildup of work is that all of that queued work itself takes longer to complete. The flow metric that represents how long it takes for work to complete is called Cycle Time. Cycle Time ultimately answers the question of “When will it be done?” A process with elongated Cycle Times makes it harder to answer that question.
The direct consequence of elongated Cycle Times is a decrease in Throughput. Throughput is the metric that represents how much work completes per unit of time. A decrease in Throughput therefore means that less work is getting done. The less work that gets done, the less value we deliver.
To manage flow we are going to need to closely monitor those three metrics:
1. Work In Progress (the number of items that we are working on at any given time),
2. Cycle Time (how long it takes each of those items to get through our process), and
3. Throughput (how many of those items complete per unit of time).The rest of this book will explain that if your process is not predictable, or is veering away from predictability, these metrics will suggest specific interventions that you can make to get back on track. In a word, these metrics are actionable.
Each metric tells you something different and taken together they form a basis of actionable metrics.
Size of an item does not matter in cycle time calculations because it is calculating how long you take to finish work. Items can be differently sized but take the same amount of time to complete after actual work is started. The sizing is based upon knowledge you have at the time of the estimate. Cycle time is based on the actual duration it took to finish the work. So using your estimate as a judge of your cycle time means you are basing your results on unproven guesses.
The size of an item could impact your throughput as you point out. But it alone is not a total picture.
Actionable Agile is intended to help you make your process more predictable and consistent. There are many factors that play into that end goal. Using or focusing on just one factor will not yield the same amout of success.
Over time, the actual data will provide much better ability to forecast new work because it is actual. Estimates can change over time as people get more familiar with a domain. What was a XL 6 months ago probably isn't the same as an XL today. You would expect that your ability to guess gets better. Your ability to complete work in the domain may also improve over time but again you are measuring actuals and not using assumptions. You need the whole data set not just a single point.
Keep reading the book. I think it is full of very useful information and techniques. If you use Jira to manage your Product Backlog I suggest you look into the Actionable Agile addon. It is easy to use, does all of the calculations for you based on the data in Jira and provides the ability to segment, drill down and get deeper information from the data.
Thanks everyone for your input.
I guess I didn't explain my issue properly. In traditional Scrum we use the team velocity and the individual story estimates to come up with the number of stories that we can pick up.
If we use Scrum with Kanban we have a metric called throughput which describes how many stories we completed in a duration. If we use the sprint as the duration then we have for example 10 stories per sprint.
Now how can I use this metric in sprint planning ?! Unless all stories are the same size it sounds like a useless metric if the team decides to pick up significantly large stories.
I'm having some trouble finding some of the references now, but there a few things to consider:
- Nothing is stopping you from measuring throughput in Story Points, if you really want to. However, most people don't because...
- There's some evidence that counting Done Story Points isn't that different than just counting Done Stories, especially over a long enough time. Unless your work is radically different sizes and you experience these sizes in batches (for example, a series of Sprints with only very, very small Stories followed by a series of Sprints with only very, very large Stories), it all averages out.
- The Law of Large Numbers.
In traditional Scrum we use the team velocity and the individual story estimates to come up with the number of stories that we can pick up.
I disagree with that statement. No where in any revision of the Scrum Guide I have ever read does it say anything about story estimates as an indicator of velocity. In fact, the 2020 version of the Scrum Guide doesn't even have the word "estimate" in it. In the 2017 version variations of the word estimate are used 9 times but none of them are used in a manner that indicates the estimates are used to come up with the number of stories that can be included in a Sprint Backlog. Estimates are a means of categorization. They can also be used to form an opinion on whether work is worth doing.
I will also point out that the word velocity does not appear in either the 2020 or 2017 versions of the Scrum Guide.
Both guides say that a Sprint Backlog creation should be driven by the Sprint Goal and items are picked in manner that will allow the team to satisfy that goal.
To be complete, throughput and cycle time are not mentioned anywhere in the guides either. However they have been part of Kanban from the beginning because Kanban is focused on improving and mastering your the flow of items through your workflow where Scrum is focused on incrementally deliverying valuable increments to stakeholders. The two can work together because the focus on different aspects and both aspects supplement each other.
Now how can I use this metric in sprint planning ?! Unless all stories are the same size it sounds like a useless metric if the team decides to pick up significantly large stories.
You use the metrics of throughput, cycle time, work in progress to establish a flow that can be maintained over time. The period of time is not timeboxed as a Sprint would be. You discard the Sprint boundaries and look at how long it takes to get something through your workflow. If for example, you are continuously taking 9 days to complete work you may determine that it is not feasible for your team to pull in additional work towards the end of a Sprint and should instead find other ways to use that time. Or you see that you are completing most work in 7 days and decide to change your Sprint lengths to 1 week. It can also be used by the Product Owner to forecast delivery. If an item on average is taking 8 days to complete and your team consistently has 2 items in progress at the same time, they can look at the Product Backlog and forecast that it could take up to 40 days (2 in progress now + 4 to complete before work can begin + time to complete the remaing two) to complete the 6th item in the Product Backlog order. That could drive a decision to change that items order in the backlog.
There are many ways to use Kanban practices to supplement Scrum practices. And as with all things agile, each team's benefits can vary as can their implementation of the practices.
I hope this helps.
Let me quote Daniel Vacanti from his book:
...the assumptions about our process that are necessary to make Little’s Law work are:
- The average input or Arrival Rate should equal the average output or Departure Rate (Throughput).
- All work that is started will eventually be completed and exit the system.
- The amount of WIP should be roughly the same at the beginning and at the end of the time interval chosen for the calculation.
- The average age of the WIP is neither increasing nor decreasing.
- Cycle Time, WIP, and Throughput must all be measured using consistent units.
Point 1 is the key here. If we pick more stories to work on in a sprint and by the end of the sprint we fail to complete them all then we don't have a predictable process according to him.
So we should somehow use Throughput to somehow forecast how many stories we can deliver in a sprint. How exactly is my question!
Little's Law is based on a continuous flow. Scrum isn't always a continuous flow. You may go days before moving work into a Sprint because your to-do work for the Sprint is at its limit. If you really want to track throughput, consider the key life events of a Product Backlog Item: refined and "ready" for a Sprint, selected for a Sprint, one or more states during development, and Done. So consider the word "average".
At any arbitrary point in time, what are the average cycle time and the average number of in-progress items? The point-in-time value could be 0. For example, during Sprint Planning, you may or may not consider "selected for a Sprint" to be "in progress". You can still use this to determine the average. If you start work in the middle of a Sprint, you can also opt to keep it in progress and track how long it takes to get to Done, even if it crosses Sprint boundaries. Again, you can get to the average cycle time from start to Done. Knowing these two averages, you can figure out how much work you can have in your system.
Point 1 is the key here. If we pick more stories to work on in a sprint and by the end of the sprint we fail to complete them all then we don't have a predictable process according to him.
Not necessarily. It probably depends what decisions are taken about the WIP at the end of the sprint.
In many cases, such items remain a good choice for further investment, and the only likely disruption to their flow will be the period of time taken for the Scrum Events, before flow resumes. This might not be long enough to significantly compromise predictability.
The disruption to flow might be more noticeable on a single item if cycle time is very short and there's a 1 or 2 day wait for Scrum Events; but as cycle times reduce, throughput increases (and perhaps WIP is greatly reduced too), the proportion of items unfinished at the end of a sprint might reduce too (meaning the impact of a Sprint on predictability is further reduced).
If we use the sprint as the duration then we have for example 10 stories per sprint.
Now how can I use this metric in sprint planning ?! Unless all stories are the same size it sounds like a useless metric if the team decides to pick up significantly large stories.
Same sizing isn't usually necessary, but right-sizing probably is. My approach to "right-sizing" is to agree on a consistent policy, such that items are treated in a consistent way.
If you imagine a bag of potato crisps (or chips, depending where you live), the machine in the factory slices in a consistent way. This produces a range of sizes, but they're within a predictable range, and if you were to measure the diameter of each one, you'd be able to say something like 85% have a diameter of 5cm or less.
If you take another bag, you would almost certainly find that the 85th percentile comes out at around the same size.
The weight of each bag's contents is a consistent constraint (just like the timebox of the sprint). Each crisp is right-sized by the way they're sliced.
If you were to count the number of crisps in each bag, they would be fairly consistent from one bag to the next, within a given range. Just like throughput or velocity will fluctuate from sprint to sprint, the exact number of crisps will fluctuate somewhat, but will prove predictable overall.
You will never be able to know exactly how many crisps you will get, or what can be done in a sprint, but you will be able to forecast with a degree of confidence.
If cycle time is expressed as a service level expectation (e.g. 85% of items go from started to done in 7 days), the Scrum Team can apply a simple right-sizing technique, by the Developers asking themselves "do we believe this item is small enough to be done in 7 days, according to our normal workflow?"; if the answer is no, the item needs to be split.
To ensure continuous improvement, once items are done, the Scrum Team can look at items that took longer than the expected 7 days, and identify why this was the case. They can take actions to prevent such occurrences in the future.
The cause might be sizing, and so the Scrum Team may need to learn how to refine more effectively; or it might be something else, such as dependencies, or favouring newer items over older work in progress.
Once the causes of slow cycle time are eliminated, and items taking 7 days or longer are a rarity, the service level expectation should change to 85% of items go from started to done in 6 days or 5 days.
The question asked during right-sizing would be adjusted accordingly, and the whole process repeated, so that further improvements are sought; gradually eliminating the causes of the slowest cycle times.