Normalization of story points between Scrum Teams
I'd like to gauge opinion on the normalization of story points between Scrum Teams. Just to be clear, by "normalization" I mean that each Scrum team is expected to have the same view as to what constitutes a story point. The idea is that a 1, 3, 5, 8, 13 etc. must have a similar meaning across teams, and that similarly sized requirements must therefore be in some sense comparable.
Ken Schwaber told me a few months back that, in his view, such normalization is a bad idea because story points cannot be "commoditized". I agree with him on the basis that the measure of productivity in Scrum should be the delivery of working increments. I also agree that normalization does indeed imply commoditization and this can skew people's ideas of what constitutes value.
Nevertheless, I'm finding that more and more clients are trying to normalize estimates across teams with a view to comparing them in some way. I'd observe that often it is IT departments who drive this normalization process rather than Business, usually in an attempt to retain departmental control of teams and their delivery. I should also point out that normalization is an accepted practice in the Scaled Agile Framework (SAFe) and so it is something we are likely to encounter more often.
I asked James Coplien for his opinion about this. He concurs that normalization makes no sense. He said: "Managers feel they can compare normalized points but, of course, you can't. It's fundamental to relative estimation".
Let's note, however, that I've assumed the case where the teams are engaged on different projects and have different Product Backlogs. If multiple teams share the *same* Product Backlog, then some sort of normalization will be essential, otherwise the size given to PBI's will be inconsistent. This is tricky because different teams will have a tendency to diverge unless they are occasionally recalibrated.
James suggested calibrating teams within high and low baselines. He said: "Teams working together on one product under a single product owner should use the same scale when estimating the Product Backlog (since it's the same product backlog). I recommend calibrating a high and a low baseline that is shared across teams — that calibrates the spread". This seems reasonable, but are there any other opinions? Moreover, since the Scrum Guide expressly allows multiple teams to work on the same product, how and when should such calibration be performed?
Hi Ian,
what value do story points have? They help the dev team to make a forecast during sprint planning, and they help the Product Owner to plan his releases. That's it, or did I miss something?
If multiple teams work on one product backlog, I think they should use the same scale, which means the same reference stories for their estimations. This way, they can make a forecast based on each team's velocity. As reference they should use stories that each team is able to estimate (and implement).
If they work on different product backlogs, in my opinion there is no value in a normalization.
If someone tries to normalize there, for me that smells like controlling, which is one of a Project Manager's jobs that wasn't assigned to PO, SM or Dev Team when creating Scrum, because it is waste.
I don't see a contradiction to the fact that normalization is an accepted practice in SAFe, but I understand your point that we have to be prepared for discussions with those guys.
Is it not compulsory to "normalize" when you have a large Product Backlog and several Dev Team working on it ?
How can the Product Owner order the Product Backlog if the estimates are not congruent ?
> Is it not compulsory to "normalize" when you have a large Product Backlog
> and several Dev Team working on it ?
> How can the Product Owner order the Product Backlog if the estimates are
> not congruent ?
I wouldn't say it's compulsory. However I do think it's a best practice to do so, and for the reason you suggest. The issue is when you have separate Product Backlogs, each of which is subject to independent product ownership and which have separate Development Teams working on them. The only common link in such cases may be the organization the teams work for. Yet I'm encountering more and more clients who have wildly different projects in their portfolios and who want to normalize points across them, thereby establishing a currency which they can subsequently control.
Mike Cohn has written about normalizing points when multiple teams are working on a given project (see his book Agile Estimating and Planning). However, he was not talking about separate projects. I asked him about this yesterday just to clarify his position. He told me "If two teams are on separate projects I would not standardize the points". So in this, he concurs with Ken Schwaber and James Coplien.
Here's my other question: if we have multiple teams drawing work from a common Product Backlog, how can we make sure that normalized points stay aligned? What Scrum event is used for this purpose? There's nothing in the Scrum Guide about this, even though it acknowledges that such multiple-team scenarios are viable.
Posted By Ian Mitchell on 12 Dec 2013 05:15 AM
Here's my other question: if we have multiple teams drawing work from a common Product Backlog, how can we make sure that normalized points stay aligned? What Scrum event is used for this purpose? There's nothing in the Scrum Guide about this, even though it acknowledges that such multiple-team scenarios are viable.
This is what product backlog refinement (aka grooming) is used for, which is however not a Scrum event.
How can you make sure that points stay aligned, if you have one team? You need a "scale" of estimated reference stories and compare new stories to them. I would do the same for multiple teams and use the same scale for all of them.
Here's my other question: if we have multiple teams drawing work from a common Product Backlog, how can we make sure that normalized points stay aligned? What Scrum event is used for this purpose? There's nothing in the Scrum Guide about this,
I don't care for the term normalization because it has two meanings that are vastly different in our context:
1. Quantile normalization, statistical technique for making two distributions identical in statistical properties
2. Normalization (statistics), removing statistical error from measured data
I think we mean #1, but some naive managers think you mean #2.
This story point scale thing on multiple teams is a very complex topic as there is no one "best practice". It is well covered in Larman's books, including how to keep the teams in synch with their velocity and planning. See # 2 and 3 under "Scaling Scrum" here:
http://www.scrumcrazy.com/resources
IMO, if one has not at least heavily skimmed those two books, one shouldn't be scaling Scrum. Otherwise, you're giving a loaded gun to a 5 year old(which is my opinion about SAFe). Just my opinion.
> Otherwise, you're giving a loaded gun to a 5 year old(which is my opinion about SAFe). Just my opinion.
Here's my opinion: SAFe is like giving a badly loaded gun to the sheriff. Five rounds are blank and the only one that releases anything is so overloaded it blows the barrel
> This story point scale thing on multiple teams is a very complex topic as
> there is no one "best practice". It is well covered in Larman's books,
> including how to keep the teams in synch with their velocity and planning
I think part of the problem is that neither Cohn, nor Larman & Vodde, make it absolutely clear that (quantile) normalization is inappropriate for unrelated project teams. They merely express their ideas within a single project context where methods must be found to size a common backlog and to express a multi-team product burndown. Unfortunately, upon reading such material, some managers extrapolate the remedy out of context, and use it to buttress a general case for normalization across unrelated projects with a diverse product ownership. This is how story points subsequently become commoditized to the detriment of actual delivery. Product Owners are not well served by these shenanigans and I wish I could put a lid on it.
Good point, Ian. I would agree, except that I would word it as follows
Change this:
> (quantile) normalization is inappropriate for unrelated project teams
to this:
> (quantile) normalization is inappropriate for unrelated products.
I would probably just change all instances of the word "project" to "product"
http://urbanturtle.com/blog/2013/08/26/you-need-an-agile-product-manage…
Related or not but this turns into an issue when performance of individuals of same discipline has to be gauged on relative basis. Due to not normalized scale its real hard for managers to know which of his resource is working on more complex work in comparison to his peers (obviously all individuals working on complete different projects and teams) and apparently it doesn't seems like there is a way to normalize story points among different teams. Anyone has any suggestions?
Story points aren't for comparing people even they are on different teams, and they should reflect projected effort rather than complexity.
I'd be concerned about why a "manager" would want to compare estimated measures. Estimates are a matter for the team (or teams) working on a single Product Backlog. Other stakeholders should care about the delivery of value, for which the Product Owner should be thei
r point of contact.
Normalized story points are usefull when you want to estimate a roadmap of features (Big user stories) at a program level in SAFe.
The issue I see when we try to apply that principle on the teams is that they immediately understand that 1 point is 1 day.
As the aim of this normalization is to be able to build a roadmap of features, my proposition is :
1/ to let the teams use their true/own story points for user stories to benefit from the full potential of autonomous and commited teams
2/ sum-up a normalized version of these points per user story to have a normalized size of the Feature to be able to evaluate and plan new Features using this normalized experience.
The normalized ratio per team is :
Normalized Story Points (function of the size of the team)
----------------------------------------------------------------------------
Velocity of the team
This ratio is only used at a program level for the only objective of building roadmaps.
For the sake of the examples, let's assume we're working with perfect Scrum teams.
Example 1:
Four Scrum Teams working on four products, or four different Product Backlogs.
I see no value in trying to normalize Story Points, not to mention the difficulty involved in that.
Example 2:
Four Scrum Teams working on one product, or one Product Backlog, working as a Nexus.
I see no value in trying to normalize Story Points, however, there would be value in applying, at least at the Nexus level, consistent units of measurement for relative estimation.
The purposes behind adding estimates to items on a Backlog are one, help the team estimate the amount of work for the Sprint Backlog, and two, help the Product Owner gauge how much work remains towards some objective or goal.
If Team A burns through an average 40 points a sprint, B through 20, C through 100, and D through 5, your total average per Sprint is about 165 points.
If, in a Nexus, you try to identify which team might work on each PBI as early as possible, then this average can still be used, with some flexibility understanding that a particular team may take more or less on any given Sprint (as is the case with any Product Backlog forecast).
-
The danger, in my opinion, with trying to normalize Story Points is you start assign absolutes to estimation, which is the trouble with time-based estimation.
Story Points are just a numerical way of looking at the size of two or more objects, the second you try and persuade that view, you'll start to get skewed results.
I imagine that the Scrum Teams within a Nexus, with the influence of the Nexus Integration Team Members and the representatives of a Scrum Team that work closely with the other Scrum Team, that normalization might naturally occur over time.
I would argue that normalizing story points across different teams makes sense only when describing the "size" of the projects. Even though the productivity might be completely different, the size estimate should not vary that much, if a common low and high points baseline stories are used across the team.
I posted a separate question abou that but unsure how long the moderation takes here.
This discussion is predicated on a premise that story points are a meaningful capacity measure.
An alternate view which has existed for a very long time and maybe coming into view now that PSK is available is time based measures and probability distributions. TOC and Critical Chain has used this for years.
You have no means possible with story points to really know the status of a multi team endeavour and whether to help out or intervene.
There is no need to normalize at all. Use story points for what they were intended for, relative sizing. Use management methods from toc and lean to forecast and plan delivery.