What Are (Story) Points Worth?

A closer look at one of the most misunderstood parts of Agile planning

Published on Monday, 27 January 2020

If you've ever worked in an Agile environment, you'll have heard of story points - and at some point, it's likely you'll have seen (or been involved in) a disagreement over what they actually mean. Story points are a great tool for estimating the effort involved in completing a piece of work, but they're very commonly misunderstood or misused in ways that can make them at least as bad as the problems they're intended to solve.

What Story Points Aren't

The phrase "story points" may mean different things to different people, so it may be beneficial to first understand what they are not - particularly if you're coming from a planning or management background. If your professional experience has seen you frequently using phrases like "man-hours" or looking at Gantt charts, then story points may tempt you to use them in a way they're not meant for.

Story points are not a proxy for time...

...they are a measure of effort.

A story point cannot be expressed as a number of hours, or days, or any other unit of time. Two stories with the same story-point value may take different amounts of time to complete. This is because story points take into account uncertainty and difficulty, as well as just raw time. An 8-point story might take all day, or it might be something that you're just not sure about, but once you look into it, it turns out to take an hour. It might be something simple but laborious that takes you hours to finish, or it might be six lines of really complicated code that takes you a few hours of scribbling on a whiteboard to work out.

Measuring effort, rather than time, means that story points can do something that hour estimates cannot: they account for developer wellbeing and burnout. An 8-point story might take a whole day of solid work to complete, or it might take two hours... but a high point-value story that takes less time is likely a more difficult piece of work, meaning that the developer working on it needs to take breaks more often, or is mentally drained and takes longer to do their next piece of work. In this way, a velocity measured in story points will usually be more consistent than one measured in hours spent typing code.

Story points are not absolute...

...they are a relative measure.

Story points incorporate difficulty and uncertainty, and these things have no standardised measure. This means that there is no set standard for what one story point is worth. There is no formula you can take, input some details about the story, and calculate a number for the story's point value. Story points are measured by comparing to other stories. If the story is similar to one you did a few weeks ago, and that one was a 5, then this one is also a 5. If this one feels like it'll be a bit tougher, or take slightly more work, or you're just a bit less certain about what's involved, then maybe it's an 8; if you think it'll be a little easier or it's a shade clearer than the previous story was, then perhaps it's a 3.

Over time, a team builds up a history of stories to compare new work to, and therefore tends to settle on a roughly consistent idea of what each point value looks like, which in turn helps to smooth out inaccuracies in estimating a team's velocity.

Story points are not universal...

...they can differ from one team to another.

Story points are based on things that differ across developers. What is hard for one developer may be easy for another; where one developer feels confident, another may be more cautious. Each team has a different history, and therefore a different set of past stories to use as comparisons and baselines. This means that two different teams may come up with totally different numerical values for the exact same story. This is entirely normal, and an intended part of the function of story points.

Having story points differ from team to team makes it absolutely clear that any estimate given by a team other than the one who will actually be doing the work is largely worthless. It discourages attempts to hold a team to an estimate or deadline that was provided by another team, and it encourages teams to have their own look at any work assigned to them, regardless of how thoroughly another team may have examined it. Crucially, it also takes the competition out of estimation, because there is no way to compare one team's estimate of 8 points to another team's estimate of 5 points. Competition over velocity drives unhealthy practices like inter-team rivalry, resentment, and the incentive to push developers to work at an unsustainable pace. The meaning of story point velocities is individual to each team, and therefore not useful for comparison.

Story points are not fixed...

...the value of a point can vary over time.

It's been said that you can never jump into the same river twice, because the water keeps flowing, and you keep aging, so the second time round it's no longer the same river and you're not the same person. The same is true of story points: what was difficult for you a year ago may be easier now, and you may have learned things that tell you to be wary of a task that you once would have assumed to be easy. As a result, a story you estimated a while ago might get a different story point value if you estimated it today. This is entirely normal, and an intended part of the function of story points.

The varying nature of story points means that they adapt to a team's changing circumstances. Gaining or losing a team member, moving to work on a different area of the codebase, adopting different technologies or tools, and a myriad other things will all impact the way that a team views their work and will therefore cause their story point values and velocity to shift. This also means that as time passes, older estimates become less reliable as the value of story points drifts from what they were at that time. This encourages teams to take a fresh look at older stories that have been brought back to the top of their backlog, and re-estimate them in the context of their current skills, knowledge, and outlook.

Story points are not precise...

...there is an inherent fuzziness in their measurements.

Story points are generally not used as a continuous scale. You'll often see story point values chosen off the Fibonacci sequence (1, 2, 3, 5, 8, 13, 21, 34...) or an adjusted version of it where higher numbers are rounded to multiples of 10 for simplicity. There's a simple reason for this: with all the ways that story points can vary, it makes no sense to use them as if they were a precise measure. This is especially apparent at high values, for reasons that make total sense when you consider what they represent. If you think a task will take a few minutes, you're probably right, but if you think it might take a month, then it's probably a large task, and there is a much higher chance for something unexpected to happen during it, or for your estimate to be a little inaccurate. Forcing you to pick between 21 and 34 stops you from quibbling over the difference between a 27 and a 28 when it really makes little difference at that scale.

This uncertainty in large numbers discourages the team from actually working on stories with those large values, because the actual work involved can vary more widely across stories with the same numerical value. This drives the team to split large stories down into smaller ones that can be assigned values at the lower, more precise end of the scale, which is beneficial because it encourages the team to structure their work into smaller, more manageable tasks. Remembering our earlier discussion on user stories (part one, part two), we see that splitting a story in this way allows us to start giving the customer something useful sooner than if we tried to do it all as one larger block of work, as well as giving the team more opportunities to pivot and avoid wasting work if priorities change or the task turns out to be more difficult than expected.

So What's The Point?

Hopefully now you'll have a clearer picture of what story points are meant to do, and what they aren't. It should also be clear that they're not for everyone.

If you need to be able to predict precisely what will be delivered, and when, at timescales longer than one or two sprints ahead, then story points aren't really the right tool. In theory, you can use your velocity and the total number of story points in your backlog to estimate how long it will take to reach a certain point, but when doing so, you should remember that both the story points and the things they're measuring will change as time goes by, so any estimation conducted in this way will be increasingly inaccurate the further into the future you look.

If you need to be able to move work (or workers) around, reassigning tasks to different teams, forming teams on a more ad-hoc basis, then story points aren't going to work very well for you. Each time a team changes composition, that team's idea of what points mean will change, and each time a task is given to a new team, that team will have a different idea of how many points it's worth. This will make your estimates very inaccurate, and lead to frequent occurrences of teams taking on more work than they can handle, or finding themselves idle and having to pick up additional work, making it hard to predict when things will be finished.

However, if you have relatively stable teams, who tend to own one or more areas of your system or spend most of their time working on generally similar things, then properly using story points to estimate your story size will give you a more reliable measure than hours and will give you a process that has built-in measures to keep your teams happy and focused on delivering value to the customer.

Comments powered by Disqus