Here are some useful things for agile software development teams to be measuring in order to understand how effective they are being (and how well their current project or initiative is tracking):
In terms of specific measures pertinent to the delivery of high quality, valuable to customer and business, working software (which, at the end of the day, should be the goal of any software development team, agile or otherwise), it is useful to differentiate between “process” metrics and “output” metrics.
Process metrics are those which the team is directly in control of, i.e. they form part of the team’s process (how they work), rather than an outcome of their process. In other words, a team’s process, and its corresponding metrics, can be used as leading indicators of delivery outcomes.
Here are some examples:
If production code (i.e. fully tested, integrated, working software) is not committed and deployed frequently, there is a higher risk of conflicts, bugs and deployment issues.
If the evolving product is not put in real users’ hands frequently, there is a higher risk of building the wrong thing due to reduced feedback, big changes/surprises for users and lots of big bang rework.
If there is a loose (or no) agreement on when a story is ready to be worked on by the development team, there is a higher risk that there is not a shared understanding with the customer about what should be built, and thus the wrong thing might be built, or crucial business rules/acceptance criteria missed.
If there is a loose (or no) agreement on the steps in a team’s process through to when a story is considered “done”, there is a higher risk that the scope of the story will be wrong, or expand during development, or crucial quality checkpoints (such as code review, refactoring or documentation) will be missed.
If the backlog is treated as a queue rather than a list of options, a large backlog is a leading indicator of a slow delivery rate (lead time) for new ideas and overall objectives.
An increase in WIP is a leading indicator for a possible slowdown in cycle time and/or reduction in quality (if the WIP puts the team over their sustainable capacity). High WIP causes costly context switching and poor flow. See Little’s Law.
If developers are not doing TDD, and thus not writing tests before production code, there is a risk that not all production code will have tests, and thus there is a higher risk of developers unwittingly committing breaking changes and thus defects escaping to customers.
If the team is not writing executable specifications collaboratively with business/customer reps, there is a higher risk that the delivered solution will not do what the customer wants or needs it to do.
Output metrics are those which are outputs, or outcomes, of the team’s process, i.e the team cannot directly control them, only try and influence them in a desired direction via changes to their process.
As such, output metrics are lagging indicators, which is one of the reasons why releasing working software to customers frequently is so important (it reduces the lag, giving the team a far better understanding of whether they are delivering the right things, and building them in the right way).
Arguably the most important indicator as to how successful a development team is being is that the customer is frequently using the delivered product or service and they are deriving value from using it (i.e. it is meeting their needs). So teams need to talk to customers regularly and measure the above in order to understand if they are serving their customers well.
A large number of defects, particularly high severity ones found by/impacting customers, often points to poor quality, as well as reduced capacity of the team to deliver value.
Measuring the total number of defects, along with defect rates (e.g. how many new ones are being raised, how much time the team is spending on fixing them, etc.) helps the team to understand if there is a general problem with quality in the system, and whether the problem is improving.
Firefighting activity, like defects, often points to poor quality, as well as reduced capacity of the team to deliver value.
Teams should seek to understand the amount of capacity spent on these incidents, and address the root causes of the fires so they can prevent them happening again (reliability / resilience), or reduce the impact if they do happen again (recoverability).
Throughput is a useful measure for the team to understand how well deliverables are flowing through their process into production, and (in conjunction with variance) to forecast (and share with stakeholders) possible release outcomes.
Throughput (days) = Stories “Done” / weeks elapsed
It should be borne in mind that throughput should represent the delivery of actual business value as closely as possible. Until working capabilities are in the customer’s hands, any throughput measure is only a proxy for real throughput.
Throughput variance is a useful measure for the team to understand:
Is work being sliced into small enough deliverables to reduce risk and deliver value early and often? Is the time it takes to deliver stories predictable (within an acceptable range)?
It also makes uncertainty transparent, and reminds folks that forecasting is about identifying multiple possible outcomes, not only one.
Cycle time is helpful for understanding how long a story typically takes to deliver from start to finish.
It is useful to measure real cycle time per story (easily done using sticky dots at daily stand-up) or calculate averages using WIP and throughput (Little’s Law).
Unlike effort, cycle time incorporates wait times, making it handy for finding unnecessary delays in the process.
Cycle time (days) = “Done” date - Start date
Average cycle time (days) = (Weeks elapsed * 5) / Stories “Done”
Slow lead time means that the overall speed to market (concept to cash) is slow, even if the team’s delivery rate (cycle time) is fast. Caused by long queues (backlogs).
Average lead time (days) = WIP + Queue / Throughput
If the time between stories being delivered is high or volatile, it is an indicator that the team has poor flow.
Takt time = Date/time story delivered - Date/time previous story delivered
Average takt time (days) = Number of days (sprint length) / Stories “Done”
Missing or defective requirements cause disruption, re-work and delay in development. Measure how often stories have acceptance criteria changed, or added to, during development.
What are some other helpful metrics you have used in your agile software development teams?
Thanks for reading! If you are looking for help with your software or product delivery, I provide agile coaching, public training (both theory and practical) up to executive management level, and more. As well as public events, I can also run training internally in your organisation for a massively reduced cost, so please ✍ get in touch.