How we can improve DevOps and strengthen our operations?

Posted by D2i Team on March 29, 2022

There is a famous saying - ”You can’t improve what you don’t measure.”

So, it all starts with measurement. We all need to understand where we’re starting from and where we want to go.

Teams need to make data-driven decisions in order to continuously improve practices, understand their strengths and weaknesses, deliver software faster, and how they can improve their DevOps capabilities.

That’s exactly after years of research, Google’s DevOps Research and Assessment (DORA) team identified four key DORA metrics

In DevOps, DORA metrics have become the standard for teams aspiring to optimize their performance and achieve the DevOps ideals of speed and stability.

Let’s see who DORA is, what the four DORA metrics are, and the pros and cons of using them

What is DORA?

Within a seven-year program, this Google research group analyzed DevOps practices and capabilities and to measure software development and delivery performance they identified four key metrics. The DevOps Research and Assessment(DORA) team is a research program that was acquired by Google in 2018. DORA uses a data-driven approach to deliver best practices in DevOps, with an emphasis on helping organizations develop and deliver software better and faster. These metrics revolutionized the way DevOps teams operate, create visibility and deliver actual data that can be used as a base for improvements and decision-making.

Let’s learn more about the four DORA metrics and why they are so useful in value stream management.

Deployment Frequency

Deployment Frequency measures how often a company does successful production deployment/release for a particular application.

For every team success definition is different, so deployment frequency can measure a range of things, such as how frequent code is deployed to production or how frequent it is released to end-users. Regardless of what this metric measures for a team, aim should be continuous deployment, with multiple deployments per day.

Change Lead Time

Change lead time measures the time taken to deploy the change to production and thus delivered to the customer. Lead time helps you understand the efficiency of our development process . Long lead times means some inefficiency in process or bottleneck along with the development or deployment pipeline.

The most common way of measuring lead time is totale between the first commit of code for a given issue to the time of deployment. A more comprehensive or exact method would be to compare the time that an issue is selected for development to the time of deployment.

Change Failure Rate

A team’s change failure rate refers to how frequent their changes lead to failures in production.

This is a measurement of the rate at which production changes result in rollbacks, failed deployments, and incidents with quick fixes

This helps in finding the quality of code you are pushing to production. The lower the change failure rate the better is the performance of the team - higher performing teams have a change failure rate of 0-15%, aim of the team should be to decrease the change failure rate over time as skills and processes improve.

Mean Time to Recovery (MTTR)

MTTR is the average amount of time it takes your team to resolve incidents and failures in production when they do happen. This metric helps to check stability of your software, as well as the agility of your team in the face of a challenge The goal of optimizing MTTR is to restore service as quickly as possible (with a low mean time to recover) and, over time, build out the systems to diagnose,detect, and correct problems when they inevitably occur.