Opinion

Setting Up DevOps Teams for Success

June 23, 2022 | 6 min read

Summary:

The gap between top and low-performing engineering teams is dramatic, whatever angle you look at it from. Whether you analyze their tech stacks and architecture choices, performance metrics, or cultural and team elements, the delta tends to be quite impressive. Despite the wide range of approaches, a few indicators paint a clear picture of the typical characteristics of a well-oiled setup.

We’re getting to the root cause when we zoom into the cultural aspects. How do teams manage DevOps? How is ownership distributed and thought about? This is where the data is most definitive and where teams that want to drive change can actually start off with. 100 of 100 developers in high-performing teams report they are running a true “you build it, you run it” setup. A setup so streamlined and optimized that it doesn’t get in their way through overly complex design or restrictive abstractions. In comparison, only 3.1% of low-performing teams report this. In 52% of cases, low performers are still stuck in a model of “throw over the fence” operations. In 44.9% of cases, developers are left alone and develop a “shadow operations” model, where senior developers are taking over the role of operations to help less experienced developers, creating all sorts of inefficiencies.

Tech Stack & Architecture:

Application architecture

Top performers are running all loosely coupled architectures in 95.% of their apps.
Low performers run almost twice as many applications that are monoliths.

Infrastructure

Top performers chose public cloud as the dominant approach
Low performers show more of on-prem

Containerization

Across all teams, 82% are either already fully on containers, currently migrating, or currently starting a migration.
18% are not planning to ever start a migration. More than twice as many low performers report they are never going to migrate – aging ecosystems and server less are the remaining options.
Kubernetes is the new legacy and solution of choice for more than 62% of all teams.

Configuration as Code

The 40% of low performers that do not use configuration as code will see skyrocketing amounts of work trying to roll back, follow good governance, meet audit requirements, etc.

Infrastructure as Code – article

Performance Metrics

Deployment Frequency: if your environment management isn’t dynamic, you’re on a monolith, or your team composition and process flow is buggy, deploying and shipping software fast gets very hard.

High performers enable their developers to deploy to production in more than 50% of cases. Over 80% of top performers deploy at least several times per day.
22% of low performers say they deploy only “a few times per year”

Lead Time: The time it takes to implement, test, and deliver code.

For more than 20% of low performers, it takes longer than one month to deliver their code through all stages
It take minutes for over 50% of top performers and there are almost none that take more than a week. This means individual code changes applied with any given deployment are likely to be significantly smaller, making it easier to review for colleagues and in turn lowers Change Failure Rate.

Mean Time to Recovery (MTTR): how long it takes you to get everything full operational again after a product or system failure

For the low performers not running on IaC or Config as Code, it takes them much longer to to fix things if they go sideways

Change Failure Rate: signals how much “faulty” code makes its way to production with any given deployment

Team Setup and Culture

only 21.2% of teams report that they can do all DevOps tasks on their own.
44.6% of cases, they are supposed to, but the setup is so complicated that in reality only experienced developers can work on DevOps tasks and become a bottleneck for the team.
34.2% of cases, the reality is a “throw over the fence” split like 20 years ago
96.6% of top performers report to heavily invest to improve their developer experience and consider it a top priority. Investing in self-service capabilities.

Aaron Erickson, who built Salesforce’s Internal Developer Platform: “Service ownership is a good idea in theory, but in practice people get confused. If developers have to run all the ops for their services, you do not have any economies of scale. To run 1,000 different services around Kubernetes, you shouldn’t need 1,000 Kubernetes experts to do that.”

Optimizing for Cognitive Load

Cognitive load: refers to the used amount of working memory resources (for developers)
while they’re becoming IaC wizards, they fall back on their area of specialization
Overcommunication is the only solution to finding the balance between giving dev teams self-service capabilities or abstraction from their dev-based roles

Golden Paths over Golden Cages

Golden path – about abstracting without abstracting

The most commonly used description for Golden Path-style self-service setups are Internal Developer Platforms. An Internal Developer Platform, or IDP, is a self-service layer that allows developers to interact independently with their organization’s delivery setup, enabling them to self-serve environments, deployments, databases, logs, and anything else they need to run their applications.

You have to treat developers as users, you have to iterate with them, you have to explain why the golden path makes sense and why they should use it for the good of everybody. (“win the hearts and minds of developers”)

Winning DevOps with a Self-Service setup build by platform teams

The mission: to build the tools that enable developers to ship scalable applications with high speed, quality, and performance

the Platform team is not to be seen as some sort of extension of the SRE or Ops teams, but rather as its own product team, serving customers (app developers) within your organization. Becoming a Platform Engineer

Internal balance

Successful Internal Platform teams manage to put in place strong guardrails and standards for their development teams. Without taking away too much of their autonomy.

Key areas top-performing Internal Platform teams focus on:

Treat your platform as a product: they need to be driven by a product mindset. Need to focus on what provides real value for its internal customer – the app developers – based on the feedback
Optimize iteration speed: You developers will be able to consistently ship more features and products to your customers while being confident that things won’t break
Solve common problems: start by understanding developer pain points and friction areas that cause slowdowns in development.
Be glue, my friend: Platform teams need to define a golden path for their developers: a reduced set of sane, proven choices of tools that get the job done and allow you to build, deploy, and operate your services. The main value you create as an Internal Platform team is to be the sticky glue that brings all the tools together and ensures a smooth development and deployment experience for your engineers.
Educate and empower your teams: Foster regular architectural design reviews. Share knowledge, experiences, and collectively define best practices. Ensure engineers have the right tools in place to validate and check for common pitfalls. Organize a hackathon.

Get the full report

download here

Best Practices 11 min read

UDM Pro Memory Usage: How to Monitor and Fix Performance Spikes

Struggling with high memory or CPU on your UDM Pro? Here’s how to monitor usage, catch issues early, and avoid...

News and Development 5 min read

Logic Success Stories: How LM Logs Cut MTTR and Boosted IT Clarity in 2024

LogicMonitor reflects on several product innovations and accomplishments within LM Logs in 2022, along with how these helped customers.

Best Practices 13 min read

Remote infrastructure management: Trends, challenges, and the future of IT

In this guide, learn the current state of remote infrastructure management, then discover what the future holds.

Subscribe to our blog

Get articles like this delivered straight to your inbox

Platform

Infrastructure Monitoring

Cloud Monitoring

Digital Experience

AIOPS

Solutions

By Initiative

By Industry

Resources

Learn

About us

Get to know us

Services

Documentation

Support

Setting Up DevOps Teams for Success

In this article

Summary: