3 Models for DevOps. Which one is best?
Uptime doesn't just happen, it has to be engineered. Uptime is the product of a properly functioning partnership between engineering and I.T. operations. What good is great ping, power, and pipe if the software stack crashes? And what good is great software if the infrastructure isn't reliable?
I'm not the only one that recognizes this, the industry is buzzing about DevOps. Like many of our industry buzzwords it has a different definition depending on who you are talking to. Here are the three main models I have encountered.
The Silo Model
In the Silo Model you see an I.T. operations team built out to deal with the operations for the engineering group. They are a distinct team from the engineers themselves, have their own management structure, and usually end up reporting into an engineering executive. The organization has its own set of metrics, resources, and political constraints that it operates by.
The Embedded Model
Another approach is that I.T. ops people are embedded with an engineering team. They sit with the engineers, they attend their meetings, eat lunch or drink beer with the engineers...they are generally considered part of the engineering team. their goals are aligned as they are all part of the same team with the same manager.
The One-Man-Band Model
At the bottom end of the DevOps spectrum is the One-Man-Band Model. There are a handful of engineers and they are responsible for an application: both for the code and the infrastructure to run it. This used to be a difficult model to staff since you needed a mix of hardware and software expertise. But in the cloud the hardware layer is disappearing and everything is a programmable resource. There isn't much distinction between programming your app logic and your infrastructure.
So which model is right for you?
Having had the opportunity to lead engineering teams in all three models I have developed some favorites depending on the organization type.
For small organizations and startups, they often have no choice but to go with One-Man-Band Model. It typically works really well for them. With a handful of people responsible for engineering uptime they can usually achieve what is necessary for the organization to grow. The one main downside to this model is that the organization is deeply reliant on a few people.
As the organization size scales, the One-Man-Band Model becomes a bottle neck and a risk. So it is time to scale up. That is when you move to the Embedded Model. The I.T Ops people become domain experts on the applications that their engineering team is creating.
As the organization continues to scale beyond a few I.T. Ops people the temptation will be to create a dedicated I.T. Ops team to handle operations across all egineering teams. When this happens though it is easy to end up with generalists that don't really understand the applications they are supporting. But I have an alternate model for you to consider that helps combat this effect.
Create a virtual I.T. operations team and leave the team members embedded with the engineering teams. Establish an I.T. Ops management structure as necessary and dual report your I.T. Ops people to both an engineering manager and the I.T Ops Manager for goal alignment. As necessary, staff dedicated I.T. Ops positions like architects and system engineers to the I.T. Ops teams. The role of these dedicated dedicated I.T. Ops positions is to create frameworks and standards that the embedded I.T. Ops people need to support their application and create uniformity that ensures smooth operations.