As a Chief Engineer at Odecee, part of my role is to define and nurture our technology roadmap. A significant component of my research involves simply spending time with our team on client engagements. We have a strong team of passionate individuals who provide me with many new ideas to consider; by spending time with them, I aim to understand what they are working on, what challenges they are facing, and the tools and techniques they would like to use in future.
One of the key outcomes of this exercise has been the recognition that the period of time between idea conception and deployment to production is often too long. It is common to see systems released to production several times a year at most. Rather than one release every three months, teams should be striving for a release every week, every day, or simply as soon as new features are ready. The goal here is to build a continuous pipeline of delivery into production that minimises the time between idea conception to production implementation.
The benefits this can deliver are numerous:
- Faster time to market leads to a greater ability to compete in the marketplace.
- Fast turnaround on change allows for a more scientific approach, where marketing and UX hypotheses are formed, implemented, tested in the market, and refined in a highly iterative process.
- Releasing small changes to production results in a significantly reduced chance of major failure.
- With limited scope of change and regular releases, broken updates can be quickly rolled back, or resolved in subsequent deployments.
- Business units will understand that features don’t have to be maximised in every release, because the next release is no more than hours or days away.
- Trust between business and technology is increased due to the continual demonstration of progress.
- The lines between technology and business disappear when interaction becomes continuous instead of periodic in a highly formalised delivery process.
- Team morale is dramatically improved because there is constant positive feedback for effort performed, and there is a steady level of effort required at all times instead of facing end-of-release panic periods.
So, why don’t more organisations do this? There is no single answer, but there are a number of factors that contribute. Implementing such a system of processes requires a greater degree of change than I can influence through the technology roadmap alone. We can continue to optimise the technology and techniques we use in building software, but the largest gains will only be realised if we take a broader approach. I have chosen several areas that are topical today to explore this further. These are ideas I believe we should be utilising on client engagements, but that are difficult for our clients to introduce due to the level of organisational change required to accommodate them. I will describe each of these, the benefits it aims to deliver, the constraints currently in place, and what we might do to resolve them.
Traditional IT utilises siloed teams, each performing a specific function. These functions include architecture, business analysis, development, testing and operations. None of these teams has direct accountability for business outcomes; instead, accountability is assumed by a cross-team management layer that must carefully coordinate communication and work sequencing. In such a system, there is minimal team empowerment; with highly centralised decision making, a significant amount of time is spent in queues where one team is waiting for another team to perform a task. Many activities fall through the cracks between official team responsibilities, and trust between teams is low.
A better approach is to use cross-functional teams, where a team is assembled with a small number of individuals from each discipline working together to deliver business features. Teams should be divided along business boundaries, not technical boundaries. This maximises their knowledge of the domain they’re operating in, making them more effective in achieving business outcomes. These teams operate with a high degree of autonomy and are empowered to make their own decisions, helping to achieve higher productivity and motivation. This is not a radical idea, and can be seen in Lean Manufacturing and Kaizen techniques first originating at Toyota in the 1930s. The concept of ‘feature teams’ has existed for some time, but it doesn’t go far enough. Feature teams typically have skills in business analysis, development and possibly testing, but have minimal business, infrastructure and operations capability. Teams need to be able to deliver a new feature from concept to production.
Several changes need to be made:
- Business needs to treat technology as an extension of their team, rather than a provider of services.
- Testing and operations should not be treated as repetitive tasks that can be performed by the lowest cost provider without any ability to constantly improve or innovate.
- Testing and operations functions need to be treated as an extension of development so that a feature is only truly complete after it has been verified and deployed to production.
- Teams must be as co-located as possible. A feature team is not a team if its members are physically divided according to their skill set. Physical co-location is preferred, but if this is impossible the team must embrace collaboration tools such as video conferencing and make themselves available at all times for adhoc one-on-one and group discussions.
Releasing features without integration testing
Release schedules are dictated to a large extent by testing schedules. It is common for a full testing schedule to run for up to two months. A significant portion of this schedule is dedicated to regression testing system interactions. Automation testing can help alleviate this to some extent, but achieving complete automated test coverage with fully integrated systems is very difficult and rarely achieved. Even a release making trivial changes with minimal development time will still incur most of the testing phases overhead because a one-size-fits-all approach is mandated for all projects. To maximise testing return on investment, many changes are bundled together; this leads to large, complicated releases with long lead times on delivering change to production.
Testing schedules are also organised around an organisation-wide release calendar to allow multiple systems to test and deploy changes together. This means that the overall release cycle is as slow as the slowest system. The amount of coordination is extraordinary in order to ensure that infrastructure, software versions, test data, and delivery teams are all deployed, configured and available at the same time. Huge amounts of time are wasted when dependencies are missed.
If we can stop doing integration testing, we can break this gridlock. Systems can then release on a cadence that suits them, and nobody has to wait for the slowest system.
How do we do this? There are a few techniques to employ.
- Keep releases small. If releases are small and incremental, the impact of breakages can be mitigated and addressed quickly.
- Maximise test automation. Tests should be developed as part of the development cycle, and run as part of every build. The team should strive to ensure the system is in a releasable state at all times.
- Use feature flags in cases where dependencies aren’t available. Build out the functionality, release it in a disabled state, and then enable it when the other system is available. Further tweaks may be required when the dependent system becomes available, but if they can also be rolled out quickly the impact is reduced.
- Use testing pacts. These are effectively time-shifted integration testing. They provide a way for test messages from one system to be played against another system, and the responses validated. These tests are run as part of the automated test suite of each application. In this way we still ensure that systems remain compatible, but project schedules are completely de-coupled.
Feature team owns production
Traditional IT makes a clear distinction between a delivery team and an operational team. They are separate teams, who interact only for a short period as a project transitions from delivery mode to business as usual. This approach has a number of downsides. An operations team is expected to manage an application of which they have very little understanding. A delivery team is not motivated or even equipped to deliver a system that can be effectively managed in production. Problems or inefficiencies occurring in production are difficult to feed into the next release.
A more modern approach is for the feature team itself to manage production. This requires a significant organisational shift in responsibilities, and may be difficult to achieve in an organisation structured around skill specialisation. Infrastructure must be available on demand for teams to configure and use as required. The management of physical infrastructure can be separated from the feature teams via a vendor service or in-house team, so long as the API is exposed to fully automate the provisioning of infrastructure resources such as compute, storage and networking. The feature team must incorporate the capability to automate provisioning of the infrastructure resources and deployment of the application stack. This capability must provide automated tooling that can quickly and reliably deploy changes to production with appropriate security controls and auditing. A team that is empowered to deploy their own changes will also be accountable for the outcomes. Rigorous deployment gates become less critical because the feature team will be strongly motivated to ensure their changes are fully verified before releasing them. If changes do occur, a feature team will be well placed to react quickly and address the issue. Fixing production issues must be prioritised above building new features.
This relies on several significant organisational changes:
- The capability for building the application software stack (traditionally handled by a middleware team) must be incorporated into the feature team.
- Responsibility for responding to incidents needs to be assumed by feature teams. This requires rotation rosters being established so that everybody (including developers) participates from time to time and the burden is not carried by a small sub-group within the team.
- The feature team must remain active. This can be a challenge in many organisations where delivery teams are created for the duration of a project and then dispersed. If a business wants the ability to consistently deliver over time, the feature team must be continuously funded.
A reason that is often cited for formally separating delivery and operations teams is separation of duties. This can be achieved without splitting the team into multiple silos. Any change – regardless of methodology – should be developed, peer reviewed and deployed. No individual should be able to perform all three steps for a single change, but there is no reason they can’t perform each of these steps across different changes. This can be achieved through strict enforcement of peer reviews and comprehensive auditing.
Retiring the app server
Application servers, in particular those of the JEE (Java Enterprise Edition) variety, are the backbone of most enterprise applications we build and deploy today. In many ways, they are the PaaS (Platform as a Service) of the enterprise world. They provide a way for applications to be deployed and managed in a common way, regardless of their internal implementation. In addition to providing applications with a consistent set of APIs (e.g. Servlet API, JNDI, JPA, etc), they give administrators a single platform that handles everything from managing deployments and monitoring to horizontal scaling. This was all great in the days before infrastructure automation, but now many of the functions of application servers are redundant, and their additional complexity gets in the way. This complexity slows everything down – from infrastructure provisioning and application deployment to feature development.
There are a number of techniques that, when used in combination, can eliminate the need for an application server.
- Containerisation as the new PaaS. Container solutions such as Docker provide a new way of managing disparate applications using common tooling. Containers interact with the outside world by mounting volumes and exposing ports, and the internal implementation can be treated as a black box.
- Automation facilitates scaled deployments. Through automation, we can deploy applications to a large number of servers at once.
- Service discovery and load balancers provide horizontal scaling. We can deploy many instances of an application and expose them to the world via a common load balancer, or advertise the endpoints via a service registry.
- JEE container features such as servlets, persistence and transaction management can all be provided by standalone Java applications. Frameworks such as Spring Boot make it easy to build a complete application that leverages all of the APIs traditionally provided by JEE servers.
- Applications that adhere to twelve-factor design principles lend themselves to efficient automation.
- Elastic scaling provides efficient use of resources that can be scaled up and down based on demand.
- Pick the right tool for the job. We shouldn’t, for example, have to write everything in Java simply because that’s all the platform supports.
Replacing application servers with a containerised platform managing smaller, independently scalable twelve-factor applications will be difficult for existing operations teams to accept. Significant processes and skills capability have been built around the management of application servers. Many of the concepts will be transferrable, but the tools will change. The end result will be a simpler, more flexible platform that results in feature teams spending more time on building functionality instead of wrestling with application server middleware.
Service-oriented architecture (SOA) appeared in the early 2000s, and promised to deliver loosely coupled systems communicating via services. However, the implementation of this rapidly turned towards building ESBs (Enterprise Service Bus) – a central messaging layer that incorporated routing, orchestration and transformational capabilities. Underlying systems could be built using heterogeneous platforms, however they had to communicate via a central messaging system. This was done for a number of reasons, including a desire to centralise service governance. In practice, this central point often became a bottleneck for change, forced release synchronisation, and became its own monolithic furball that was difficult to maintain. In addition, because they were managing an organisation-wide system, the ESB team had little-to-no business domain understanding of the services they were exposing. This is a typical outcome when systems are divided along technical boundaries instead of business boundaries.
The recent trend towards microservices is a refreshing approach to integration, by promoting dumb pipes and smart endpoints. It is seen by some as ‘SOA done right’, but this time the intelligence is explicitly pushed to the edges, and systems are bounded based on the business functionality they provide. A key enabler for the realisation of microservices is full stack automation. A system can be decomposed into smaller independent units without incurring prohibitive deployment and administrative overhead.
There are a number of key benefits associated with this decentralised approach to integration:
- There is significantly reduced coupling between independent programs of work. Business solutions can be delivered without dependence on integration teams, who have many business units competing for their capacity.
- Systems are built by technology teams who have a stronger understanding of the business domain.
- Technology choices can be made based on the specific problems they need to address. For example, no longer do all systems have to utilise a single messaging transport because that’s what’s supported by the ESB platform; instead, technologies can be chosen that are most efficient for the task at hand.
- Without the single point of failure, the resilience of the entire system is increased.
- Smaller systems are easier for developers to comprehend, enhance and maintain.
- It is easier to trial new designs and technologies on low-risk self-contained systems without impacting the stability of an entire platform.
The ability to pick technology that suits the problem is hugely empowering to a feature team, however it cannot be done in a chaotic manner. This is where design governance comes into play…
Governance, as applied in many organisations, utilises a process where designs are created by project teams and then reviewed by a design forum consisting of senior technology stakeholders. Often, these senior stakeholders are not directly involved in the program of work and are only engaged at the time of review; this tends to mean that issues identified in these forums can require significant rework. The outcome is usually one nobody is happy with – projects are delayed and strategic visions are compromised or ignored. If issues are only detected during formal reviews, the system is broken. Senior technology stakeholders are tasked with building strategic technology visions and obtaining support to achieve them from senior executives, but are rarely given the opportunity to engage in the delivery process. Visions will not be realised without a greater level of involvement in the delivery process.
Effective governance is achieved by leading from the front and assisting in accelerating delivery. It is much more than defining a vision and enforcing it, it is establishing the common techniques, patterns and tooling that allows feature teams to deliver business outcomes more effectively. It should not be considered an independent function within the organisation, it should be an integral part of the way software is delivered.
Here are several ideas that can be applied when defining the role of governance:
- Project backlogs must be developed with input from senior technology stakeholders, and capacity must be allocated to building out the architectural runway used to deliver subsequent business features.
- The right to act in a review capacity is dependent on being engaged on the project throughout. Important decisions cannot be made without a strong contextual understanding of the project to date.
- Key artefacts from an effective governance capability are a suite of techniques, common design patterns, re-usable modules and template components for feature teams to leverage.
- Re-use artefacts should be leveraged from working solutions that have been battle-tested in production. This helps avoid building over-complicated solutions for problems that don’t exist in the real world.
- Previous learnings and decisions should be used as guidance to help build better solutions more efficiently, not as rules that must be applied. Today’s state-of-the-art is tomorrow’s legacy. Effective governance requires involvement in the early phases of solution design where previous learnings can be applied if appropriate, and exceptions made if required.
A clear measure of an effective governance process is when senior technology stakeholders and delivery leads regularly seek each other’s counsel in building both solution designs and strategic roadmaps. Close collaboration without formal process is the best indicator that good working relationships have been established.
Trialling new technology
It is difficult to introduce significant new technologies into an enterprise. Change typically requires the sponsorship of a senior executive; this means effective smaller changes are ignored due to the effort of gaining approval outweighing the reward. Failure to introduce new technologies can lead to missed productivity improvements and functionality benefits that are necessary to remain competitive.
Feature teams need to be empowered to trial new technology in order to gain experience with it, and determine whether it is worth pursuing more broadly or should be discarded. This should be possible if several pre-requisites are met:
- There is a clear understanding of the relative importance of systems, and what level of risk each system can tolerate. It is great that topics such as pace layering are receiving broad recognition.
- There is an acceptance from senior stakeholders that new technologies will be trialled from time to time. There is a cost associated with trialling technology, and this must be seen as an important activity with a long-term return on investment.
- Feature teams are empowered to test new technologies, but also accountable for the outcomes they produce.
- Systems are built in a modular fashion that supports experimentation on small isolated components.
This is an exciting time to work in technology, and there have been many positive steps made in recent years towards maximising efficiency and harnessing creativity. Solutions increasingly incorporate open source software, allowing a tremendous amount of innovation and collaboration. The agile movement has led to the recognition that continuous delivery of business features in small increments is preferable to waterfall delivery. The ability to obtain infrastructure via an API has enabled the DevOps revolution to dramatically lower the hurdle that sits between an application running on a development workstation and its release into a production environment.
The next challenge is making our feature teams directly accountable for our production environments, empowering them to make their own decisions to achieve the best outcomes, and providing governance that guides and assists rather than inhibiting them.
Achieving these outcomes will not be trivial. They will take a significant amount of effort over a long period of time. I don’t have all the answers, but I’ve included a number of my ideas in an attempt to start the conversation. I want to obtain general agreement that these are objectives worth pursuing, and start discussing how to achieve them via a constant stream of small, iterative improvements. I hope that in 10 years time we can look back at these outcomes and consider them so obvious and intuitive that we wonder how we ever did things differently.
Read more about these key concepts:
-  Lean manufacturing
-  Kaizen technique
-  Testing pacts
-  Twelve-factor design principles
-  Pace layering
This post was written by Brett Henderson