Dominant Designs in Software Engineering: Cloud Native Design Patterns

Azmat
9 min readMar 5, 2020

This is part four of the Dominant Designs, a series of articles taking an in-depth look into the emergence of 4 different dominant designs in software engineering and how to recognize them in an acquisition.

___________________________________________________________________

“Technology infrastructure is at a fascinating point in its history. Due to requirements for operating at tremendous scale, it has gone through rapid disruptive change. The pace of innovation in infrastructure has been unrivaled except for the early days of computing and the internet. These innovations make infrastructure faster, more reliable, and more valuable.”

  • Cloud Native Infrastructure, Justin Garrison and Kris Nova

Over the past few weeks, you and I have discussed the Microservices architecture, serverless computing, and DevOps as the current dominant designs in software engineering and now it’s time for the final dominant design — cloud-native design patterns. But before we dive into the patterns, it’s important to brush up on some cloud-native fundamentals.

What is Cloud Native?

In essence, when we talk of software, infrastructures, and processes as cloud-native, we mean that it’s all based on a cloud. But cloud-native isn’t one thing — it’s a combination of things. Most people refer to cloud-native applications when they speak of cloud-native but there are other things as well (that we’ll discuss next).

But if I really had to define cloud-native as one thing, I would define it was an approach that emerged as the culmination of Microservices applications, serverless architecture, and DevOps practices. Everything that we’ve learned so far in this series is tied to Cloud-Native. In fact, in this article, you’ll repeatedly see me mention key topics from previous articles.

In other words, cloud-native is the modern approach that leverages containerization, faster deployment methods, on-demand computing power, and automation to develop fast, scalable, and reliable customer-centric software.

Cloud-Native Infrastructure and Applications

Cloud-native infrastructure is the hardware and software upon which your entire cloud-native ecosystem is built upon. Traditionally, companies needed to build their infrastructure from scratch, investing in data centers, operating systems, and more. However, in recent times and with the popularity of cloud computing, we’ve shifted from hardware-as-a-service to infrastructure-as-a-service. In the latter, companies can buy/rent raw networking, storage, and compute from cloud service providers and vendors (like AWS). These vendors also bundle additional services like identity and access management (IAM), provisioning, and inventory systems — all of which help create a self-sustained ecosystem.

This shift was monumental in cloud-native emerging as a dominant design. Before this, infrastructure was often seen as an unnecessary expense by companies that racked up bills even if it wasn’t used. With virtualization (going from HaaS to IaaS), companies have had access to unlimited computing power and storage, on-demand, more importantly, they now only had to pay for resources that were being used, significantly cutting down costs, time and effort related to maintenance.

But in order to build a truly cloud-native infrastructure that can utilize the full potential of your cloud-native applications, companies also need to incorporate automation, APIs, policy-driven resource allocation, and useful abstractions.

Related: Dominant Designs in Software Engineering: Serverless Computing

Now let’s talk about cloud-native applications or cloud-native software. What makes an application cloud-native?

Is it just an application that’s built to run on the cloud? Nope, that isn’t it.

Two core characteristics of cloud-native applications are rapid development and deployment — these characteristics also define cloud-native applications to some extent.

Cloud-Native Applications

Another concept that goes hand in hand with cloud-native is applications called containerization. Think of containers as packages of isolated code that can run independently as applications on their own. However, containers are almost always used in batches to create one feature-rich application (with each container handling one function/feature of the app).

Now I’ve already discussed the benefits of this practice previously but in short, companies can increase scalability, have greater control over their software, and avoid catastrophic software failures by isolating errors to independent containers.

Related: Dominant Designs in Software Engineering: Microservices Architecture

Microservices are often called cloud-native applications and while it’s true that containers are a key aspect in cloud-native, Microservices aren’t the entire thing.

Every cloud-native application must also be built with the following characteristics in mind:

  • Resiliency
  • Agility
  • Operability
  • Observability

Benefits of Adopting a Cloud-Native Infrastructure

The fact that some of the biggest tech companies (and now non-tech giants) have adopted a cloud-native infrastructure for their operations should be a good indicator of the numerous benefits it has to offer. So before I start discussing some of the cloud-native design patterns currently in use, I want to talk about the benefits of adopting a cloud-native infrastructure and if it’s actually worth it.

(spoiler alert: it is)

Creating Value Faster

One of the key issues faced by tech companies is the time taken to produce apps, updates, and patches. Most companies have a few (or a single) teams of developers, working on different functions, with limited communications with the operations team that have a reactive approach to bugs and failures (instead of a proactive one). Not to mention the severe lack of automation. All in all, it’s a very inefficient way of developing software.

By following cloud-native patterns, companies overcome, all, if not most of these issues that increase the time to market. Even better, the entire process is customer-centric which improves the odds of customer satisfaction.

Building What is Needed

Spending time, money, and other resources on tasks that don’t generate revenues or lead to any value creation is not a great way of doing business. However, this is exactly what most companies with a Hardware-as-a-service approach are doing. For instance, they pay large sums for local services and a team of engineers to maintain their private servers — paying for the entire ecosystem when it’s not being used — quite wasteful.

Cloud-native, on the other hand, allows businesses to build what is needed and only that. Partnering up with cloud service providers and paying on a pay-per-use scheme means not capital doesn’t go underutilized. Storage and compute on demand also means that scaling up and down is automatic and the company doesn’t have to worry about resource allocation to cope up with increased traffic or load.

Achieving New Levels of Efficiency

One of the major hurdles faced by “elite” performers in DevOps and “worst” performers was that the former used cloud computing while the latter didn’t have access to public cloud. Companies on the public cloud had fewer things to worry about, access to more tools and services, and saved more money — a combination of all of these benefits meant that some companies perform exponentially better than others. In fact, microservices, DevOps, serverless — all of these technologies are founded on scalability, cost-effectiveness, and efficiency.

Cloud-Native Design Patterns

Microservices are part of the foundation of a cloud-native ecosystem and loosely-coupled services are crucial in scalable and efficient software development. However, it’s possible for a microservices architecture to become very similar to a monolith by either coupling services too tightly or too early. And when this happens, all the benefits of a microservices architecture vanish and the limitations of a monolith architecture begin to appear. To avoid this situation, developers follow a number of patterns, called cloud-native design patterns.

And over the years, a lot of design patterns have emerged over the past few years, from different companies/institutions. But the most famous of these patterns is the Twelve-Factor App — a set of best practices that help developers build cloud-native applications.

Source

Microsoft has its own set of cloud-native design patterns comprising of 37 different patterns solving problems related to:

  • Availability
  • Data Management
  • Design and Implementation
  • Messaging
  • Management and Monitoring
  • Performance and Scalability
  • Resiliency
  • Security

Similarly, Cornelia Davis has come up with 12 main design patterns in her book “Cloud Native Patterns: Designing Change-tolerant Software”.

But from a broader perspective, most of the design patterns can trace their roots back to a few key design patterns.

Here are 6 major cloud-native design patterns:

  1. Event-driven

One way of making microservices communicate with each other is through a request/response protocol. This protocol is very common in serverless applications where the app is served when a client makes a request. There are a number of different types of requests but the most common is HTTP. However, the request/response protocol becomes very limiting when the app grows with an increasing number of microservices.

To overcome this limitation, the event-driven design pattern is used. The premise behind event-driven systems is that instead of being limited to just client-side requests, code can be executed by a number of events or triggers. These events may in turn trigger different events that execute different pieces of code. The benefit of using an event-driven design pattern is that it helps a large number of microservices to communicate with each other without actually connecting them with each other so software doesn’t become monolithic over time.

2. CQRS (Command Query Responsibility Segregation)

In essence, Command Query Responsibility Segregation (CQRS) separating the write logic (commands) from the read logic (queries). The main benefit of having separate interfaces for operations that read data and operations that write data is that it simplifies the code, making it easier to maintain. Other benefits of CQRS include more flexibility in future updates, increased performance and scalability, and avoiding merge conflicts at the domain level.

3. Saga Patterns

When a client makes a request, there is no guarantee that their request will reach the target service. This is very likely in business transactions spanning multiple services. However, this problem can be overcome with a compensating behavior called a saga pattern.

When business transactions are implemented as sagas, they act as a failsafe mechanism by creating a sequence of local transactions. Each transaction updates and sends a message to trigger the next local transaction, and so on. However, if one local transaction fails, then the saga will take compensating action, executing transactions that undo any changes made by the previous transaction. This ensures data consistency across services without using distributed transactions.

4. Multiple service instances

In cloud computing, you can either scale horizontally or vertically. Scaling horizontally means increasing or decreasing application capacity whereas scaling vertically means increasing or reducing application instances. The differences between these two are major and one is clearly superior.

For instance, if you launch an application running on AWS using 16 GB of memory but after 6 months, the number of users increases and your app now demands 20 GB of memory. However, you cannot simply add 4 GB of memory to the machine running your app (neither Amazon or Google allow changing the specs of the machine). So the only option here is to completely move to a 32 GB machine as efficiently as possible. This is horizontal scaling.

However, if you had provisioned 4 instances of your app with 4 GB each instead of one provision with 16 GB, you could simply create another 4 GB instance when the number of requests increases, for a total of 20 GB.

Apart from more efficient scalability, multiple service instances also offer greater resilience (when one instance fails, the rest are still operable), high availability, and control.

5. Canary Deployments

The term “canary tests” came from miners using canary birds to know when to evacuate the mines — as long as the birds sang, the mines were, when they stopped (died), the miners were in danger. An interesting albeit grim origin to the canary deployments. Image source

Canary deployments, also known as canary releases is a technique of testing new versions of software before the version is introduced to the main production. The premise behind canary deployments is simple — make small releases available to a small group of users and test the new release for bugs, glitches, or any problems.

Canary deployments will also include automated tests called canary tests.

Canary deployments can be done through load balancers, AWS’ Router 53, Kubernetes, etc. Companies like Facebook, Google, and Netflix all use canary deployments to reduce the risks that come with introducing a new update.

6. Stateless services

First off, stateless doesn’t mean the software doesn’t store any information at all — because it does. It’s just the way it stores information that is different from stateful services. The difference in the sense that stateless services use a replicating process to take snapshots of the process and store the information in persistence stores. When a client makes a request, the system will retrieve the state from the snapshot (in the persistence stores), and make the necessary changes to the state. After the process is complete, the state goes back to the persistence store and the system forgets that anything ever happened.

There is nothing inherently wrong with stateful services, it’s just that considering the nature of stateless services suit a cloud-native architecture far more. The biggest and most obvious benefit of stateless services is that the cost related to maintaining state is reduced/avoided. But the proper implementation of stateless services will render benefits in the form of increased scalability, resilience against failures, and recovery options.

--

--

Azmat

seasoned technologist with experience in software architecture, product engineering, strategy, commodities trading, and other geeky tech.