Rising Cloud Technologies: Site Reliability Engineering

New technologies help companies to transform organizations into digital organizations. Identifying the emerging cloud technologies and understanding their impact on the existing cloud landscape can help companies to become more successful.

While some companies do not have a formal cloud strategy in place, most companies are using at least a cloud technology such as SaaS, IaaS or PaaS – whether in a private, public or hybrid cloud.

Other companies follow a multi cloud strategy since it allows them to select different cloud services from different providers because some are better for certain tasks than others. For example, some cloud platforms specialize in large data transfers or have integrated machine learning capabilities.

Most popular cloud models are the hybrid and multi cloud as of today. Seeing the first benefits of cost savings and increased efficiencies, companies focus now more on agility, speed and time to market to enable digital business success.

The new cloud capabilities increase the deployment options. Companies want the benefits of the cloud in all of their IT systems with the increased offering of cloud service providers, customers can now decide on the technology, services, providers, locations, form factors and control.

Since the digitalization journey raises new considerations and expectations, companies are now looking into technical areas to improve their cloud landscape such as the distributed cloud, API-Centric SaaS, Cloudlets, Blockchain PaaS, Cloud Native, Site Reliability Engineering, Containers, Edge Computing and Service Mesh.

Site Reliability Engineering

How closely should software development and operation be interconnected and which control processes are required? From this question and the implementation of the answers, Site Reliability Engineering (SRE) emerged as a new service management model.

Site Reliability Engineering is a structured approach to software development that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems.

In general, an SRE team is responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response and capacity planning. They split their time between operations/on-call duties and developing systems and software that help increase site reliability and performance.

What is the difference between DevOps and Site Reliability Engineering?

The difference between DevOps and SRE is while DevOps raise problems and dispatch them to Dev to solve, the SRE approach is to find problems and solve some of them themselves. The ideal SRE team includes developers with different specialties so that each developer can provide beneficial insight.

SRE is designed to give developers more freedom to create innovative and automated software solutions. By establishing reliable software systems with redundancy and safeguards in place, developers are not limited by traditional operations protocols. For example, in a DevOps team, the operations manager may need to approve each software update before it is published. In SRE, developers may be allowed to release updates as needed.

Since SRE is developer-focused, the manager of an SRE team must have development experience, not just operations knowledge. An SRE manager may actively help with software development instead of merely overseeing it.

SRE focuses on stability rather than agility and proactive engineering rather than reactive development and creates a bridge between development and operations by applying a software engineering mindset to system administration topics which also delivers services faster.

The ultimate goal for SREs is to establish a service quality from the perspective of the end customer. By continuously optimizing the control processes and automation, the human error factor should be kept to a minimum. The automatic control processes are indispensable for maintaining quality standards. This can be done by building self-service tools for user groups that rely on their services such as automatic provisioning of test environments, logs, and statistics visualization. Doing so reduces work in progress for all parties, allows developers to focus exclusively on feature development, and lets them focus on the next task to automate.

How to speed up the Software Development Life Cycle?

In every software development or standard applications implementation project, companies fail when the project is implemented without a methodology. For example, there are methodologies using international standards such as ISO/IEC 12207 for the Software Development Life Cycle (SDLC) or more specific ones such as Oracle’s Unified Method (OUM) when implementing Oracle Applications.

The experiences were positive and project teams were able to produce a high quality software that met customer expectations and reached completion. However, with increased complexity of IT landscapes and regulations, projects were delaying with completion in times and budgets.

Companies started to move to shorter development cycles and to become more agile to respond to a faster changing environment by introducing methodologies such as Scrum to break down deliverables into shorter cycles (sprints) to enable continuous improvements. However, scrum projects often ended up in chaos because of lacking leadership, teamwork and discipline. You need strong and professional change management to have commitment, courage, focus, openness and respect in those projects and it is often difficult to handle genius developers with a prima donna syndrome to adapt to those values.

Can new technology help?

Nowadays, companies rely on DevOps and agile methodologies in the cloud in order to speed up the software development process. More cooperation is visible between traditional and DevOps companies towards common standards to enable collaboration. For example in the financial industry, the Fintech Open Source Foundation (FINOS) is a community to promote open source solutions for the financial services industry by providing an independent setting to deliver software and standards that address common industry challenges and drive innovation.

Using a foundation where developers, IT experts and industry leaders agree on standards and collaboration on open source projects gives financial services companies the full advantage to use DevOps cloud platforms (i.e. Gitlab) and move from traditional SDLC to a modern, cloud based and service oriented software development life cycle with the aim to develop software more efficient and faster while keeping the high regulative and quality standards.

Choosing one DevOps platform may look risky, however, it helps to redefine development and engineering work because product owners from business, software developers, operators, test engineers, project managers etc. have access to the same information, using a common standard and have to work closely together to benefit from faster software development while maintenance cost budgets are constantly cut.

Leave a Reply

Your email address will not be published. Required fields are marked *