Kubernetes at 10: Constructing stateful app storage and data protection

Kubernetes is 10 years’ aged. Mid-2024 sees the 10th birthday of the market-leading container orchestration platform.

That decade started as containers emerged as a brand sleek draw to virtualise capabilities, but storage and data protection functionality modified into nearly non-existent. Now, Kubernetes affords a outmoded container platform for cloud-native capabilities, with all that’s required for the storage of stateful data.

We label the first decade of Kubernetes with a chain of interviews with engineers who helped fabricate Kubernetes and take care of challenges in storage and data protection – in conjunction with the usage of Kubernetes Operators – as we glance forward to a future characterised by man made intelligence (AI) workloads.

Right here, Michelle Au, a tool engineer at Google specializing in Kubernetes storage trend, talks about getting bright on work corresponding to adding improve for snapshots and operators, which add functionality and complexity beyond Kubernetes’ core for evolved services corresponding to in storage and data protection

What modified into the market love when Kubernetes first launched?

Kubernetes modified into an early incumbent in the home. Containers had been factual turning into licensed and corporations had been beginning to salvage the home. There had been many substitute workload orchestration initiatives love Mesosphere, Docker Swarm, Cloud Foundry and Nomad.

How did you come by bright on storage for Kubernetes?

In 2017, I joined the Kubernetes team at Google and started working on initiatives as share of the SIG [special interest group] storage community. Sooner than that, I had finest heard about Kubernetes during the grapevine but never in actual fact outmoded it. But I believed it’d be a huge opportunity to be in a location to pick out out part in an initiate offer project that modified into reshaping the industry. 

When did you realise that Kubernetes modified into in the leading location available in the market?

Initially that wasn’t the case. But seeing the exponential growth of workloads yr after yr, and listening to the full success tales, especially with stateful workloads, is something that makes me very proud of what we’ve constructed. 

Whereas you checked out Kubernetes, how did you potential data and storage?

After I started studying about Kubernetes, the preferrred strengths that straight popped out at me had been the declarative API [application programming interface] and reconciliation manufacture paradigm, workload portability across environments, and standardisation of workload deployment easiest practices. All these strengths of Kubernetes had been also objectives that guided me when designing Kubernetes storage functions. 

What disorders first came up around data and storage with Kubernetes for you?

After I joined the team, running stateful workloads had been basically runt to puny-scale deployments in the cloud. There had been aloof many friction functions in relation to storage ecosystem improve apart from to scheduling, repairs and disruption management. Operating these workloads efficiently required many custom-constructed processes especially around day two challenges.

One of my first initiatives modified into to add improve for native storage. Whereas working on that, I realised there modified into a broader wretchedness of data locality awareness and made up our minds to take care of the problem more broadly by introducing the conception of storage topology to the Kubernetes scheduler. This simplified the route of of provisioning stateful workloads to be fault tolerant across failure domains whether you’re running in the cloud or on-premise.

Previous that, we started tantalizing from day one considerations to day two considerations. Snapshots improve modified into a critical gap in Kubernetes yet is important to any danger restoration approach. 

What needed to substitute?

Implementing snapshots modified into surprisingly non-trivial. Mapping an crucial operation to a declarative API required plenty of rounds of brainstorming, storage distributors had a small bit a form of semantics, and there modified into also the study of suggestions to handle application consistency. But in the cease we had been in a location to prevail in consensus and elevate this important functionality that further bolstered Kubernetes’ stateful workload readiness. 

How did you come by bright around Kubernetes operators?

As share of the Google Kubernetes Engine (GKE) team, I even accept as true with worked closely with clients and companions that are having a study to scamper operators on our platform. I also joined the Info on Kubernetes community as a representative of the Kubernetes project and GKE to better realize the whisper functions that customers face as we issue and reduction relay those disorders support to the Kubernetes project. 

What took situation around operators that made them a success for data and storage?

The advent of Custom Sources in actual fact modified the game for operators. It made it that it is likely you’ll per chance per chance mediate of to fabricate a Kubernetes-vogue declarative API to your explicit expend case. Many stateful workloads accept as true with quite just a few intricate and bright operational processes that can now not be with out pains abstracted in Kubernetes’ core, corresponding to configuring HA [high availability] and replication, and managing disruption and upgrade processes. Operators for the time being are in a location to orchestrate this complexity for cease-customers. 

How did this improve more cloud-native approaches? What had been the penalties?

Declarative APIs can with out pains enable GitOps or config-as-code paradigms. Operators for stateful workloads enable you to automate provisioning, upgrades and other repairs operations, with the added perfect thing about being love minded with how organisations space up all their other Kubernetes workloads. 

Kubernetes is now 10. How produce you mediate about it as we issue?

Kubernetes has attain a lengthy draw from “no draw I would scamper a database on Kubernetes” to “I’m running databases at petabyte scale with automated rolling upgrades”. Kubernetes has change into a actual platform to scamper just a few of basically the most anxious workloads, and that has shown as the project has shifted center of attention from a lot workload-enabling functions to efforts that improve reliability and steadiness over the final few years. 

What considerations aloof exist around Kubernetes when it involves data and storage?

Ecosystem discovery is a wretchedness. There are plenty of distributors in the home, and as an cease-consumer, it’s time-tantalizing and refined to mediate the full choices. I mediate here is where the Info on Kubernetes community can reduction. They accept as true with hosted talks and blogs on a huge number of stateful workloads that differ from introductory to evolved issues.

The growth of AI/ML has also presented sleek challenges. AI/ML workloads accept as true with very a form of data patterns and requirements than dilapidated databases, and regularly expend object storage and file storage in preference to block storage.

Also, multi-cloud or hybrid environments are a actuality, which adds further requirements for depraved-ambiance portability. I even accept as true with considered our Kubernetes storage abstractions proceed to extend to those sleek expend conditions, despite the indisputable fact that I mediate there is aloof room for enhancements, especially around data lifecycle and management. 

Any other anecdotes or recordsdata to part?

We’re regularly having a study for more contributors and participants in Kubernetes SIG-storage apart from to Info in Kubernetes. Please join these communities as soon as you happen to are attracted to sharing your tales or contributing to the project.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button