What Is the Cloud Native Computing Foundation?

What Is the Cloud Native Computing Foundation?

Open-Source Free Software, What Is That?

Before we can explore why the Cloud Native Computing Foundation was needed, we have to get a basic understanding about what open-source free software is, and how it's maintained.

Let's take Linux as an example. Linus Torvalds wanted an operating system of his own so he developed the heart of the operating system, the kernel. This kernel would later be named Linux. At first, this was a personal passion project. Of course, it was private since only Linus had access to it. But Linus wanted to share this with the whole world. So he uploaded the source code of Linux to the Internet. Technically, this is what it means to open-source something, to allow anyone in the world to see the entire source code of some software tool. And with open-source, something cool happens. People that have similar hobbies and are passionate about what problem that project is solving, will start to join in. They'll look at the code, fix stuff, add new features, make the program work faster, and so on. This will be in the form of code suggestions which the project owner can accept or decline.

But Linus went even further. He didn't just open source, he also made Linux "free software". This is where there's usually some confusion. People tend to believe that the word "free" in "free software" comes from the fact that we usually don't have to pay to use it. But, actually, the word "free" in "free software" refers to freedom. It's similar to how we use this word to say "free human being", in the sense that that human is free to do anything they want. Same applies to free software. No matter if we are a big company like Apple or Microsoft, a small local business, or just an individual, we can use Linux for whatever we may need. We are free to do anything we want with it, without signing contracts, buying licenses, asking for permissions, and so on.

This is how Linux got that big. First of all, by open sourcing, Linus wasn't working alone anymore but got help from thousands and thousands of developers around the world. And by making it free software, this attracted companies that needed to move fast, without worrying about contracts and legal stuff. They could just download Linux and immediately use it without even asking Linus for permission.

How the CNCF Was Born

Now let's go back in time, to the year 2014. At that point, Google was already using containers and container automation tools, to serve billions of users. Inside their company, they created a tool called Borg, and used it for container orchestration. But this was private. Only they had the source code for this awesome tool and only they were allowed to use it. But in 2014 that finally changed. Just like Linus Torvalds did with Linux, so they did with Borg. They open-sourced this project and made it free software. Borg would later be renamed to Kubernetes.

We saw that open-source free software is a cool concept. Anyone is free to use it however they want and anyone can suggest code changes that might improve the project. But there's also a problem with this model. There still needs to be some sort of leadership: people that decide how this open-source project should be developed, how it should evolve, what code changes get accepted, and which get rejected. Now let's say that Apple or Microsoft is this leader. They oversee everything that happens with Kubernetes. They might be tempted to develop the software in a direction that only benefits them, not the entire community. They might implement things that only help them have an advantage while leaving their competitors at a disadvantage.

To avoid these kinds of problems, Kubernetes needed a neutral home. So Google discussed with the Linux Foundation about how to solve this. This foundation already has a long history of being neutral, working for the benefit of the entire community, not just some chosen few. So a lot of companies already trust it. In 2015, the Linux Foundation created a sort of sub-organization, called the CNCF, the Cloud Native Computing Foundation. Finally, Google donated Kubernetes to the CNCF, this now becoming its neutral home.

What is the Cloud Native Computing Foundation Today?

Ok, so the CNCF started out as a home for Kubernetes. But, as years went by, this became much more. First of all, the CNCF now has many more projects under its umbrella. Here are some of the most popular ones:

  • Of course, Kubernetes for orchestrating containers.
  • Helm, which is a sort of package manager for Kubernetes. For example, we can install Wordpress with just one command instead of defining multiple complicated objects in Kubernetes.
  • Containerd, a container manager. This takes care of things such as pulling images from a server on the Internet, starting a container, stopping it, and so on.
  • etcd, which is almost like a database. But instead of storing its data on a single server, etcd makes it easy to spread out this data to many computers, so that it can be quickly accessed from multiple locations. Plus, it makes data more resilient because even if one point fails, the data is still available at many other points, so it's still alive and well.
  • Open Policy Agent, or OPA which lets us define policies in our Kubernetes clusters, what should be allowed, what should be disallowed.
  • Fluentd which makes it easier to collect logs about what is going out in our cloud or Kubernetes infrastructure.
  • Prometheus which can help us monitor important things that happen in our cloud or Kubernetes infrastructure. This happens in real-time so we can immediately get alerts when important events take place, for example, catastrophic errors.
  • Linkerd, a so-called service mesh that makes it easier to interconnect all the small pieces we have in our infrastructure. Just like a router in our home connects many devices to our network and to the Internet, a service mesh does a somewhat similar thing for small components running in our clouds.

We can kind of see how the additional projects landing under the CNCF were things that played well with Kubernetes. Things that added new functionality on top of it. But this is not a rule. Nowadays, we can see a lot of projects under the CNCF that aren't necessarily for Kubernetes, but rather, they work well with cloud infrastructure. Otherwise said, projects that are "cloud-native", and we'll see later on what that means.

With many more projects adopted, the CNCF also has many more responsibilities. Here are some examples:

  • They promote the projects that they adopt. Basically, the projects get some free marketing to help them get a good start and gather some community around them. And by community this doesn't only mean developers that can contribute code. It also means people or companies that use these software tools to solve some real-world problems.
  • They organize events such as KubeCon + CloudNativeCon conferences. This serves both as a way to market CNCF projects but also as a way to allow more people to learn how to use these tools effectively.
  • They secure funding for projects that need it.
  • They try to make sure that all the projects they adopt can work reasonably well with each other. That means that these tools should be able to interconnect if they're used in a cloud infrastructure. So they may suggest ways in which the developers can make these tools compatible with each other.

Why Contribute a Project to the CNCF?

But why would we want a project to become part of the CNCF? Let's say we worked on some cool open-source tool. We have the project on GitHub, people are downloading it, using it on their servers, and we even have around 50 developers regularly contributing better code for this project. Our software tool is awesome, but the community around it is still small. This could be a reason to contribute our project to the CNCF. If they see that our tool indeed has a lot of potential, they can help us get a lot more visibility on the Internet. And with more visibility, our community will grow. More people will test our software, more people will contribute code. So our project could evolve much faster with CNCF's help. In this case, we can consider it an accelerator for projects with great potential but small communities.

This is not the only reason why someone would want to contribute something to the CNCF. It's just one example. Others might have different reasons, like we saw with Google wanting to find a neutral home for Kubernetes. For example, if Apple would open-source some projects they use internally, it's almost guaranteed that Microsoft wouldn't want to touch it, as their competitors. But if Apple would donate that project to the CNCF, a neutral party, then Microsoft would be able to use that project and contribute to it, without worrying.

What Is "Cloud Native?"

Now the CNCF cannot adopt any project under the sun. So there are a lot of conditions to get accepted. The first one we can guess from the name of the foundation. For a project to get accepted, it has to somehow be "cloud native". But what does "cloud native" mean? To understand this, we should go back in time.

In the initial stages of the Internet, it was quite common for one server to host many services. For example, we could have a website selling t-shirts. And everything to make the website work was installed on that server. We had one service that allowed us to get emails at some [email protected] address. Another service that let us send mail. Another service that hosted our website. Yet another service that hosted our database. And so on. It kind of worked at that point. There weren't many people online, servers weren't pressured as much as they are these days. But this had an inherent problem. It was basically a house of cards. If the server had a problem, we lost everything: no more emails sent or received, no database, no website; nothing worked anymore.

With time, people moved to another model where each server had its own isolated purpose. For example, one server only dealt with emails, another with the database, another hosted the website, and so on. But even that wasn't ideal. Sure, now if we'd lose the email server, our website would still work. But we can't send and receive emails, and that's still not good for our business. Imagine people placing an order for a t-shirt and getting no email confirmation. They might think their order didn't register.

In the modern days of the Internet we have yet another problem. We need to be able to quickly scale up and down. Imagine we just launched an awesome service like Netflix. In January we had 5.000 clients. In February we had 20.000. In March we had 300.000. We are growing at an incredible rate. With the old model, everything on one, two, three servers, to scale up we would need to move everything to bigger, more powerful servers. Migrating everything takes time and might need us to temporarily pause our service, no more movies for our users. They wouldn't like that. But with the cloud, we don't need to work like that anymore.

This might sound weird, but the cloud infrastructures today, in a way, resemble how the human body works. We have billions and billions of cells. Each of them is very small and specialized on a single task. Each works independently. But all of those independent, billions of cells, make up a SINGLE human body. The cloud is similar. We launch and configure lots and lots of small pieces in the cloud. Each piece is independent. But when we interconnect all of them, we create a single organism, our cloud infrastructure. This will be the cloud infrastructure that lets us run our operation and offer some service to our customers. Going forward with our analogy, in the human body new cells are born all the time and old cells die all the time. But this goes unnoticed, the body still works normally. In the cloud, a similar thing happens. We don't have this big thing like a single server storing a database, as we did in the old model. Instead, we have a lot of small things storing up the database or parts of the database. They are like little cells which specialize in database operations. Since we now have hundreds or thousands of cells specialized in database operations, nothing bad will happen if one or two cells die, there are still many that remain functional. So our database will always work even if some parts malfunction. Furthermore, we don't need to interrupt our service if our business grows. We don't need to migrate from one small server to one big server. We just add more cells to our infrastructure.

This is what we would call "cloud native" tools. Tools that can be used in this way; small services, small components, or small pieces of software that we can launch independently, then interconnect to create a larger structure. So for a tool to have a chance to be adopted by the CNCF it has to be "cloud native". Otherwise said, it has to be able to work this way, so it can find a natural home running in the cloud somehow.

CNCF Maturity Levels and Project Proposal Process

A project under the CNCF umbrella can be placed in one of three categories: Sandbox, Incubating, or Graduated. These are the so-called maturity levels. And it's easier to get a project to be accepted in Sandbox, a lot harder to get it accepted as Incubating and very hard to get it into the Graduated group.

Sandbox

Usually, if someone wants to donate their software tool/tools to the CNCF, they'll start by filling out a form and ask for it to be included in the initial Sandbox level. The CNCF has a few members that are part of what is called the TOC, the Technical Oversight Committee. Among other things, they are responsible for deciding on what can become part of the CNCF. In this case, TOC members take a look at the tool asking for inclusion and if the majority votes with a "yes", this will become part of the CNCF's Sandbox.

Here's how the CNCF Sandbox proposal form looks like.

If a certain tool becomes part of the CNCF the tool's trademark and logos have to be donated to this organization. That's so the CNCF can legally use these when promoting these tools on their websites, blogs, at live events, and so on. Basically, a way to avoid legal trouble down the road.

For a software tool to be accepted into Sandbox it has to have these properties:

  • It's very good at solving a particular problem in this cloud-native space.
  • The tool is still very young though, still in its early experimental stages. Some people use it in the real-world, but not enough. It's still unproven, it is not yet guaranteed that this tool will work correctly in most situations. It needs more time to be tested by more and more people until we can be sure that it's reliable.
  • Since this tool is rather young, there might not be a lot of people contributing code, suggesting changes, improving it, working on it in one way or another. So it has some community around it, but not a very large one.

Incubating

Now let's see how a tool can become part of the Incubating category. Usually, a project that's been sitting in Sandbox can be upgraded to Incubating if it proves it has evolved and matured enough, as time went by. But, interestingly, if a software tool is already going strong, has a lot of people and companies using it, a lot of code contributors, its owners can request that it becomes part of the CNCF's Incubating category, directly. That is, they can skip Sandbox and jump to Incubating directly. Either way, to become part of Incubating, the process is much harder. Instead of filling out a simple form, the tool's owners have to actually write a technical document and send it to CNCF's GitHub repository. In this document they must provide proof that their tool meets all the requirements to become part of the Incubating category. This can include details such as: community size, social channels, examples of companies using their tool, and so on.

Here, you can see how such documents look like: https://github.com/cncf/toc/tree/main/proposals/incubation

If a Technical Oversight Committee member is impressed, and has the time, they can step in and become what is called the "Incubation sponsor". They become a sort of a bridge, between the tool's owners and the CNCF organization, other TOC members included. They lead and guide the discussions between all people involved. They help other TOC members get a technical understanding about this tool and why it's a good candidate as an Incubating project. Finally, after a long process and technical analyses, the TOC members vote. For a project to be accepted as Incubating, two thirds of the members have to vote in favor.

To get a tool accepted in the Incubating category it needs to have these properties:

  • It solves a problem in the cloud-native space.
  • It has a lot more community around it. Enough developers contribute code. Enough people in the team can review that code to decide what should get accepted, adjusted slightly, or rejected.
  • There needs to be proof that this tool is used by at least three end users, and that it works reliably for them. But what is an end user? Someone that uses a cloud-native tool, but DOES NOT sell cloud-native services. For example, Apple can be considered an end-user. They don't sell services based on cloud-native tools, they don't let you run containers, host websites or databases, run code, and so on. But Google Cloud or Amazon Web Services are NOT end users since they use cloud-native tools but then resell cloud-native solutions further ahead. Otherwise said, the use of those cloud native tools does not end with them, so they are not end users.
  • For a tool to be used in the real world, security is very important. So to be accepted into Incubating, such a tool must have a clearly documented way of how they handle security issues. They must have a publicly accessible document that instructs users on how to report security issues. And they must specify how they handle those security issues and how they update their code or release new software versions to fix those issues.
  • They must have a clear versioning model. For example, they go from version 1.0.6 of their tool to version 1.0.7. It must be clearly documented how they make the decision to make such a version jump. Or how they go from 1.0.8 to 1.1.0, under what circumstances, what are the conditions, etc.

Graduated

Finally, let's take a look at the highest maturity level, Graduated. For a tool to get to this stage, it has to currently be in the Incubating category. The project maintainers must submit a graduation proposal document to CNCF's GitHub repository. In this document they should provide all proof that the project meets all requirements for this category. Here are some examples how these documents look like: https://github.com/cncf/toc/tree/main/proposals/graduation.

If and when a TOC member steps in as the sponsor, discussions begin. After a long process and careful analysis, TOC members vote. Two thirds of the members need to vote in favor, for the tool to be promoted from Incubating to the Graduated category.

https://github.com/cncf/toc/blob/main/proposals/graduation/linkerd.md

These conditions must be met to get a tool accepted into the Graduated category:

  • As usual, the tool must solve a problem in the cloud-native space.
  • All conditions from the Incubating category must be met.
  • The project must have committers from at least two organizations. What's a committer? Well, imagine I send a new piece of code to your project. I just suggest a change, but I cannot modify your project. You see my suggestion, agree that it's a good one, so you want to include this. If you accept that suggestion that changes your code, the technical term is that you committed that change. Since you have the rights to do commits, it is said that you are a commiter for this project.
  • The project has earned and maintained a so-called "Core Infrastructure Initiative Best Practices Badge". This badge is earned when the project follows a long list of best practices for an open source free software project. Some conditions to get this badge are pretty straightforward, like the project's website must list what this tool does or how people can contribute new code or code improvements. Other conditions are a bit harder to achieve, for example, one developer in the team must know how to design secure software. The list of conditions for this badge is pretty long: https://bestpractices.coreinfrastructure.org/en/criteria/0
  • The project needs to ask an independent company to do what is called a security-audit, where they look for security vulnerabilities in that software tool. After the audit, all of the critical vulnerabilities need to be fixed.
  • A document on the project's website needs to detail the so-called project governance and committer process. For example, this document could explain all the steps required to take a code suggestion and actually commit the change into the project. Or it could explain how to become a maintainer and what the responsibilities are. In a nutshell, it explains how the project is governed, how things should be done if you want to contribute to this project.
  • Another document must list the organizations that adopted this project, otherwise said, the ones which are actively using it in their infrastructure.
  • Finally, for the project to move into this Graduated category that adopts only the most mature projects, the Technical Oversight Committee (TOC) must vote in favor of this. Two thirds of the votes must be in favor to pass.