What is Puppet in DevOps?

Puppet in DevOps

Let’s talk about software deployment.

Before we can deploy an application and make it available to its users, we must complete two steps:

First, we provision the infrastructure, which is the set of hardware & software components that support the application’s development, testing, and deployment. Provisioning means setting up servers, network equipment, and other infrastructure.

Second, we configure the infrastructure, which involves customizing the provisioned resources. Some example tasks include:

  • Installing software packages on a server
  • Updating to a specific Linux distribution
  • Setting up logging
  • Creating database configuration files

Configuring a handful of servers can be done manually or by using a script, without much trouble. But what if we have a complex infrastructure set up with hundreds or even thousands of servers?

In such a scenario, manual configuration will definitely be time-consuming & more often than not, will lead to errors that are costly & difficult to troubleshoot. Over time, configurations across servers would become inconsistent (known as “configuration drift”) due to human error.

To avoid such issues, we must adopt automation, which is in sync with the DevOps practice of “Automate Everything”. Even the infrastructure that applications run on. This is where Puppet comes into the picture.

What is Puppet?

Puppet is an open-source configuration management tool used to manage and automate the configuration of servers at scale. Some examples of daily tasks that Puppet can automate include:

  • Installing software
  • Applying security patches
  • Modifying database settings

In Puppet, we use Puppet-specific code to write configuration files known as manifests where we declare how we want our infrastructure to be configured. For example, the following Puppet code ensures that Nginx is installed on our server(s).

package { ‘nginx’:
ensure => installed
}

When we define and manage infrastructure through configuration files, we rely on a core principle of DevOps: Infrastructure as Code.

Infrastructure as Code

The core idea behind infrastructure as code (henceforth referred to as “IaC”) is that we manage infrastructure configuration through code (configuration files) rather than through manual processes.

These configuration files are then stored and tracked in version control systems (such as GitHub). This way, the entire history of the infrastructure is now captured in the commit log. This becomes a powerful tool for debugging issues. Anytime a problem pops up, we can check the commit log and find out what changed in our infrastructure, and we might resolve the problem simply by rolling back to a previous version until a fix is implemented.

As we can see, Iac brings many benefits:

  • Improved productivity: System administrators and operators no longer have to carry out manual configuration.
  • Improved reliability: As infrastructure configuration information is stored in configuration files, there is less chance of human error.

Now that we understand what Puppet is and what problems it solves, let’s understand how Puppet works.

Also read: DevOps Vs Infrastructure-as-Code (IaC) Vs GitOps

How does Puppet work?

Puppet uses an agent/server model to configure the systems. The agent is referred to as the Puppet Agent & the server is referred to as the Puppet Server.

Puppet Agent needs to be installed on each system we want to manage/configure with Puppet. Each agent is responsible for:

  • Connecting securely to the Puppet Server to get the series of instructions in a file referred to as the Catalog File.
  • Performing operations from the Catalog File to get to the desired state.
  • Sending back the status to the Puppet Server.

The Puppet Server is responsible for:

  • Compiling the Catalog File for hosts, based on system, configuration, manifest file, etc. Puppet prepares a Catalog File based on the manifest file, which is a Puppet program used to control the systems running the Puppet Agent. After processing the manifest file, the Puppet Server prepares the Catalog File based on the target platform.
  • Sending the Catalog File to Agents when they query the Server.
  • Storing information about the entire environment, such as host information, metadata such as authentication keys.
  • Gathering reports from each Agent and then preparing the overall report.

When using this agent/server model, the agent connects to the server and sends a bunch of facts that describe the computer to the server. The server then processes this information, generates the list of rules that need to be applied to the device, and sends this list back to the agent. The agent is then in charge of making any necessary changes to the computer.

Puppet Design Philosophy

Puppet has three important characteristics:

1. Declarative

Puppet takes a declarative approach to configuration files that describe the desired state of infrastructure. Puppet then configures the infrastructure based on this defined state.

The Puppet Domain Specific Language (DSL) is a declarative language. In declarative language, we declare the state we want to achieve rather than the steps to get there. With the Puppet DSL, we describe the desired state of our systems and Puppet handles all responsibility for making sure that the system conforms to this desired state.

2. Idempotent

Puppet as a language is designed to be inherently idempotent. An idempotent action can be performed over and over again. If the first run was successful, reapplying the same action a second, or third time won't change the system; there will be no unintended side effects.

Furthermore, if a script is idempotent, it can fail halfway through its task and be run again without problematic consequences. For example, if for some reason Puppet fails part way through a configuration run, re-invoking Puppet will complete the run and repair any configurations that were left in an inconsistent state by the previous run.

Most Puppet resources provide idempotent actions and we can rest assured that two runs of the same set of rules will lead to the same end result.

3. Stateless

Puppet’s agent/server API is stateless. This means that there is no state being kept between runs of the agent.

Each Puppet run is independent of the previous one, and the next one. Each time the Puppet agent runs, it collects the current facts. The Puppet master then generates the rules based just on those facts, and then the agent applies them as necessary.

This stateless model has several advantages:

  • There is no need to synchronize data or resolve conflicts between masters. This allows Puppet servers to scale horizontally, which means that we can add more servers (or “nodes”) that each run the application.
  • Catalogs can be compared and cached locally, so that servers don’t need to exchange data about the current state all the time, reducing network traffic & server load.

Open source Puppet vs Puppet Enterprise

Puppet comes in two flavors: open-source Puppet & Puppet Enterprise.

Open-source Puppet is great for individuals managing a small set of servers. Puppet Enterprise is the commercial version. It builds on the core open-source projects, adding a whole set of powerful capabilities to manage complex workflows & automate enterprise-scale infrastructure.

Final Thoughts

Puppet is a cross-platform tool that has been around for a long time. It is mature, & more importantly, it is one of the most popular infrastructure automation tools in the industry today. This makes it a must-have tool in a DevOps engineer’s toolkit. You can start your Puppet journey with an easy-to-understand course: Puppet for the Absolute Beginners.