Why automation is so helpful. Part 1: Initial linux configuration using Ansible

Almost every experienced engineer uses automation tools on a weekly basis. It starts with scripting and can be endless. One of the most popular is Configuration Automation, Continuous Integration, or Continuous Deployment.

Today we will talk about configuration. Let’s say we need to create 20 VMs. It can be for almost every role: Kubernetes nodes, VPN servers, databases, or even old on-premises web application which is not containerized but needs to be scaled right now. Every OS should be initially configured regardless of its role to meet basic organization needs (in security, predictable configuration, or integrations possibility).

Being a newbie engineer, I used to do it manually by copy-pasting commands in every terminal window. After the fifth time, you’ll probably try to write a script that will open SSH connections in a loop with these commands. The next time when configuration standard will be changed, you’ll start this script on almost every server. 40 VMs will be good, I guess. And next time you will need to install updates on 60 servers.

And now, as your infrastructure grows, you’ll meet a cybersecurity guy who needs to be sure that your infrastructure meets configuration standards and apply some new tweaks from him.

And here Ansible comes to play.

Instead of writing scripts for applying and checking configuration, you can write an Ansible playbook that will cover both tasks at the same time as it’s built-in functionality.
Another feature is the possibility to upgrade a group of servers at the same time without linear logic (the fifth server will do the next tasks while the third is still applying updates).

Let’s take a tour of basic Linux configuration steps after the fresh install.

Take a look at code attachments as it’s the main reason for describing next tasks :slightly_smiling_face:.

Set repos, install updates and packages

Ansible Role link
(You need to add --ask-become-pass Ansible’s flag for using sudo in Ubuntu until you’ll allow it without password).

Linux-based OS uses repositories to download software. They can be public (enabled by default) or private (for easier distribution and versioning control in organizations).

Ubuntu uses a repository list from sources.list file, which contains archive.ubuntu.com as a default source. You can change it to local mirror in your country to get fastest response:

Ping statistics for archive.ubuntu.com:
    Minimum = 48ms, Maximum = 59ms, Average = 52ms
Ping statistics for ru.archive.ubuntu.com:
    Minimum = 13ms, Maximum = 23ms, Average = 16ms

To check if there is a mirror in your county and its frequency look at this page:
You can automate choosing fastest mirror with Ansible or install apt-transport-mirror.

To download information about available packages run apt update, and apt upgrade to install available newer packages versions. Use -y flag for scripts. Another available option is full-upgrade which can remove outdated packages to solve dependencies (only for old systems).
Next step is to install packages that you’re using on a daily basis, like vim, htop, iotop:
apt install htop iotop.

Rocky Linux (as rpm-based system) stores all repos by categories in files at /etc/yum.repos.d/. Another difference is that its package manager dnf has built-in feature of searching fastest mirror. Add fastestmirror=True parameter in /etc/dnf/dnf.conf to enable it.
Package upgrade can be accomplished with one command dnf upgrade - you don’t need to run cache update manually because dnf checks it for every installation command.
I’d also recommend to install EPEL repo on Rocky: dnf install epel-release.

I guess you’ve already noticed that Ubuntu uses http to access repos. After package is downloaded its signature will be checked before installing. You’ll found it later with adding new repo. Public GPG key for every repo should also be added (like root CA certificate in https) for both systems. Additionally, you can use https but it’s not supported by every repo and can make a little latency for downloading packages.

:warning: Be aware of running upgrades on production servers as it can cause service restarts or VM reboot for starting new kernel. Major updates can crash your service due to configuration issues or its deprecation.

Set timezone and enable NTP

Ansible Role link

By default we encourage time in UTC (Coordinated Universal Time), and then adjust it to local timezone. Certificate-based connection encryption protocols (like Kerberos or https) use time in UTC for certificate validation checking. So, it’s important to use correct timezone instead of just setting local time by your clock.
You can easily configure it with timedatectl set-timezone. Use timedatectl list-timezones to get list of available timezones.

Virtual machines take time from their hypervisor but they cannot exactly duplicate it, and after a while, your VM can be late from actual global time. This is why we should enable NTP (Network time protocol) client which synchronizes clock. In enterprise infrastructure, you’ll probably get time from domain controller, which takes it from global public source.
Systemd-timesyncd is enabled by default on latest Ubuntu versions. You can check its status to be sure it’s working properly: systemctl status systemd-timesyncd and journalctl -u systemd-timesyncd --reverse. Custom NTP servers can be configured at /etc/systemd/timesyncd.conf.

Rocky has chronyd which is also enabled by default. It loads configuration from /etc/chrony.conf file.

You may also be interested in configuring NTP server using symmetric keys authentication (not supported in Windows).

Add users

When you’re working in a team, other engineers should be able to run privileged commands on a server to work together or in case you’ll be unavailable. To access command line using SSH you can use password or key for authentication. Preferably you should use key-based auth, but why?

Almost every engineer saw phishing messages which aims to steal your data (passwords, credit card info) or even make you pay for something unreal. To secure from these attacks we should check web site address that we’re browsing. Next step is the MITM (main-in-the-middle) attack. The attacker owns your network equipment and redirects queries to his phishing server. But nowadays almost every site uses https, and browser will warn you about insecure certificate (it can be issued by non-trusted CA or for another web address). Check this link to get more details about PKI (Public Key Infrastructure).

SSH works in a similar way but its certificates are always self-signed, so you can’t check issuing CA. Connecting first time you’ll see this message after which public key will be saved as trusted for this address and other certificates will be rejected.

ssh 192.168.120.128
The authenticity of host '192.168.120.128 (192.168.120.128)' can't be established.
ECDSA key fingerprint is SHA256:zF+uuW9hVXdfzs49ZwR5NuOrDxgfEAKwylkGjmcU7tU.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.120.128' (ECDSA) to the list of known hosts.
logout
Connection to 192.168.120.128 closed.

# And now I've changed cached public key in known_hosts file

ssh 192.168.120.128
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.

But what if you’re connecting first time from another device or attacker was here from the first steps? Your password hash can be stolen and brute-forced.

Using a certificate your private key should never leaving its storage. It’s used in a challenge with public key which allows server to ensure that you have a private key without seeing it. Technically it’s possible to use a stolen challenge request as part of MITM attack, so I would recommend you to use different keys for multiple tiers. The most secure way is storing server’s public keys in DNS SSHFP (Secure Shell fingerprint) record, which resolves through DNSSEC or DNS over https.

Ansible Role link

This command will create new user, home folder, and set proper command interpreter:
useradd <username> --create-home --shell /bin/bash.

Next step is to push user’s public key. By default, you can store it in home directory ~/.ssh/authorized_keys. Another way is to store keys in /etc/ssh/authorized_keys, or even better at different files for every user /etc/ssh/authorized_keys/%u (additional configuration required, you can combine it with the next step). Don’t forget to configure permissions for keys file - it’ll be rejected with global write permissions: chmod 600 ~/.ssh/authorized_keys.

You shouldn’t allow root account login through SSH or locally. Log in using your personal account and run sudo instead.
Sudo permissions are given by default to admin or sudo groups in Ubuntu, and wheel group in Rocky, but using password. You can configure it with visudo. Application will check syntax after editing. I’ve written role with additional configuration in /etc/sudoers.d/ for every user, in case you’ll be interested to configure different sudo permissions for users.

Configure OpenSSH, Enable 2FA

Ansible Role link

With OpenSSH configuration I mean disabling features which can decrease attack scope. In other words, you should disable features that you don’t use.
Configuration file available at /etc/ssh/sshd_config in both systems. Check it for available options to decide which are useful for your environment. I’ll include my last config in this Ansible role.

Also, I would recommend enabling two-factor authentication which should save you in case of stealing credentials. Basically, it can be in two types: OTP and Push.

  • First is free but not that comfortable - you need to enter 6 digits code every time you log in to the server. Secret key for generating codes should be saved only on your phone and endpoint server, but you will need to securely store it for placing on every server, or scan different QR codes every time which is creasy. Can be a good practice to use it only on Jump/Bastion servers to get inside of your cloud network perimeter. Also, you need to have accurate time synchronization intervals on server and your phone, because digit code generates every minute or half, based on current time.
  • Second is much more useful and allows you to configure security policies for devices and get detailed statistics. It’ll send you a push request with allow and deny buttons, but it works only when both VM and smartphone have internet access (SMS or calls are also available in special cases). I’ve added DUO support in this role because we’ve used it for free (up to 9 engineers). There are also other solutions with great functionality and useful free plans.

Duo installation described in its documentation. Let’s look at this steps shortly:

  1. Add repository and import GPG key
  2. Install duo-unix
  3. Configure it in /etc/duo/pam_duo.conf file. You need to set three parameters from Admin console and its behavior in case if internet connection is lost. It is important to store secret key in secure storage - I’ve added it to vars only for demo reasons. We’ve used Vault to get these keys identified by integration key.
  4. Replace classic pam configuration for sshd in /etc/pam.d/sshd
  5. Enable keyboard-interactive auth method in /etc/ssh/sshd_config
  6. Add user for Ansible and configure it only for publickey auth method in sshd_config (because Ansible doesn’t support interactive login. Also it’s helpful for running automated pipelines). We’ve chosen admin username because it’s a default user in Ubuntu’s image at some public clouds.
  7. Add your users in Admin Console and send links for configuration. Don’t forget to allow 2FA bypass for Ansible user

It looks weird to enable 2FA but still having private key for a user account which can bypass it. The point is that you shouldn’t store this key with direct access to every server on your laptop as soon as infrastructure grows to business level. You’ll probably write pipelines which can take long Ansible runs for whole servers park.

Set FQDN (Fully Qualified Domain Name)

Using correct hostname is important for engineer because it’s the main way to identify currently opened console (you can also use some bash tweaks to show other useful information in command line). Use hostnamectl set-hostname <name> to configure hostname.

Some applications (like apache) use FQDN for proper configuration.
By default in Ubuntu domain name applies from DHCP response and sometimes cloud provider can’t change this name by your request. You can hardcore correct domain name at /etc/cloud/templates/hosts.debian.tmpl. Rocky’s /etc/hosts file is static, so you can just write here.
As it’s optional step for specific case I didn’t add its role.

Set message of the day

Another optional step to make your experience customized by yourself. MOTD runs scripts that print useful information when you log in.
Ubuntu messages are quite larger with unnecessary news, so you can delete them.
You can configure scripts for both systems at /etc/update-motd.d/, but Rocky will not execute them - only print contents. Easiest solution is to start script using default profile setting.

Ansible Role link

Add internal CA certificates

Nowadays we issue Let’s Encrypt certificates even for internal services - it will guarantee that service will work from BYOD (Bring Your Own Device) employee’s home laptops. But if you have internal services like monitoring and Active Directory, or even cloud-managed database, there is a big probability you’ll need access these services from Linux VMs.

Using Ubuntu, copy certificate file to /usr/local/share/ca-certificates and run update-ca-certificates --fresh to add them in system cache. Fresh flag will recreate this cache. Certificates should be placed in separate files, without bundle.
Rocky stores additional certificates at /etc/pki/ca-trust/source/anchors. Run update-ca-trust after placing them.

Ansible Role link

Configure firewall

Firewall is very important for VM’s security. In a large enterprise network it can give your service additional layer of protection. After deploying new services it also helps to understand application’s network requirements, as it can be lost after years.

Ubuntu uses iptables, which is managed by ufw.

  • Some applications automatically add rules to /etc/ufw/applications.d/, so you can easily allow their default ports: ufw allow OpenSSH
  • or quickly add classic rule: ufw allow 443/tcp.
  • Run ufw enable to start filtering.
  • ufw status verbose will give you overview of current default action and rules.

Rocky includes firewalld by default. It uses zones which is assigned to interfaces.

  • Use firewall-cmd --get-zones to get list
  • and firewall-cmd --zone=<zone name> --list-all to list rules (if you didn’t specify zone, public will be shown).
  • You can run firewall-cmd --add-service=ssh
  • or firewall-cmd --add-port=60001/udp to create allow rules in public zone (use --zone=<zone name> to specify another).
  • To apply new rules run firewall-cmd --reload.
  • If you want to save changes for next reboot run firewall-cmd --runtime-to-permanent.
  • Run systemctl enable --now firewalld to start it and enable at system startup.

Also, pay attention to removing rules: you need to set absent status in playbook because deleting task will not affect rule.

Ansible Role link

What’s next?

Now, as you have playbook with prepared base roles you can run ansible-playbook playbooks/base.yml -v to apply it. To validate configuration appliance just add --check option.
Clone repo and adapt it for your study environment - I’ve checked it all before pushing :slightly_smiling_face:

In conclusion, using Ansible to automate your configuration saves time and makes it easier to ensure that all servers are configured properly by your current standards. You can easily apply and check it, even across multiple groups of servers. This can be especially helpful as your infrastructure grows and you need to ensure that everything is secure and up-to-date.

Feel free to reach me in KodeKloud community slack or write comments here.
I hope you enjoyed it, and I’d like to thank you for spending your time with me.

Thanks for sharing Andrei.