Linux - Create and Manage Hard Links

The Linux file system is a hierarchical structure that organizes files and directories on a computer. It starts with the root directory, which is the top-level directory in the file system. From there, directories and subdirectories can be created to organize files into groups. Each file and directory on the file system is represented by an inode, which contains information about the file's ownership, permissions, and location on the disk. One of the features that Linux uses to manage data efficiently is hard links.

This blog will explore what we mean by hard links definition, and how to create and manage hard links in Linux.

Linux File System

To understand hard links and soft links, we first have to learn some very basic things about filesystems.

Let’s imagine a Linux computer is shared by two users: alex and jane. Alex logs in with their own username and password, Jane logs in with her own username and password. This lets them use the same computer, but have different desktops, different program settings, and so on. Now Alex takes a picture of the family dog and saves it into /home/alex/Pictures/family_dog.jpg.

Let’s simulate a file like this.

echo "Picture of Milo the dog" > Pictures/family_dog.jpg

With this, we created a file at Pictures/family_dog.jpg and stored the text “Picture of Milo the dog” inside. There’s a command on Linux that lets us see some interesting things about files and directories.

stat Pictures/family_dog.jpg

We’ll notice an Inode number. What is this?

Filesystems like xfs, ext4, and others, keep track of data with the help of inodes. Our picture might have blocks of data scattered all over the disk, but the inode remembers where all the pieces are stored. It also keeps track of metadata: things like permissions, when this data was last modified, last accessed, and so on. But it would be pretty inconvenient to tell your computer, “Hey, show me inode 52946177”. So we work with files instead - the one called family_dog.jpg in this case. The file points to the inode, and the inode points to all the blocks of data that we require.

And we finally get to what interests us here.

A Hard link is a powerful feature of the Linux file system that can be used to create multiple references to a single file. Unlike symbolic links, which are simply pointers to another file, hard links are actual references to the file itself.

Back to our example: we notice this in the output of our stat command. There’s already one link to our Inode? Yes, there is. When we create a file, something like this happens:

We tell Linux, “Hey save this data under this filename: family_dog.jpg”

Linux says: “Ok, will group all this file’s data under inode 51221169. Data blocks and inode created. Will hard link file “family_dog.jpg” to Inode 51221169.

Now when we want to read the file:

“Hey Linux, give me data for family_dog.jpg file”

“Ok, let me see what inode this links to. Here’s all data you requested for inode 51221169”

family_dog.jpg -> Inode 51221169

Easy to understand. But why would we need more than one hard link for this data?

Well, Jane has her own folder of pictures, at /home/jane/Pictures. How could Alex share this picture with Jane? The easy answer, just copy /home/alex/Pictures/family_dog.jpg to /home/jane/Pictures/family_dog.jpg. No problem, right? But now imagine we have to do this for 5000 pictures. We would have to store 20GB of data twice. Why use 40GB of data when we could use just 20GB? So how can we do that?

Instead of copying /home/alex/Pictures/family_dog.jpg to /home/jane/Pictures/family_dog.jpg, we could hard link it to /home/jane/Pictures/family_dog.jpg.

The syntax of the command is:

ln path_to_target_file path_to_link_file

The target_file is the file you want to link with. The link_file is simply the name of this new hard link we create. Technically, the hard link created at the destination is a file like any other. The only special thing about it is that instead of pointing to a new inode, it points to the same inode as the target_file.

In our imaginary scenario, we would use a command like:

ln /home/alex/Pictures/family_dog.jpg /home/jane/Pictures/family_dog.jpg

Or, if we’re already inside the /home/alex directory (that’s our current/working directory) we can use a relative path to our target file:

ln Pictures/family_dog.jpg /home/jane/Pictures/family_dog.jpg

Our picture is only stored once, but the same data can be accessed at different locations using different filenames.

First, hard links allow for data stored once to be accessed at different locations using different filenames.

Another beautiful thing about hard links is this: Alex and Jane share the same 5000 pictures through hard links. But maybe Alex decides to delete his hard link of /home/alex/Pictures/family_dog.jpg. What will happen with Jane’s picture? Nothing, she’ll still have access to that data. Why? Because the inode still has 1 hard link to it (it had 2, now it has 1). But if Jane also decides to delete her hard link /home/jane/Pictures/family_dog.jpg, the inode will have 0 links to it. When there are 0 links, the data itself will be erased from the disk.

The beauty of this approach is that people that share hard links can freely delete what they want, without having a negative impact on other users that still need that data. But once everyone deletes their hard links to that data, the data itself will be erased. So data is “intelligently removed” only when EVERYONE involved decides they don’t need it anymore.

Below are the limitations of using hard links:

  • You can only hard link to files, not directories.
  • You can only hard link to files on the same filesystem. If you had an external drive mounted at /mnt/Backups, you would not be able to hard link a file from your SSD, at /home/alex/file to some other file on /mnt/Backups since that’s a different filesystem.

To learn more about Linux, check out our hands-on Linux course.

Learning Linux Basics Course & Labs | KodeKloud

Best Practices When Hard Linking

When working with hard links in Linux, there are a few best practices that you should keep in mind to ensure that you are using them effectively and safely.

First, it's important to understand that when you create a hard link, you are essentially creating a second reference to the same file on your file system. This means that any changes you make to the file using one filename will be reflected in the file accessed via any other linked filename. As a result, it's important to be careful when working with hard links to avoid accidentally making unwanted changes to your files.

Second, it's important to ensure that all users involved in accessing a hard-linked file have the necessary permissions to do so. This may require adding users to the same group and changing permissions on the file accordingly.

Finally, when working with hard links, it's important to be aware of the potential for inadvertently creating circular links, which can cause issues with your file system. To avoid this, it's generally a good idea to keep hard links within a single directory and avoid linking files across different directories.

More on Linux: