The Linux file system is a hierarchical structure that organizes files and directories on a computer. It starts with the root directory, which is the top-level directory in the file system. From there, directories and subdirectories can be created to organize files into groups. Each file and directory on the file system is represented by an inode, which contains information about the file's ownership, permissions, and location on the disk. One of the features that Linux uses to manage data efficiently is hard links.
This blog will explore what we mean by hard links definition, and how to create and manage hard links in Linux.
Want to gain a deeper understanding of Linux's other main concepts? Watch this video.
To understand hard links and soft links, we first have to learn some very basic things about filesystems.
Let’s imagine a Linux computer is shared by two users: alex and jane. Alex logs in with their own username and password, Jane logs in with her own username and password. This lets them use the same computer, but have different desktops, different program settings, and so on. Now Alex takes a picture of the family dog and saves it into /home/alex/Pictures/family_dog.jpg.
Let’s simulate a file like this.
echo "Picture of Milo the dog" > Pictures/family_dog.jpg
With this, we created a file at Pictures/family_dog.jpg and stored the text “Picture of Milo the dog” inside. There’s a command on Linux that lets us see some interesting things about files and directories.
We’ll notice an Inode number. What is this?
Filesystems like xfs, ext4, and others, keep track of data with the help of inodes. Our picture might have blocks of data scattered all over the disk, but the inode remembers where all the pieces are stored. It also keeps track of metadata: things like permissions, when this data was last modified, last accessed, and so on. But it would be pretty inconvenient to tell your computer, “Hey, show me inode 52946177”. So we work with files instead - the one called family_dog.jpg in this case. The file points to the inode, and the inode points to all the blocks of data that we require.
And we finally get to what interests us here.
What is a Hard Link in Linux?
A Hard link is a powerful feature of the Linux file system that can be used to create multiple references to a single file. Unlike symbolic links, which are simply pointers to another file, hard links are actual references to the file itself.
Back to our example: we notice this in the output of our stat command. There’s already one link to our Inode? Yes, there is. When we create a file, something like this happens:
We tell Linux, “Hey save this data under this filename: family_dog.jpg”
Linux says: “Ok, will group all this file’s data under inode 51221169. Data blocks and inode created. Will hard link file “family_dog.jpg” to Inode 51221169.
Now when we want to read the file:
“Hey Linux, give me data for family_dog.jpg file”
“Ok, let me see what inode this links to. Here’s all data you requested for inode 51221169”
family_dog.jpg -> Inode 51221169
Easy to understand. But why would we need more than one hard link for this data?
How to Create a Hard Link in Linux
Well, Jane has her own folder of pictures, at /home/jane/Pictures. How could Alex share this picture with Jane? The easy answer, just copy /home/alex/Pictures/family_dog.jpg to /home/jane/Pictures/family_dog.jpg. No problem, right? But now imagine we have to do this for 5000 pictures. We would have to store 20GB of data twice. Why use 40GB of data when we could use just 20GB? So how can we do that?
Instead of copying /home/alex/Pictures/family_dog.jpg to /home/jane/Pictures/family_dog.jpg, we could hard link it to /home/jane/Pictures/family_dog.jpg.
The syntax of the command is:
ln path_to_target_file path_to_link_file
The target_file is the file you want to link with. The link_file is simply the name of this new hard link we create. Technically, the hard link created at the destination is a file like any other. The only special thing about it is that instead of pointing to a new inode, it points to the same inode as the target_file.
In our imaginary scenario, we would use a command like:
ln /home/alex/Pictures/family_dog.jpg /home/jane/Pictures/family_dog.jpg
Or, if we’re already inside the /home/alex directory (that’s our current/working directory) we can use a relative path to our target file:
ln Pictures/family_dog.jpg /home/jane/Pictures/family_dog.jpg
Now our picture is only stored once, but the same data can be accessed at different locations using different filenames.
Benefits of Using Hard Links
First, hard links allow for data stored once to be accessed at different locations using different filenames.
Another beautiful thing about hard links is this: Alex and Jane share the same 5000 pictures through hard links. But maybe Alex decides to delete his hard link of /home/alex/Pictures/family_dog.jpg. What will happen with Jane’s picture? Nothing, she’ll still have access to that data. Why? Because the inode still has 1 hard link to it (it had 2, now it has 1). But if Jane also decides to delete her hard link /home/jane/Pictures/family_dog.jpg, the inode will have 0 links to it. When there are 0 links, the data itself will be erased from the disk.
The beauty of this approach is that people that share hard links can freely delete what they want, without having a negative impact on other users that still need that data. But once everyone deletes their hard links to that data, the data itself will be erased. So data is “intelligently removed” only when EVERYONE involved decides they don’t need it anymore.
Limitations of Hard Links
Below are the limitations of using hard links:
- You can only hard link to files, not directories.
- You can only hard link to files on the same filesystem. If you had an external drive mounted at /mnt/Backups, you would not be able to hard link a file from your SSD, at /home/alex/file to some other file on /mnt/Backups since that’s a different filesystem.
To learn more about Linux, check out our hands-on Linux course.
Best Practices When Hard Linking
First, make sure that you have the proper permissions to create the link file at the destination. In our case, we need write permissions at: /home/jane/Pictures/.
Second, when you hard link a file, make sure that all users involved have the required permissions to access that file. For Alex and Jane, this might mean that we might have to add both their usernames to the same group, for example, “family”. Then we’d use a command to let the group called “family” read and write to this file. You only need to change permissions on one of the hard links. That’s because you are actually changing permissions stored by the Inode. So once you change permissions at /home/alex/Pictures/family_dog.jpg, /home/jane/Pictures/family_dog.jpg, and all other hard links will show the same new sets of permissions.
Take on real Kubernetes tasks on a live system with KodeKloud Engineer. Ready to Get Started? It's free!
More on Linux: