Episode Details
Back to Episodes
295: Fun with funlinkat()
Description
Introducing funlinkat(), an OpenBSD Router with AT&T U-Verse, using NetBSD on a raspberry pi, ZFS encryption is still under development, Rump kernel servers and clients tutorial, Snort on OpenBSD 6.4, and more.
Headlines
Introducing funlinkat
- It turns out, every file you have ever deleted on a unix machine was probably susceptible to a race condition
One of the first syscalls which was created in Unix-like systems is unlink. In FreeBSD this syscall is number 10 (source) and in Linux, the number is dependent on the architecture but for most of them is also the tenth syscall (source). This indicated that this is one of the primary syscalls. The unlink syscall is very simple and we provide one single path to the file that we want to remove. The “removing file” process itself is very interesting so let’s spend a moment to understand the it. First, by removing the file we are removing a link from the directory to it. In Unix-like systems we can have many links to a single file (hard links). When we remove all links to the file, the file system will mark the blocks used by the file as free (a different file system will behave differently but let’s not jump into a second digression). This is why the process is called unlinking and not “removing file”. While we unlink the file two or three things will happen:
- We will remove an entry in the directory with the filename.
- We will decrease a file reference count (in inode).
- If links go to zero - the file will be removed from the disk (again this doesn't mean that the blocks from the disk will be filled with zeros, though this may happen depending on the file system and configuration. However, in most cases this means that the file system will mark those blocks to as free and use them to write new data later This mostly means that “removing file” from a directory is an operation on the directory and not on the file (inode) itself. Another interesting subject is what happens if our system will perform only first or second step from the list. This depends on the file system and this is also something we will leave for another time. The problem with the unlink and even unlinkat function is that we don’t have any guarantee of which file we really are unlinking.
- When you delete a file using its name, you have no guarantee that someone has not already deleted the file, or renamed it, and created a new file with the name you are about to delete. We have some stats about the file that we want to unlink. We performed some tests. In the same time another process removed our file and recreated it. When we finally try to remove our file it is no longer the same file. It’s a classic race condition.
- Many programs will perform checks before trying to remove a file, to make sure it is the correct file, that you have the correct permissions etc. However this exposes the ‘Time-of-Check / Time-of-Use’ class of bugs. I check if the file I am about to remove is the one I created yesterday, it is, so I call unlink() on it. However, between when I checked the date on the file, and when I call unlink, I, some program I am running, might have updated the file. Or a malicious user might have put some other file at that name, so I would be the one who deleted it. In Unix-like operating systems we can get a handle for our file called file - a descriptor. File descriptors guarantee us that all the operations that we will be performing on it are done on the same file (inode). Even if someone was to unlink a number of directories entries, the operating system will not free the structures behind the file descriptor, and we can detect the file that was removed by someone and recreated (or just unlinked). So, for example, we have an alternative func