The Starter Guide introduced the file ownership and permissions access concepts, but really understanding the UNIX filesystem (and this also applies to Linux' filesystems) requires that we redefine the concept of “What is a file.”.
Here, “everything” really means everything. A hard disk, a partition on a hard disk, a parallel port, a connection to a web site, an Ethernet card: all these are files. Even directories are files. Linux recognizes many types of files in addition to the standard files and directories. Note that by file type here, we do not mean the type of the contents of a file: for GNU/Linux and any UNIX system, a file, whether it be a PNG image, a binary file or whatever, is just a stream of bytes. Differentiating files according to their contents is left to applications.
If you remember, when you do ls -l, the character before the access rights identifies the type of a file. We have already seen two types of files: regular files (-) and directories (d). You can also find other types if you wander through the file tree and list the contents of directories:
Character mode files: these files are either special system files (such as /dev/null, which we have already discussed), or peripherals (serial or parallel ports), which share the trait that their contents (if they have any) are not buffered (meaning they are not kept in memory). Such files are identified by the letter c.
Block mode files: these files are peripherals, and unlike character files, their contents are buffered. For example, some files in this category are: hard disks, partitions on a hard disk, floppy drives, CD-ROM drives and so on. Files /dev/hda, /dev/sda5 are example of block mode files. In ls -l output, these are identified by the letter b.
Symbolic links: these files are very common, and heavily used in the Mandrake Linux system startup procedure (see chapter Chapter 11, The Start-Up Files: init sysv). As their name implies, their purpose is to link files in a symbolic way, which means that they are files whose content is the path of a different file. They may not point to an existing file. They are very frequently called “soft links” , and are identified by an 'l'.
Named pipes: in case you were wondering, yes, these are very similar to pipes used in shell commands, but with the difference that these actually have names. Read on to learn more. They are very rare, however, and it is not likely that you will see one during your journey into the file tree. Just in case you do, the letter identifying them is p. To learn more, have a look at the section called “ Anonymous Pipes and Named Pipes”.
Sockets: this is the file type for all network connections, but only a few of them have names. What's more, there are different types of sockets and only one can be linked, but this is way beyond the scope of this book. Such files are identified by the letter s.
Here is a sample of each file:
$ ls -l /dev/null /dev/sda /etc/rc.d/rc3.d/S20random /proc/554/maps \ /tmp/ssh-queen/ssh-510-agent crw-rw-rw- 1 root root 1, 3 May 5 1998 /dev/null brw-rw---- 1 root disk 8, 0 May 5 1998 /dev/sda lrwxrwxrwx 1 root root 16 Dec 9 19:12 /etc/rc.d/rc3.d/ S20random -> ../init.d/random* pr--r--r-- 1 queen queen 0 Dec 10 20:23 /proc/554/maps| srwx------ 1 queen queen 0 Dec 10 20:08 /tmp/ssh-queen/ ssh-510-agent= $ |
Inodes are, along with the “Everything Is a File” paradigm, a fundamental part of any UNIX file system. The word “inode” is short for Information NODE.
Inodes are stored on disk in an inode table. They exist for all types of files which may be stored on a filesystem, including directories, named pipes, character mode files and so on. Which leads to this other famous sentence: “The inode is the file”. Inodes are how UNIX identifies a file in a unique way.
Yes, you didn't misread that: on UNIX, you do not identify a file by its name, but by its inode number. [21] The reason for this is that the same file can have several names, or even no name. A file name, in UNIX, is just an entry in a directory inode. Such an entry is called a link. Let's look at links in more detail.
[21] Important: notice that inode numbers are unique per filesystem, which means that an inode with the same number can exist on another filesystem. This leads to the difference between on-disk inodes and in-memory inodes. While two on-disk inodes can have the same number if they are on two different filesystems, in-memory inodes have a unique number right across the system. One solution to obtain uniqueness, for example, is to hash the on-disk inode number against the block device identifier.