The User Guide introduced the file ownership and permissions access concepts, but really understanding the Unix file system (and this also applies to GNU/Linux' ext2fs) requires that we redefine the file concept itself.
Here, "everything" really means everything. A hard disk, a partition on a hard disk, a parallel port, a connection to a web site, an Ethernet card, all these are files. Even directories are files. GNU/Linux recognizes many types of files in addition to the standard files and directories. Note that by file type here, we do not mean the type of the contents of a file: for GNU/Linux and any Unix system, a file, whether it be a PNG image, a binary file or whatever, is just a stream of bytes. Differentiating files according to their contents is left to applications.
If you remember well, when you do ls -l, the character before the access rights identifies the type of a file. We already saw two types of files: regular files (-) and directories (d). You can also stumble upon these other types if you wander through the file tree and list contents of directories:
Character mode files: these files are either special system files (such as /dev/null, which we already discussed), or peripherals (serial or parallel ports), which share the particularity that their contents (if they have any) are not buffered (meaning they are not kept in memory). Such files are identified by the letter c.
Block mode files: these files are peripherals, and as opposed to character files, their contents are buffered. Files entering this category are, for example, hard disks, partitions on a hard disk, floppy drives, CD-ROM drives and so on. Files /dev/hda, /dev/sda5 are example of block mode files. On a ls -l output, these are identified by the letter b.
Symbolic links: these files are very common, and heavily used in the Mandrake Linux system startup procedure (see chapter "The startup files: init sysv"). As their name implies, their purpose is to link files in a symbolic way, which means that such files may or may not point to an existing file. This will be explained later in this chapter. They are very frequently (and wrongly, as we will see later) called "soft links", and are identified by an 'l'.
Named pipes: in case you were wondering, yes, these are very similar to pipes used in shell commands, but with the difference that these ones actually have names. Read on to learn more. They are very rare, however, and it is very unlikely that you will see one during your journey into the file tree. Just in case you do, the letter identifying them is 'p'. To learn more about it, have a look at "Anonymous" Pipes and Named Pipes.
Sockets: this is the file type for all network connections. Only a few of them have names, though. What's more, there are different types of sockets and only one can be linked, but this is way beyond the scope of this book. Such files are identified by the letter 's'.
Here is a sample of each file:
$ ls -l /dev/null /dev/sda /etc/rc.d/rc3.d/S20random /proc/554/maps \ /tmp/ssh-pingusa/ssh-510-agent crw-rw-rw- 1 root root 1, 3 May 5 1998 /dev/null brw-rw---- 1 root disk 8, 0 May 5 1998 /dev/sda lrwxrwxrwx 1 root root 16 Dec 9 19:12 /etc/rc.d/rc3.d/ S20random -> ../init.d/random* pr--r--r-- 1 pingusa pingusa 0 Dec 10 20:23 /proc/554/maps| srwx------ 1 pingusa pingusa 0 Dec 10 20:08 /tmp/ssh-pingusa/ ssh-510-agent= $ |
Inodes are, with the "Everything Is a File" paradigm, the fundamental part of any Unix file system. The word "inode" is short for Information NODE.
Inodes are stored on disk in an inode table. They exist for all types of files which may be stored on a file system, and this includes directories, named pipes, character mode files and so on. Which leads to this other famous sentence: "The inode is the file". Inodes are also the way by which Unix identifies a file in a unique way.
Yes, you read well: on Unix, you do not identify a file by its name, but by its inode number. [1] The reason for this is that a same file can have several names, or even no name. A file name, in Unix, is just an entry in a directory inode. Such an entry is called a link. Let's look at links in more detail.
[1] | Important: notice that inode numbers are unique per file system, which means that an inode with a same number can exist on another file system. Which leads to the difference between on-disk inodes and in-memory inodes. While two on-disk inodes can have the same number if they are on two different file systems, in-memory inodes have a unique number all across the system. One solution to obtain uniqueness, for example, to hash the on-disk inode number against the block device identifier. |