The Starter Guide introduced the file ownership and permissions access concepts, but really understanding the UNIX® file system (and this also applies to Linux's file systems) requires that we redefine the concept of “What is a file.”.
Here, “everything” really means everything. A hard disk, a partition on a hard disk, a parallel port, a connection to a web site, an Ethernet card: all these are files. Even directories are files. Linux recognizes many types of files in addition to the standard files and directories. Note that by file type here, we do not mean the type of content of a file: for GNU/Linux and any UNIX® system, a file, whether it be a PNG image, a binary file or whatever, is just a stream of bytes. Differentiating files according to their contents is left to applications.
When you issue
ls -l, the character before the access rights
identifies the file type. We have already seen two types of
files: regular files (-
) and directories
(d
). You can also find other types if you
wander through the file tree and list the contents of
directories:
Character mode
files: they are either special system files
(such as /dev/null
, which we have
already discussed), or peripherals (serial or parallel
ports), which share the trait that their contents (if they
have any) are not buffered (meaning they
are not kept in memory). Such files are identified by the
letter c
.
Block mode
files: these files are peripherals, and unlike
character files, their contents are
buffered. For example, some files in this category are:
hard disks, partitions on a hard disk, floppy drives,
CD-ROM drives and so on. Files like
/dev/hda
,
/dev/sda5
are examples of block-mode
files. Such files are identified by the letter
b
.
Symbolic
links: these files are very common and heavily
used in the Mandrakelinux system start-up procedure (see
Chapter 11, The Start-Up Files: init sysv). As their name implies, their
purpose is to link files in a symbolic way, which means
that they are files whose content is the path to a
different file. They may not point to an existing
file. They are very frequently called soft
links, and such files are identified by the
letter l
.
Named
pipes: in case you were wondering, yes, these
are very similar to pipes used in shell
commands,
but with the difference that these actually have
names. However they are very rare and it's not likely that
you will see one during your journey into the file
tree. Such files are identified by the letter
p
. See Section 4, ““Anonymous” Pipes and Named
Pipes”.
Sockets: this is the file
type for all network connections, but only a few of them
have names. What's more, there are different types of
sockets and only one can be linked, but this is way beyond
the scope of this book. Such files are identified by the
letter s
.
Here is a sample of each file:
$ ls -l /dev/null /dev/sda /etc/rc.d/rc3.d/S20random /proc/554/maps \ /tmp/ssh-queen/ssh-510-agent crw-rw-rw- 1 root root 1, 3 May 5 1998 /dev/null brw-rw---- 1 root disk 8, 0 May 5 1998 /dev/sda lrwxrwxrwx 1 root root 16 Dec 9 19:12 /etc/rc.d/rc3.d/ S20random -> ../init.d/random* pr--r--r-- 1 queen queen 0 Dec 10 20:23 /proc/554/maps| srwx------ 1 queen queen 0 Dec 10 20:08 /tmp/ssh-queen/ ssh-510-agent= $
Inodes are, along with the “Everything Is a File” paradigm, a fundamental part of any UNIX® file system. The word inode is short for “Information NODE”.
Inodes are stored on disk in an inode table. They exist for all types of files which may be stored on a file system, including directories, named pipes, character-mode files and so on. Which leads to this other famous sentence: “The inode is the file”. Inodes are how UNIX® identifies a file in a unique way.
No, you didn't misread that: in UNIX®, you do not identify a file by its name, but by its inode number[25]. The reason for this is that the same file may have several names, or even no name. In UNIX®, a file name is just an entry in a directory inode. Such an entry is called a link. Let us look at links in more detail.
[25] Important: note that inode numbers are unique per file system, which means that an inode with the same number can exist on another file system. This leads to the difference between on-disk inodes and in-memory inodes. While two on-disk inodes may have the same number if they are on two different file systems, in-memory inodes have a unique number right across the system. One solution to obtain uniqueness, for example, is to hash the on-disk inode number against the block device identifier.