Introduction

What makes the difference between free software and proprietary software is the access to the sources of the software[29]. That means free software is distributed as archives of source code files. It may be unfamiliar to beginners because users of free software must compile source code for themselves before they can use the software.

There are compiled versions of most of the existing free software. The user in a hurry just has to install these pre-compiled binaries. Some free software is not distributed in this form, or the earlier versions are not yet distributed under binary form. Furthermore, if you use an exotic operating system or an exotic architecture, a lot of software will not be compiled for you. More importantly, compiling software for yourself allows you to enable only the interesting options or to extend the functionality of the software by adding extensions in order to obtain a program which exactly fits your needs.

Requirements

To build software, you need:

  • a computer with a working operating system,

  • general knowledge of the operating system you use,

  • some space on your disk,

  • a compiler (usually for the C language) and an archiver (tar),

  • some food (in difficult cases, it may last a long time). A real hacker eats pizzas –– not quiches.

  • something to drink (for the same reason). A real hacker drinks soda –– for caffeine.

  • the phone number of your techie friend who recompiles his kernel each week,

  • specially patience, and a lot of it!

Compiling from source does not generally present a lot of problems, but if you are not used to it, the smallest snag can throw you. The aim of this document is to show you how to escape from such a situation.

Compilation

Principle

In order to translate source code into a binary file, a compilation must be done (usually from C or C++ sources, which are the most widespread languages among the (UNIX) free software community). Some free software is written in languages which do not require compilation (for instance perl or the shell), but they still require some configuration.

C compilation is logically done by a C compiler, usually gcc, the free compiler written by the GNU project. Compiling a complete software package is a complex task, which goes through the successive compilations of different source files (it is easier for various reasons for the programmer to put the different parts of his work in separate files). In order to make it easier on you, these repetitive operations are handled by a utility named make.

The four steps of compilation

To understand how compilation works (in order to be able to solve possible problems), you have to know the steps involved. The objective is to little by little convert a text file written in a language that is comprehensible to a trained human being (i.e. C language), into a language that is comprehensible to a machine (or a very well trained human being in a few cases). gcc executes four programs one after the other, each of which takes on one step:

  1. cpp: The first step consists of replacing directives (preprocessors) by pure C instructions. Typically, this means inserting a header (#include) or defining a macro (#define). At the end of this stage, pure C code is generated.

  2. cc1: This step consists in converting C into assembly language. The generated code depends on the target architecture.

  3. as: This step consists of generating object code (or binary code) from the assembly language. At the end of this stage, a .o file is generated.

  4. ld: The last step (linkage) links all the object files (.o) and the associated libraries, and produces an executable file.

Structure of a distribution

A correctly structured free software distribution always has the same organization:

  • An INSTALL file, which describes the installation procedure.

  • A README file, which contains general information related to the program (short description, author, URL where to fetch it, related documentation, useful links, etc). If the INSTALL file is missing, the README file usually contains a brief installation procedure.

  • A COPYING file, which contains the license or describes the distribution conditions of the software. Sometimes a LICENSE file is used instead, with the same contents.

  • A CONTRIB or CREDITS file, which contains a list of people related to the software (active participation, pertinent comments, third-party programs, etc).

  • A CHANGES file (or less frequently, a NEWS file), which contains recent improvements and bug fixes.

  • A Makefile file (see the section the section called “Make”), which allows compilation of the software (it is a necessary file for make). If this file does not exist at the beginning, then it is generated by a pre-compilation configuration process.

  • Quite often, a configure or Imakefile file, which allows one to generate a new Makefile file customized for a particular system (see the section called “Configuration”).

  • A directory that contains the sources, and where the binary file is usually stored at the end of the compilation. Its name is often src.

  • A directory that contains the documentation related to the program (usually in man or Texinfo format), whose name is often doc.

  • Sometimes, a directory that contains data specific to the software (typically configuration files, example of produced data, or resources files).



[29] This is not completely true since some proprietary software also provides source code. But unlike what happens with free software, the end user is not allowed to use or modify the code as he wants.