
The Compiler
============

To use the compiler, require "compile.ss".  It defines the following
functions (plus a few signatures).  Options that control the compiler
are documented in the next section.

 Single-file extension compilation
 ---------------------------------

   ((compile-extensions expr) scheme-file-list dest-dir)

      `(compile-extension expr)' returns a compiler that is initialized with
      the elaboration-time expression `expr'.  The compiler takes a list
      of Scheme files and compiles each of them to an extension, placing
      the resulting extensions in the directory specified by `dest-dir'.
      If `dest-dir' is #f, each extension is placed in the same directory
      as its source file.

   ((compile-extensions-to-c expr) scheme-file-list dest-dir)

     Like `compile-extensions', but only .c files are produced, not
     extensions.

   (compile-c-extensions c-file-list dest-dir)

     Compiles each .c file (usually produced with `compile-extensions-to-c') 
     in c-file-list to an extension.  `dest-dir' is handled as in
     `compile-extensions'.

 Multi-file extension compilation
 ---------------------------------

   ((compile-extension-parts expr) scheme-file-list dest-dir)

      `(compile-extension expr)' returns a compiler that is initialized with
      the elaboration-time expression `expr'.  The compiler takes a list
      of Scheme files and compiles each of them to a linkable object and
      a .kp (constant pool) file, placing the resulting objects and .kp files
      in the directory specified by `dest-dir'.  If `dest-dir' is #f, each object 
      and .kp file is placed in the same directory as its source file.
 
   ((compile-extension-parts-to-c expr)  scheme-file-list dest-dir)

     Like `compile-extension-parts', but only .c and .kp files are produced, 
     not compiled objects.

   (compile-c-extension-parts c-file-list dest-dir)

     Compiles each .c file (produced with `compile-extension-parts-to-c')
     in c-file-list to an extension.  `dest-dir' is handled as in
     `compile-extension-parts'.

   (link-extension-parts obj-and-kp-file-list dest-dir)

     Links objects for a multi-object extension together, using .kp
     files to generate and link pooled constants.  The objects and
     .kp files in `obj-and-kp-file' can be in any order.  The resulting
     extension "_loader" is placed in the directory specified by `dest-dir'.

   (glue-extension-parts obj-and-kp-file-list dest-dir)

     Like `link-extension-parts', but only a "_loader" object file
     is generated; this object file is linked with all the other
     object files to produce the "_loader" extension.

 zo compilation
 --------------

   ((compile-zos expr) scheme-file-list dest-dir)

      `(compile-zos expr)' returns a compiler that is initialized with
      the elaboration-time expression `expr'.  The compiler takes a list
      of Scheme files and compiles each of them to a .zo file, placing
      the resulting .zo files in the directory specified by `dest-dir'.
      If `dest-dir' is #f, each .zo file is placed in the same directory
      as its source file.
 
 Collection compilation
 ----------------------

   (compile-collection-extension collection sub-collection ...)

      Compiles the specified (sub-)collection to an extension
      "_loader", putting intermediate .c and .kp files in the
      collection's "compiled/native" directory, and object files and
      the resulting "_loader" extension in the collection's
      "compiled/native/PLATFORM" directory (where `PLATFORM' is the
      system name for the current platform).

      The collection compiler reads the collection's "info.ss"
      file to obtain information about compiling the collection.
      The following fields are used:

       `name' - the name of the collection as a string.

       `compile-prefix' - an S-expression to use as the elaboration-time
         prefix expression for compiling all files in the collection.

       `compile-omit-files' - a list of filenames (without paths); all
         Scheme files in the collection are compiled except for the
         files in this list.  If a file contains elaboration time
         expressions (e.g., macros) that are not intended to be local
         to the file, then the file should be included in this list.

       `compile-subcollections' - a list of collection paths, where each
         path is a list of strings. `compile-collection-extension' is 
         applied to each of the collections.

      Only the `name' field is required from info.ss, but the 
      `compile-prefix' field should also be provided, because the
      Setup PLT uses this field as an indication that the collection 
      should be compiled (see "Setup PLT" below).

      The compilation process is driven by the `make-collection'
      function in the "collection.ss" library of the "make"
      collection.

   (compile-collection-zos collection sub-collection ...)

      Compiles the specified (sub-)collection files to .zo files.
      The .zo files are placed into the collection's "compiled"
      directory.

      The "info.ss" file is used as in `compile-collection-extension'.
      In addition, the following two fields are used:

       `compile-elaboration-zos' - a list of filenames (without paths);
         the Scheme files in this list are compiled with elaborations 
         preserved (see MzLib's compile-file for information about 
         preserving elaborations). The files are compiled from left to
         right in the list, all in the same namespace. If a file does 
         not need to be compiled, it is nevertheless loaded before 
         subsequent files in the list are compiled.

       `compile-elaboration-zos-prefix' - an S-expression to use
         as an elaboration-time prefix expression for compiling
         the files returned for `compile-elaboration-zos'.

      The main compilation process is driven by the `make-collection'
      function in the "collection.ss" library of the "make"
      collection. The elaboration-preserving compilation does not
      use make, but it only recompiles a file when the source file
      is newer.

---------------------------------------------------------------------------

Options for the Compiler
========================

To set options for the "compile.ss" extension compiler, load the
libray "option.ss".  (This library is automatically loaded by
"compile.ss", but it can also be loaded earlier.) Options are set by
the following parameters:

  compiler:option:verbose - #t causes the compiler to print
     verbose messages about its operations.  Default = #f.

  compiler:option:setup-prefix - a string to embed in public names.
     This is used mainly for compiling extensions with the collection
     name so that cross-extension conflicts are less likely in
     architectures that expose the public names of loaded extensions.
     Note that `compile-collection' handles prefixing automatically
     (by setting this option).  Default = "".

  compiler:option:clean-intermediate-files - #t keeps intermediate
     .c/.o files.  Default = #f.

  compiler:option:compile-subcollections - #t uses info.ss's
     'compile-subcollections for compiling collections. Default = #t.

  compiler:option:compile-for-embedded - #t creates .c files and
     object files to be linked directly with an embedded MzScheme
     run-time system, instead of .c files and object files to
     be dynamically loaded into MzScheme as an extension.
     Default = #f.

  compiler:option:propagate-constants - #t improves the code by
     propogating constants.  Default = #t.

  compiler:option:assume-primitives - #t equates X with #%X when
     #%X exists.  This is useful only with non-unitized code.
     Default = #f.

  compiler:option:stupid - Allow obvious non-syntactic errors; e.g.:
    ((lambda () 0) 1 2 3).  Default = #f.

  compiler:option:vehicles - Controls how closures are compiled.  The
    possible values are: 'vehicles:automatic - auto-groups
                         'vehicles:functions - groups by procedue
                         'vechicles:units - groups by unit
                         'vehicles:monolithic - groups randomly
    Default = 'vehicles:automatic.

  compiler:option:vehicles:monoliths - Sets the number of random
    groups for 'vehicles:monolithic.

  compiler:option:seed - Sets the randomizer seed for
    'vehicles:monolithic.

  compiler:option:max-exprs-per-top-level-set - Sets the number of
    top-level Scheme expressions crammed into one C function.  Default
    = 25.

  compiler:option:unpack-environments - #f might help for
    register-poor architectures.  Default = #t.

  compiler:option:debug - #t creates debug.txt debugging file.  Default
    = #f.

  compiler:option:test - #t ignores top-level expressions with syntax
   errors.  Default = #f.

More options are defined by "mzscheme" "dynext"'s "compile.ss" and
"link.ss" libraries.  Those options control the actual C compiler and
linker that are used.  See "doc.txt" in the "mzscheme" "dynext"
collection for more information about those options.

The "optionr.ss" library is a unit/sig matching the signature
`compiler:option^' containing these options.  (The unitized option
names do not have the `compiler:option:' prefix.) The "sig.ss" library
defines the `compiler:option^' signature.

---------------------------------------------------------------------------

The Compiler as a Unit
======================

The "compiler.ss" library is a unit/sig matching `compiler^' that
provides the "compile.ss" functions.  This signature and all auxilliary
signatures needed by "compiler.ss" are defined by the "sig.ss"
library.

The "compiler.ss" signed unit requires the following imports:

   compiler:option^ - from "optionr.ss" or "option.ss" (the latter
                      uses a `compiler:option:' prefix)
   mzlib:function^
   mzlib:pretty-print^
   mzlib:file^
   mzlib:compile^
   dynext:compile^ - From the collection "mzscheme" "dynext"
   dynext:link^
   dynext:file^

---------------------------------------------------------------------------

Low-level Extension Compiler and Linker
=======================================

The high-level "compile.ss" interface relies on low-level
implementations of the extension compiler and linker.  (Of course, .zo
compilation is handled by MzLib's "compile.ss" library.)

The "loadr.ss" and "ldr.ss" libraries define unit/sigs for the
low-level extension compiler and multi-file linker.  The "load.ss"
library opens these units into the current namespace with the prefix
`mzc:'.

The low-level functions are:

  (mzc:eval-compile-prefix expr) - Evaluates an elaboration-time
    S-expression `expr'.  Future calls to mzc:compile-XXX will see the
    effects of the elaboration expression.

  (mzc:compile-extension scheme-source dest-dir) - Compiles a
    single Scheme file to an extension.

  (mzc:compile-extension-to-c scheme-source dest-dir) - Compiles
    a single Scheme file to a .c file.

  (mzc:compile-c-extension c-source dest-dir) - Compiles a single .c
    file to an extension.

  (mzc:compile-extension-part scheme-source dest-dir) - Compiles a
    single Scheme file to a compiled object and .kp file toward a
    multi-file extension.

  (mzc:compile-extension-part-to-c scheme-source dest-dir) - Compiles
    a single Scheme file to .c and .kp files towards a multi-file
    extension.

  (mzc:compile-c-extension-part c-source dest-dir) - Compiles a single
    .c file to a compiled object towards a multi-file extension.

  (mzc:link-extension object-and-kp-file-list dest-dir) - Links
    compiled object and .kp files into a multi-file extension.

All but the last function are from the "loadr.ss" unit, and the
last one is from "ldr.ss".  The "loadr.ss" unit/sig requires the
following imports:
   mzlib:function^
   mzlib:pretty-print^
   mzlib:file^
   dynext:compile^ - From the collection "mzscheme" "dynext"
   dynext:link^
   dynext:file^
   compiler:option^ - from "optionr.ss" or "option.ss" (the latter
                      uses a `compiler:option:' prefix)
and the "ldr.ss" unit/sig requires the following imports:
   dynext:compile^ - From the collection "mzscheme" "dynext"
   dynext:link^
   dynext:file^
   mzlib:function^
   compiler:option^ - from "optionr.ss" or "option.ss" (the latter
                      uses a `compiler:option:' prefix)

---------------------------------------------------------------------------

Setup PLT: Collection Setup and Unpacking
=========================================

The Setup PLT executable (bin/setup-plt for Unix) performs two
services:

 * Compiling and setting up all collections: When Setup PLT is run
   without any arguments, it finds all of the current collections
   (using the PLTHOME and PLTCOLLECTS environment variable)
   and compiles all collections with an info.ss library that
   indicates how the collection is compiled (see the
   --collection-zos flag for mzc).

   The --clean (or -c) flag to Setup PLT causes it to delete
   all existing .zo and extension files, thus ensuring a clean
   build from the source files.

   The -l flag takes one or more collection names and restricts 
   Setup PLT's action to those collections.

   In addition to compilation, a collection's info.ss library
   can specify executables to be installed in the plt directory
   (plt/bin under Unix) or other installation actions.

 * Unpacking .plt files: A .plt file is a platform-indepedent
   distribution archive for MzScheme- and MrEd-based software.
   When one or more file names are provided as the command line
   arguments to Setup PLT, the files contained in the .plt
   archive are unpacked (according to specifications embedded in
   the .plt file; see below) and only the collections specified
   by the plt file are compiled and setup.

 Compiling and Setting Up Collections
 ------------------------------------

Setup PLT attempts to compile and set up any collection that:

 * has an info.ss library;
 
 * is a top-level collection (not a sub-collection; top-level
   collections can specify subcollections to be compiled and
   set up with the `compile-subcollections' info.ss field);
   and

 * has the `name' and `compile-prefix' info.ss fields.

Collections meeting this criteria are compiled using the
`compile-collection-zos' procedure described above. If the -e or
--extension flag is specified, then the collections are also compiled
using the `compile-collection-extension' procedure described above.

Additional info.ss fields trigger additional setup actions:

 * `mzscheme-launcher-names' - a list of executable names to be
   installed in plt (or plt/bin) to run MzScheme programs implemented
   by the collection. A parallel list of library names must be
   provided by `mzscheme-launcher-libraries'. For each name, a
   launching executable is set up using the launcher collection's
   `install-mzscheme-program-launcher'. If the executable already
   exists, no action is taken.

 * `mzscheme-launcher-libraries' - a list of library names in
   parallel to `mzscheme-launcher-names'.

 * `mred-launcher-name' - an executable to be installed in plt
   (or plt/bin) to run a MrEd program implemented by the collection.
   The executable is installed using the launcher collection's
   `install-mred-program-launcher'. If the executable already
   exists, no action is taken.

 * `install-collection' - a procedure that accepts a directory path
   argument and performs collection-specific installation work.
   This procedure should avoid unnecessary work in the case that
   it is called multiple times for the same installation.

 Unpacking .plt Distribution Archives
 ------------------------------------

The extension ".plt" is not required for a distribution archive; this
convention merely helps users identify the purpose of a distribution
file.

The raw format of a distribution file is described below. This format
is uncompressed and sensitive to communication modes (text
vs. binary), so the distribution format is derived from the raw format
by first compressing the file using gzip, then encoding the gzipped
file with the MIME base64 standard (which relies only the characters
A-Z, a-z, 0-9, +, /, and =; all other characters are ignored when
a base64-encoded file is decoded).

The raw format is

 * "PLT" is the first three characters.

 * An info.ss-like procedure that takes a symbol and a failure thunk
   and returns information about archive for recognized symbols. The
   two required info fields are:

     + `name' - a human-readable string describing the archive's
       contents. This name is used only for printing messages to the
       user during unpacking.

     + `unpacker' - a symbol indicating the expected unpacking
       environment. Currently, the only allowed value is 'mzscheme.

   The procedure is extracted from the archive using MzScheme's
   `read' and `eval' procedures.

 * An unsigned unit that drives the unpacking process. The unit 
   accepts two imports: a path string for the plt directory and
   an `unmztar' procedure. The remainder of the unpacking process
   consists of invoking ths unit. It is expected that the unit will
   call `unmztar' procedure to unpack directories and files that are
   defined in the input archive afer this unit. The result of invoking
   the unit must be a list of collection paths (where each collection
   path is a list of strings); once the archive is unpacked, Setup
   PLT will compile and setup the specified collections.

   The `unmztar' procedure takes one argument: a filter
   procedure. The filter procedure is called for each directory and
   file to be unpacked. It is called with three arguments:

      + 'dir or 'file - indicates whether the item to be unpacked
        is a directory or a file;

      + a relative path string - the pathname of the directory or file
        to be unpacked, relative to the plt directory; and

      + a path string for the plt directory.

   If the filter procedure returns #f for a directory or file, the
   directory or file is not unpacked. If the filter procedure returns
   #t and the file or directory already exists, it is not created.

   When a directory is unpacked, intermediate directies are created
   as necessary to create the specified directory. When a file is
   unpacked, the directory must already exist.

   The unit is extracted from the archive using MzScheme's `read'
   and `eval' procedures.

Assuming that the unpacking unit calls the `unmztar' procedure, the
archive should continue with unpackables. Unpackables are extracted
until the end-of-file is found (as indicated by an `=' in the
base64-encoded input archive).

An unpackable is one of the following:

 * The symbol 'dir followed by a list. The `build-path' procedure
   will be applied to the list to obtain a relative path for the
   directory (and the relatie path is combined with the plt directory
   path to ge a complete path).

   The 'dir symbol and list are extracted from the archive using
   MzScheme's `read' (and the result is *not* `eval'uated).

 * The symbol 'file, a list, a number, an asterisk, and the file
   data. The list specifies the file's relative path, just as for
   directories. The number indicates the size of the file to be
   unpacked in bytes. The asterisk indicates the start of the file
   data; the next n bytes are written to the file, where n is the
   specified size of the file.

   The symbol, list, and number are all extracted from the archive
   using MzScheme's `read' (and the result is *not* `eval'uated).
   After the number is read, input characters are discarded until
   an asterisk is found. The file data must follow this asterisk
   immediately.

 Making .plt archives
 --------------------

The compiler collection's pack.ss library provides functions to help
make .plt archives, especially under Unix:

 (pack dest name dirs collections filter encode?) - Creates the .plt
  file specified by the pathname `dest', using the string `name' as
  the name reported to Setup PLT as the archive's description, and 
  `collections' as the list of colection paths returned by the
  unpacking unit. The `dirs' argument must be a list of relative
  paths for directories; the contents of these directories will be
  packed into the archive. The `filter' procedure is called with
  the relative path of each candidate for packing; if it returns #f
  for some path, then that file or directory is omitted from the
  archive. If `encode?' is #f, then the output archive is in raw
  form, and still must be gzipped and mime-encoded. If `encode?'
  is #t, then gzip and mmencode must be in the shell's path for
  executables.

  The `filter' and `encode?' arguments are optional, defaulting
  to std-filter and #t, respectively.

 (std-filter p) - returns #t unless `p' matches "CVS$" or "compiled$".

 (mztar dir output filter) - called by `pack' to write one directory
  `dir' to the output port `output; using the filter procedure
  `filter'.
