Site Installation | Information | GASNet/GM spawning command |
---|---|---|
MPICH/gm | Distributed as mpirun.ch_gm.pl with the MPICH distribution. it should be noted that GASNet/GM only supports the script from the MPICH 1.2.5..10 distribution | mpirun -np NP |
MPICH/gm or LAM | The mpiexec utility provides the best portability and keeps track of Myricom's changes to the spawner interface (many sites have already adopted it and it is available from http://www.osc.edu/~pw/mpiexec/). | mpiexec -n NP |
gexec | The gexec utility is another cluster remote execution system, and is often used in conjunction with the gstat and Ganglia cluster management software (available from http://www.theether.org/gexec/). The GEXEC_SVRS can be set to target a specific list of nodes to be used for the job. | gasnetrun_gm -n NP --gexec |
Others / Berkeley UPC Default | As a fallback mechanism and also installed by default in uprun.conf, the GASNet/GM contrib directory contains a gasnetrun_gm.pl perl script. It is added as a commodity for clusters which do not necessarily have some of the above site-specific spawners or for administrators which may want to tailor a spawning script. A '-v' verbose output may help debug the script. | gasnetrun_gm -n NP |
The login for the machine is alvarez.nersc.gov. The homepage (including information on how to get accounts) is at
http://www.nersc.gov/alvarez/
To compile UPC programs on alvarez
source /usr/local/pkg/Modules/init/bash
Note: the alvarez convention seems to be for you to put any customizations in your .bashrc.ext file (or .profile.ext, .login.ext, etc.), rather than in .bashrc itself.
module delete pgi
module add upc gcc pbs
maui
to your .bashrc.ext file.
GM_INCLUDE = /usr/gm/include GM_LIB = /usr/gm/lib
To run UPC programs on alvarez
To run programs, you must use the PBS batch system. Generally you'll want to do this interactively:
qsub -I -l nodes=2,walltime=5:00:00.
This shouldn't take more than a minute or so (you can use qstat to examine the job queue), and you can remain in the session for up to 5 hours. You will now be logged onto a compute node on the system, but the file system is largely the same, so you generally won't notice the difference.
mpirun -machinefile $PBS_NODEFILE -np 2 executable_name
where '2' is the number of nodes, and 'executable_name' is your program. To allow a little less typing, the upc module contains a program called mpirun-batch, which can simply be called as
mpirun-batch executable_name
The only conduit supported on the FTG cluster at present is MPI.
To compile UPC programs on n2001
source /usr/local/pkg/Modules/init/bash
module add mpich pbs upc
to your shell's startup script, to ensure you've got the correct MPI libraries, PBS, and our UPC compiler in your PATH.
Multiple versions of our UPC compiler are present on this system: use 'modules avail' to see a list of them. The version that a plain 'module add upc' will load is shown with a '(default)'. This is generally the right version to be using, but other versions can be loaded via 'module load upc/1.0-debug', etc. You can switch between versions via 'module switch upc upc/stable-opt', for instance (you can always just use 'upc' as the first argument, no matter which upc module you actually currently have loaded).
To run UPC programs on n2001
To run programs, you must use the PBS batch system. Generally you'll want to do this interactively:
qsub -I -l nodes=2,walltime=5:00:00.
This shouldn't take more than a minute or so, and you can remain in the session for up to 5 hours. You will now be logged onto a compute node on the system, but the file system is largely mapped to n2001's, so you generally won't notice the difference.
More information on using the PC cluster is available at http://www.nersc.gov/research/FTG/pcp/user/x86cluster.html.
YMCA uses LAM/MPI instead of MPICH, which makes a difference for booting GM jobs. LAM uses a per-use daemon for process and environment control and the 'lamboot' utility must be run prior to invoking upcrun. The connection to the daemon is per-user and is thus remains active even if new terminals are created to run more upc jobs.
The 'lamboot' utility should only be run once -- any attempt to run it more than once will kill all previous connections to the LAM daemon. The LAM/MPI environment on YMCA can be "booted" as follows:
wwnodes --mpi > lamhosts.out lamboot -v lamhosts.out upcrun -np 2 testprogram1 upcrun -np 2 testprogram2 upcrun -np 2 testprogram3If a different set of nodes is required, 'lamhalt' can be run and the boot process can be run again.
YMCA does not share the GM library across all slave nodes. As a result, you must always pass upcc '-Wl,-static' in order to compile the GM library into the binary (note: this may change in the near future). You can do this by putting the flag in 'default_options' in your $HOME/.upccrc file.
To compile UPC programs on citrus
CC = /usr/bin/gcc-3.3 GM_INCLUDE = /usr/gm/include GM_LIB = /usr/gm/lib
To run UPC programs on citrus
The system currently does not contain a batch scheduler and there is no node reservation mechanism other than gexec. It is therefore deprecated for performance runs, since stray processes can already be running on the selected nodes. Gexec either works in a mode where the user specifies the nodes to use or resorts to using gstat which keeps some load balancing information in order to select nodes to use.
GEXEC_SVRS = "c17 c18 c19 c20"
gasnetrun_gm -np NP --gexec