Instructions for compiling and running MPI programs

Note that this does not include instructions on using the PBS system for submitting jobs to the cluster, but merely attempts to help with the use of mpicc/mpiCC/mpirun to run and compile programs either on workstations or on the cluster.

Before you start

If you understand .rhosts files then add an appropriate line which contains your current machine name, followed by your username. If you don't simply execute the following line:

echo `hostname` $LOGNAME > ~/.rhosts

In addition you need a file which contains a list of hosts that you are going to use. This simply has one machine listed one per line. A logical place to save this is in ~/machines. If I was using cslin205, I might choose the following:

   cslin206
   cslin207
   cslin208
   cslin209

It's assumed that you've made this file, and it is located at ~/machines.

That should be all that needs setting up.

Compiling

If you are writing programs in C then the behaviour of the beowulf machine and the linux machines is the same. Compiling the program test.c into an executable called test is done as follows:

mpicc test.c -o test

If you are writing programs in C++ then it's slightly different unfortunately. The mpicc on the linux machines will not compile C++ programs, and there is a different script called mpiCC. Predictably enough it's used as:

mpiCC test.cc -o test

On the beowulf machine, mpicc seems perfectly happy compiling C++ although this has not been thoroughly tested.

Running

Once your program is compiled, then to run the program on the linux machines you use the command mpirun:

mpirun -machinefile ~/machinelist -np 4 test

This would run the program test, with 4 processes (np = number of processes). It simply works its way down the machinelist file. From memory of mpich's behaviour I think that the host you are currently on is the first node automatically, then it starts looking in the machinefile. This should explain why I left the current host off the machinelist. On the linux machines this number can be any number from 1 up.

If you are sending a job to the cluster and your program is expecting some form of input from the keyboard: you can store what you would have typed in a file and pass this into the program using a command like:

mpirun test < inputfile

Only the process with rank 0 will recieve the contents of this file.

Troubleshooting

If any of these steps fail make sure that the following commands produce the expected output:

  cslin102$ cat ~/.rhosts
  cslin102 johnh

  cslin102$ which mpicc
  /usr/local/mpich/bin/mpicc

  cslin102$ which mpirun
  /usr/local/mpich/bin/mpirun
Any permission denied messages are likely to be a result of incorrectly setting you .rhosts file up.