Raven Cluster Tutorial

From Uv-wiki

Jump to: navigation, search


WARNING: Currently your /home space provided is not backed up, so be sure to make regular backups!
If you have further questions, please check the FAQ. If the question is still not answered, email raven 'at' cs 'dot' utah 'dot' edu.

This page contains general information from getting your Raven account setup to submitting a job and reading the output files.

Contents

Request a Raven Account

  • Email raven 'at' cs 'dot' utah 'dot' edu requesting an account.

Please include:

  • Name of professor authorizing new account (e.g. Professor Kirby for cs6230, Spring 2011)
  • Your full name
  • Desired login name (may be adjusted to system standards)
  • Email address to use for contact (if different than originating address)

When your account has been created, all information regarding said account will be mailed to the address the request came from.

Log Into Your Raven Account

  • Start by connecting to <user_name>@raven-srv.cs.utah.edu with your ssh client (your username and password were given to you in the email mentioned above).

NOTE: Please do not SSH into the Raven compute nodes (raven1 - raven32) to submit jobs or do any other kind of general work. You cannot submit jobs from the compute nodes. Only SSH into the nodes if you have to use gdb and such.
Again, Raven nodes are raven1 - raven32

How to Change Your Password

  • When logging in for the first time, the system will require you to change your password (and will prompt you)
  • If you need to change your password later, use yppasswd from raven-srv

Setup RSA Key Pair

  • You need to setup a public/private key pair, since MPICH2 requires you to be able to ssh to every cluster node without a password.
  • At the prompt, type ssh-keygen -t rsa.
    • For Enter file in which to save the key (/home/user/.ssh/id_rsa):, just press enter choosing the default value.
    • For Enter passphrase (empty for no passphrase): press enter so you don't need a password to connect to the other cluster nodes.
    • For Enter same passphrase again: press enter so you don't need a password to connect to the other cluster nodes.

You should get something like the following output:
(please notice there is one more step below the output box)

Your identification has been saved in /home/user/.ssh/id_rsa.
Your public key has been saved in /home/user/.ssh/id_rsa.pub.
The key fingerprint is:
23:05:7a:ae:b1:9a:70:d1:60:91:18:d8:fd:56:e9:94 ahumphre@raven-srv
The key's randomart image is:
+--[ RSA 2048]----+
|o+.o  .  o       |
|o o... .E        |
|  o ...+.        |
| . o oo..        |
|  . o.o S        |
|   . + . .       |
|. . o            |
| o o             |
|  o              |
+-----------------+
  • At the prompt, type mv ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys which makes it so the public/private key pair is used (eliminating the need for a password when using ssh).

known_hosts and hosts Files

NOTE: By default, these mandatory files should reside in your account on raven-srv in the following places:

  • /home/<username>/hosts
  • /home<username>/.ssh/known_hosts

If they do not (if your account was created before 1/15/10), you can follow the instructions below to create them:

hosts File

This is the list of nodes raven-srv will use for jobs.

  • Download the file: hosts and put it in your raven-srv home directory - /home/<user_name>

known_hosts File

This is the RSA key fingerprint for all Raven nodes (raven1 - raven32) listed in the hosts file.

  • Download the file: known_hosts and put it in /home/<user_name>/.ssh on raven-srv


NOTE: If your browser wants to save the files with an extension, just override it (e.g. no extension).

Create and Compile a Simple MPI Program

  • We can now start experimenting with MPI programs. To compile C programs use mpicc. To compile C++ programs use mpicxx. An example of each will be given.

C

Use a text editor to create the greetings.c file below somewhere in your home directory.

greetings.c

#include <stdio.h>
#include <string.h>
#include "mpi.h"
 
int main(int argc, char* argv[])
{
    int         my_rank;       /* rank of process */
    int         p;             /* number of processes */
    int         src;           /* rank of sender */
    char        message[100];  /* storage for message */
    MPI_Status  status;        /* return status for receive */
 
    /* Start up MPI */
    MPI_Init(&argc, &argv);
 
    /* Find our process rank  */
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
 
    /* Find out number of processes */
    MPI_Comm_size(MPI_COMM_WORLD, &p);
 
    if (my_rank != 0)
    {
        /* Create message */
        sprintf(message, "Greetings from process %d!", my_rank);
 
        /* Use strlen+1 so that '\0' gets transmitted */
        MPI_Send(message, strlen(message) + 1, MPI_CHAR, 0, 0, MPI_COMM_WORLD);
    }
    else /* my_rank == 0 */
    {
        for (src = 1; src < p; src++)
        {
            MPI_Recv(message, 100, MPI_CHAR, src, 0, MPI_COMM_WORLD, &status);
            printf("%s\n", message);
        }
    }
 
    /* Shut down MPI */
    MPI_Finalize();

    return 0;
}

Compile the program with mpicc greetings.c. Mpicc isn't a compiler; it is just a wrapper that invokes gcc with the correct arguments (namely linking the MPICH library) to be able to compile a MPI program. To see the command that is actually executed, use the following: mpicc greetings.c -show. This will generate an executable named a.out in the present working directory.

To name it something else use mpicc -o greetings greetings.c.
Here an executable named greetings will be created in the present working directory.

C++

Use a text editor to create the greetings.cpp file below somewhere in your home directory.

greetings.cpp

#include <iostream>
#include <stdio.h>
#include <string.h>
#include "mpi.h"

int main(int argc, char* argv[])
{
    int         my_rank;      /* rank of process */
    int         p;            /* number of processes */
    int         src;          /* rank of sender */
    char        message[100]; /* storage for message */
    MPI_Status  status;       /* return status for receive */

    /* Start up MPI */
    MPI_Init(&argc, &argv);

    /* Find our process rank  */
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

    /* Find out number of processes */
    MPI_Comm_size(MPI_COMM_WORLD, &p);

    if (my_rank != 0)
    {
        /* Create message */
        sprintf(message, "Greetings from process %d!", my_rank);

        /* Use strlen+1 so that '\0' gets transmitted */
        MPI_Send(message, strlen(message) + 1, MPI_CHAR, 0, 0, MPI_COMM_WORLD);
    }
    else /* my_rank == 0 */
    {
        for (src = 1; src < p; src++)
        {
            MPI_Recv(message, 100, MPI_CHAR, src, 0, MPI_COMM_WORLD, &status);
            std::cout << message << std::endl;
        }
    }

    /* Shut down MPI */
    MPI_Finalize();

    return 0;
}

Compile the program with mpicxx greetings.cpp. Like mpicc, mpicxx is also just a wrapper for g++. This will generate an a.out file in the present working directory like the C version above.
To name it something else use mpicxx -o greetings greetings.cpp. Here an executable named greetings will be created in the present working directory.

Submitting a Job to the Raven Cluster (psub)

  • The final step is submitting the program to run on the cluster. To do this, the qsub program is used. Instead of using qsub directly, it is easier to submit with the command psub <executable> <number_of_processes> <queue_name> [optional_PBS_parameters].
  • psub creates a PBS script that qsub uses for the job submission. To submit a job for the above program, type
$> psub ./a.out 16 '-q cs6230'
  1. Queues available:
    1. cs6230 (for Professor Mike Kirby's Parallel Computing HPC class, Spring 2011)
    2. gauss (for Gauss Group members)
    3. batch (default queue)
  • Some useful "optional PBS parameters" for the last psub argument are -m ae -M <email_address>. This will make the scheduler send an email to <email_address> once the job has finished or has been aborted.
  • Use the command qstat -a to check the status of the submitted job. This will allow you to see if your job has been queued or is running (if it completed, it won't be on the list). The above job might take a few seconds to change from queued status to running status. After running the job, there will be 4-5 new output files in the current working directory:
    • psub.<username> - This is the PBS script created by psub that is submitted automatically to qsub.
    • psub.<pbs_jobid>.out - This is the standard output of the MPI program. It will have the printf/cout output from the MPI program, for instance.
    • psub.<pbs_jobid>.err - This is the standard errror of the MPI program. NOTE: This file may not exist.
    • psub.<username>.o<job_number> - This is the standard output related to job submission.
    • psub.<username>.e<job_number> - This is the standard error related to job submission.
Personal tools