Murska User's Guide > Program development > Running programs
Tehdyt toimenpiteet

Running programs (batch and interactive)

This section gives a short introduction on how to run programs on Murska.

Login node jobs

On a login node (murska.csc.fi) a user can run only small test programs without the scheduler (LSF-SLURM). For example, a small serial and parallel programs can be run, respectively, by:

./my_serial_test_executable
/opt/hpmpi/bin/mpirun -np 2 ./my_parallel_mpi_test_executable

Some programs might need a purpose-built environment, i.e., a module file. Typically, a module file contains instructions that alter or set shell environment variables, such as PATH to enable access to various installed software or libraries.

To view the modulefiles that are currently loaded in your environment
module list

All the available modulefiles can be seen

module avail

If the program needs access to specific shared libraries that are defined in a modulefile load the modulefile before running the executable. You do not have to do this if the modulefile is already loaded by yourself or by default. Below is an example

module load modulefilename  
module load mpi (this example loads the default mpi environment)


Compute node jobs

On the login node  a job is submitted to LSF-SLURM, which places the job in a queue and allows it to run when the necessary resources become available on compute nodes.
Very shortly LSF allocates the resources and SLURM provides an execution layer to launch tasks to all nodes in the allocation.

IMPORTANT: All files needed by a job must be copied to $WRKDIR, for example the program and input/output files. Remember to give module load modulefilename commands if needed. The limits (max number of cores, max runtime) of the interactive sessions can be found from bottom of this page.

Interactive serial job without GUI (Grafical User Interface).

bsub -n 1 -W 02:00 -Ip $SHELL -i   (allocates the resources, LSF)
srun ./my_serial_executable (launch the job, SLURM)
exit (exit the allocation)

Options:
-n number of processes (number of cores)
-W running time, wallclock, format hh:mm (hours:minutes)
-Ip interactive job

SEE ALSO
man bsub, man srun

The command bsub will give a prompt for a command line only if the parameter -i is passed to the shell. However, some advanced control keys might not work. You may try using xterm instead of $SHELL (see below).

Interactive Non-MPI parallel job without GUI

bsub -n 4 -M 1048576 -W 01:30 -Ip $SHELL -i
srun ./my_executable
exit

Options:
-n number of processes (number of cores)
-W running time, wallclock, format hh:mm (hours:minutes)
-Ip interactive job
-M per process memory limit (KB) (example 1GB = 1048576 KB)

Other way (LSF and SLURM on the same command line):

bsub -n 4 -M 1048576 -W 01:30 -Ip srun ./my_executable

Interactive MPI-parallel job without GUI
bsub -n 4 -M 1048576 -W 00:30 -Ip $SHELL -i
mpirun -srun ./my_MPI_executable
exit

bsub options:
-n number of proceses (number of cores)
-W running time, wallclock, format hh:mm (hours:minutes)
-Ip interactive job
-M per process memory limit (KB)

Other way (LSF and SLURM on the same command line):

bsub -n 4 -M 1048576 -W 00:30 -Ip mpirun -srun ./my_MPI_executable

Serial or parallel interactive session where graphical user interface (GUI) is necessary.

Serial session (one core session). Remember to write all necessary bsub options (memory and runtime requirements).
bsub -Ip xterm
Multiple core session (below 4 cores). Remember to write all necessary bsub options (number of cores, memory and runtime requirements).
bsub -n 4 -Ip xterm
These will open an X-terminal window where one can launch a serial or parallel  application and where the prompt also support all control characters. After the xterm session has started, command can be entered normally:
srun ./my_serial_executable
mpirun -srun ./my_MPI_executable


How-to submit a serial or parallel batch job.

bsub < my_job_script

Do not forget '<' , it is essential.
All files needed by a job are in $WRKDIR, for example the program and input/output files.
$WRKDIR is available on all nodes and always means the same thing.

Serial batch job. 

#!/bin/csh
###
### serial job script example
###
# execution shell environment
#BSUB -L /bin/csh

## name of your job, %J will show as your jobID
#BSUB -J my_jobname%J

## system error message output file
#BSUB -e my_output_err_%J

## system message output file
#BSUB -o my_output_%J

## send email notification when the job is finished
#BSUB -N

## a per-process (soft) memory limit
## limit is specified in KB
## example: 1 GB is 1048576
#BSUB -M 524288

## how long a job takes, wallclock time hh:mm
#BSUB -W 01:01

## number of proceses
#BSUB -n 1

## run my executable
srun my_serial_program

## bjobs will save some information about my job
bjobs -l $LSB_JOBID

Parallel batch job.

#!/bin/csh
###
### parallel job script example
###
# Initializes the execution environment
#BSUB -L /bin/csh

## name of your job, %J will show as your jobID
#BSUB -J my_jobname%J

## system error message output file
#BSUB -e my_output_err_%J

## system message output file
#BSUB -o my_output_%J

## send email notification when the job is finished
#BSUB -N

## a per-process (soft) memory limit
## limit is specified in KB
## example: 1 GB is 1048576
#BSUB -M 1048576

## how long a job takes, wallclock time hh:mm
#BSUB -W 11:01

##the number of processes (number of cores)
#BSUB -n 4

## run my MPI executable
/opt/hpmpi/bin/mpirun -srun my_mpi_program

## bjobs will save some information about my job
bjobs -l $LSB_JOBID

Remember to include module load modulefilename  commands in a script if needed. By loading all needed modulefiles makes sure that the environment is always a correct one. Note that since July 2008 the modules environment is initialized automatically for all shells.

Available queues

The command bqueues displays available queues and some of their proporties. These may change from time to time. The following queues were available for customers when this chapter was written:


serial         :     1  core    /   4h/7d  def/max runtime  /  not interactive
parallel      : 256  cores  /   4h/2d  def/max runtime  /  not interactive
interactive :   32  cores /   1h/4h  def/max runtime  /  interactive
longrun     : 128  cores  /  8h/21d def/max runtime  /  not interactive

NB! In the longrun queue you run at your own risk. If a batch job in that queue stops prematurely no compensation is given for lost cpu time!