Login node jobs
On a login node (vuori.csc.fi) a user can run only small test programs without the scheduler (SLURM). For example, a small serial can be run by:
./my_serial_test_executable
Some programs might need a purpose-built environment, i.e., a module file. Typically, a module file contains instructions that alter or set shell environment variables, such as PATH to enable access to various installed software or libraries.
module list
All the available modulefiles can be seen
module avail
If the program needs access to specific shared libraries that are defined in a modulefile load the modulefile before running the executable. You do not have to do this if the modulefile is already loaded by yourself or by default. Below is an example
module load modulefilename
Compute node jobs
On the login node a job is submitted to SLURM, which places the job in a queue and allows it to run when the necessary resources become available on compute nodes.IMPORTANT: All files needed by a job must be copied to $WRKDIR, for example the program and input/output files. Remember to give module load modulefilename commands if needed. The limits (max number of cores, max runtime) of the interactive sessions can be found from bottom of this page.
Interactive serial job
salloc -p interactive -n 1 -t 02:00:00 (allocates the resources)
srun ./my_serial_executable (launch the job)
exit (exit the allocation)
Options:
-n number of processes (number of cores)
-t running time, wallclock, format hh:mm:ss (hours:minutes:seconds)
Other way (one-liner):
salloc -p interactive -n 1 -t 02:00:00 srun ./my_serial_executable
If an application has a command line interface (like gdb debugger) the next
example will start a pseudo terminal on a computing node
and after the resource allocation the debugging
session can be launched normally
salloc -p interactive -n 1 -t 02:00:00 srun --pty $SHELL
gdb ./my_program
exit
Other way (one-liner):
salloc -p interactive -n 1 -t 02:00:00 srun --pty gdb ./my_program
SEE ALSO
man salloc, man srun
Interactive Non-MPI parallel job
salloc -n 4 --mem-per-cpu=1000 -t 01:30:00 -p interactive
srun ./my_executable
exit
Options:
-n number of processes (number of cores)
-t running time, wallclock, format hh:mm:ss (hours:minutes:seconds)
--mem-per-cpu per process memory limit (MB) (example 1GB = 1000 MB)
Other way (one-liner):
salloc -n 4 --mem-per-cpu=1000 -t 01:30:00 -p interactive srun ./my_executable
SEE ALSO
man salloc, man srun
Interactive MPI-parallel job
salloc -n 24 --ntasks-per-node=12 --mem-per-cpu=1000 -t 00:30:00 -p parallel
srun ./my_MPI_executable
exit
Options:
-n number of proceses (number of cores)
--ntasks-per-node On Vuori there is 12 cores per node. This way your job will distributed so that the number nodes is minimized
-t running time, wallclock, format hh:mm:ss (hours:minutes:seconds)
--mem-per-cpu per process memory limit (MB)
Other way (one-liner):
salloc -n 24 --ntasks-per-node=12 --mem-per-cpu=1000 -t 00:30:00 -p parallel srun ./my_MPI_executable
SEE ALSO
man salloc, man srun
Submitting serial or parallel batch jobs
A serial or parallel batch job is submitted using sbatch:
sbatch my_job_scriptAll files needed by a job are in $WRKDIR, for example the program and input/output files. $WRKDIR is available on all nodes and always points to the same location within the cluster.
SEE ALSO
man sbatch
Remember to include module load modulefilename commands in a script if needed. By loading all needed modulefiles makes sure that the environment is always a correct one.
Serial batch job
#!/bin/csh
###
### serial job script example
###
## name of your job
#SBATCH -J my_jobname
## system error message output file
#SBATCH -e my_output_err_%j
## system message output file
#SBATCH -o my_output_%j
## a per-process (soft) memory limit
## limit is specified in MB
## example: 1 GB is 1000
#SBATCH --mem-per-cpu=1000
## how long a job takes, wallclock time hh:mm:ss
#SBATCH -t 01:01:00
## number of proceses
#SBATCH -n 1
## run my executable
srun ./my_serial_program
Parallel batch job
#!/bin/csh
###
### parallel job script example
###
## name of your job
#SBATCH -J my_jobname
## system error message output file
#SBATCH -e my_output_err_%j
## system message output file
#SBATCH -o my_output_%j
## a per-process (soft) memory limit
## limit is specified in MB
## example: 1 GB is 1000
#SBATCH --mem-per-cpu=1000
## how long a job takes, wallclock time hh:mm:ss
#SBATCH -t 11:01:00
##the number of processes (number of cores)
#SBATCH -n 24
##parallel queue
#SBATCH -p parallel
## run my MPI executable
srun ./my_mpi_program
OpenMP and hybrid OpenMP/MPI jobs
Use the option --cpus-per-task (or -c for short) with the commands salloc and sbatch to allocate cores for threads. The following commands both reserve four cores for the program:
salloc --cpus-per-task=4 srun ./my_openmp_app
salloc -c 4 srun ./my_openmp_app
The environment variable OMP_NUM_THREADS specifies the number of OpenMP threads. By default there will be one thread per core. On Vuori that is 12 threads. To match with the previous allocation, one would set:
setenv OMP_NUM_THREADS 4
The following commands will both run a hybrid OpenMP/MPI job. They allocate four tasks and six cores per task for the program:
salloc --ntasks=4 --cpus-per-task=6 srun ./my_hybrid_app
salloc -n 4 -c 6 srun ./my_hybrid_app
Binding threads to cores
The runtime libraries on Vuori support core affinity. Core affinity binds a thread to particular cores. In general, this improves performance. The binding is controlled with compiler specific environment variables as follows:
PGI
setenv MP_BIND yes
The value of MP_BIND must be set to yes. Otherwise all threads run in a node are run only in one core. The default of MP_BIND is no.
PathScalesetenv PSC_OMP_AFFINITY TRUE
setenv PSC_OMP_AFFINITY_GLOBAL TRUE
The default of PSC_OMP_AFFINITY is TRUE. So it is not necessary to set it again.
GCCsetenv GOMP_CPU_AFFINITY "0-11"Setting the variable is necessary. Otherwise all threads within a node will run only in one core.
For more information, see Chapter Shared memory parallelization.
Queue/partition limits
- serial (default queue), max nodes=1, max cpu's=12, run time limit 7 days
- parallel, max nodes=12, max cpu's=144, run time limit 7 days
- longrun, max nodes=12, max cpu's=144, run time limit 21 days
- interactive, max nodes=1, max cpu's=12, run time limit 4 hours
- gpu run time limit 24 hours
- gpu6g, run time limit 12 hours