The batch job queing and scheduling system used on Vuori is the Simple Linux Utility for Resource Management (SLURM).
This chapter describes how to use SLURM, including its most important commands and their options, how to submit jobs, how to monitor, display and get information about hosts, queues and jobs, and how to manage jobs and remove them from the queue. Batch job script examples are given.
All batch jobs must be submitted to compute nodes via the queuing mechanism of SLURM using the command sbatch. Usage of the job submitting and launching commands sbatch, salloc and srun are described in the Chapter Running programs (batch and interactive).
Available queues / partitions
The command sinfo displays available partitions and some of their properties. In SLURM, partition is equilavent of queue. These may change from time to time. The following queues were available for customers when this chapter was written:
serial : 1-12 cores / 1d/7d def/max runtime
parallel : 12-144 cores / 1d/7d def/max runtime
longrun : 1-144 cores / 2d/21d def/max runtime
test : 1-24 cores / 5min/15min def/max runtime
interactive : 1-12 cores / 1h/4h def/max runtime
gpu : 7 nodes (14 gpu's) / 10h/24h def/max runtime
NB! In the longrun queue you run at your own risk. If a batch job in that queue stops prematurely no compensation is given for lost cpu time!
The test queue is aimed to allow prompt execution of short test jobs (less than 15 min). For longer tests use the interactive queue.
The chapter is divided to the following subsections:
1 General information
2 Commands (submitting and deleting jobs)
3 Monitoring and displaying jos and system status