Sciences and methods > Chemistry > Materials Studio related files > Running Materials Studio array jobs in Murska
 
Tehdyt toimenpiteet

Running Materials Studio array jobs in Murska

The "array job" functionality can be used to submit a set of predefined jobs and to control how much resources they use. This is important for Accelrys software jobs as there is a limited number of license tokens used by all jobs.

The LSF queuing system has the option to run an array job, which is essentially one jobs script that sends several jobs in a controlled fashion.

For example, if you want to run e.g. 20 DMol3 jobs, you could,  instead of writing 20 separate submission scripts, write just one array job script, and ask that to run the jobs. This way you can control how many jobs are running at the same moment and thus prevent the license tokens from running out.

The relevant line in the LSF-script would be

#BSUB -J dmol_arrayjob[1-20]%3

which would run twenty jobs but would let only three jobs into exectution at the same moment.

For more details on the array job, look at the bsub man page.

The complete job file could be like:

#!/bin/tcsh
#BSUB -L /bin/tcsh
#BSUB -J dmol_arrayjob[1-20]%3
#BSUB -o dmol_out%J
#BSUB -e dmol_err%J
#BSUB -M 1048576
#BSUB -W 12:00
#BSUB -n 1

cd /to/the/right/directory

set job_name=(`sed -n "$LSB_JOBINDEX"p inputlist.txt`)

set DMOL3PATH=/fs/local/linux26_x86_64/appl/chem/accelrys/MS/MS_4.2/DMol3/bin

$DMOL3PATH/RunDMol3.sh $job_name

$LSB_JOBINDEX is a variable that contains subjob spesific index numbers.
Sed is then used to read one line, specified by the index number, from
the list of jobs to be done (inputlist.txt). In this example the list
should have one column containing the names of jobs to be done (the basename of each job explained in the standalone page here).

To run array jobs, have a look at the old @CSC article, those jobs in SGE batch job system @CSC 4/2005 page 9 (pdf) but in that example, (for Murska i.e. LSF) you must replace:

#$ -t 1-x    -> #BSUB -J jobname[1-x]
$SGE_TASK_ID -> $LSB_JOBINDEX

The limiting the number of simultaneous jobs using the % definition
is the only additional thing that should be added to the things discussed
in the old @CSC article. (And of course all the SGE specific lines should be
converted to LSF fromat)

Total usage of Accelrys licenses

We ask the users of Accelrys software to not to hog all license tokens, but be considerate of other users. Please don't use more than 16 license tokens at a time, and if a more that 50 tokens are in use, try to stay below 10. Thank you.