Run jobs on a supercomputer

There are two ways to run a job on a supercomputer. One way is to enter commands in real time (interactive mode) as if you are sitting in front of your personal computer, except that it now has much greater computing power. This way is quite convenient for short jobs. But as soon as you are disconnected from the supercomputer or as your session is expired, your running job will be terminated. Interactive mode is not suitable for long jobs. The other way is to submit a batch job. Imagine you turn in a job application and then go about your normal life. You don’t need to sit in front of the supercomputer to keep the connection alive. The supercomputer will silently do the job that you submitted and will notify you when it finishes. This is the batch mode.

To run the interactive mode with SLURM scheduler system, you can use either srun or salloc. They function differently but are similar in many ways. This article, printed from the website of the Computer Science Department of Colorado State University, nicely explains their differences. On SGE scheduler system, you run the interactive mode by qrsh or qlogin.

Whichever way you use, there will be a limit on how much resource you can request. This is because a supercomputer is designed for multiple concurrent users. There may be a cap on the number of jobs one can submit per day or week. There may be a cap on time, memory, number of cores that a job can use. To find out what those limits are on SLURM, use the following command:

sacctmgr show qos format=Name,Priority,Flags,MaxWall,MaxJobsPU,MaxSubmitPU,MaxTres

To run a batch mode, you can use sbatch. The command in the bash shell is quite simple: sbatch batchscriptname.sh. In the following example, we write a batch script called “sample-batchscript.sh” to allocate resource to run a Python program called “sample-py.py”. The job name is a string carrying the values of three variables a,b,c. The returned value is the sum a+b+c. The batch script file sample-batchscript.sh is as follows:

#!/bin/bash
#
# Scheduler specific section
# --------------------------
#SBATCH --job-name="a=11,b=21,c=3" # a name for your job
#SBATCH --nodes=1                  # node count, default is 1
#SBATCH --ntasks=2                 # total number of tasks across all nodes
#SBATCH --cpus-per-task=1          # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --mem-per-cpu=1M           # memory per cpu-core
#SBATCH --time=00:00:05            # total run time limit (HH:MM:SS)
#SBATCH --output=%j,%x.out 	       # name of output log file, default is slurm-%j.out
#SBATCH --mail-type=BEGIN,END,FAIL # send email when job begins, ends, or fails
#SBATCH --mail-user=youremailaddress
#
# Job specific section
# -----------------------
ulimit -s unlimited
printf "Job started at %s on %s \n" "$(date '+%H:%M:%S')" "$(date '+%m/%d/%Y')" 
echo ----------------Submitted batch script-----------
scontrol write batch_script $SLURM_JOB_ID - | sed '18,21d' | head -n -2
echo ------------End of submitted batch script-----------
mpirun -n $SLURM_NTASKS python3 sample-py.py $SLURM_JOB_NAME #mpiexec can replace mpirun
echo ------------Job summary-----------
scontrol show job $SLURM_JOB_ID

The first line starting with #! is to tell the bash shell that this is an executable file. The lines starting with #SBATCH is for SLURM to allocate resource. All other lines starting with # are comments and are ignored by the bash shell. After submitting the batch script with sbatch, SLURM will make a copy of the script so that the version that SLURM will work on is not affected by any changes you will make in your script file. An analogy is that after you submit your job application, you cannot alter it. You can alter the version you have in hand, but you cannot alter the version you just sent away. The codes between echo…echo is to print the version of the batch script at the time it is submitted. The part sed ‘18,21d’ and head -n -2 are to skip printing lines 18-21 and the last two lines in the batch script file.

The Python program sample-py.py is as follows:

import sys

def maincode(params):
    ass_params = params.split(",")
    for i in ass_params: exec(i,globals())
    return a+b+c

if __name__ == "__main__":
  params = sys.argv[1]
  print('The sum is: ',maincode(params))

On the SGE scheduler system, you submit the batch job by qsub: qsub batchscriptname.sh. An example of the SGE batch script is as follows:

#!/bin/bash
#
# give the job a name (environment variable $JOB_NAME)
#$ -N a=6,b=7,c=4
# set the shell
#$ -S /bin/bash
# join the output and error into one file (y means yes)
#$ -j y
# name the output file
#$ -o $JOB_ID-$JOB_NAME.out
# work in the current directory
#$ -cwd
#$ -M youremailaddress
# Email when job begins (b), ends (e), or aborted (a)
#$ -m bea
# parallel environment with requested number of cores (environment variable $NSLOTS)
#$ -pe orte 2
# request running time
#$ -l h_rt=00:00:05
# request virtual memory per MPI process
#$ -l h_vmem=1G # the total memory for job is 2G

ulimit -s unlimited
printf "Job started at %s on %s \n" "$(date '+%H:%M:%S')" "$(date '+%m/%d/%Y')" 
echo ----------------Submitted batch script-----------
cat $JOB_SCRIPT | sed '24,27d' | head -n -1
echo ------------End of submitted batch script-----------
conda activate dedalus2
mpirun -n $NSLOTS python3 sample-py.py $JOB_NAME
qstat -j $JOB_ID

I find the online documentation of SGE to be quite confusing compared to that of SLURM. However, this page, printed from the website of the College of Engineering Information Technology at Boston University, explains well how to request memory for the job.