4 SLURM: HPC scheduler
If you have written some scripts and want to execute them, it is advisable to send them to the scheduler. The scheduler (SLURM) will distribute the jobs across the cluster (6+ machines) and make sure that there are no conflicts with respect to CPU and memory if multiple people send jobs to the cluster. This is the essential job of a scheduler.
4.1 First steps
Sending jobs to SLURM in R is supported via the R package clustermq.
The R interpreters and packages are not shared with RSW. Therefore, all R packages your script needs must to be reinstalled on the HPC with the respective R version.
Rather than calling a R script directly, you need to wrap your code into a function and invoke it using clustermq::Q()
.
Instead of using clustermq
directly, you can make use of R packages like {targets} or {drake} to automatically wrap your whole analysis in a way that it executes all layers of your analysis on the HPC.
There is no other way to submit your R jobs to the compute nodes of the cluster than by using any of the tools mentioned above.
Also, it is essential to load all required system libraries you need (e.g. GDAL, PROJ) via environment modules so that they are available on all nodes.
Note that most likely the versions of these libraries will differ to the ones used in the RSW container. For reproducibility it might be worth not deviating too much or even using the same versions on the HPC and within RSW.
4.2 SLURM commands
While the execution of jobs is explained in more detail in Chapter 4, the following section aims familiarizing yourself with the usage of the scheduler.
The scheduler is queried via the terminal, i.e. you need to ssh
into the server or switch to the “Terminal” tab in RStudio.
The most important SLURM commands are
-
sinfo
: An overview of the current state of the nodes
sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
all* up infinite 4 alloc c[0-2],edi
all* up infinite 2 idle c[3-4]
frontend up infinite 1 alloc edi
threadripper up infinite 4 alloc c[0-2],edi
opteron up infinite 2 idle c[3-4]
-
squeue
: An overview of the current jobs that are queued, including information about running jobs
squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
129_[2-5] threadripper cmq7381 patrick PD 0:00 1 (Resources)
121_2 threadripper cmq7094 patrick R 6:24:17 1 c1
121_3 threadripper cmq7094 patrick R 6:24:17 1 c2
129_1 threadripper cmq7381 patrick R 5:40:44 1 c0
-
sacct
: Overview of jobs that were submitted in the past including their end state
122 cmq7094 threadripper (null) 0 COMPLETED 0:0
123 cmq7094 threadripper (null) 0 PENDING 0:0
121 cmq7094 threadripper (null) 0 PENDING 0:0
125 cmq6623 threadripper (null) 0 FAILED 1:0
126 cmq6623 threadripper (null) 0 FAILED 1:0
127 cmq6623 threadripper (null) 0 FAILED 1:0
128 cmq6623 threadripper (null) 0 FAILED 1:0
124 cmq6623 threadripper (null) 0 FAILED 1:0
130 cmq7381 threadripper (null) 0 PENDING 0:0
-
scancel
: Cancel running jobs using the job ID identifierIf you want to cancel all jobs for your specific user, you can call
scancel -u <username>
.
4.3 Submitting jobs
4.3.1 clustermq
setup
Every job submission is done via clustermq::Q()
(either directly or via drake
).
See the setup instructions in the clustermq package on how to setup the package.
First, you need to set some options in your .Rprofile
(on the master node or in your project root when you use {renv} or {packrat}):
options(
clustermq.scheduler = "slurm",
clustermq.template = "</path/to/file/"
)
See the package vignette on how to set up the file.
Note that you can have multiple .Rprofile
files on your system:
- Your default R interpreter will use the
.Rprofile
found in the home directory (~/
). - But you can also save an
.Rprofile
file in the root directory of a (RStudio) project (which will be preferred over the one in $HOME).
This way you can use customized .Rprofile
files tailored to a project.
At this stage you should be able to run the example at the top of the README
of the {clustermq} package.
It is a very simple example which finishes in a few seconds.
If it does not work, you either did something wrong or the nodes are busy.
Check with sinfo
and squeue
.
Otherwise see the troubleshooting chapter.
Be aware of setting n_cpus
in the template
argument of clustermq::Q()
if your submitted job is parallelized! If you submit a job that is parallelized without telling the scheduler, the scheduler will reserve 1 core for this job (because it thinks it is sequential) but in fact multiple processes will spawn. This will potentially affect all running processes on the server since the scheduler will accept more processing than it actually can take.
4.3.2 The scheduler template
To successfully submit jobs to the scheduler, you need to set the .Rprofile
options given above.
Note that you can add any bash commands into the scripts between the SBATCH
section and the final R call.
For example, a template could look as follows:
#!/bin/sh
#SBATCH --job-name={{ job_name }}
#SBATCH --partition=all
#SBATCH --output={{ log_file | /dev/null }} # you can add .%a for array index
#SBATCH --error={{ log_file | /dev/null }}
#SBATCH --cpus-per-task={{ n_cpus }}
#SBATCH --mem={{ memory }}
#SBATCH --array=1-{{ n_jobs }}
source ~/.bashrc
cd /full/path/to/project
# load desired R version via an env module
module load r-3.5.2-gcc-9.2.0-4syrmqv
CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'
Note: The #
signs are no mistakes here, they are no “comment” signs in this context.
The SBATCH
commands will be executed here.
You can simply copy it and adjust it to your needs. You only need to set the right path to your project and specify the R version you want to use.
4.3.3 Allocating resources
There are two approaches/packages you can use:
drake
/targets
clustermq
The drake
approach is only valid if you have set up your project as a drake
or targets
project.
drake::make(parallelism = "clustermq", n_jobs = 1,
template = list(n_cpus = <X>, log_file = <Y>, memory = <Z>))
(The individual components of these calls are explained in more detail below.)
Note that drake
uses clustermq
under the hood.
Notations like <X>
are meant to be read as placeholders, meaning they need to be replaced with valid content.)
When submitting jobs via clustermq::Q()
, it is important to tell the scheduler how many cores and memory should be reserved for you.
This step is very important.
If you specify less cores than you actually use in your script (e.g. by internal parallelization), the scheduler will plan with X cores although your submitted code will spawn Y processes in the background. This might overload the node and eventually cause your script (and more importantly) the processes of others to crash.
There are two ways to specify these settings, depending on which approach you use:
- via
clustermq::Q()
directly
Pass the values via argument template
like template = list(n_cpus = <X>, memory = <Y>)
.
It will then be passed to the clustermq.template
file (frequently named slurm_clustermq.tmpl
) which contains following lines:
This tells the scheduler how many resources (here cpus) your job needs.
- via
drake::make()
Again, set the options via argument template = list(n_cpus = X, memory = Y)
.
See section “The resources column for transient workers” in the drake manual.
Please think upfront how many cpus and memory your task requires. The following two examples show you the implications of wrong specifications.
mclapply(cores = 20)
(in your script) > n_cpus = 16
In this case, four workers will always be in “waiting mode” since only 16 cpus can be used by your resource request. This slows down your parallelization but does no harm to other users.
mclapply(cores = 11)
< n_cpus = 16
In this case, you reserve 16 CPUs from the machine but only use 11 at most. This blocks five CPUs of the machine for no reason potentially causing other people to be added to the queue rather than getting their job processed immediately.
Furthermore, if you want to use all resources of a node and run into memory problems, try reducing the number of CPUs (if you already increased the memory to its maximum). If you scale down the number of CPUs, you will have more memory/cpu available.
4.3.4 Monitoring progress
When submitting jobs you can track its progress by specifying a log_file
in the clustermq::Q()
call, e.g. clustermq::Q(template = list(log_file = path/to/file))
.
For drake
, the equivalent is to specify console_log_file()
in either make()
or drake_config()
.
If your jobs are running on a node, you can SSH into the node, e.g. ssh c0
.
There you can take a look at the current load by using htop
.
Note that you can only log in if you have a running progress on a specific node.
4.3.5 renv
specifics
If {renv} is used and jobs should be sent from within RSW, Slurm tries to load {clustermq} and {renv} from the following library
<your/project/renv/library/linux-centos-7/R-4.0/x86_64-pc-linux-gnu/`
This library is not used by default and only in this very special occasion (Slurm + RSW).
The reason for this is that Slurm thinks its on CentoOS when invoking the CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'
call and tries to find {clustermq} in this specific library.
When working directly on the HPC via a terminal, the {renv} library path is renv/library/R-4.0/x86_64-pc-linux-gnu/
.
Simply copying {clustermq} and {renv} to this location is enough:
mkdir renv/library/linux-centos-7/R-4.0/x86_64-pc-linux-gnu
cp -R renv/library/R-4.0/x86_64-pc-linux-gnu/clustermq renv/library/linux-centos-7/R-4.0/x86_64-pc-linux-gnu/
cp -R renv/library/R-4.0/x86_64-pc-linux-gnu/renv renv/library/linux-centos-7/R-4.0/x86_64-pc-linux-gnu/
4.3.6 RStudio Slurm Job Launcher Plugin
While it would simplify some things to use the Launcher GUI in RStudio, the problem is that one requirement is to have R versions shared across all nodes. Since the RSW container uses its one R versions and is decoupled from the R environment modules used on the HPC, adding these would duplicate the R versions in the container and create confusion.
Also it seems the RStudio GUI does not allow to load additional env modules which is a requirement for loading certain R packages.
4.4 Summary
Set up your
.Rprofile
withoptions(clustermq.template = "/path/to/file")
. Theclustermq.template
should point to a SLURM template file in your $HOME or project directory.Decide which approach you want to use
drake
/targets
orclustermq
A Slurm template file is required. This template needs to be linked in your
.Rprofile
withoptions(clustermq.template = "/path/to/file")
.