Slurm Help
Slurm Documentation
The HPC Team cannot possibly keep up with the constant slew of updates to Slurm and it’s documentation. All general information about commands, operation, and use of Slurm can be found from the list of relevant pages below. Remember, always defer to SchedMD’s website first for the most up to date and relevant documentation.
Official Documentation
Official Slurm_Documentation Homepage.
Command_Summary (2-page sheet).
Slurm NEWS (changes in recent versions of Slurm)
Subscribe to Slurm_Mailing_Lists.
Slurm_Publications and presentations.
Youtube videos about Slurm.
Slurm_Man_Pages overview of man-pages, configuration files, and daemons.
Slurm_Bugs tracking system.
Slurm_Quick_Start administrator guide.
Slurm Elastic Computing (Cloud Bursting) (Google Cloud, Amazon EC2 etc.)
Non-official Documentation
Most Popular User Commands
SLURM commands have many different parameters and options. The SchedMD provdied Command_Summary (2-page sheet) is a great sheet to reference most commands and their options. Below is a list commands The HPC Team believes are most relevant to DEAC Cluster users:
- Queue List: squeue [-u ] - Job Submission: sbatch - Cluster Status: sinfo [-Nel] - Job deletion: scancel - Job status: squeue - Job information: scontrol show job - Temp job info: sstat -j .batch - Job Completion: sacct -j -l - Node List: sinfo -N - Show Reservation: scontrol show reservation= - Fairshare Info: sshare [-A ] [-u ] - Queue Priority: sprio
DEAC Cluster Slurm Specifics
The DEAC Cluster has a few configuration specifics that make it unique from a defacto Slurm install. They are listed below.
Example
Below is an example job that can run on the DEAC Cluster. Normally, an --account= directive entry exists, but in this examples case, the default account will be used. It is highly recommended to include an account specification, especially for users who belong to multiple research groups.
#!/bin/bash
#SBATCH --job-name="Example_Submission"
#SBATCH --partition=small
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --mem=5GB
#SBATCH --time=00-00:05:00
#SBATCH --mail-user=%u@wfu.edu
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH --output=slurm-%x-%j.o
echo "Running Job $SLURM_JOB_ID" #Print Slurm Job ID
pwd #Print current working directory
cd /home/$USER #Go to homedir and print so you see change
pwd
cd /scratch/$SLURM_JOB_ID #Change to temp scratch dir
pwd
which python3 #Show default python3 path
python3 -V #Show default python3 version
module load apps/python/3.11.8 #Load python3 modulefile
which python3 #Show updated python3 path
python3 -V #Show updated python3 version
module list #Show loaded modules
hostname #Print compute node hostname where job ran
Accounts
Each research group corresponds to a shared Slurm account for tracking utilization. IE, Engineering Professor Adam Carlson would have a “carlsonGrp” Slurm account for which he and all of his sponsored researchers would utilize when submitting jobs to Slurm. The account is specified using the Slurm_accounts directive (--accounts=) in a batch job submission.
Each Slurm Account inherits it’s priority from the parent department. So in this case, carlsonGrp would inherit their priority from the “egr” Slurm parent account. This is important to know because all Slurm child accounts to egr affect the overall priority for each other. Same goes for all corresponding departments
Partitions
The DEAC Cluster has 4 primary partitions:
large - Jobs > 1 node, <180 days; the default partition.
small - Jobs = 1 node, <1 day; receives double partition priority as large.
gpu - Jobs <= 2 nodes, <28 days; only partition with GPU resources.
interactive - Jobs = 1 node, <1 day; all interactive jobs run here.
The small, large, and interactive partitions share the same nodes. The only difference is the limits set by running jobs, and the priority assigned to each job upon submission. The GPU partition is comprised of GPU nodes, which can also be found in the interactive partition.
Node Features
Because the DEAC Cluster is heterogeneous, we use node Features to identify differences between node types. Features can be referenced using the Slurm_constraints directive (--constraints=) in a batch job submission. Valid features and constraint options are as follows:
login: These nodes are used to submit jobs and are not assigned to any partition to execute jobs.
amd : These nodes contain amd cores (64-core)
zen# : This designates the revision of amd core architecture (the higher the number, the newer the architecture).
intel : These nodes contain intel cores
skylake : These nodes have Intel’s Xeon E5 Skylake based processors (44-core UCS nodes)
cascade : These nodes have Intel’s Xeon Gold Cascade Lake based processors (44 and 48-core UCS nodes)
rocky9 : Designates the operating system installed on the node.
44cores : Designates 44-cores available on the node.
48cores : Designates 48-cores available on the node.
64cores : Designates 64-cores available on the node.
highmem : Designates high memory limit (currently 2.3TB) on the node
gpu : Designates GPU available (suboption is: a100_80, a100_40, v100_32).
Priority Calculation
The Priority Calculation equation used by the DEAC Cluster for each job is as follows:
The following Priority Weights are determined as follows:
Fairshare = Based upon a leveled Department Fairshare (\(\mathbf{F_{\mathrm{Dept}}}\)) starting value, and adjusted by Slurm based on monthly utilization compared to expected baseline.
Age = Slurm assigned value based on wait time (up to 7 day max; up to 100 jobs per group simultaneously)
Partition = DEAC partition values as follows: small=20; large=10; gpu=40; (all all others=10)
QOS = 0 for normal QOS (default), and 10 for any high QOS (only available for contributors).
Nice_Factor = A way to manually adjust job importance by weight of +/-2147483645 (via –nice directive). A positive value lowers priority; only admins can assign a negative value to increase priority.
The higher the overall calculated value, the higher the priority. The most complicated aspect of this calculation is called “leveled fairshare”, where Slurm takes the standard assigned integer value and levels it on a scale of 0 to 1. In the following example, we’ll use a new user example (leveld fairshare of 1). If a user submits a job via their normal QOS to the large partition, the calculation is as follows:
If the user has made a contribution, and submits a job via their high QOS to the large partition, the calculation is as follows:
This highlights how a contributing group receives a three times increase in priority via their high QOS from the same starting point for a job submission.
If a non-contributing user has waited 7 days for their job to start (the maximum time factor), then their fairshare will have increased to the same priority as the high QOS:
This time-based increase helps ensure a level of balance so that non-contributing users can still have jobs run after a certain amount of wait time.