Working on a computing cluster#
The Slurm Queuing System#
Many people have access to Euler. If everyone ran whatever program they liked, whenever they liked, the system would soon grind to a halt as it tried to manage the limited resources between all the users. To prevent this, and to ensure fair usage of the server, there is a queueing system that automatically manages which jobs are run when. Any program that will use either more than 1 CPUs (sometimes referred to as cores or threads, though there are minor technical differences between these terms), more than a few MB of RAM, or will run for longer than a few minutes, should be placed in the queue.
To correctly submit a job to the queue on Euler, it’s usually easiest to write a short shell script based on a template. Our server Cousteau also uses the Slurm Queuing System.
#! /bin/bash
#SBATCH --job-name example # Job name
#SBATCH --output example_out.log # Output log file path
#SBATCH --error example_error.log # Error log file path
#SBATCH --ntasks 8 # Number of CPUs
#SBATCH --mem-per-cpu=2G # Memory per CPU
#SBATCH --time=1:00:00 # Approximate time needed
# Insert your commands here
echo This job ran with $SLURM_NTASKS threads on $SLURM_JOB_NODELIST
Then the equivalent commands:
# Submit the job to the queue
sbatch my_jobscript.sh
# Check the status of your jobs
squeue
# Remove a job from the queue
scancel jobid
Exercise 0.9
Copy the script
/nfs/teaching/551-0132-00L/2_Good_practices/submit_slurm.sh
to your home directory.
# Copy the submit script to your home directory
cp /nfs/teaching/551-0132-00L/2_Good_practices/submit_slurm.sh ~/
Submit the script to the job queue with sbatch and look at the output file
# Submit the script
sbatch submit_slurm.sh
# Check if it is in the queue (it may run too fast and you miss it)
squeue
# Check the output files
less example*error.txt # Note: "*" stands for a number
# Should be empty
less example*out.txt # Note: "*" stands for a number
# Should tell you that it ran with 8 threads on localhost
Now edit the script:
Remove the existing echo command.
Put a command to run the script you wrote for Exercise 2.5 on one of the fasta files in
/nfs/teaching/551-0132-00L/1_Unix/genomes
.You should only use 1 CPU instead of 8, the other parameters can stay the same unless you want to rename the job and log files
# Modify the submit script (submit_slurm.sh) to look something like this:
#! /bin/bash
#SBATCH --job-name fastacount # Job name
#SBATCH --output out.log # Output log file path
#SBATCH --error error.log # Error log file path
#SBATCH --ntasks 1 # Number of CPUs
#SBATCH --mem-per-cpu=2G # Memory per CPU
#SBATCH --time=1:00:00 # Approximate time needed
./fastacount.sh /nfs/teaching/551-0132-00L/1_Unix/genomes/bacteria/escherichia/GCF_000005845.2_ASM584v2/GCF_000005845.2_ASM584v2_cds_from_genomic.fna
Submit the job. When the job is finished, look at the output files for yourself.
# Then you submit it like this:
sbatch submit_slurm.sh
# Check the output
less error.log # Should be empty
less out.log # Should have the output of your script, e.g. 4302