Great Lakes
Transferring files
One option is to use Cyber Duck, which works fine, but can sometimes be a bit annoying/slow (and requires DUO authentication). Another option is to use the scp
command on the terminal. For example to transfer a file called test.txt
from my desktop to my home directory on great lakes:
cd Desktop
scp test.txt gl-xfer:/home/asyadav/test.txt
To transfer an entire directory use the -r
option; e.g.
scp -r localdir gl-xfer:/home/asyadav
The command can be reversed to transfer from great lakes to my laptop. See the great lakes user guide for more info.
A couple of notes on using scp:
- The great lakes user guide says that you’ll need to authenticate via Duo to complete the transfer, but I’ve only had to enter my password so far…
- I modified my
~/ssh/config
file so that I can use the short-handgl-xfer
rather than typing out the entire host nameuniqname@greatlakes-xfer.arc-ts.umich.edu
; see this linuxize post.
Batch jobs
The main way of submitting jobs to Great Lakes is via the sbatch
command. The command is designed to reject the job at submission time if there are requests or constraints that Slurm cannot fulfill as specified, giving users a chance to modify their job specifcations.
Submitting a job
To submit a batch job you first need to create a simple batch job script which tells Slurm the job specifications (e.g. how many nodes, processors, memory, etc.) and the program to execute.
Here is an example batch script that I have used in the past named calibration_SAMIN_mktclearing.sh
. The fields are pretty self-explanatory.
#!/bin/bash
# The interpreter used to execute the script
#“#SBATCH” directives that convey submission options:
#SBATCH --job-name=calib_mktclear
#SBATCH --mail-user=asyadav@umich.edu
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH --cpus-per-task=10
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --mem-per-cpu=1g
#SBATCH --time=48:10:00
#SBATCH --account=lsa3
#SBATCH --partition=standard
#SBATCH --output=/home/%u/%x-%j.log
# The application(s) to execute along with its input arguments and options:
julia -p 10 calibration_SAMIN_mktclearing.jl
To submit the job simply navigate to the directory where the batch job script is located an use the sbatch
command:
sbatch calibration_SAMIN_mktclearing.sh
Note that you need to specify a Slurm account for the job to run. To view which accounts you can submit to use the command:
sacctmgr show assoc user=$USER
UPDATE: email recieved from HPC support on April 21 says that I should be using lsa1
!
Common errors
Often when running a new job you’ll find errors in your code/script, so some iteration is involved. Once the code/script is OK, you may encounter an “out of memory” error. The usual solution to this is simply to increase the memory allocated for the job via the mem-per-cpu
argument. Great Lakes defaults to 1G per cpu, but if this isn’t sufficient, double it until the job goes through.
Useful job commands
List queued and running jobs
squeue -u$USER
Cancel a queued job or kill a running job
scancel <job_id>
Cancel all jobs
scancel -u$USER
To monitor CPU/memory usage of running jobs you can ssh into the computer node the job is running on. To find the nodes your jobs are running on use the squeue -u$USER
user command. Then ssh into the node:
ssh c13n03
Once you’re in the computer node use the ps
command to get an instantaneous snapshot of your CPU/memory usage:
ps -u$USER -o %cpu,rss,args
You can also use the top
command here.