Daru Lab guide to using Stanford Sherlock HPC cluster


Sherlock HPC Resources

We have access to the Sherlock clusters. Documentation for Sherlock is here.

As a member of the Daru lab, you have access to the following resources on Sherlock:

Connecting by SSH

Use SSH from a terminal and your SUNet username to log in.

# Connect to Sherlock from your local computer
ssh <user>@login.sherlock.stanford.edu
    

Setup Your Scratch Directory

On Sherlock, you can access several partitions, but the ones commonly used in the Daru lab are the "hns" and "normal" partitions. Use your user-specific scratch directory to run your jobs. This is also where you can temporarily store large data files, but remember that files on SCRATCH and GROUP_SCRATCH are automatically purged 90 days after their last content modification.

To transfer files from your local computer to the cluster, you can use sftp, or you can download data directly on the cluster if it is hosted online somewhere.

# On your local computer
# Transfer files or dirs from your local computer to the scratch space
sftp <user>@dtn.sherlock.stanford.edu
put /pathtofile/on/your/computer/file.csv
    

Submit Jobs to the Cluster Using SLURM

The Sherlock cluster uses the SLURM job submission system to manage shared resources on the cluster. When you log in, you will start at the "head" node, which is like a waiting area. Remember not to run any jobs on this node. Instead, you can submit your jobs via SCRATCH using a "job script" to reserve resources for your job and send it to run on a "compute node".

To keep things organized, create a directory called "Batch" for your job submissions. This way, you can easily keep track of your tasks.

# In the SCRATCH node
mkdir ~/Batch{1..10}
    

Example Job Submission

The bash script below tells the scheduler the resources we need, which partition to use ("hns"), and how the job and output files should be named. The command below reserves 16 cores and executes the R CMD BATCH command to run the R script calibration.R in the Batch* folder. The file is named script.sbatch and is placed one directory outside where the R scripts are located in the "Batch/" directory.

# Open file with vi text editor on the head node
vi script.sbatch
    
#!/usr/bin/bash
#SBATCH --time=2-00
#SBATCH -p hns
#SBATCH --ntasks-per-node=16
#SBATCH --nodes=1
#SBATCH --mem=90GB

cd ${1}

R CMD BATCH --no-save --no-restore calibration.R output.out
    

Submit the job to the scheduling queue.

# On the scratch node
for DIR in Batch*; do sbatch script.sbatch $DIR; done
    

Check whether it has started yet.

# On the scratch node
sacct
    

Interactive Mode

If you only plan to do a small amount of work, it is better to jump into an interactive session rather than submit a job to request many resources. This type of job will usually start quickly.

To request a 30-minute interactive session, use the following command:

sdev

This will allow you to start a quick and interactive session on the Sherlock cluster for shorter tasks or testing purposes.