HPC Cluster Live Demo Guide
1. Setting Up Conda Environment
Install Miniconda from Source
-
Download the Miniconda installer:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
-
Make the installer executable:
chmod +x Miniconda3-latest-Linux-x86_64.sh
-
Run the installer:
./Miniconda3-latest-Linux-x86_64.sh
- Follow the on-screen prompts to complete the installation.
-
Add Miniconda to your PATH:
export PATH=~/miniconda3/bin:$PATH source ~/.bashrc
-
Verify the installation:
conda --version
Create and Activate a Conda Environment
- Create a new environment (e.g.,
myenv
) with Python 3.8:conda create -n myenv python=3.8
- Activate the environment:
conda activate myenv
2. Setting Up Jupyter Notebook
Option 1: Using PowerShell or Tabby for SSH Tunneling
-
Install Jupyter Notebook within the Conda environment:
pip install notebook
-
Start Jupyter Notebook on the HPC cluster without a browser:
jupyter notebook --no-browser --port=8888
-
On your local machine, open PowerShell or Tabby and set up an SSH tunnel:
ssh -L 8888:localhost:8888 <your-username>@headnode01.hpc.gla.ac.uk
- Accept the fingerprint prompt and enter your password.
-
Open the Jupyter Notebook link in your local browser:
http://127.0.0.1:8888/tree?token=<your-token>
Option 2: Using Visual Studio Code
- Open VS Code and install the Remote - SSH extension.
- Use the Command Palette (Ctrl+Shift+P) to connect to the HPC cluster:
Remote-SSH: Connect to Host
- Enter the hostname:
headnode01.hpc.gla.ac.uk
- Open a terminal in VS Code and follow the steps to start the Jupyter Notebook server.
3. Basic Linux Commands
These basic Linux commands will help you navigate and manage files in your cluster environment:
-
Check Current Directory:
pwd
-
List Files and Directories:
ls
-
Change Directory:
cd <directory-name>
-
Create a Directory:
mkdir <directory-name>
-
Move or Rename Files:
mv <source-file> <destination>
-
Delete Files and Directories:
rm <file-name>
or
rm -rf <folder-name>
-
View File Contents:
cat <file-name>
-
Check Disk Usage:
du -sh
4. Submitting a Job on the HPC Cluster
Create a simple Slurm job script (e.g., job_example.slurm
) for submitting a job:
#!/bin/bash
#SBATCH --job-name=example-task
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1
#SBATCH --mem=64GB
#SBATCH --cpus-per-task=8
#SBATCH --time=00:10:00
module load python/3.8
python example_script.py
Submit the job:
sbatch job_example.slurm
Monitor the job:
squeue -u <your-username>
5. Starting an Interactive Session
If you want to run commands interactively:
srun --partition=gpu --gres=gpu:1 --mem=64GB --cpus-per-task=8 --pty bash
This document should guide you through the demo and help your audience follow along with each step.
6. Slurm Commands and Job Scheduling Basics
Common Slurm Parameters
Slurm parameters allow you to specify your resource requests. Below are the most commonly used ones:
Example Parameter | Description |
---|---|
--job-name="Test-Job-1" |
Specify a name for the job. |
--nodes=2 |
Number of servers to run the job on. Use only for parallel processing. |
--cpus-per-task=8 |
Number of CPUs per process. Useful for multithreaded jobs. |
--gres=gpu:1 |
Number of GPUs to request. |
--mem=16G |
Amount of memory per node. Default is megabytes unless a suffix is used. |
--partition=gpu |
Specify the partition to use (e.g., cpu , gpu ). |
--time=00-01:00:00 |
Total runtime limit for the job. |
--mail-user=<email> |
Email address for notifications (e.g., job start, end, failure). |
--mail-type=BEGIN,END,FAIL |
Types of notifications to send. |
Interactive Jobs
Interactive jobs allow you to debug submission scripts, install software, or prepare environments. Use interactive jobs sparingly and for short durations.
Example command:
srun --job-name="Interactive-Job" --partition=gpu --gres=gpu:1 --mem=16G --time=02:00:00 --pty bash
Batch Jobs
Batch jobs are the recommended way to run jobs on the HPC cluster. Below is an example of a submission script (job_example.sh
):
#!/bin/bash
#SBATCH --job-name="Test-Job-1"
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=16G
#SBATCH --time=00-05:00:00
#SBATCH --mail-user=firstname.lastname@glasgow.ac.uk
#SBATCH --mail-type=BEGIN,END,FAIL
module load python
cd ~/work/calculations/pythonscripts
python myPythonCalcs.py
Submit the job:
sbatch job_example.sh
Monitoring Jobs
Once your job is submitted, you can monitor it using the following commands:
-
View All Jobs:
squeue
-
View Your Jobs:
squeue -u <your-username>
-
Cancel Your Jobs:
scancel <your-jobid>