Requesting Resources

6. Requesting Resources#

Overview

Tutorial: 30 min

Objectives:

Learn how to write a PBS job script for Gadi.
Learn how to launch a job in Gadi.

If you remember the HPC usage flow diagram from the previous section, this section will show you how to submit the code through the scheduler to the compute nodes.

6.1. Planning Jobs#

To run compute tasks such as simulations, weather models, and sequence assemblies on Gadi, users need to submit them as ‘jobs’ to ‘queues’. Each queue has different hardware capabilities and limits to run different types of jobs.

Ideally, we should know the answers to the following questions before running a job:

Which project will you use for this job?
Which modules must be loaded for your software?
What command or script will run your main program?
How much compute resources (CPU cores/GPUs, memory, walltime) does your task require?

However, this is not always possible, especially for new users or running a new workflow for the first time. In this case, we can follow the following steps:

Gather information: Find out the available queues, hardware capabilities and software modules for your program.
Run test jobs: use a small sample of the data and small number of cores/walltime to test the script/program.
Adjust resources: gradually increase the resources and improve code efficiency to find the performance sweet spot.
Accounting: check your project allocation and estimate the cost (time, RAM, compute cores etc.) of full job.
Full run: run the job with the full dataset and monitor the job status.

Note

The sweet spot is where the job can take advantage of parallelism and achieve a shorter execution time, while utilising at least 80% of the resources requested.

Searching for this sweet spot can take time and experimentation, some code will need several iterations before that efficiency is found.

6.1.1. Compute Resources#

Compute resources are allocated to projects, not individual users, through various allocation schemes.

Service Units (SU) measure Gadi’s compute hours.
Projects must have a compute allocation to run jobs.
Compute allocations are assigned quarterly.
Allocations can be transferred, adjusted, or reallocated by the project’s Chief Investigator (CI) or scheme manager.
Unused allocations can roll over to the next quarter if requested within the first two weeks of the current quarter.

Service Units (SUs) are charged based on the resources reserved for a job and the walltime. The resources reserved are determined by the greater of the requested CPUs or the proportion of memory.

Job Cost (SU) = Queue Charge Rate × Max(NCPUs, Memory Proportion) × Walltime Used (Hours)

Queue Charge Rate: The charge rate for the queue as listed in the Queue Limits. Note that using express queues increases job priority but also raises the job’s cost.
NCPUs: The number of CPUs requested for the job using the PBS -l ncpus option.
Memory Proportion: Calculated as Memory requested ÷ Memory per core (where Memory Per Node is divided by NCPUs per node for the queue).

Examples of Job Costs

Example 1: A job using 1 CPU for 30 minutes in the normal queue will incur a charge of 1 Service Unit (SU).

Example 2: A job using 4 CPUs with 16 GiB of memory for 5 hours of walltime in the normal queue will be charged 40 SUs.

Calculation: 4 CPUs × 5 hours × 2 SU per hour = 40 SU.

The situation can become more complex:

Some jobs may require fewer CPUs but need more memory.
Others may require GPUs.
Some jobs might need to use the express queue.

6.1.2. Queue Structure#

Jobs on Gadi are submitted to different queues, which determine resource availability, limits, and cost (service unit charge rates). Each queue is optimized for different workloads:

Gadi Queue Types#
Queue Name	Description
normal	For standard jobs, with moderate walltime and resource limits. Most projects use this queue by default.
express	For short, urgent work with higher queue priority but increased SU charges. Limited walltime.
gpuvolta/gpua100	Specifically for jobs requiring access to NVIDIA Volta or A100 GPUs, respectively.
hugemem	For jobs requiring nodes with large memory (up to several terabytes per node).
copyq	For high-speed data transfer jobs; not for general compute workloads.

You select a queue using the -q option in your PBS batch script or interactive job request, for example:

#PBS -q normal

Each queue has limits, such as:

Maximum CPUs or GPUs per job
Maximum walltime
Maximum number of running jobs per project or user

Up-to-date queue characteristics and charge rates are listed at: Queue Limits

Note

Submitting to an express or GPU queue will affect both the job’s priority and cost (SU consumption).

6.2. Running Batch Jobs#

The overall procedure to run a job on Gadi takes a few steps:

Steps for Running a Batch Job on Gadi

Write a job script (specifying the queue, duration, and resources needed).
Submit the job script to the queue.
Monitor the job status (If the job uses more resources than requested, it will be terminated immediately).
View the job output and error files (by default, saved in the directory from which you submitted the job).
Cancel the job if needed.

Batch jobs

A batch job is a non-interactive job submitted to the scheduler (like PBS or SLURM) to run at a later time. It executes your code or script without requiring your direct involvement during execution.

To run a batch job on Gadi, users need to create a PBS script. The script is a text file formed of two sections: resource requests and job steps. Resource requests involves specifying the required number of CPUs/GPUs, expected job duration, amounts of RAM, disk space, and so on. Job steps involves shell commands of what needs to be done (i.e. loading environment/software modules, running computing steps, parameter space, etc.).

Here is an example of a PBS script:

#!/bin/bash

#PBS -P vp91
#PBS -q normal
#PBS -l ncpus=48
#PBS -l mem=10GB
#PBS -l storage=gdata/vp91+scratch/vp91
#PBS -l walltime=00:02:00
#PBS -N testScript
#PBS -l wd

module load python3/3.11.0
module load papi/7.0.1

. /g/data/vp91/Training-Venvs/intro-to-numba/bin/activate
which python

python3 main.py $PBS_NCPUS> /g/data/vp91/$USER/job_logs/$PBS_JOBID.log

Specifies which shell to use
-P - Gadi project (sometimes called account) to use
-q - Gadi queue to use
-l ncpus - Total number of cores requested
-l ngpus - Total number of GPUs requested
-l mem - Total memory requested
-l storage - Storage in g/data and /scratch to come from project vp91
-l walltime - Total wall time for which the resources are provisioned (in hours:minutes:seconds)
-N - Name of the job
-l wd - Enter the working directory once the job has started.

Note

For more PBS directives, see the PBS directive list.

Check queue limit page for core numbers, memory and walltime limits.

For different Gadi queues, see the queue structure page.

6.3. Practice: Write a PBS job script#

Exercise

To run our compiled hello_mpi program on Gadi, we need to wrap it in a PBS job script:

Go to /scratch/vp91/$USER/first_job.
Inside first_job, create new file job_script.sh and paste the template below, change <Project code> to vp91, and save.

#!/bin/bashs

#PBS -P <Project code>
#PBS -q normal
#PBS -l ncpus=4
#PBS -l mem=16gb
#PBS -l walltime=00:10:00

module load openmpi/4.1.5
mpirun hello_mpi

6.4. Submit and Monitor Jobs#

Once you have saved this script as a ‘.sh’ file, you will be able to submit it using the ‘qsub’ command, followed by the file name.

To submit a job use the command

qsub <jobscript.sh>

After your job has been successfully submitted, you will be given a jobID (e.g. 12345678.gadi-pbs). You can use this jobID to monitor the status of your job.

To know the status of your job use the command

qstat <jobid>

To know get the details about the job use the command

qstat -swx <jobid>

We recommend checking the Job Monitor regularly to ensure that your jobs are running efficiently, especially if you are running a new workflow for the first time.

When the job is completed, it will produce two new text files in the working directory, with a filename script.sh.o<JobID> and script.sh.e<JobID>.

The first file, with ‘o’ in the file name, is the output of the job, which should have run successfully on Gadi.

The second file, with ‘e’ in the file name, is the error stream. this will document any errors that occured while the job was running. In this case, the error stream file should be empty.

6.5. Practice: Submit and Monitor Your First Job#

Exercise

Submit the job script you created in the previous exercise.
Check the status of the job using the qstat command.
Check the output and error stream of the job using the less command.
Use other commands to check the job status and utilisation of resources.

Command	Description
man qstat	View the manual for qstat and a range of helpful commands
qdel <jobid>	Delete the job with jobID <jobid>
nqstat_anu <jobID>	See how much CPU and memory your job has actually been using
qstat -swx <jobid>	Display the job status in the queue with comment
qstat -fx <jobid>	Display full job status information
qps <jobid>	Take a snapshot of the process status of all current processes in the running job
qcat [-s/-o/-e] <jobid>	Display [submission script/STDOUT/STDERR] of the running job
qls <jobid>	List contents in the folder $PBS_JOBFS
qcp <jobid> <dst>	Copy files and directories from the folder $PBS_JOBFS to the destination folder <dst>

6.6. Interactive Jobs#

An interactive job allows you to interact directly with the HPC system and the job while it’s running. This means you have a command-line shell (e.g., terminal) on the compute node where you can run commands in real-time. NCI recommends that users utilise this resource to debug large parallel jobs or install applications that have to be built when GPUs are available.

qsub -I is the command to request an interactive job.

qsub -I -q normal  -P vp91 -l walltime=00:10:00,ncpus=4,mem=10GB

Hint

Check your shell prompt before and after running the command. Why did it change?

Once you are finished with the job, run the command

exit

to terminate the job.

Key Points

Multiple PBS directives are available request a job.
Gadi uses some custom directives.
There are two modes to request a job - batched and interactive. You can also use interactive jobs to test your script/program.

Requesting Resources

Contents

6. Requesting Resources#

6.1. Planning Jobs#

6.1.1. Compute Resources#

6.1.2. Queue Structure#

6.2. Running Batch Jobs#

6.3. Practice: Write a PBS job script#

6.4. Submit and Monitor Jobs#

6.5. Practice: Submit and Monitor Your First Job#

6.6. Interactive Jobs#