A job submission helper on Hercules: ssubmit

Warning

The ssubmit tool is only available on Hercules2 cluster

Overview

For users who are not yet very experienced in using the SLURM batch system, we provide a tool, ssubmit, that simplifies job submission. It generates the sbatch script and submits the job.

Basic Usage

A batch job can be submitted using the ssubmit command

ssubmit program arg1 arg2 ...

where:

  • program is the application you want to run
  • arg1 arg2 ... are the program arguments (if any)

If your programs is from a global installed application, you need to load the module before use ssubmit.

module load application
ssubmit program arg1 arg2 ...

Then, ssubmit will ask you for the resources reserved for your job in terms of time, memory and number of cores. ssubmit creates a submission script and submits your job. The submission script is stored in a slurm-JOB_ID.sh file and the job output in a slurm-JOB_ID.out file, where JOB_ID is the job ID set by slurm

The following example will submit a job that executes the echo command with 'hello from ...' as argument on a compute node using 1 core with 1 Gb of memory during 1 hour.

ssubmit echo ‘hello from job id $SLURM_JOB_ID running on compute node $(hostname)’

====== Time ====================================================================

  [1]  1 hour (default)
  [2]  1 day
  [3]  5 days
  [4]  15 days (max)

  Your choice or your time (DD-HH:MM or HH:MM): 1

  Using serial paradigm 1 process


====== Memory ==================================================================

  Memory per process in GB (defaul 1GB  max 2000GB): 1

====== Summary =================================================================

  You are about to submit a job to the cluster.
  Please check if everything is correct.

                Job name: echo
              Executable: echo
               Arguments: ‘hello from job id running on compute node cecicluster’
    Number of processors: 1
    Memory per processor: 1024
                    Time: 00-01:00:00
               Partition: batch

  Would you like to continue [Y/n] ?y

====== Submitted job informations ==============================================

                  Job id: 201403967
              Job output: slurm-201403967.out
     Get job status with: squeue -j 201403967


             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
         201403967     batch     echo ceciuser PD       0:00      1 (None)

In this case, the job output is located in slurm-201403967.out and the submission script is stored in slurm-201403967.sh

Ssubmit arguments

You can set up some job parameters using ssubmit’s arguments. The arguments must been set before program name. Those are:

  • -y: Automatic confirm of submission values
  • -t  --time: TIME: allocation time DD-HH:MM
  • -m  --mem-per-cpu: MEMORY: Memory per process in MB
  • -n  --ntasks: NTASKS: Number of process
  • -J  --job-name: NAME: Specify a name for the job allocation
  • -p , --template-path: PATH: Specify a template files path

For example:

ssubmit --time DD-HH:MM --mem-per-cpu MB --ntasks N --job-name NAME program arg1 arg2 ...

The -y option allow you to skip the confirmation step

ssubmit --time 00-01:00 --mem-per-cpu 1024 -y echo 'hello world'

Advanced usage

You can modify the way ssubmit generates the submission script adding your own template files in a folder determined with the -p argument.

In the templates directory, you can add three kind of files that will overwrite the default ones.

template.js

This is a jinja2 file with the content that ssubmit will write after the #SBATCH header in the submission script. In the jinja2 template you can access this variables:

  • path: the directory path where your application is located
  • cmd: the program name
  • argv: an array with the command and the arguments.
  • sbatch: a dict with the sbatch parameters
  • inc: a dict with user defined variables (see inc.py module)
  • paradigm: the selected paradigm.
  • config: a dict with the data you set in config.yaml file

The default template is

{# minimal template #}
{{ path }}/{{ cmd }} {% for i in argv[2:] %}{{ i }} {% endfor %}

An example using inc and sbatch variables:

echo "inc variables example p1={{ inc.p1 }}, p2={{ inc.p2 }}"
hostname

mpirun -np {{ sbatch['ntasks'] }} {{ cmd }}

config.yaml

By default, the job paradigm is set to serial so your job will only use one core.

For preinstalled applications supported paradigm are predefined.

If you want to overwrite the predefined paradigms or set the paradigms for applications you have installed in you home directory, you need a yaml file config.yaml that contains the paradigms your application is able to use.

For example, the config.yaml file for an application that support all allowed paradigms is

---

# List of paradigm this program supports
paradigms:
  - serial
  - smp
  - mpi

inc.py

inc.py is a python file or module that allows you to implement your own arguments checks and add new variables to the generated submission script. The functions you can define in this file are:

function description
def check_args(argv)

Add the checks and modifications you need to your program arguments.

  • input: array with the arguments
  • output: array with the checked and/or modified arguments
def inc(argv)

Include new variables needed on your application.

  • input: array with the arguments
  • output: dict or list inc with new variables
check_sbatch(sbatch)

Used to check the sbatch parameters. Must return True or False depending if there is an error or not in the parameters

  • input: sbatch dict
  • output: boolean True or False
def get_sbatch_options(argv):

Append new sbatch parameters.

  • input: array with the arguments
  • output: a dict with {‘sbatch_variable’: ‘Value’, ...}

An example of inc.py could be:

import os
import sys


def check_args(argv):
    """ check specific arguments related to each application
        and do modifications on it if needed
        # argv[0] = ssubmit
        # argv[1] = application executable
        # argv[2...] = application arguments
    """

    # argument must be a input file
    if not os.path.isfile(argv[2]):
        print("ERROR: Input file %s doesn't exist or is not readable" % (argv[2]))
        sys.exit(1)

    return argv


def inc(argv):
    """ do any action related to the application and return variables as dict
        or list to be used in the sbatch template
    """

    return {'app_version': "20210630-R1"}


def check_sbatch(sbatch):
    """ Check sbatch """

    if sbatch['paradigm'] == 'mpi':
        if int(sbatch['ntasks']) % 2 == 0:
            return True
        else:
            print('Number of mpi process must be EVEN number')
        return False

def get_sbatch_options(argv):
    """ set more sbatch options if application needs it return a dict with
        { option: value, ... }
    """

    # Set log file name as input file name
    sbatch = {'output': argv[2].rsplit('.', 1)[0] + '.log'}

    return sbatch