A high-performance cluster is a collection of servers, called “nodes” in this context, interconnected by a fast network. A typical cluster consists in one or two management nodes, one or two login nodes (“frontends”), several storage nodes, and a large collection of compute nodes. Users of a HPC cluster must submit jobs to the managing software, called resource manager/job scheduler that decides when and where their computations run.
A server in a cluster dedicated to performing the computations. Compute nodes host compute units (processors), volatile memory (RAM), local persistent storage (disk)
A compute node can host one or multiple processors, fitted in the sockets of the motherboard. These are the chips that perform the actual computation. They are made of multiple cores and caches.
One of the subsystems of a processor that can independently run instructions that are part of a software process. It comprises arithmetic and logic units (ALU) and floating-point units (FPU) and some cache memory and is often split into two hardware threads.
Independent intruction processing channel inside a core. All threads in a core share some of the computations units of the core. See SMT and Hyper-Threading for more information.
Small and very fast working memory sitting on the processor, used to hold data transferred to or from the main volatile memory (RAM).
Acronym for “Random Access Memory”, this is the main working memory of a node, holding the data produced or consumed by the computation process during its lifetime.
Persistent storage attached to a node. Some disks will hold the operating system (OS), other will hold the user data. Multiple technologies exist: HDD, SSD, NVMe with different capacities/performance/cost tradeoffs.
The piece of software that runs the base services on a node and interfaces the hardware resources with the software processes.
Execution context (flow) of a sequence of instructions that can be run independently from the rest of the program (the other threads) while sharing the same adress space. Threads in a program are created through mechanisms such as `pthreads<https://en.wikipedia.org/wiki/Pthreads>`_ or OpenMP.
Sequence of instructions in either human-readable format (source code) or machine-readable (compiled format (binary). Instructions can refer to mathematical operations, memory transfers, disk access, network connections, etc.
Program that produces a binary exectuable from human-readable source code written in a programming language, by contrast with an interpreter.
Program that executes human-readable source code written in a programming language directly by calling the corresponding functionalities in its own code, by contrast with a compiler.
Sequence of instructions treated as a single and distinct unit. More specifically, a job is a sequence of steps, each step consisting in multiple parallel tasks. A job is described by a job submission script and submitted to the job scheduler.
A invocation of a program along with its arguments through the srun Slurm command. The srun command will swpan as many tasks as requested, monitor them, report resource usage, and forward incoming UNIX signals. Each step of a job will have a distinct entry in the accounting for this job.
A running instance of a program started by srun (or mpirun). Multiple tasks of the same step will run in parallel, on possibly distinct nodes and can itself use multiple CPUs in parallel.
Central Processing Unit. In general, a CPU is synonym to processor, but in this context, it must be understood as a single allocatable unit for computation. Depending on the node configuration and on the parametrization of the job scheduler, it will most often correspond to a core or a hardware thread/
Job submission script
Shell script (text file containing shell commands) that describes a job (what commands to run, which program to start, etc.) along with its resource requirements and additional parameters (such as a job name, an email address for notifications, etc.)
A list of resources that are needed by a job, such as number of CPUs, amount of RAM, possibly GPUs<GPU>, software licences, etc. and a maximum duration for the usage of the those ressources (Wall time limit).
Action of deciding which job can use which resources and when, based on resource availability, job priorities and backfill opportunities. Jobs that can currently use resources are said to be running, the others are pending.
Number associated with a job that is used to decided the ordering in which jobs are considered for resource allocation. The priority can be computed based on multiple criteria such as fairshare, job size, queue time, quality of service, etc.
Scheduling policy by which a job with a lower priority can start before a job with higher priority, provided it does not delay that higher-priority job’s start time. This can be either because it requires a completely distinct set of resources, or because it will free the resources before the predicted time of the higher-priority job.
Fairshare is a measure of how far away from a fair usage of the resources a user or an account is, based on its cluster share. Users who used the cluster a lot in the recent past (i.e. have consumed many CPU.hours) have a lower fairshare than users who did not. Fairshare is adapted in realtime as the running jobs consume resources.
Unit of computing resource usage corresponding to using a full CPU for one hour, or, equivalently, two CPUs for half an hour, etc. The concept is similar for node.hour, node.day, GPU.hour, etc.
A Slurm account is an administrative concept that allows tracking and organising resource usage among user organisation levels (departments, units, etc.)
Portion of the total available resource a cluster can offer that is “promised” to a user or an account, often based on administrative concerns.
Quality of service
Functionality by which a user can request particular privileges for a job, such as for instance a priority boost.
Job whose parallel threads or processes can all address the same memory, or at least a common portion of memory. Such jobs can only run on a single node.
Job whose parallel tasks each have their own memory space. Such jobs can spread across multiple nodes provided they rely on a mechanism for data communication between tasks, which can be through the network XXX or via the disks.
Parallel programming pardigm where all theads or process are able to read and/or write in the same memory space, for instance with threads spawed from the same process, or processes using a shared memory segment. Shared-memory programming is typically done in a shared-memory job, but can also happen in a distributed-memory job provided a PGAS <https://en.wikipedia.org/wiki/Partitioned_global_address_space>`_ library such as Unified Parallel C, Coarray Fotran or OpenSHMEM <http://openshmem.org/site/> is used.
Parallel prgramming paradigm where processes exchange messages rather than sharing memory. If the exchange mechanism is able to travel through networks, as is the case with MPI, the processes can live in distinct nodes, if it is not the case, such as with named pipes or UNIX sockets
Message Passing Interface. De-facto standard for distributed-memory programming. Multiple implementations exist for various languages, some are open-source, some are sold by hardware manufacturers or compiler vendors.
Graphical Processing Unit. Hardware component initially designed to connect to the computer screen, later evolved to off-load heavy computations from the processor. The compute power of a GPU is much larger than that of a processor, but the way it is programmed is very different. Specific libraries such as CUDA, ROCm, OpenACC <https://en.wikipedia.org/wiki/OpenACC>, OpenCL must be used.
Software library that can be included into a program and offer access to optimised linear algebrae, signal processing, statistical, etc. functionalities. Open-source examples are OpenBLAS, BLIS, or Atlas. Commercial libraries include for instance MKL. Like compilers and other software, they are most often organised in environment modules, or system modules
Mechanism that allows modifying the environment (environment variable, alias, etc.) in which programs are started by the shell. By modifying variables such as $PATH or $LD_LIBRARY_PATH, users can choose which of the installed software they want to use. Modules are often organised into releases.
In the context of modules, a release is a set of modules related to software whose versions and toolchain have been carefully chosen so as to be compatible.
Command-line interface to the operating system that interpret user commands such as starting a program, copying a file, etc. Examples include Bash and Zsh.
Command-line interface. User interface that involves the keyboard to enter command in response to a prompt in a terminal window, by contrast with a TUI (text user interface), that lets users interact with menu items or widgets in a terminal through the keyboard or a GUI (grahipcal user interface) that allows users to click on buttons and menu items with the mouse.
Piece of software that manages the interactions between the shell and the console of the user. The shell can be local to the machine or running on a remote server and connected to through SSH.
Computer equipment consisting in a keyboard and a screen.
Secure Shell. Protocole by which users can connect to remote servers (for instance a login node of a cluster) using their login and a combination of authentication factors <https://en.wikipedia.org/wiki/Multi-factor_authentication>_ (password, key, hardware token, etc.).
A piece of data (text, number sequence, image, etc.) stored in a filesystem and primarily identified by its name. Along the name, other metadata information is stored by the filesystem such as owernship, permissions, or creation date. Files are organised in a hierarchy of directories
Cataloging structure that contains files or to other directories. Also called folder in some contexts. In HPC cluster, each user has a home directory where they can write files and create sub-directories. It is by default their current working directory when they connect to the cluster.
Data structure and assorted mechanism to store and retrieve files. A filesystem local to a (compute) node can only be accessed from that node. By contrast, a network filesystem (e.g. NFS) is hosted on a (storage) node and exported to all other noeds, and a parallel filesystem (e.g. BeeGFS) is hosted on multiple (storage) nodes and exported to all other nodes.
Meta data associated with a file, that defines what type of access (read, write, execute) the owner, its group, and the others, have on the file.
Software that analyses other software to find out which functionality or operation takes the most time.
Software that helps analysing other software to find bugs and problems.
A shell variable that holds values that can be specified by the user, by the environment module system, or by Slurm, that can alter the way sofware behaves.
How well a job or program can use increasing amounts of computing resources. A program that strongly scales takes less and less time as more computing power is used. A program that weakly scales takes roughtly the same time when more data and more computing power are used in the same proportions.