Disk space¶
Danger
There is no backup of the data stored on any cluster. Any removed file is lost for ever. It is the user’s responsibility to keep a copy of the contents of their home in a safe place.
Each cluster is equipped with several file systems that can be used to store files. These disk spaces have different properties and each of them is designed to best fit different usage intents. They are listed in the table here-under.
Disk space | Scope | Environment variables (depends on cluster) | Lifetime |
---|---|---|---|
Home | cluster | $HOME | cluster or account lifetime + grace period |
Workdir | cluster | $GLOBALSCRATCH | one year |
Local scratch | node | $TMPDIR,$LOCALSCRATCH | job lifetime |
Global Home | CÉCI | $CECIHOME | funding or account lifetime + grace period |
Transfer | CÉCI | $CECITRSF | as short as needed |
Long-term | external |
The ‘Scope’ column indicates from where the disk space is accessible. A scope of ‘cluster’ means all compute nodes and frontends in the cluster share the same filesystem. By contrast, a scope of ‘node’ refers to storage space that is distinct for each node ; it is local to that node and cannot be accessed from outside the node. The ‘CÉCI’ scope means the filesystem is accessible from all compute nodes and all frontends of all CÉCI clusters. Finally, a scope of ‘external’ refers to machines that are outside the perimeter of the CÉCI consortium and are managed by the universities.
The ‘Lifetime’ column indicates the duration for which the data will be kept. Data residing in the home directories will reside there as long as the cluster is in production and the account owning the data was renewed no later than a couple years ago (grace period). When a date for decomissioning is decided, users are warned, and instructions are offered to migrate the data either to a newer cluster, or to another storage.
Warning
Data belonging of accounts that are expired for more than 500 days are eligible for deletion, after notification to the data owner, or their supervisor should the owner be unreachable.
When a date for purging old data is reached, the past owner is contacted if possible, or their supervisor to warn about the data removal. Data in Workdirs/Global scratches are often cleaned during maintenance operations, while data in Local scratches are cleaned after each job. Data in Global home will remain as long as there is sufficient funding to maintain and replace the infrastructure when needed. Finaly, data in Transfer should be removed as soon as tranfers are completed.
Note
You can also check the presentation given on our yearly training sessions about using the different storage solutions on the clusters . On the final slides is pointed where to find some slurm submission scripts examples.
Home file system¶
Upon login on a login node or front-end, you will be end up in your home
directory. This directory is available on the front-end and all compute nodes of a cluster. Its
full path can be shown with echo $HOME
and you can return there with a simple
cd
command.
The home filesystem is dedicated to source code (programs, scripts), configuration files, and small datasets (like input files.)
Do not use this area for your main working activities; use your workdir directory instead (see next section).
Note
Quotas and time limits are enforced on the use of space on the HOME directories. See section “Quota” for details.
Workdir file system¶
The workdir or globalscratch is a high-performance shared disk space common to
all compute nodes and to the front-end of a cluster. Its full path can be shown with echo
$GLOBALSCRATCH
. The path will differ from one cluster to other. The
globalscratch is often built using a fast parallel filesystem such as Lustre, GPFS, or
FraunhofferFS.
The workdir should be used to store the files generated by your batch jobs. You can also copy there large input files if required. After that, it is customary in the job script to create a subdirectory with the job id, where all temporary data will be written, and to clean that directory when the job finishes, and after having copied results to another location either on the same filesystem if the results are to be consumed by a later job, or on another filesystem such as the home or a remote long-term storage.
To use a globalscratch, you might first need to create a directory by yourself. It is then common to name it after your login. On some clusters, it might already be created for you. Anyway, it is always safe to try to create it on any CÉCI cluster with the command.
mkdir -p $GLOBALSCRATCH
Note
Some clusters have a quota on the use of the workdir directories. See section “Quota” for details.
Warning
The data in the globalscratch directory can be removed at any time specialy during maintenance periods.
Local Scratch file system¶
The local scratch is the temporary disk space available on all compute nodes
and is only visible from within the compute node it belongs to. On the CÉCI
clusters, they are available through the $LOCALSCRATCH
or $TMPDIR
environment variable. Often they are built on top of a fast, but
redundancy-lacking, RAID-0 system.
There you can write/read temporary results during your job, copy the results of interest back to the home directory, and then delete it at the end of the job script. Example commands can be found in F.A.Q.-Q11 section
Note
Files stored in the scratch directory on each node are removed immediately after the job terminates. You will not be able to access files in the scratch directory after your job has completed. Furthermore files in scratch directory are not accessible from any other nodes (compute node and login nodes). Therefore, all files you want to save must be copied from the scratch directory to your home as part of your job.
Using the scratch directory when running a batch job is often more efficient than using the home or workdir. If your jobs performs a lot of disk I/O to files that does not need to be shared between nodes, then please use this directory. This relieves the load on the central disk servers, and most of the time also makes your job run faster.
Warning
There is not quota limit on the local scratch. The user has to be careful not to fill the space otherwise the job will probably crash.
The scratch size depends of the node type. For example on Hercules:
Node type | Size |
---|---|
Dell M610 | 600 GB |
Dell M610x | 600 GB |
HP DL360 | 600 GB |
HP SL230 | 1.2 TB |
CECI’s Common filesystem¶
A detailed information about Global Home and Transfer filesystem is available on the common filesystem section.
Long-term storage file system¶
Long-term storage or archive is also built on stable technologies, but quota there are less limited. The downside is that the long-term storage is often not directly connected to the cluster so it can be quite slow. It may also not be free of charge. CECI does not provide long-term storage. Ask the systems administrator at your institution which kind of long-term storage they offer. Then you ca transfer your data as explained in the file transfer section.
Final remarks¶
As scratch spaces are not meant to store data in the long term, you should expect them to be cleaned automatically after some time. This does not mean that you should not do it yourself though: always clean up after your job. This is especially important after a job crashes before the cleaning operations in the submission script had a chance to run.
You need to be careful about filenaming when using a parallel job to avoid overwriting files by two different parts (threads, or processes) of the program, especially on global filesystems.
Even if your data is small and the quota allow you working in your home directory, you should consider using scratch spaces as they are in general much faster. The global filesystems are much faster because they handle requests and store data in parallel, while the local filesystems are faster as they do not need to access the network and they are used by far less jobs at a time.
Quota¶
Quotas are setup on your home directory and in some clusters on your workdir space.
The quota system will be based on soft quota, hard quota and a grace period.
A soft quota is the disk space level below which nothing happens for the user. When your reach the soft quota, a warning email is sent to you on some systems. On others, you get a message when you connect. This does not prevent you writing more data until the grace period is reached.
The grace period is the amount of time during which you are allowed to go over the soft limit before the system blocks your ability to write more data.
The hard quota is the disk space level above which the system blocks immediately your ability to write more data. However, you will still be able to read your data. If your hard quota is exceeded or if you exceed the grace time for your soft quota, your will be blocked from submitting jobs until you clean your space.
The table below present the storage quotas for CÉCI user accounts on each
cluster. Use the command ceci-quota
to get all your current quotas usage:
username@nic5-login1 07:55:08 ~ 1 > ceci-quota
Diskquotas for user username
Filesystem used limit files limit
$HOME 14.9 GiB 100.0 GiB 43648 200000
$GLOBALSCRATCH 2.3 TiB 5.0 TiB 70038 500000
$CECIHOME 6.7 GiB 100.0 GiB 60222 100000
$CECITRSF 0.0 B 1.0 TiB 1 unlimited
Cluster | file system | env variable | Soft quota | Hard quota | Grace |
---|---|---|---|---|---|
NIC5 | home workdir |
$HOME $GLOBALSCRATCH |
100 GB 5 TB |
110 GB 5 TB |
|
Dragon2 | home workdir |
$HOME $GLOBALSCRATCH |
40 GB | 44 GB | |
Dragon1 [1] | home workdir |
$HOME $GLOBALSCRATCH |
20 GB 20 GB |
23 GB 23 GB |
|
Hercules | home workdir |
$HOME $GLOBALSCRATCH |
200 GB 400 GB |
1 TB 4 TB |
9 weeks 1 weeks |
Lemaitre4 | home workdir |
$HOME $GLOBALSCRATCH |
100 GB none |
100 GB none |
n/a |
Nic4 | home workdir |
$HOME $GLOBALSCRATCH |
20 GB none |
25 GB none |
|
All Clusters | Global Home Transfer |
$CECIHOME $CECITRSF |
100GB 1 TB |
120 GB 10 TB |
10 days |
[1] | Clusters where $HOME and $GLOBALSCRATCH are the same filesystem so the quota is applied on both. |