Using pre-installed software¶
Software that is most used on clusters is already pre-installed on the CÉCI clusters: Fortran and C/C++ compilers, popular interpreters like Python or R, numerical libraries such as BLAS and LAPACK, and message passing libraries (e.g. OpenMPI). In addition, several versions of these software are often installed.
If you are already familiar with an environment modules tool and you are only interested in the list of software available in the CÉCI clusters, go to the Software installed in the clusters section. Otherwise, please keep reading to learn how to enable the software you want.
Loading software with the module
command¶
The installed software can be enabled through the use of the module
command, its main options are:
module avail | av list available software (modules)
module load | add [module] set up the environment to use the software
module list list currently loaded software
module purge clears the environment
module spider list all possible modules
module show show the commands in the module file
module help get help
For instance, module available openmpi
will list all available OpenMPI
modules. Assuming it returns something like
OpenMPI/2.1.1-GCC-6.4.0-2.28
OpenMPI/2.1.1-iccifort-2017.4.196-GCC-6.4.0-2.28
OpenMPI/3.1.1-GCC-7.3.0-2.30
then with the command
module load OpenMPI/2.1.1-GCC-6.4.0-2.28
you will enable OpenMPI version 2.1.1 compiled with GCC version 6.4.0.
The naming convention for the available modules is always of the form
software/version-toolchain
(more on the toolchain part below).
After doing this, when you run e.g. mpicc
or mpirun
without specifying
the full path, you will be running that specific version of OpenMPI compilers
or launch script.
Note
The module load
command can be abbreviated as ml
Enabling or loading a given module actually sets a group of environment
variables such as $LD_LIBRARY_PATH
, $CPATH
, $PATH
, ... to point
to the custom installation paths of the software.
You can verify how those variables change when loading/unloading a given
module. As we provide several versions of the same software and libraries,
each one of them must be installed in different custom locations. It’s
not possible to rely only on the standard unique location on Unix systems
at /usr/lib
, /usr/include
, /usr/bin
, ... used for typical
installations with only a single version of each software.
Most of the software requires several dependencies to be built, for instance in the example above we needed at least GCC v6.4.0. Loading the module will also load all the required dependencies at build time. You can verifiy which were the rest of dependencies pulled with
module list
If you want to reset the environment (unload all the modules) use the command
module purge
You will have to do so if you want to load another version of the software, or test to compile your own software with different compiler and libraries versions.
As the module
tool is in charge of dynamically modifying your environment
variables, it is recommended to not modify the variables $LD_LIBRARY_PATH
and $CPATH
in your .bashrc
, unless you know precisely what you are
doing.
If you are told to so in the installation instructions of a software you need,
please contact us for support, that might save you a lot of time in some cases.
For advanced users, the exact behaviour of the module (setting PATHs, enviroment variables, aliases, etc.) and which are its dependencies can be discovered with the command
module show
For instance:
[you@lemaitre3 ~]$ module show HDF5
------------------------------------------------------------------------------------------------------------------------
/opt/cecisw/arch/easybuild/2018b/modules/all/HDF5/1.10.2-intel-2018b.lua:
------------------------------------------------------------------------------------------------------------------------
help([[
Description
===========
HDF5 is a data model, library, and file format for storing and managing data.
It supports an unlimited variety of datatypes, and is designed for flexible
and efficient I/O and for high volume and complex data.
More information
================
- Homepage: https://portal.hdfgroup.org/display/support
]])
whatis("Description: HDF5 is a data model, library, and file format for storing and managing data.
It supports an unlimited variety of datatypes, and is designed for flexible
and efficient I/O and for high volume and complex data.")
whatis("Homepage: https://portal.hdfgroup.org/display/support")
conflict("HDF5")
load("intel/2018b")
load("zlib/1.2.11-GCCcore-7.3.0")
load("Szip/2.1.1-GCCcore-7.3.0")
prepend_path("CPATH","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/include")
prepend_path("LD_LIBRARY_PATH","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/lib")
prepend_path("LIBRARY_PATH","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/lib")
prepend_path("PATH","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/bin")
prepend_path("PKG_CONFIG_PATH","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/lib/pkgconfig")
setenv("EBROOTHDF5","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b")
setenv("EBVERSIONHDF5","1.10.2")
setenv("EBDEVELHDF5","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/easybuild/HDF5-1.10.2-intel-2018b-easybuild-devel")
setenv("HDF5_DIR","/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b")
the details of the compiling options for a module can be found in the installation log file. That log file can be found, once the module is loaded, at the location
$EBROOT<module name>/easybuild/*log
For instance:
[you@lemaitre3 ~]$ ml HDF5
[you@lemaitre3 ~]$ ls $EBROOTHDF5/easybuild/*log
/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b/easybuild/easybuild-HDF5-1.10.2-20190430.140559.log
[you@lemaitre3 ~]$ ml FFTW
[you@lemaitre3 ~]$ ls $EBROOTFFTW/easybuild/*log
/opt/cecisw/arch/easybuild/2018b/software/FFTW/3.3.8-gompi-2018b/easybuild/easybuild-FFTW-3.3.8-20181011.025049.log
The full configure
command used for the compilation is saved in the log file
[you@lemaitre3 ~]$ grep 'INFO cmd " ./configure' $EBROOTHDF5/easybuild/*log
== 2019-04-30 14:00:55,033 run.py:506 INFO cmd " ./configure --prefix=/opt/cecisw/arch/easybuild/2018b/software/HDF5/1.10.2-intel-2018b --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --with-szlib=/opt/cecisw/arch/easybuild/2018b/software/Szip/2.1.1-GCCcore-7.3.0 --with-zlib=/opt/cecisw/arch/easybuild/2018b/software/zlib/1.2.11-GCCcore-7.3.0 --with-pic --with-pthread --enable-shared --enable-cxx --enable-fortran FC="mpiifort" --enable-unsupported --enable-parallel " exited with exit code 0 and output:
Toolchains and software organisation¶
The modules for the available software are organised around the notion of
toolchains.
A toolchain is a collection of compiler and libraries that are often used
and known to interoperate correctly. For instance, the foss
(free open source software) toolchain comprises a bundle of
- GCC (including gfortran and g++)
- OpenMPI
- OpenBLAS
- ScaLAPACK
- FFTW
Toolchains bundles are further organised into releases designated by a year followed by the letter a
or the letter b
. Releases corresponding to a specific set of versions for each of the module part of the toolchain. The versions are chosen so as to minimize the risks of bugs and incompatibilities.
For instance release 2017b
of the foss
toolchain contains:
- GCC/6.4.0
- OpenBLAS/0.2.20
- OpenMPI/2.1.1
- ScaLAPACK/2.0.2
- FFTW/3.3.6
In most of the CÉCI clusters are also provided some releases of the intel
toolchain, which comprises a bundle of
- Intel compilers (icc, ifort, icpc)
- Intel MPI
- MKL
Note
For the pre-installed software we provide, which requires at least the
set of tools bundled in a toolchain, it’s followed the convention to add to
the module name for it a suffix -toolchain-YYYYx
to identify the
toolchain used at build time (which becomes a dependency to run this
software).
Warning
Mixing toolchains (e.g. foss
and intel
) and releases (e.g. 2021a
and 2022b
) is not a good idea as it can introduce linking issues and more generally compatibility issues.
Extensions packages/modules for interpreted languages¶
Many extensions are installed by default for languages that are pre-installed on the clusters. First load the module corresponding to the version of the language that you want then run the corresponding following command.
For Python, you can get the list with:
$ pip freeze
ansible==2.4.2.0
argcomplete==1.7.0
Babel==0.9.6
backports.ssl-match-hostname==3.4.0.2
bzr==2.5.1
cffi==1.6.0
chardet==2.2.1
[...]
Note
If you work with Python virtual environments, make sure to create the virtual env with --system-site-packages
to make the system packages visible from the virtual env.
For R, run this in a R shell:
> ip <- installed.packages(.Library)
> ip[, c(1,3)]
Package Version
abc "abc" "2.1"
abc.data "abc.data" "1.0"
abind "abind" "1.4-5"
acepack "acepack" "1.4.1"
adabag "adabag" "4.1"
ade4 "ade4" "1.7-8"
adegenet "adegenet" "2.1.0"
adephylo "adephylo" "1.1-10"
ADGofTest "ADGofTest" "0.3"
akima "akima" "0.6-2"
[...]
If you do not find the extension you need, you can try installing it by yourself following these procedures: Installing languages extensions