Installing software by yourself¶
Installing languages extensions¶
With scripting languages, even if the interpreter is installed globally, you have the possibility to install additional packages locally in your home directory.
First of all, we discourage the use of Conda on the CÉCI clusters. See Conda .
For python include the
--user option to
pip install, e.g.:
$ pip install --user mymodule
will install the fictitious
mymodule package in your home directory
Once that is done, you will need to make sure:
$HOME/.local/binis in your
$HOME/.local/lib/pythonx.y/site-packages/is in your
Make sure to replace the
x.y part with the actual version of Python you are
using. For instance:
$ export PATH=$HOME/.local/bin:$PATH $ export PYTHONPATH=$HOME/.local/lib/python2.7/site-packages/:$PYTHONPATH
It is important when you install a package that you load the correct Python module, and use the Pip option
--no-binary :all: to recompile from source rather than install pre-compiled binaries whenever possible. See more information in the PIP documentation . You can use GCC optimisation flags when doing so. Example:
CFLAGS='-O2 -pipe -march=sandybridge' pip install --no-binary :all: PACKAGE
The above example builds the PACKAGE with optimisation options that are compatible with most clusters, and hence suboptimal on recent ones. See With GCC for more information.
If you are already used to create Python virtualenvs for managing your custom
modules installations (if you are not, is a good idea to learn about them),
take into account that on the clusters we provide, apart of different core
Python versions, installations of different python modules bundles compatible
with them. If you need a specific python module not available in the
environment when you do e.g.
module load Python/3.6.6-foss-2018b, check
always with the module avail command or in the
list of installed software if there is a specific
installation provided for that Python installation.
In the case that you need
mymodule which is not available, check carefully
which are its dependencies and verify if some of those are available. Let’s
mymodule has as requirements numpy, matplotlib, h5py and
Keras. On some clusters there are specific installations for those, so you can
module load Python/3.6.6-foss-2018b module load matplotlib/3.0.0-foss-2018b-Python-3.6.6 module load h5py/2.8.0-foss-2018b-Python-3.6.6 module load Keras/2.2.4-foss-2018b-Python-3.6.6
for numpy it is included already on the main Python module. Then you can
proceed to create a virtualenv and install
mymodule by doing:
mkdir ~/my_venv virtualenv --system-site-packages ~/my_venv source ~/my_venv/bin/activate pip install mymodule
--system-site-packages flag will make available inside the virtualenv
the python modules already loaded. Then, when you pip install
avoid pulling the wheels for all those and you will be using the optimally
compiled versions we provide.
If you run a script that depends on specific libraries/packages, those libraries need to be installed. A command like
will load the
doParallel library if it is installed. If that library is not installed, then
R will fail with the following explicit error mesage:
Error in library(doParallel) : there is no package called 'doParallel'
If you write:
then R will try to install the
doParallel library before loading it.
On a HPC cluster, you will find that some libraries are installed beforehand by the system administrators. Note that libraries depends on the version of
R so it could be that, on the same cluster, a library is available for one version and not for another version. For instance on
doParallel library is not installed in
dfr@hmem00:~ $ R R version 2.13.1 (2011-07-08) [...] > library(doParallel) Error in library(doParallel) : there is no package called 'doParallel' >
but it is in
dfr@hmem00:~ $ R R version 3.3.1 (2016-06-21) -- "Bug in Your Hair" [...] > library(doParallel) Loading required package: foreach foreach: simple, scalable parallel programming from Revolution Analytics Use Revolution R for scalability, fault tolerance and more. http://www.revolutionanalytics.com Loading required package: iterators Loading required package: parallel >
Remember that you can choose the
R version through environment modules and the
If the library you need is not installed, you can either ask the system administrators to install it globally or install it by yourself for your account.
If you run the
install.packages() command by yourself,
R will note that you are not the administrator and ask whether it should create a private library where additional packages will be installed:
> install.packages('doParallel') Warning in install.packages("doParallel") : 'lib = "/opt/cecisw/arch/easybuild/2018b/software/R/3.5.1-foss-2018b/lib64/R/library"' is not writable # This is because you do not have administrator access to the cluster Would you like to use a personal library instead? (yes/No/cancel) yes Would you like to create a personal library ‘~/R/x86_64-pc-linux-gnu-library/3.5’ # So R will create a directory in your home directory, one per R version. to install packages into? (yes/No/cancel) yes --- Please select a CRAN mirror for use in this session --- Secure CRAN mirrors 1: 0-Cloud [https] 2: Algeria [https] 3: Australia (Canberra) [https] 4: Australia (Melbourne 1) [https] 5: Australia (Melbourne 2) [https] 6: Australia (Perth) [https] 7: Austria [https] 8: Belgium (Ghent) [https] [...] ** testing if installed package can be loaded * DONE (doParallel)
Once this is done, you will be able to load the library on all compute nodes of that cluster for the same
R version as the one that was used to install the package.
Note that this process is interactive as
R asks the user some questions, and this might fail when running inside a batch job. So you can run it interactively when connected to the frontend (the login node) once before you submit your jobs.
If you want to include the
install.packages() command in your scripts, you should specify at least the target directory with the
lib parameter and the mirror to use with the
install.packages('doParallel', lib='~/R/x86_64-pc-linux-gnu-library/3.5', repo='https://lib.ugent.be/CRAN')
R knows readily the answer to the questions.
As a final note, please be aware that on heterogeneous clusters (which have compute nodes with different generations of processors), the
R installation is performed once per generation (to be optimised for the processors of that generation) and it could be that one library is missing for one
R version on one generation of CPU.
For Perl can be used the
local::lib module and the
cpanm to install modules locally, you need to setup the environment
according to the output of the
perl -Mlocal::lib command. You can set it
eval $(perl -Mlocal::lib)
and/or set it once and for all with
perl -Mlocal::lib >> ~/.bash_profile
Then you can simply run
to install the
By default, this will install the modules in the
~/perl5 directory. Should
you want to install them to another place, give that path as argument to the
local::lib module. For instance:
perl -Mlocal::lib=mylibs/perl >> ~/.bash_profile
to install in the
Installing with Yum or Aptitude¶
Installing with Yum (RedHat, Fedora, etc.) or Aptitude (Ubuntu, Debian, etc.)
or any other packager manager is not possible for users. Things like
apt-get install <name of package> will fail because all clusters run the
CentOS distribution that does not use the Aptitude packager manager, and users
are not allowed to use
sudo (See below).
If your program can only be installed with
apt-get, then you will need to
Use of the sudo command¶
Do not try to use the
sudo command; it will fail. Only local system
administrators are able to gain root-level privileges. Regular users are not
allowed to, simply because they would continuously break each other’s
configuration, or potentially destroy the whole system. There is therefore no
way root-level privileges will ever be granted to users.