Installing software by yourself¶
Installing languages extensions¶
With scripting languages, even if the interpreter is installed globally, you have the possibility to install additional packages locally in your home directory.
Python¶
First of all, we discourage the use of Conda on the CÉCI clusters. See Conda .
For python include the --user
option to pip install
, e.g.:
$ pip install --user mymodule
will install the fictitious mymodule
package in your home directory $HOME/.local/
.
Once that is done, you will need to make sure $HOME/.local/bin
is in your PATH
variable.
Make sure to replace the x.y
part with the actual version of Python you are
using. For instance:
$ export PATH=$HOME/.local/bin:$PATH
It is important when you install a package that you load the correct Python module, and use the Pip option --no-binary :all:
to recompile from source rather than install pre-compiled binaries whenever possible. See more information in the PIP documentation . You can use GCC optimisation flags when doing so. Example:
CFLAGS='-O2 -pipe -march=sandybridge' pip install --no-binary :all: PACKAGE
The above example builds the PACKAGE with optimisation options that are compatible with most clusters, and hence suboptimal on recent ones. See With GCC for more information.
Virtualenvs¶
If you are already used to create Python virtualenvs for managing your custom
modules installations (if you are not, is a good idea to learn about them),
take into account that on the clusters we provide, apart of different core
Python versions, installations of different python modules bundles compatible
with them. If you need a specific python module not available in the
environment when you do e.g. module load Python/3.6.6-foss-2018b
, check
always with the module avail command or in the
list of installed software if there is a specific
installation provided for that Python installation.
In the case that you need mymodule
which is not available, check carefully
which are its dependencies and verify if some of those are available. Let’s
imagine that mymodule
has as requirements numpy, matplotlib, h5py and
Keras. On some clusters there are specific installations for those, so you can
load them
module load Python/3.6.6-foss-2018b
module load matplotlib/3.0.0-foss-2018b-Python-3.6.6
module load h5py/2.8.0-foss-2018b-Python-3.6.6
module load Keras/2.2.4-foss-2018b-Python-3.6.6
for numpy it is included already on the main Python module. Then you can
proceed to create a virtualenv and install mymodule
by doing:
mkdir ~/my_venv
virtualenv --system-site-packages ~/my_venv
source ~/my_venv/bin/activate
pip install mymodule
the --system-site-packages
flag will make available inside the virtualenv
the python modules already loaded. Then, when you pip install mymodule
you
avoid pulling the wheels for all those and you will be using the optimally
compiled versions we provide.
R¶
If you run a script that depends on specific libraries/packages, those libraries need to be installed. A command like
library(doParallel)
will load the doParallel
library if it is installed. If that library is not installed, then R
will fail with the following explicit error mesage: Error in library(doParallel) : there is no package called 'doParallel'
If you write:
install.packages("doParallel")
library(doParallel)
then R will try to install the doParallel
library before loading it.
On a HPC cluster, you will find that some libraries are installed beforehand by the system administrators. Note that libraries depends on the version of R
so it could be that, on the same cluster, a library is available for one version and not for another version. For instance on Hmem,
the doParallel
library is not installed in R
2.13.1
ceciuser@cecicluster:~ $ R
R version 2.13.1 (2011-07-08)
[...]
> library(doParallel)
Error in library(doParallel) : there is no package called 'doParallel'
>
but it is in R
3.3.1
ceciuser@cecicluster:~ $ R
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
[...]
> library(doParallel)
Loading required package: foreach
foreach: simple, scalable parallel programming from Revolution Analytics
Use Revolution R for scalability, fault tolerance and more.
http://www.revolutionanalytics.com
Loading required package: iterators
Loading required package: parallel
>
Remember that you can choose the R
version through environment modules and the module
command.
If the library you need is not installed, you can either ask the system administrators to install it globally or install it by yourself for your account.
If you run the install.packages()
command by yourself, R
will note that you are not the administrator and ask whether it should create a private library where additional packages will be installed:
> install.packages('doParallel')
Warning in install.packages("doParallel") :
'lib = "/opt/cecisw/arch/easybuild/2018b/software/R/3.5.1-foss-2018b/lib64/R/library"' is not writable # This is because you do not have administrator access to the cluster
Would you like to use a personal library instead? (yes/No/cancel) yes
Would you like to create a personal library
‘~/R/x86_64-pc-linux-gnu-library/3.5’ # So R will create a directory in your home directory, one per R version.
to install packages into? (yes/No/cancel) yes
--- Please select a CRAN mirror for use in this session ---
Secure CRAN mirrors
1: 0-Cloud [https] 2: Algeria [https]
3: Australia (Canberra) [https] 4: Australia (Melbourne 1) [https]
5: Australia (Melbourne 2) [https] 6: Australia (Perth) [https]
7: Austria [https] 8: Belgium (Ghent) [https]
[...]
** testing if installed package can be loaded
* DONE (doParallel)
Once this is done, you will be able to load the library on all compute nodes of that cluster for the same R
version as the one that was used to install the package.
Note that this process is interactive as R
asks the user some questions, and this might fail when running inside a batch job. So you can run it interactively when connected to the frontend (the login node) once before you submit your jobs.
If you want to include the install.packages()
command in your scripts, you should specify at least the target directory with the lib
parameter and the mirror to use with the repo
parameter:
install.packages('doParallel', lib='~/R/x86_64-pc-linux-gnu-library/3.5', repo='https://lib.ugent.be/CRAN')
so that R
knows readily the answer to the questions.
As a final note, please be aware that on heterogeneous clusters (which have compute nodes with different generations of processors), the R
installation is performed once per generation (to be optimised for the processors of that generation) and it could be that one library is missing for one R
version on one generation of CPU.
Octave¶
With Octave you can use the pkg command to install additional packages, and the pkg prefix command to decide where to install them. Here is an example how to install a package by yourself on the cluster. The objective of this mini tutorial is to have the azimuth
function available on a specific version of Octave available as an system module. We will assume that the relevant system module is loaded.
First you need to find the name of the package which offers that function. According to its documentation, https://octave.sourceforge.io/mapping/function/azimuth.html, it is named “mapping”. The easy way would be using the Octave Forge from the Octave command line with
octave:1> pkg install -forge "mapping"
but unfortunately, in this example, the version available from the Octave Forge requires a newer version of Octave than we have currently active.
octave:1> pkg install -forge "mapping"
error: the following dependencies were unsatisfied:
mapping needs octave >= 5.2.0
mapping needs geometry >= 4.0.0
octave:2> version
ans = 4.4.1
We will then install the package and its dependencies manually. As the latest version is not working for this version, you will need to downloaded the previous version from the project page on Source Forge: https://sourceforge.net/projects/octave/files/Octave%20Forge%20Packages/Individual%20Package%20Releases/
Download the file mapping-1.4.0.tar.gz
and copied it to your home directory.
Note
Whenever installing packages in Octave, load the texinfo
module before starting Octave to be able to generate the documentation inside Octave.
As stated in the documentation, mapping
requires some dependencies. When you try to install the package, with pkg install <path to file>
, it complains:
octave:1> pkg install mapping-1.4.0.tar.gz
error: the following dependencies were unsatisfied:
mapping needs geometry >= 4.0.0
octave:2> pkg install geometry-4.0.0.tar.gz
error: the following dependencies were unsatisfied:
geometry needs matgeom >= 1.0.0
therefore, download the following files:
geometry-4.0.0.tar.gz
matgeom-1.2.3.tar.gz
from the same location and proceeded to install matgeom
:
octave:1> pkg install matgeom-1.2.3.tar.gz
For information about changes from previous versions of the matgeom package, run 'news matgeom'.
and back the dependency chain:
octave:2> pkg install geometry-4.0.0.tar.gz
warning: doc_cache_create: unusable help text found in file 'clipper'
For information about changes from previous versions of the geometry package, run 'news geometry'.
octave:3> pkg install mapping-1.4.0.tar.gz
configure: WARNING: GDAL library not found. Reading of raster files will be disabled.
For information about changes from previous versions of the mapping package, run 'news mapping'.
Here you notice the warning about GDAL
libraries not being found, leading to reduced functionalities. So you can remove the package and install it again with the GDAL
module loaded:
octave:4> pkg uninstall mapping
octave:5> exit
[dfr@lemaitre3 ~]$ ml GDAL
[dfr@lemaitre3 ~]$ octave
octave: X11 DISPLAY environment variable not set
[...]
octave:1> pkg install mapping-1.4.0.tar.gz
For information about changes from previous versions of the mapping package, run 'news mapping'.
The warning has disappeared and the package is available.
octave:2> pkg list
Package Name | Version | Installation directory
--------------+---------+-----------------------
control | 3.1.0 | .../easybuild/2018b/software/Octave/4.4.1-foss-2018b/share/octave/packages/control-3.1.0
geometry | 4.0.0 | /home/ucl/pan/dfr/octave/geometry-4.0.0
io | 2.4.12 | .../arch/easybuild/2018b/software/Octave/4.4.1-foss-2018b/share/octave/packages/io-2.4.12
mapping | 1.4.0 | /home/ucl/pan/dfr/octave/mapping-1.4.0
matgeom | 1.2.3 | /home/ucl/pan/dfr/octave/matgeom-1.2.3
signal | 1.4.0 | .../easybuild/2018b/software/Octave/4.4.1-foss-2018b/share/octave/packages/signal-1.4.0
statistics | 1.4.0 | .../2018b/software/Octave/4.4.1-foss-2018b/share/octave/packages/statistics-1.4.0
Once the package is loaded, the azimuth
function is found and behaves as in the examples of its documentation.
octave:3> pkg load mapping
octave:4> which azimuth
'azimuth' is a function from the file /home/ucl/pan/dfr/octave/mapping-1.4.0/azimuth.m
octave:5> azimuth([10,10], [10,40])
ans = 87.336
octave:6> azimuth([0,10], [0,40])
ans = 90
octave:7> azimuth(pi/4,0,pi/4,-pi/2,"radians")
ans = 5.3279
octave:8>
Perl¶
For Perl can be used the local::lib
module and the cpanm
tool:
For cpanm
to install modules locally, you need to setup the environment
according to the output of the perl -Mlocal::lib
command. You can set it
interactively with
eval $(perl -Mlocal::lib)
and/or set it once and for all with
perl -Mlocal::lib >> ~/.bash_profile
Then you can simply run
cpanm Algorithm::Numerical::Shuffle
to install the Algorithm::Numerical::Shuffle
module.
By default, this will install the modules in the ~/perl5
directory. Should
you want to install them to another place, give that path as argument to the
local::lib
module. For instance:
perl -Mlocal::lib=mylibs/perl >> ~/.bash_profile
to install in the mylibs/perl
directory.
Installing with Yum or Aptitude¶
Installing with Yum (RedHat, Fedora, etc.) or Aptitude (Ubuntu, Debian, etc.)
or any other packager manager is not possible for users. Things like sudo
apt-get install <name of package>
will fail because all clusters run the
CentOS distribution that does not use the Aptitude packager manager, and users
are not allowed to use sudo
(See below).
If your program can only be installed with apt-get
, then you will need to
use Singularity.
Use of the sudo command¶
Do not try to use the sudo
command; it will fail. Only local system
administrators are able to gain root-level privileges. Regular users are not
allowed to, simply because they would continuously break each other’s
configuration, or potentially destroy the whole system. There is therefore no
way root-level privileges will ever be granted to users.