Jupyter¶
For heavy computations, you can run a Jupyterhub service in a Slurm job allocation on any cluster. Jupyterhub allows you, among others, to access a to a python notebook. To achieve this, you need to the following steps:
- Find the module to load
- Submit a jupyter job in a cluster
- Connect your browser to the jupyterhub
Find the module to load¶
Connect to a cluster using ssh or MobaXterm.
Search for IPython or JupyterLab modules in the documentation or execute in a cluster one of this commands:
$ module spider IPython
$ module spider JupyterLab
Chose one of the available modules and use module spider
again to get the module dependencies. Example:
module spider IPython/7.18.1-GCCcore-10.2.0
...
You will need to load all module(s) on any one of the lines below before the "IPython/7.18.1-GCCcore-10.2.0" module is available to load.
releases/2020b
...
In this case you will need to load releases/2020b
before loading IPython .
Submit a job to start jupyter¶
Once you know which module to load, submit a job to work for 30 min with the command:
$ srun -t 0:30:00 --pty bash -c 'ml releases/2020b; ml JupyterLab; jupyter notebook --ip $(hostname -i)'
The following have been reloaded with a version change:
1) releases/2019b => releases/2020b
[I 11:09:07.956 NotebookApp] Serving notebooks from local directory: /home/users/-/-/ceciuser
[I 11:09:07.957 NotebookApp] Jupyter Notebook 6.1.4 is running at:
[I 11:09:07.957 NotebookApp] http://10.252.2.1:8888/?token=6392d08d83e5826e27f044a40973273e682282dcbc49894a
[I 11:09:07.957 NotebookApp] or http://127.0.0.1:8888/?token=6392d08d83e5826e27f044a40973273e682282dcbc49894a
[I 11:09:07.957 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 11:09:07.990 NotebookApp] No web browser found: could not locate runnable browser.
[C 11:09:07.990 NotebookApp]
You need to keep the compute node IP address (10.252.2.1) the port (8888) and
the connexion token for next step. Leave the terminal open until you have
finish. To end the job use CTRL-C
.
Note
For dragon2 you need to add the option --cluster=$CLUSTER_NAME
. Example
srun -t 0:30:00 --cluster=$CLUSTER_NAME --pty bash -c 'ml load releases/2019b; ml load IPython/7.9.0-fosscuda-2019b-Python-3.7.4; jupyter notebook --ip $(hostname -i)'
Connect to the jupyterhub interface¶
Linux or MacOS¶
If you are using Linux or MacOS, then use the nice little tool named Sshuttle tool to access it.
Warning
sshuttle is currently not supported directly on Microsoft Windows.
Sshuttle is a program that, according to its documentation, acts as a
“Transparent proxy server that works as a poor man’s VPN. Forwards over ssh.
Doesn’t require admin. Works with Linux and MacOS.” Installation on a Linux or
MacOS laptop can be done with apt
, pacman
, dnf
depending on your
Linux distribution, or with brew
, MacPort
on MacOS, but you can also
install it with pip
or git clone
from GitHub directly. See the GitHub
page for details.
After you have installed sshuttle
, make sure you can access the cluster
using an SSH gateway or VPN if necessary. In the
following, we will assume the connection to nic5 through one of the gwceci
gateways is properly configured in .ssh/config
as nic5
.
Then, create a tunnel with sshuttle
. Open a terminal window that will be dedicated to that and run the following in it.
$ sshuttle -r nic5 10.252.1.0/16
[local sudo] Password:
Warning: No xauth data; using fake authentication data for X11 forwarding.
client: Connected.
It will ask for your password in order to elevate privileges (sudo
). That
is needed because it will modify temporarily low-level network routing. As long
as it runs, the private network of the cluster will be accessible from your
laptop, wherever you are connected. Leave that terminal session open for as
long as you need access to the Jupyter service in your job.
If your terminal supports it, you can then click on the URL that starts with http://10.
in previous step.
Otherwise simply copy that URL in your browser and you should see the Jupyterhub interface.
Once you are finished hit CTRL-C
in both terminals to stop everything.
The IP range for sshuttle
depends on the cluster.
- nic5 10.252.1.0/16
- lemaitre3 10.7.1.1/24
- dragon2 10.102.169.0/24
Windows¶
If you are using Windows, install and configure MobaXterm.
Use the compute node IP address and port you get when submitting the job.
To reach the IP address of the computer node from your computer (localhost) two forwarding ports must be created. The first one will forward a intermediate localhost port (1111) to the front-end (nic5, lemaitre3 ...) through the gateway. The second one will forward the local port 2222 on the localhost port 1111 to the computer node port 8888
Create the first tunnel between your computer and the cluster frontend:
- Open MobaXTerm and click on “tunnelling” and click on “new ssh tunnel” to create the first tunnel.
- Fill the local port with 1111.
- Fill “ssh server” with the login and gateway address of your university.
- Fill the remote server with the frontend cluster address
- Click Save and set the name of the new tunnel with the cluster name
- Click on the key and add your CÉCI key id_rsa.ceci
You can create a tunnel for each cluster (nic5, lemaitre3 ...)
Create the second tunnel between your computer and the compute node through the first tunnel:
- Create a “new ssh tunnel” .
- Fill the local port with 2222.
- Fill “ssh server” with the localhost forwarded port 1111 and your CÉCI user login name
- Fill the remote server with the compute node IP and port you get when submitting the job.
- Click Save and set the name of the new tunnel as node.
- Click on the key and add your CÉCI key id_rsa.ceci
You will have something like this:
Start the tunnel for the cluster where you submitted the job and the tunnel for the node. Then open your browser at http://localhost:2222/?token=XXXXXXXXXX. Replacing XXXXXXXXXX with the token you received wen submitting the job.
See a video with a connexion example on dragon2.
Note
Each time you summit a new jupyterhub job, you could receive a different IP address and port. Replace the IP address and port in the node tunnel before start it.