Infrastructure

../_images/infrastructure.svg

As a member of the THOTH team, you’ll have access to several machines including our desktops, GPU cluster and the CPU cluster shared among the center of Grenoble. The machines you’ll have access to are managed by THOTH or the DSI (Direction des Systèmes d’Information).

Machines

Desktops

Upon your arrival, you should have a desktop attributed to you.

All desktop machines run a Linux Ubuntu distribution (16.04 LTS or 18.04 LTS).
You have basic user access to all of the desktop machines. The credentials (username+password) are the same as your INRIA account.

Administrator privileges are given only to the system administrators. The list of desktop machines is available here:

Under certain circumstances it is acceptable to run experiments on desktop machines.
There are, however, best practices to follow:
  • Almost every machine has a designated user. If your experiment is big, ask for his permission first.

  • Only run code that you *know* is stable.

  • Don’t crash your friend’s machine by allocating too much memory.

Important

Desktops stay online 24/7. You shouldn’t shut down your computer when you leave work. The reason for this is that data is stored on every desktop, and this data should be accessible to other people at all times. Sometimes a reboot is necessary, in those cases talk to a system administrators first.

GPU nodes

At the time of writing (September 2021), our GPU cluster is comprised of 27 GPU nodes. You cannot ssh directly to GPU nodes, but you can launch interactive jobs on them. The frontend machine for the GPU cluster is edgar. More information can be found in the dedicated OAR tutorial.

CPU nodes

The CPU cluster was previously managed by THOTH, but is now administered by the DSI. We have priority access to a subset of 54 nodes. As for GPU nodes, we cannot ssh directly to CPU nodes, but you can launch interactive jobs on them. The frontend machine for the CPU cluster is access2-cp. More information can be found in the dedicated OAR tutorial.

Warning

Do not launch python commands directly on cluster frontends (edgar or access2-cp). If your command makes the frontend crash, the whole cluster will be down.

Others

Some machines such as clear or prospero are neither desktops or cluster nodes, but have different functions such as storing the data for the team or running monitoring servers.

Storage

Home directory

Your home directory is accessible from all machines with this path:

/home/yourUsername

It is designed to store low volume, highly critical data, such as your code.

The space available is only ~10 GB. You can hardly store videos, descriptors or experimental results on this, and it shouldn’t be used as such.

Your home directory data is relatively safe. It is stored by the centre’s IT. If you have important data, regularly make backups yourself on a laptop, external hard drive, or upload your code somewhere (see online storage for details).

Note

Home directories are backed up. If you do something wrong (delete a file, etc.) old versions of any directories in your home directory can be accessed in: /home/yourUsername/.snapshot. This hidden directory stores different version of the files from your home directory (last 4 hours, last 14 days and eventually last 4 weeks). It is impossible to retrieve a file modified more than 4 weeks ago.

Scratch spaces

THOTH

There are numerous “scratch” spaces in the THOTH team. They are data volumes exported via NFS to *all* machines.

scratch spaces are designed to store high volumes of data with some redundancy, however it should not be considered reliable.

Also, the data on scratch spaces is not backed up, if you remove the wrong file you can’t recover it.

Some examples of scratch space paths:

/scratch/clear/
/scratch2/clear/
/scratch/albireo/
/scratch/orion/

To see all the available scratch spaces, check the disk usage table. To use a scratch space, ask your system administrators for access. They will create a directory /scratch/machine/yourUserName to which you have write access.

Inria Grenoble

Some scratch spaces are not located physically on our machines, and are managed by the DSI of Grenoble. We have access to it on all machines:

/services/scratch/thoth/yourUsername

This scratch space can be also useful if you want to collaborate with other Inria teams located in Grenoble. You can replace thoth by any other team.

UGA

Coming soon.

Recap

The following directories are accessible from all machines of the network :

/home/<username>
/scratch/<workstation>/<username>
/services/scratch/thoth/<username>

Online storage

  • Versioning and saving code: it is highly recommended to use development tools that allow you to version and back up your code.

To version your code, you can use git or svn (git is now recommended).

Inria proposes services to manage git (and svn) projects online: the Inria GitLab (where you can log in with your Inria credentials).

  • Sharing and backing up files: Inria proposes a SeaFile sync service available at https://mybox.inria.fr that basically works as any syncing service file (Dropbox, GoogleDrive, etc.). You have a 10Gb storage space where you can create some libraries. You can install a client on your machine for automatic sync. You can log in with your Inria credentials. People with whom you are collaborating (that are not Inria) can create an account and contribute to your libraries (they will NOT have a 10Gb storage space).

  • Foreign services: as member of Inria, it is highly recommended that you use the pre-cited tools to store and share code/data online. In particular, for privacy concern, you should avoid using services that are not hosted by Inria or its institutional partners. For instance, you should not use Dropbox, Google Drive and equivalent to save work related data. The use of Inria GitLab is also encouraged (compared to GitHub for instance).