Slurm Memory Usage
For all partitions we had to slightly decrease the maximum available memory. Please consider our documentation for all details.
GPU resources
In order to unify the resource allocation process and to avoid confusion, we will apply the following changes on October 2nd, 2024:
- If you want to apply for GPU resources, you have to use „k GPU-h“ (i.e., thousand GPU hours) instead of „Mio core-h“. This includes the detailed project description as well as the JARDS online form. For convenience reasons, the JARDS system will still calculate the core hour equivalent. Until the end of the year we will accept the old metric in your detailed project descriptions, of course.
- The views in JARDS.project (RWTH projects, NHR projects) will show the used contingents in „k GPU-h“ instead of „Mio. core-h“ for all GPU resources.
- The command line tool r_wlm_usage will show the used contingents in „k GPU-h“ instead of „Mio. core-h“ for all GPU resources (i.e. „ML partition“).
- If you act as scientific reviewer, you should recommend GPU resources in „k GPU-h“ instead of „Mio. core-h“.
There will be no changes for the HPC (i.e. CPU resources) partition. Please note: The internal billing mechanisms will not change at all. Our Slurm configuration will still use a billing which respects the used memory and core equivalents of a node. Basically, the factor 24 between GPU-h and Core-h will be used.
You can find all limits for the different project categories on our website.
You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.
Default Anaconda Repositories have been blocked on the HPC Cluster
Dear CLAIX Users
As you may have noticed, access to the „default“ Anaconda repositories (repo.anaconda.com and others hosted at anaconda.com) has been blocked by the firewall on the HPC cluster. This action is necessary because RWTH is not permitted to use the these repositories due to licensing issues.
We understand that blocking the Anaconda domain may disrupt your current workflows. To mitigate this, we set conda-forge as the default channel in /etc/conda/condarc. If you still encounter issues, please check the .condarc file in your home directory and make sure to remove „defaults“, „r“, and „main“ from the channel list.
For new Conda users, we suggest using Miniforge, as this distribution uses conda-forge as its default repository.
We apologize for any inconvenience this may cause. If you require assistance, please reach out to servicedesk@itc.rwth-aachen.de.
PS: For a brief time, we accidentally blocked anaconda.org. We have corrected this issue.
HPCWORK Now Offers Increased File Quotas
Dear CLAIX Users
As you might remember, in April, we transitioned the HPCWORK directories to a new Lustre-System. This new system provides significantly greater file quotas. We are pleased to announce that you now have a default quota of 1 million files.
We hope this enhancement makes your workflows easier and more capable.
Happy computing!
Your HPC Admins
[CLAIX-2023] Update to Rocky Linux 8.10
The RWTH Compute Cluster CLAIX-2023 was updated to Rocky Linux 8.10 due to the end-of-life of Rocky 8.9 ensuring the availability of continuous security and bugfix updates to the system.
Detailed changes from Rocky 8.9 to Rocky 8.10 can be read from the Rocky Linux relase notes.
Note: Please keep in mind that due to the update, the measurable performance may vary compared to previous compute jobs due to changed library and application versions included in the updated system.
You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.
Final decommision CLAIX-2018
All remaining CLAIX-2018 (login and backend nodes) have been decommissioned now. There are only the following exceptions:
* login18-4 and copy18-1 stay online for integrative hosting customers. All others should use the new CLAIX-2023 dialog systems.
* login18-g-1 stays online until login23-g-1 is available again (compare maintenance page).
You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.
Change of default partition to CLAIX-2023
The default partition for all projects was changed from CLAIX-2018 to the corresponding CLAIX-2023 partition (e.g., c23ms for CPU jobs, c23g for GPU jobs).
You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.
Decommissioning of further CLAIX-2018 nodes
Today, further 287 CLAIX-2018 MPI nodes have been decommissioned.
You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.
Decommissioning of CLAIX-2018
CLAIX-2018 has reached end of life. The first 516 (empty) nodes have been switched off today (May 3rd, 2024). Over the next few weeks, further systems will gradually be taken out of service. The final decommission of the remaining CLAIX-2018 nodes will take place on May 31st, 2024.
You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.
Decommission of first CLAIX-2018 GPU nodes
We decommissioned 25 GPU nodes CLAIX-2018. We strongly recommend to migrate to the CLAIX-2023 ML nodes as soon as possible.
You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.