Kategorien
Seiten
-

IT Center Changes

Kategorie: ‘RWTH-HPC’

[CLAIX-2023] Update to Rocky Linux 8.10

20. Juni 2024 | von

The RWTH Compute Cluster CLAIX-2023 was updated to Rocky Linux 8.10 due to the end-of-life of Rocky 8.9 ensuring the availability of continuous security and bugfix updates to the system.

Detailed changes from Rocky 8.9 to Rocky 8.10 can be read from the Rocky Linux relase notes.

Note: Please keep in mind that due to the update, the measurable performance may vary compared to previous compute jobs due to changed library and application versions included in the updated system.


You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Final decommision CLAIX-2018

03. Juni 2024 | von

All remaining CLAIX-2018 (login and backend nodes) have been decommissioned now. There are only the following exceptions:

* login18-4 and copy18-1 stay online for integrative hosting customers. All others should use the new CLAIX-2023 dialog systems.

* login18-g-1 stays online until login23-g-1 is available again (compare maintenance page).

 


You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Change of default partition to CLAIX-2023

15. Mai 2024 | von

The default partition for all projects was changed from CLAIX-2018 to the corresponding CLAIX-2023 partition (e.g., c23ms for CPU jobs, c23g for GPU jobs).


You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Decommissioning of further CLAIX-2018 nodes

08. Mai 2024 | von

Today, further 287 CLAIX-2018 MPI nodes have been decommissioned.

 


You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Decommissioning of CLAIX-2018

03. Mai 2024 | von

CLAIX-2018 has reached end of life. The first 516 (empty) nodes have been switched off today (May 3rd, 2024). Over the next few weeks, further systems will gradually be taken out of service. The final decommission of the remaining CLAIX-2018 nodes will take place on May 31st, 2024.

 


You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Decommission of first CLAIX-2018 GPU nodes

10. April 2024 | von

We decommissioned 25 GPU nodes CLAIX-2018. We strongly recommend to migrate to the CLAIX-2023 ML nodes as soon as possible.

 


You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Systemmaintenance 17 April 2024

10. April 2024 | von

Dear cluster users,

on 17 April 2024, we will carry out a complete maintenance of the cluster. The following points will be processed during maintenance

* new kernel so that the user namespaces can be reactivated, see also [1]
* update of the Infiniband stack of CLAIX23 to stabilise and improve performance
* migration of the HPCWORK directory from lustre18 to the new storage lustre22, see also [2]. In the last few weeks we have started to migrate all HPCWORK data to a new file system. In this maintenance we will perform the last step of the migration. HPCWORK will not be available during this maintenance.
* migration of the old JupyterHub system to a new one

During this maintenance work, the login systems and the batch system will not be available. It is expected that the login systems will reopen in the early morning.

We do not expect the maintenance to last all day, but expect the cluster to open earlier. However, HPCWORK will most likely not be available at this time, the migration must be completed first. Jobs that rely on HPCWORK will complain that they cannot find files. You must therefore stop such jobs and resubmit them at a later date.

[1] https://maintenance.itc.rwth-aachen.de/ticket/status/messages/14/show_ticket/8929
[2] https://maintenance.itc.rwth-aachen.de/ticket/status/messages/14/show_ticket/8960

With kind regards,

Your HPC Team

 


You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

CLAIX-2023 in operation

05. April 2024 | von

After a successful pilot phase, CLAIX-2023 is in operation. The new system is available to all projects now. In order to submit jobs to the CLAIX-2023 HPC segment, please use one of the new partitions in your scripts. The used resources will be accounted on your project contingents (e.g., „SBATCH -A rwthXXXX“).

Please also note: The end of life of CLAIX-2018 is imminent. Thus, we recommend to start using CLAIX-2023 as soon as possible.

Please contact servicedesk@itc.rwth-aachen.de for any further questions.

 


You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Update RegApp

03. April 2024 | von

We updated the RegApp to the latest version (e.g., internal code updates, upgrade Java 17).


You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

HPC Slurm privacy mode has been disabled

23. Februar 2024 | von

UPDATE: This feature has been temporarily disabled due to incompatibilities with https://perfmon.hpc.itc.rwth-aachen.de/

Slurm commands within the ITC HPC Cluster have been changed to hide personal Slurm information from other users.

  • Users are prevented from viewing jobs or job steps belonging to other users.
  • Users are prevented from viewing reservations which they can not use.
  • Users are prevented from viewing usage of any other user with Slurm.

If you experience any problems, please contact us as usual via servicedesk@itc.rwth-aachen.de with a precise description of the features you are using and your problem.