Allgemein - IT Center Changes

Kategorie: ‘Allgemein’

Decommissioning of CLAIX-2018

03. Mai 2024 | von Cramer, Tim

CLAIX-2018 has reached end of life. The first 516 (empty) nodes have been switched off today (May 3rd, 2024). Over the next few weeks, further systems will gradually be taken out of service. The final decommission of the remaining CLAIX-2018 nodes will take place on May 31st, 2024.

You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Kategorie: Allgemein, RWTH-HPC
Kommentare geschlossen

Decommission of first CLAIX-2018 GPU nodes

10. April 2024 | von Cramer, Tim

We decommissioned 25 GPU nodes CLAIX-2018. We strongly recommend to migrate to the CLAIX-2023 ML nodes as soon as possible.

You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Kategorie: Allgemein, RWTH-HPC
Kommentare geschlossen

Systemmaintenance 17 April 2024

10. April 2024 | von Wagner, Marcus

Dear cluster users,

on 17 April 2024, we will carry out a complete maintenance of the cluster. The following points will be processed during maintenance

* new kernel so that the user namespaces can be reactivated, see also [1]
* update of the Infiniband stack of CLAIX23 to stabilise and improve performance
* migration of the HPCWORK directory from lustre18 to the new storage lustre22, see also [2]. In the last few weeks we have started to migrate all HPCWORK data to a new file system. In this maintenance we will perform the last step of the migration. HPCWORK will not be available during this maintenance.
* migration of the old JupyterHub system to a new one

During this maintenance work, the login systems and the batch system will not be available. It is expected that the login systems will reopen in the early morning.

We do not expect the maintenance to last all day, but expect the cluster to open earlier. However, HPCWORK will most likely not be available at this time, the migration must be completed first. Jobs that rely on HPCWORK will complain that they cannot find files. You must therefore stop such jobs and resubmit them at a later date.

[1] https://maintenance.itc.rwth-aachen.de/ticket/status/messages/14/show_ticket/8929
[2] https://maintenance.itc.rwth-aachen.de/ticket/status/messages/14/show_ticket/8960

With kind regards,

Your HPC Team

You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Kategorie: Allgemein, RWTH-HPC
Kommentare geschlossen

CLAIX-2023 in operation

05. April 2024 | von Cramer, Tim

After a successful pilot phase, CLAIX-2023 is in operation. The new system is available to all projects now. In order to submit jobs to the CLAIX-2023 HPC segment, please use one of the new partitions in your scripts. The used resources will be accounted on your project contingents (e.g., „SBATCH -A rwthXXXX“).

Please also note: The end of life of CLAIX-2018 is imminent. Thus, we recommend to start using CLAIX-2023 as soon as possible.

Please contact servicedesk@itc.rwth-aachen.de for any further questions.

You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Kategorie: Allgemein, RWTH-HPC
Kommentare geschlossen

HPC Slurm privacy mode has been disabled

23. Februar 2024 | von Alvaro

UPDATE: This feature has been temporarily disabled due to incompatibilities with https://perfmon.hpc.itc.rwth-aachen.de/

Slurm commands within the ITC HPC Cluster have been changed to hide personal Slurm information from other users.

Users are prevented from viewing jobs or job steps belonging to other users.
Users are prevented from viewing reservations which they can not use.
Users are prevented from viewing usage of any other user with Slurm.

If you experience any problems, please contact us as usual via servicedesk@itc.rwth-aachen.de with a precise description of the features you are using and your problem.

Kategorie: Allgemein, RWTH-HPC
Kommentare geschlossen

Temporary Deactivation of User Namespaces

12. Januar 2024 | von Hansen, Sven

Update 08.02.24:
We have installed a bugfix release for the affected software component and enabled user namespaces again.

Dear users,

due to an open security issue we are required to disable the feature of so-called user namespaces on the cluster. This feature is mainly used by containerization software and affects the way apptainer containers will behave. The changes are effective immediately. Most users should not experience any interruptions. If you experience any problems, please contact us as usual via servicedesk@itc.rwth-aachen.de with a precise description of the features you are using. We will reactivate user namespaces as soon as we can install the necessary fixes for the aforementioned vulnerability.

Kategorie: Allgemein, RWTH-HPC
Kommentare geschlossen

Terrapin Attack Counter Measures (SSH)

09. Januar 2024 | von Toehgiono, Gerrit

A recently discovered flaw in the implementation of the Secure Shell (SSH) protocol lead to an attack vector called „Terrapin Attack“ enables an attacker to break the integrity of the „secure shell“ connection in order weaken the overall security. TL;DR To implement an effective counter measure against the attack, we have disabled the affected methods in the HPC cluster’s SSH configuration. Consequently, these methods cannot be used until further notice:

Ciphers: ChaCha20-Poly1305
MACs: Any etm method (e.g. hmac-sha2-512-etm@openssh.com)

Please adapt your configuration accordingly if your configuration is relying on the methods mentioned above.

The attack is only feasible when a using either the ChaCha20-Poly1305 Cipher or a combination of a Cipher Block Chaining (CBC) cipher (or, in theory, a Counter Mode (CTR) cipher) combined with an encrypt then MAC (etm) message authentication code (MAC) method and the attacker has the ability to act as a man-in-the-middle. (Example: A security suite on your client machine may perform a deep packet inspection (per definition a (hopefully „good“) man-in-the-middle) to protect you from other threats.)

The Galois Counter Mode (GCM) AES ciphers are not affected.

We encourage you to employ strong encryption ciphers such as aes256-gcm@openssh.com and a sufficiently strong MAC method (e.g. hmac-sha2-256 or hmac-sha2-512) immune to the attack vector.

Note:

Due to a bug in the Windows OpenSSH client employing the umac-128@openssh.com MAC as default, we disabled the problematic method in the SSH server configuration as well to minimize issues when connecting to the HPC cluster. Until further notice, only hmac-sha2-512 and hmac-sha2-256 can be employed as MAC. Please adapt your configuration accordingly, if required, e.g.:

Ciphers aes256-gcm@openssh.com,aes256-ctr
MACs hmac-sha2-512,hmac-sha2-256

You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Kategorie: Allgemein, RWTH-HPC
Kommentare geschlossen

(English) Multi-Factor Authentication Mandatory starting 15 January 2024

20. Dezember 2023 | von Alvaro

Leider ist der Eintrag nur auf English verfügbar.

Kategorie: Allgemein, RWTH-HPC
Kommentare geschlossen

OS Upgraded to Rocky 8.9

30. November 2023 | von Toehgiono, Gerrit

During the last cluster maintenance, the OS of the HPC cluster was upgraded to Rocky Linux 8.9 due to the EOL of Rocky 8.8 to ensure continous update support for the systems.

The upgrade provides a modernized system base and security enhancements. The user view, usage and the expectable performance of the cluster remain unchanged.

You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Kategorie: Allgemein, RWTH-HPC
Kommentare geschlossen

CLAIX Systemwartung am 27.11.2023

17. November 2023 | von Wagner, Marcus

Sehr geehrte Nutzende des RWTH Compute Clusters,

am 27.11.2023 wird der komplette Cluster von 8-12 Uhr aufgrund einer Systemwartung nicht zur Verfuegung stehen.

Mit freundlichen Gruessen,
Ihr HPC-Team

Etwaige auftretende Störungen oder Sicherheitshinweise aufgrund des genannten Changes in der Kategorie RWTH-HPC könnt ihr auf unseren Statusmeldungsportal verfolgen.

Kategorie: Allgemein, RWTH-HPC
Kommentare geschlossen

IT Center Changes

Kategorie: ‘Allgemein’

Kategorien

Changes