Kategorie: ‘Allgemein’
HPC Cluster: Linux Kernel Upgrade
The Linux Kernel on the CLAIX18 compute nodes is being upgraded to kernel version 4.18.0-477.21.1. To maximise the availability of the compute cluster, the mandatory reboot of the nodes is scheduled as a reboot job, thus allowing all already submitted and running jobs for completion before the upgrade takes place.
Please note that the reboot is prioritised over other jobs, and some nodes may be temporarily unavailable after the reboot.
Best regards,
Your HPC-Team@RWTH
You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.
Change in SSH Configuration: Depreciation of Insecure Methods, Addition of New Methods
As the result of a recent security evaluation, we have decided to disable several methods in key exchange, message authentication codes and encryption ciphers classified insecure/weak which obsoletes the following methods and method groups as listed below. In general, we have disabled SHA-1-based methods since SHA-1 is broken since early 2017 (cf. Stevens et al.: “The first collision for Full SHA-1”).
We kindly ask you to update your client configuration accordingly since these methods cannot be used anymore to access the RWTH Aachen HPC Cluster until further notice: Read the rest of this entry »
Resource limits on HPC dialog systems changed
We have reduced the per-user-resource limits for main memory on the HPC dialog systems (login18-1.hpc.itc.rwth-aachen.de etc.). A single user can now only use about 25% of the available main memory, i.e. 96GB for most of our servers. On login18-x-1 and login18-x-2, as before, only 16 GB are available to each user.
OS Upgrade to Rocky 8.8
Dear users of the cluster,
on
** July 17, 2023 from 7:00 a.m. to 5:00 p.m. **
there will be a maintenance where we will update the current operating system Rocky Linux 8.7 to Rocky Linux 8.8. The front ends will also be updated, so you will not be able to log into the cluster or access your data.
However, there is an exception to this. The MFA test engine login18-4 will remain accessible, but you will only be able to log in there with a second factor [1]. Temporarily, however, $HPCWORK will also be unreachable here, as the Lustre file system is also undergoing maintenance.
We do not expect that you will have to recompile your software or change your job scripts. So your jobs should start normally after the end of the maintenance.
With best regards
Your HPC Team @ RWTH
CLAIX-2016 EOL
CLAIX-2016 already reached its end of life for a while. For convenience reasons we still operate the following systems:
- CLAIX-2016 dialog (“login”) nodes:
login.hpc.itc.rwth-aachen.de
login-g.hpc.itc.rwth-aachen.de
login-t.hpc.itc.rwth-aachen.de - Data Transfer node:
copy.hpc.itc.rwth-aachen.de - CLAIX-2016-SMP nodes (144 cores, 2TB main memory):
lns02.hpc.itc.rwth-aachen.de
lns03.hpc.itc.rwth-aachen.de
We will switch off all remaining nodes on **July, 10th 2013**. Please use CLAIX-2018 login / transfer nodes in future:
New Terms of Use and Data Privacy Agreement
We updated our data privacy agreement and the terms of use for the service “RWTH High Performance Computing”:
One major change is that we have two separate data privacy agreements now:
- One for the RWTH Compute Cluster and the RWTH JARDS online portal.
- One for the NHR JARDS online portal.
This change is necessary, because the NHR JARDS online portal will be used by all national HPC centers (NHR) in future.
We believe all documents are in the interest of our users and enable a fair, productive and secure usage of our HPC resources. Thus, your consent is assumed. Otherwise, you can delete your HPC account at any time:
The changes come into force at June 1st, 2023.
If you have any questions or problems, the colleagues at the IT-ServiceDesk (servicedesk@itc.rwth-aachen.de) will be happy to help you.
You can track any disruptions or security advisories that may occur due to the aforementioned change in the RWTH-HPC category on our status reporting portal.
EOL CentOS Software Environment
As announced, the old CentOS software stack will reach its end of life on April, 30th. Beginning with May 2nd, this means the following:
- No submission to CentOS nodes will be possible anymore.
- All CLAIX-2018 login / dialog nodes will be migrated to Rocky 8 Linux, lmod and the new software stack.
- Jobs submitted to CentOS nodes before May 1st will be scheduled to the remaining CentOS batch nodes on a best-effort basis, without guarantee for start or completion within the remaining lifetime of those nodes. We strongly recommend to submit all new jobs to the new Rocky 8 environment as of now.
Please find the overview of changes with Rocky Linux 8 here.
You can track any disruptions or security advisories that may occur due to the aforementioned change in the RWTH-HPC category on our status reporting portal.
EOL login2.hpc.itc.RWTH-Aachen.de
Due to a hardware failure the dialog system
login2.hpc.itc.rwth-aachen.de
is no longer usable. Please use the dialog systems from CLAIX18 (Login Nodes) in the future.
You can track any disruptions or security advisories that may occur due to the aforementioned change in the RWTH-HPC category on our status reporting portal.
Increase maximum compute quota RWTH-S
Starting at April, 18th 2023 the maximum annual CPU quota for RWTH Small (RWTH-S) projects will be increased from 0.24 Mio. Core-h to 0.36 Mio Core-h. Find further information about the application process on our website.
You can track any disruptions or security advisories that may occur due to the aforementioned change in the RWTH-HPC category on our status reporting portal.
New software environment and operating system for CLAIX
Since CentOS 7, the Linux Operating System of CLAIX, is outdated, and since CentOS Linux 8 reached End Of Life (EOL), we have to shift to a new Linux distribution. We have selected Rocky Linux 8.7, which is also Red Hat Enterprise compatible.
Going hand in hand with this change, we also prepared the shift to a new software environment based on EasyBuild and Lmod. These tools are also widely used at other HPC centers. We expect them to improve the user experience and the maintainability from the administrator’s perspective.
However, for you as a user, both changes require that you
(1.) learn how to use the new module system (e.g., names of the modules changed, toolchains, etc.) and
(2.) recompile your software and
(3.) revise and possibly modify your batch scripts accordingly.
Understanding that you need time to prepare for these changes, we will offer support for the old (CentOS 7) and the new (Rocky Linux 8) environment in parallel for a transition period of approx. 7 weeks, ending on April 30th, 2023. During this period, more and more compute nodes will be migrated to the new environment. Consequently, the waiting times for jobs scheduled to run in the old environment might increase over time.
How to proceed?
During a maintenance on March 8th, 2023 the following login nodes were migrated to the new environment:
- login18-2.hpc.itc.rwth-aachen.de
- login18-3.hpc.itc.rwth-aachen.de
- login18-x-2.hpc.itc.rwth-aachen.de
- login18-g-2.hpc.itc.rwth-aachen.de
- copy18-2.hpc.itc.rwth-aachen.de
Jobs submitted from these login nodes will automatically be submitted to compute nodes running Rocky 8 and the new module environment. Any batch job submitted from non-migrated login nodes (i.e., not in the list above) will be scheduled to the old environment.
You can use these login nodes in order to test your new workflows, make modifications and get familiar with the new software stack. For an example on how to find software easily with “module spider” in the new module system, please refer here.
Please note: If you would like to use a graphical remote desktop session, ensure you are using FastX 3 (newer version).
We provide an overview on the changes with the transition and a separated branch here about the new module system (including example scripts for different software packages).
You can track any disruptions or security advisories that may occur due to the aforementioned change in the RWTH-HPC category on our status reporting portal.