Categories
Pages
-

IT Center Changes

Kategorie: ‘RWTH-HPC’

CLAIX System Maintenance on 2023-11-27

November 17th, 2023 | by

Dear users of the RWTH compute cluster,

on 2023-11-27 the complete cluster will not be available from 8am to 12am due to system maintenance.

Kind regards,
Your HPC team


You can track any disruptions or security advisories that may occur due to the aforementioned change in the RWTH-HPC category on our status reporting portal.

End of Apptainer Pilot Phase

October 11th, 2023 | by

We are happy to announce that, after a long pilot phase, we are granting all users full access to use Apptainer containers on the cluster. Containers are virtual environments that allow running an identical software configuration across several systems, e.g., two different HPC systems, and simplify the setup of software that only runs well on other Linux distributions. Apptainer also supports the conversion of Docker images and can thus run a vast variety of existing images with little to no extra effort.

Previously, we only allowed curated container images as part of our software stack and individual images on a per-case basis. Starting today, users can build and run their own container images anywhere on the cluster.

If you are interested in using Apptainer, please take a look at our documentation[1] and read the “Best Practices” section to get started and avoid common problems. As part of our efforts to support containerized workloads in HPC, we will also grow our collection of container images in the module system and provide a set of Claix-specific base images for various scenarios that can be used as a foundation for your own container images.

Kind regards,
Your HPC Team

[1] https://help.itc.rwth-aachen.de/service/rhr4fjjutttf/article/e6f146d0d9c04d35aeb98da8d261e38b/

CLAIX-2018 dialog systems

September 7th, 2023 | by

Due to the high load on the login / dialog nodes affecting their usability, we decided to reduce the maximum usable cores on each login node to four cores for each user. Please note:  These login nodes should be used for programming, preparation and minimal post processing of batch jobs. They are not intended for production runs or performance tests. For longer tests  (max. 25 minutes), parallel debugging, compiling, etc., you can use our “devel” partition by adding “#SBATCH –partition=devel” to batch jobs or interactively with “salloc -p devel”.

For all productive jobs, please use our batch system **without**  “#SBATCH –partition=devel”. If you want to more learn more about the batch system, we invite you to our Slurm introduction.

 


You can track any disruptions or security advisories that may occur due to the aforementioned change in the RWTH-HPC category on our status reporting portal.

FastX Server Component Upgraded to Version 3.3.39

September 7th, 2023 | by

The FastX server component installed on the HPC frontend nodes was upgraded to version 3.3.39.
The update contains security enhancements and several bugfixes from which all users benefit when using FastX.

Please ensure to use the latest desktop client if you are using FastX when accessing the cluster.

For more information on how to access the RWTH Aachen Compute cluster via FastX, please refer to the ITC Help Page


You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our [status reporting](https://maintenance.rz.rwth-aachen.de/ticket/status/messages/14-rechner-cluster) portal.

ARMForge becomes LinaroForge

August 31st, 2023 | by

With Linaro’s acquisition of the Forge toolset from ARM, the popular parallel debugger DDT and the performance analysis tool Performance Reports are now available as Linaro Forge.

The newest version of the toolset is available on CLAIX via module load LinaroForge/23.0.2.


You can track any disruptions or security advisories that may occur due to the aforementioned change in the RWTH-HPC category on our status reporting portal.

HPC Cluster: Linux Kernel Upgrade

August 25th, 2023 | by

The Linux Kernel on the CLAIX18 compute nodes is being upgraded to kernel version 4.18.0-477.21.1. To maximise the availability of the compute cluster, the mandatory reboot of the nodes is scheduled as a reboot job, thus allowing all already submitted and running jobs for completion before the upgrade takes place.

Please note that the reboot is prioritised over other jobs, and some nodes may be temporarily unavailable after the reboot.

Best regards,
Your HPC-Team@RWTH


You can track any disruptions or security advisories that may occur due to the aforementioned change in the Email category on our status reporting portal.

Change in SSH Configuration: Depreciation of Insecure Methods, Addition of New Methods

August 4th, 2023 | by

As the result of a recent security evaluation, we have decided to disable several methods in key exchange, message authentication codes and encryption ciphers classified insecure/weak which obsoletes the following methods and method groups as listed below. In general, we have disabled SHA-1-based methods since SHA-1 is broken since early 2017 (cf. Stevens et al.: “The first collision for Full SHA-1”).

We kindly ask you to update your client configuration accordingly since these methods cannot be used anymore to access the RWTH Aachen HPC Cluster until further notice: Read the rest of this entry »

Resource limits on HPC dialog systems changed

July 31st, 2023 | by

We have reduced the per-user-resource limits for main memory on the HPC dialog systems (login18-1.hpc.itc.rwth-aachen.de etc.). A single user can now only use about 25% of the available main memory, i.e. 96GB for most of our servers. On login18-x-1 and login18-x-2, as before, only 16 GB are available to each user.

 

Changes to Abaqus Batch Jobs

July 14th, 2023 | by

Today, we have made several major changes to the Abaqus configuration on the RWTH-HPC systems. These changes include an automatic configuration of many batch job parameters which previously have been generated on-the-fly as part of the example batch script we described in Abaqus documentation .
With this new setup, users should experience less problems when starting Abaqus batch jobs and receive descriptive error messages including suggested solutions when something goes wrong during configuration. These changes only affect Abaqus 2020 and newer.

Do I need to change my batch scripts?

No. The previous example batch scripts still work. However, you may omit the part that generates a local abaqus_v6.env file. If such a file exists in the job’s working directory, its settings will override the system settings but in most cases these are identical as of now.

What to do if I experience problems?

If you have trouble running the Abaqus GUI on a frontend node, please make sure to delete the abaqus_v6.env file from the working directory. Alternatively, you can try starting the GUI from another directory. If you still experience problems or if your batch jobs behave in an unexpected fashion, please report the issue to servicedesk@itc.rwth-aachen.de

OS Upgrade to Rocky 8.8

July 11th, 2023 | by

Dear users of the cluster,

on

** July 17, 2023 from 7:00 a.m. to 5:00 p.m. **

there will be a maintenance where we will update the current operating system Rocky Linux 8.7 to Rocky Linux 8.8. The front ends will also be updated, so you will not be able to log into the cluster or access your data.

However, there is an exception to this. The MFA test engine login18-4 will remain accessible, but you will only be able to log in there with a second factor [1]. Temporarily, however, $HPCWORK will also be unreachable here, as the Lustre file system is also undergoing maintenance.

We do not expect that you will have to recompile your software or change your job scripts. So your jobs should start normally after the end of the maintenance.

 

With best regards
Your HPC Team @ RWTH