Hello Discovery users,
The
annual shutdown at MGHPCC that would normally occur in June was rescheduled for a date later in the year due to concerns around COVID-19. We will send out further information as details and a date are worked out for this new maintenance window.
We will be using this time instead to perform monthly maintenance on Monday, June 1 from 8a.m. to 5p.m. The cluster will be unavailable for jobs during this time. In addition, the General Parallel File System (GPFS) will be undergoing an upgrade, which will take up to two days. This will impact the availability of the /scratch file system for that time period. This will not affect your /home directory or any other storage system, such as /work, if you have a directory on that system. You can still access those directories after the maintenance period is completed.
The June maintenance tasks will include the following:
*Upgrading CUDA to 10.2
*Upgrading GPFS
*Completing the InfiniBand (IB) network cabling on GPFS
To ensure that your job scripts account for this shutdown period, we have a script ( t2sd ) that you should add to the --time option with your jobs. The --time option will override any time parameters you have in your sbatch script with the time remaining before the cluster is unavailable.
For example:
srun --time=$( t2sd ) <your usual srun options>
sbatch --time=$( t2sd ) script.sbatch
NOTE: If you usually run your jobs on a partition with short time limits, such as debug or express, you don’t need to add the ( t2sd ) option until it is closer to the start of the maintenance window. You only need to use ( t2sd ) if the time left before the start of the maintenance period is less than the default time of the partition. For example, the default time of the express partition is 60 minutes. If you wanted to run a job on the express partition at 5 a.m. on June 1, you would not need to add the ( t2sd ) option. However, if you wanted to run a job at 7:30 a.m. on June 1 on the express partition, you would need to use the ( t2sd ) option.
If
you have any questions or issues, please contact us at
[log in to unmask], visit our website rc.northeastern.edu, or view our documentation at rc-docs.northeastern.edu.
The Research Computing Team
#### IMPORTANT ####
You are receiving this email because you have an account on the Discovery cluster. Membership on this list is mandatory for all Discovery account holders, as it is our primary way of communicating important updates and announcements to you about Discovery. If you no longer need a Discovery account and want to be unsubscribed from this list as well, you need to submit an unsubscribe from mailing list request in Service Now.