LISTSERV mailing list manager LISTSERV 16.5

Help for DISCOVERY Archives


DISCOVERY Archives

DISCOVERY Archives


DISCOVERY@LISTSERV.NEU.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

DISCOVERY Home

DISCOVERY Home

DISCOVERY  July 2014

DISCOVERY July 2014

Subject:

Re: Connection fo MGHPCC is down - Update 5

From:

Nilay Roy <[log in to unmask]>

Reply-To:

Discovery Cluster <[log in to unmask]>

Date:

Fri, 11 Jul 2014 14:37:58 -0400

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (123 lines)

Dear Discovery Cluster Users,

While we wait for the connection to be restored by the projected ETR of 18:00 EDT today, we have an alternative route to MGHPCC. This has been established for emergency use only. This will enable users to move any critical files they need to their local drives.

I have extensively checked the following using this connection:

1) All running jobs are completing.
2) All storage is working OK and there is no data loss or corruption. All nodes - administrative, login and compute nodes are up an running.
3) There is NO connection to our Windows Active Directory (AD) Servers so we cannot authenticate users. For this reason I have closed the LSF queues and all pending jobs are removed. There is danger of corruption of storage and user configurations if we try to run the cluster in this state. As soon as the connection is restored the queues will be opened. Running jobs are not affected as authentication information is cached for these jobs.

This connection will not allow more than 8 simultaneous users and is very slow. To access your files - for each user - I will have to set up a temporary account on the login node "discovery4.neu.edu". Users then will login to the temporary location and then using the temporary account on discovery4.neu.edu into the login node. From there they can access their files in /home or /scratch and "rsync" (preferred) or "sftp" these to their local machine. They will need the user account name, password and IP or fully qualified domain name of their local machine where they want to move their critical files.

Please email me if you require emergency access to the cluster and I will set up a local temporary account on "discovery4.neu.edu" and link it to your regular account on the Discovery Cluster so that you can access your files and transfer them to your local machine.

There is no X11 forwarding on this connection so you will not be able to use any GUI's/Windows.

I will do this on a first come first served basis.

We apologize for the disruption and inconvenience this has caused our users, and we are doing everything in our capacity to provide alternatives till the issue is resolved.

Thank you for your patience in this matter.

Best
Nilay

-----Original Message-----
From: Nilay Roy [mailto:[log in to unmask]] 
Sent: Friday, July 11, 2014 1:28 PM
To: Roy, Nilay
Subject: RE: Connection to MGHPCC is down - Update 4

Dear Discovery Cluster Users,

Networking services has provided us with another update:

Update 4: " Type II advised splicing of the 2 (288) count cable are 90% complete. At this time RCN is preparing a workaround for the 432 using spare fibers on one of the existing 288's. NCC checking the systems to verify the state of impacted services. Updates will continue accordingly. ETTR: 18:00 EDT”

We are also working with Networking to deploy and test an alternate route to MGHPCC for users that need urgent access to the cluster before the expected ETTR for resumption of connectivity to MGHPCC of 18:00 EDT today. 

We will continue to update you as we receive more information.

Thank you for your patience in this matter.

Best
Nilay
========================================================================== 
Nilay Roy, PhD Computational Physics, MS Computer Science
Assistant Director - Research Computing, Information Technology Services
Northeastern University, 221-177, 360 Huntington Avenue, Boston, MA 02115
Email: [log in to unmask] (C) 508.226.2261 (Preferred) / (O) 617.373.6048
Northeastern Research Computing Website: http://www.northeastern.edu/rc 
========================================================================== 

Subject: RE: Connection to MGHPCC is down - Update 3

Dear Discovery Cluster Users,
Networking services has provided us with another update:
Update 3: "We still have no ETR.  There are 2 - 88-strands and one 492 strand cable that are cut.  It sounds like the duct work that the cable goes through is badly damaged in multiple locations."
I believe there is redundancy to the 10G trunks to MGHPCC, Holyoke, MA where our Discovery Cluster along with HPC Clusters from Harvard University, MIT, BU, UMass and equipment from Commonwealth of Massachusetts is located. 
But this time there is extensive damage to the fiber cables due to duct works being damaged in multiple places and flooding.
Many other users that rely on this for connectivity not only to MGHPCC, Holyoke, MA but other data centres are also affected. Please bear with us.
We will continue to update you as we receive more information.
Thank you for your patience in this matter.
Best
Nilay
=================================================
Nilay Roy, PhD Computational Physics, MS Computer Science
Assistant Director - Research Computing, Information Technology Services
Northeastern University, 221-177, 360 Huntington Avenue, Boston, MA 02115
Email: [log in to unmask] Tel: 508.226.2261 (Preferred) / 617.373.6048
=================================================

Subject: RE: Connection to MGHPCC is down - Update 2 
 
Dear Discovery Cluster Users,
Networking services has provided us with another update:
Update 2: "The vendor has located the fiber cuts. There are multiple cuts where I beams were driven through the fiber. Also the manholes that need to be used are flooded. There is no eta at this time. "
We will continue to update you as we receive more information.
Best
Nilay
=================================================
Nilay Roy, PhD Computational Physics, MS Computer Science
Assistant Director - Research Computing, Information Technology Services
Northeastern University, 221-177, 360 Huntington Avenue, Boston, MA 02115
Email: [log in to unmask] Tel: 508.226.2261 (Preferred) / 617.373.6048
=================================================

Subject: RE: Connection to MGHPCC is down - Update. 
 
Dear Discovery Cluster Users,
Networking services has provided us with an update:
Update 1: “Our vendor has confirmed that this is a fiber cut in South Boston. There is no ETR at this time. RCN ticket# RT-22865 888-972-6622”
We will continue to update you as we receive more information.
Best
Nilay
========================================================================== 
Nilay Roy, PhD Computational Physics, MS Computer Science
Assistant Director - Research Computing, Information Technology Services
Northeastern University, 221-177, 360 Huntington Avenue, Boston, MA 02115
Email: [log in to unmask] (C) 508.226.2261 (Preferred) / (O) 617.373.6048
Northeastern Research Computing Website: http://www.northeastern.edu/rc 
========================================================================== 
 
Subject: Connection to MGHPCC from Campus is down.
 
Dear Discovery Cluster Users,
The Networking services confirmed that the connection to MGHPCC in Holyoke, MA where the Discovery Cluster is located is down. So all existing connections have been terminated. Jobs will continue to run but you will not be able to login unless connectivity is restored. If you had a session live that terminated you will have lost your work. No new connections to the cluster can be made at this time.
Networking is working to ensure that the issue is resolved as soon as possible.
We will update you as soon as we have more information on this.
Best
Nilay
========================================================================== 
Nilay Roy, PhD Computational Physics, MS Computer Science
Assistant Director - Research Computing, Information Technology Services
Northeastern University, 221-177, 360 Huntington Avenue, Boston, MA 02115
Email: [log in to unmask] (C) 508.226.2261 (Preferred) / (O) 617.373.6048
Northeastern Research Computing Website: http://www.northeastern.edu/rc 
========================================================================== 
########################################################################

To unsubscribe from the DISCOVERY list, click the following link:
http://listserv.neu.edu/cgi-bin/wa?SUBED1=DISCOVERY&A=1

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

May 2024
April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
May 2018
March 2018
February 2018
January 2018
October 2017
August 2017
July 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014

ATOM RSS1 RSS2



LISTSERV.NEU.EDU

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager