Following the discovery of an intermittent but serious network issue, Gandi teams have determined that a rolling maintenance will be necessary.
We regret that, while most of the affected systems will simply require migration and no interruption in service, some will almost certainly require restarting. We will endevor to make these interruptions as short as possible, and only perform them when absolutely necessary.
We are starting with dc0 (Paris) this week. We will proceed on to dc2 (Luxembourg) on Monday, June 9.
The issue is not detected on dc1 (Baltimore) at the moment but, if necessary, we will proceed to fix it there. We apologize for any inconvenience this may cause.
* 08:25 UTC 12 hosting nodes are made inaccessible due to a switch failure. ~200 Virtual machines (VMs) are made unreachable.
* 08:40 UTC Switches are recovered and VMs are once again accessible. Investigation does not reveal cause of incident.
* 12:01 UTC A second incident occurs, affecting 8 nodes and ~180 VMs.
* 12:09 UTC Switches are recovered, VMs are made available again. Additional data collection measures are put in place to help determine cause.
* 14:56 UTC A third incident occurs, affecting 10 nodes and 321 VMs.
* 15:10 UTC Nodes and VMs are available again. This time extensive forensic data is made available, and we expect to find the root cause and execute a permanent fix, which will be implemented as soon as possible.
We do apologise for the inconvenience this issue may have caused.