There is currently an issue on Paris datacenter affecting some IaaS/Cloudserver & PaaS/SimpleHosting physical machines.

Our technical team is currently analyzing the issue, and is on-site in the datacenter. 

More information to come on this article.

 

EDIT : All hosting operations are currently stopped.

EDIT : It seems that it was located on a network device. The physical nodes and the VMs are coming back up. We monitor the impacted VMs to be sure that they respond in the next minuts.

EDIT : The issue has been solved. The hosting operations are up.


Tuesday June 19th 2012 at 9:30 GMT: our mail platform is currently experiencing a slowdown. Our technical team is on location and performing diagnostics to determine the reason for this incident. Affected services are email reception, Gandi's webmail, as well as the customer support ticket processing platform.

Please accept our apologies for the inconvienence. We will keep you informed of developments below as we learn more about the incident.

 

update 9:50 GMT: Our teams have fixed the problem, and the service is back to normal again.

 

Update Wednesday June 20 at 14:15 GMT: another machine has experienced the same problem as yesterday. We rebooted it as preventative measure, however it is currently in a filesystem correction. The service on this machine should return to normal in about 45 minutes.

 

Wednesday June 20th 2012 15:29 GMT: The unit has returned to normal operating conditions.


We are going to perform some database maintenance during the night of the 23rd to 24th of May, from 00:00 CEST to 03:00 CEST (gmt+2).

During this time, the management of Gandi services via the website or API will not be available, since they will be offline.

Services will continue to function normally.



We are currently experiencing issues with webmail authentication.  Our teams are working to resolve the problem as soon as possible.  This issue does not impact reception of your emails, but only accessing them.

 

We apologise for the inconvenience.

 

Update 29/02/2012:  13:30 GMT:  The issue has been stabilised since the end of the afternoon yesterday.  We are nevertheless working on the platform to avoid furture recurrence of this issue.


To correct problems we have identified in our storage systems, we need to perform corrective maintenance procedures tonight between 23:30 and 3:30 CET. This will impact some of your server systems, making certain data volumes unavailable for a period of 15 to 20 minutes.
We recommend that you do not restart your servers during this period. All services should return to normal immediately following the termination of the maintenance window.
If you will be affected by this maintenance, you should have received a mail from us advising you of the issues and timining. Please see this page for the contents of that message. 
We will update this post to keep you informed of our progress.

Edit 23:30 CET: operation started

Edit 00:32 CET: first reboot done, equipment beeing behaving as expected, we proceed further

Edit 01:00 CET: most reboots are now ongoing, our first upgrades having been successful

Edit 01:30 CET: most of our storage units have been upgraded - if your server recovered from I/O stalls, this issue is fixed for you -- otherwise, this upgrade will be finished in the next hour

Edit 01:34 CEST: a compute node crashed during this operation, we are starting the affected virtual servers on another machine right now.

Edit 02h30: maintenance is finished, thank your for your patience during this operation.


A piece of storage equipment is failing, likely due to defective components. Our teams are currently working to restore the situation as soon as possible. We recommend that you do not restart your server if you are impacted. We will keep you informed of our progress of this incident in this article.

[02:28 CET] Recovery sucessful. Components repaired and storage unit now nominal. 


We have a temporary emergency halt on the hosting storage system (filers).  We recommend that you do NOT attempt to restart your server.  The impacted servers should recover in the next few minutes.  We will update you with further information as soon as possible.

 

[edit 00:00] The services are fully restored as of 21:20 CET. Most users were back to nominal function before 19:30, but some took longer to start. Identifiable blocked systems were managed and restarted manually. Please restart your services if they are still unavailable at this time, and contact support if your server is not available and cannot be restarted.


A storage unit is currently experiencing a slowdown. Our teams are currently working on a solution.

Update (09:45 GMT): The situation improved between 07:00 and 08:00 GMT. There were significant slowdowns between 05:00 and 06:50 GMT.

 

Update (January 25th 09:00 GMT): A storage equipment is currently experiencing slowdown. The incident is similar to the one yesterday. Our technical team is working on solving the issue.

 

Update (January 25th 10:00 GMT): The I/O situation improved. Our technical team is still working to find a complete fix to the issue.

 

Update (January 25th 10:22 GMT): A storage equipment is currently experiencing slowdown. The incident is similar to the one this morning. Our technical team is working on solving the issue.

 

Update (January 26th 11:26 GMT): The I/O situation improved. Our technical team is still working to find a complete fix to the issue.

 

Update (January 27th 19:11 GMT): A storage equipment is currently experiencing slowdown. The incident is similar to the incident of the week. Our technical team is working on solving the issue.

 

Update (January 27th 22:00 GMT): The I/O situation is now stabilized. Our technical team is still working to find a complete fix to the issue.

 

Update (February 2nd 03h30 GMT): Another incident affects one of our storage units. We're now rebooting the faulty equipment. We recently found a few corrective actions that we'll soon be able to take in order to solve this kind of issues.

 

Update (February 2nd 20:19 GMT): Another incident has occurred, and slowdown was noticed, however the situation is stable right now.

 

Update (February 6th 02:09 GMT): Slowdown on one of our storage units. Teams working on it.

 

Two storage units are concerned by these incidents, which are isolated slowdowns in read/write operations. We suspect that the problem is two-fold: a software problem (blocking of operations), and a hardware problem (some disk models are unusually slow).

When these slowdowns occur, the implementation of iSCSI that lets us connect your servers to their disks may be dysfunctional. The result is an "I/O wait" that is artificially high (100%) even if the storage is once again rapid.

We are currently working on these three problems by giving priority to the capacity of our system to re-establish service after a slowdown.





Page   1 2 36 7 8
Taille du bandeau d'actualités