There has been a problem with sitemaker since 5:00 AM CET today.

 

Consequently, all the websites using Sitemaker are offline.

 

Our teams are  in the process of fixing the problem, and we expect services to return shortly.

 


Since the webmail update, you may encounter errors on the webmail while consulting your messages.

You may encounter issues on the attachments too, as it is limited to 2MB instead of 25MB normally.

Our technical team is doing currently the necessary operations in order to be back to normal as soon as possible.

Sorry for the inconvenience this issue may have caused to you.

UPDATE : most of the issues should be fixed now. We are still analyzing and fixing the remaining problems.


The operations on the SimpleHosting platform are currently stopped.

 

Our team spotted an issue on the SimpleHosting operations.

They are done correctly, but their status is not updated in term of display ("operation in progress" although it is done).

But, it is blocking waiting operations.

 

We are currently analyzing the issue.

 

UPDATE : the problem was located in a logger system which did not allow the operation to be updated. All operations are finished now.


 

We will be doing maintenance on one of our storage units on Dec. 18th, from 15:00-16:00 Pacific Standard Time, for one hour.
During this time, some e-mail boxes will be unavailable. Mail delivery will be queued, and delayed. 
EDIT : canceled

We apologize for the inconvenience.  

We will be doing maintenance on our webmail during the afternoon and evening of December 17, in order to update the Roundcube webmail version for all email users.

 

Between 15:00 and 19:00 Pacific time, webmail will not be accessible. Email delivery and email client access will continue normally.

 

We apologize for the inconvenience, and look forward to bringing you an updated version of our webmail client.

Update 2013-12-17 19:15 PST: The maintenance is complete.


An incident is currently underway on our Simple Hosting platform (Paris datacenter only).
 
The reason for the incident is not immediately clear; we are investigating.  Please don't launch any operations on the instance for the moment.
 
Updates will be posted here as soon as we have more information.
 

Update Tue Dec 10 21:37:01 CET 2013: This issue has been resolved. Please accept our apologies for the brief period of inavailability.


Simple Hosting instances located in our Baltimore data center only may be currently experiencing issues. Our technical staff is investigating the issue. Please do not perform any operations on your instance in the meantime.

This post will be updated as the situation evolves.

Update 00:51:20 CET:

A member of our technical staff is currently onsite in Baltimore to address the problem.

Update 01:35:13 CET:

The issue has been resolved. Services should be now operating normally.



The incident of November 11th is part of a series of incidents over the past few weeks caused by the gateway units, which provide Internet access for the Simple Hosting instances.
The Simple Hosting platform has experienced a number of different issues, principally with the gateway equipment, which seems to be the weakest link in the architecture. It is suject to:
  • HSRP instability causing short interruptions in connectivity,
  • Saturation of NAT translation tables as a result of a number of factors, including DDoS and Customer Activity, 
  • High CPU usage under certain conditions.
What will Gandi do to fix the situation, replace this gateway and improve the Simple Hosting product ?
  • Replace the network equipment which provides the gateway to Internet for the Simple Hosting product with more powerful appliances, and greater numbers of units (scaling). The new units will better handle the current load and will support the growth of Simple Hosting instances in the near future,
  • Set up a deeper level of monitoring to better detect technical problems,
  • Implement advanced monitoring to detect abuse from specific instances and enable quicker reaction from our technical team for handling these abuses before they impact the quality of services for all other customers.
We apologise for the inconvenience, and please be assured that our teams are endeavouring to correct these issues in the shortest possible time.

We experienced a hardware fault on routing equipment on the simple hosting platform.
Below is a chronology of the various events:
- 20:06 UTC : CPU load on the equipment shows significant increase.
- 20:06 UTC : Equipment is running at 100% CPU for no apparent reason, and has failed to respond to commands.
- 20:08 UTC : We made the decision to migrate to secondary equipment.
- 20:08 UTC : The secondary equipement exhibits the same symptoms as the primary, so traffic was not transferred.
- 20:09 UTC : Debugging underway as to ascertain the cause of the problem.
- 20:26 UTC : Migration to the now-stabilised secondary equipment.
- 20:27 UTC : Service returned to nominal operation.
- 22:42 UTC : Following this incident, there was a secondary effect on DNS resolution; the Simple Hosting instances failing to resolve DNS since 20:06 UTC.  the problem is now resolved.
- The network equipment used for the Gateways for this service are visibly showing signs of weakness.  An in-depth analysis of the anomaly and behaviour of the primary unit is underway (likely due to a memory fault).  We are currently running on the secondary gateway for the moment.


Page   1 2 3 4 5 613 14 15
Change the news ticker size