Our new platform is already available at www.gandi.net

Go to the new Gandi

We are experiencing an incident with certain network switching equipment.  Some services are partially affected, notably the Blogs and IPv6 GandiMail services. 

 

Update 13:08 CET (12:08 GMT):  The incident has been resolved.

 

 

We are working to resolve the issue as soon as possible, and apologise for any inconvenience.



EDIT : cancelled

Due to an important update to the database structure, GandiMail services are experiencing temporary disruption.  Our technical teams are working to minimise the impact as much as possible, but there will be some delays of around 15 to 20 minutes for mail delivery into mailboxes whilst the work on the database is in progress.

 

We apologise for the inconvenience caused.


Following an incident that occurred on the fiber optic line of one of our providers, Gandi's website and some services were unavailable for about ten minutes between 4:15 and 4:25 (Paris time).

Please accept our apologies for the inconvenience.



There is a net incident in Gandi, our technical team is working to solve the problem. The mail and blog services are partially impacted.

 

UPDATE 11:52 Problem identified in one of our load balancers. It's being solved.

UPDATE 11:59 Problem solved, our services are available again.


Due to an incident witha management system, we have temporariliy suspended email delivery to GandiMail boxes.  New mails will be stored in the incoming spools and will be delivered when the issue has been resolved. 

 

We apologise for any inconvenience this may cause.

 

Update: 16:00 CET (15:00 GMT) - the issue has been resolved and the mail services are operating normally.

 


We have lost contact with our infrastructure at Telehouse1 (Jeuneurs) in Paris as of 03:11 CET (02:11 GMT).  Our technicians are en-route to investigate the problem.  

 

Affected Gandi services are:

1/3 of the DNS systems, whois for domain registry, and mail redirection service.

 

We will update here as we obtain more details.

 

UPDATE: 04:01 CET (03:01 GMT) - our core router at Telehouse1 had suffered a system crash resulting in the outage of this site.  Our engineer successfully restarted the router and we have obtained crash dump information from the router for analysis.  All services are once again operational.


We will be carrying out network maintenance during the night of 15-16 January 2011.  The purpose of this work is part of a multiphase plan to remove the legacy nework topology and migrate to a more stable, scalable, and efficent hierarchical model in Paris. 

 

In this phase, the activity will only involve the interconnections beween the core and aggregation network elements at our datacenter in St. Denis.

 

This activity will have several minor impacts on connectivity for various Gandi services in Paris throughout the maintenance window, each up to five minutes as the migrations are performed in the various sections of the network, but no significant outages are expected.

 

We have scheduled this maintenance window from 02:00 CET (01:00 GMT) to 08:30 CET (07:30 GMT) on 16 January during the period of lowest impact to customers.

 

We will schedule follow-on maintenance activity over the coming weeks for the rest of the network migrations, to include activities at Telehouse as well as a number of services in particular, and we will of course endeavour to keep any disruption to a minimum.


Today (January 4th 2011), one of our routers went offline. This led to the partial and temporary loss of our network, impacting some of our services such as our website, SiteMaker, GandiBlogs, some email accounts, and all operations towards servers. Domain names did not encounter any unavailability, though some network paths to certain servers were unavailable.

The incident is currently being resolved, and services will progressively return to normal.

Please accept our apologies for the inconvenience.

 

UPDATE: Here is the technical explanation for yesterday's network incident:

Part of the Gandi France network is based on legacy topologies built over the past ten years, including multi-site spans for various VLANs and in some cases a relatively flat architecture.  This part of the architecture relies, perhaps unwisely, on spanning-tree protocol to ensure a loop-free layer-2 topology in a bridged or switched network.  Whilst we have have been performing various engineering works over the past 18 months to simplify the architecture, it takes a considerable amount of time to completely unbuild what has been built piece by piece over a period of ten years without significant outages of the Gandi services.  

The incident yesterday was exacerbated by the legacy elements of the Gandi France network infrastructure and was caused by a fault in a downstream access switch cluster which created a layer-2 loop in the architecture.  This in turn caused an unfortunate situation whereby the layer-2 topology of the legacy network was being constantly recalculated resulting in the spanning-tree protocol failing to converge, consuming 100% resources on the affected switches and thus preventing traffic flow.  The offending switch cluster was isolated from the network, but we were also required to reload another switch in another datacentre to stop the "snowball" effect caused by the fault.

We have already scheduled for this quarter significant network engineering activities to finally unpick the remainder of the legacy topology and migrate to a fully hierarchical model limiting the layer-2 domains to locally contained subnets, and minimising the reliance upon such protocols as spanning-tree which was never designed to be used in such large scale designs in the first place.  We will be communicating the dates and times of the maintenance windows over the coming weeks.

We apologise again for any inconvenience caused during this network incident yesterday.



( * spanning-tree protocol:  http://en.wikipedia.org/wiki/Spanning_tree_protocol  )


We are currently experiencing an abnormally increased load on the incoming mail spools  on the GandiMail service.  As a result, new mail deliveries may be slower than usual.  Our teams are investigating the source of this increased load and we will keep you updated as we have more information.

 

Update: 14:00 : The slow spool performance is related to an increased load on the antispam/antivirus filtering on the mail spools.  Our teams are actively working to resolve the issue as soon as possible.  Inbound mail is still being delivered, but of course at a slower rate.

 

Update: 17:50:  Our teams have isolated the issue and have tweaked the processes on the antispam filters to further optimise performance.  All mail in the spools have been delivered to the recipient mailboxes and the system is now running nominally.

 

We apologise for any inconvenience caused by the slower than normal delivery of mails today.


Page   1 2 313 14 15
Change the news ticker size