The whole platform is unreachable for a few minutes. We’re investigating and trying to recover asap.
[09:35] First frontend to be unreachable.
[09:40] All frontends are unreachable.
[09:55] The hosting provider explains that the issue is due to a DHCPv6 flood - our monitoring doesn’t show an unusual DHCPv6 traffic.
[10:05] Servers are booting in rescue-mode to investigate further.
[10:20] They are unreachable even in rescue mode. The support is investigating. We start to migrate the DNS record to a new frontend.
[10:51] The support ticket is escalated to an upper level, and we are asking to let the servers down without any ETA to get a proper fix.
[11:05] Another independant frontend is deployed for HTTP only - services are running but in a degraded mode.
[11:18] HTTPS deployed, all services are up and no longer running in a degraded mode.
[07:30] All servers are reachable again, but only in rescue mode. We’re asked to disable IPv6.
[08:30] Still issues on the internal network.
[09:00] Issue resolved, all servers are up and running. Beginning the DNS migration back the default configuration, without IPv6 support.
[10:00] Migration done. Still working with the hosting provider to get more details.
[2017/02/20] All actions have been applied on both side to avoid any further downtime because of this.