*Update about the January 23rd DDoS.
*
TL; DR :
It was the biggest DDoS attack we have ever faced.
The attack lasted from 06:25 AM CET to 12:56 PM CET
Services were unavailable or degraded from 06:32 AM CET to ~ 11:00 AM CET
At 06:42 AM CET :
Our public services monitoring was triggered by external probes.
The whole infrastructure team was paged.
The monitoring was very noisy due to the DDoS. Metrics were also degraded, making it difficult to find the target.
After further digging, we saw that several public services were under attack.
We had to isolate and defend several service endpoints at the same time.
The fact that the attackers were targeting different services made the attack more complicated to mitigate.
Adding to the difficulties, our backbone and some core links were saturated, making troubleshooting and remediation slower and more complicated even with out of band connections.
We lost a lot of time on that.
At 7:52 AM CET: We were joined by our anti DDoS provider team.
We were beginning to advertise some service prefixes via our anti DDoS provider and start mitigation while isolating those same prefixes from our standard transit and peering providers.
Our priority was to defend our DNS servers.
The DNS cluster was doing well. However, as network pipes were full, the service was degraded.
Other endpoints like admin.gandi.net and public APIs were still down.
We had some difficulties in deploying efficient mitigations due to technical problems on our side while trying to advertise prefixes only through our anti DDoS provider.
At 11:00 AM CET our services were mostly back online.
Our internal postmortem is ongoing to see what went well and what could be improved, and there is room for improvement.