We have not seen a re-occurrence of this issue again.
Also, we have deployed a code fix to prevent false nodata alerts if the received postbacks drop below a certain number.
We'll be releasing a postmortem about this incident in the next few days.
Apr 21, 20:43 BST
Response times to our postbacks endpoint has normalized. We will continue to monitor this and confirm with our provider if this has been a re-ocurrence. If you got a false nodata alert between 00:00 and 05:00 UTC today please report to email@example.com
Apr 21, 07:11 BST
In the last hours we have seen elevated response times to our postback endpoint again. We're actively investigating the cause to determine if the same or a new issue. If you got a false nodata alert please report to firstname.lastname@example.org
Apr 21, 06:24 BST
We're keeping this open a few more hours because we observed some transient timeouts during this morning.
Apr 20, 14:04 BST
Our provider reported: "An upstream transit provider incorrectly advertised a route that caused customer traffic to be incorrectly sent to that provider which would have caused customers to be unable to reach multiple datacenters for the duration of the event. Routing was corrected by the upstream transit provider at approximately 11:40 UTC and services should have begun to to stabilize at that time."
We have confirmed expected monitoring values in the last 40 minutes and have enabled nodata alerting at 14:30 UTC.
During the next hours we will continue to monitor networking parameters for re-occurrences.
Apr 19, 15:37 BST
Our provider has informed that a routing anomaly is causing the higher than normal latency and timeouts we've been seeing and is working to identify its source. We will continue to keep nodata alerts disabled.
Apr 19, 14:19 BST
Our provider has identified the ongoing problem and is working to restore full service. We are still seeing some network degradation and we are keeping nodata alerts disabled to prevent further occurrences from false nodata triggers. Alerting delays, if any, are residual now.
Apr 19, 13:45 BST
We are continuing to work with our provider on this issue. We are seeing a reduced error rate but haven't received confirmation yet. We're currently keeping nodata alerts disabled to prevent further occurrences from false nodata triggers.
Apr 19, 13:07 BST
We have identified network degradation on our public Internet uplinks. We're are reaching to our provider on this.
This is causing gaps on graphs, delayed alerting triggers and false nodata on some devices.
Apr 19, 12:30 BST
We're currently investigating a high error rate on postback intake
Apr 19, 12:10 BST