Update - Our provider has completed emergency maintenance on a router identified as showing backplane congestion through this incident and affecting our network traffic.
We will monitor this throughout tomorrow to confirm the fix.
Sep 21, 22:17 BST
Monitoring - Payload intake and processing has just normalized. We still don't have a resolve statement from our provider so we will continue to monitor this issue for another 24h.
Sep 21, 21:19 BST
Identified - Our provider has just notified us they have identified the issue and are currently working on correction.
Sep 21, 19:30 BST
Update - Our provider continues to work on the issue. We have deployed a code change to minimize impact on device payloads processing which will minimize gaps on graphs for the small number of devices impacted.
Sep 21, 18:20 BST
Update - We're continuing to work with our provider to resolve this. We'll share more significant updates as and when we have them.
Sep 21, 17:09 BST
Update - We're continuing to work with our provider to resolve this.
Sep 21, 15:23 BST
Investigating - We're observing further problems with processing agent postbacks and we're working with our provider on the network degradation which is causing this incident.
Sep 21, 14:40 BST
Monitoring - As during the previous occurrences of this issue, payload intake has normalized and nodata alerts have enabled.
Our provider has not found the cause of this problem yet, they will continue to work on this. We'll continue to monitor the postback intake and processing closely.
Sep 20, 21:05 BST
Update - Our provider networking team is still looking into the cause of the degradation we're observing.
Sep 20, 20:02 BST
Update - We are continuing to work with our provider to track down the network performance degradation we are observing.
Sep 20, 18:00 BST
Investigating - We are observing a number of received device payloads lower than normal again and nodata alerts are disabled at the moment. The affected devices will show gaps on their graphs.
Our monitoring has also picked up degradation on the internal networking and we are currently working with our provider to find the cause and fix it.
Sep 20, 16:26 BST
Update - This issue has just cleared and nodata alerts are active again. We will continue to look into the root cause of this issue for another 24h.
Sep 19, 18:41 BST
Update - We are observing a number of received device payloads lower than normal again and nodata alerts are disabled at the moment.
Sep 19, 17:45 BST
Update - We have been observing the expected 5min wait time on nodata alerts for the past 3 hours, which means that received payloads have normalized. We will keep monitoring this issue for the next 24 hours.
Sep 18, 17:10 BST
Monitoring - We are observing a number of received device payloads lower than normal. The drop has been fluctuating between 1.5% and 2.5%. Bellow 2% our protection for false "no data" alerts triggers and delays these alerts delivery. So, instead of the builtin 5min wait we are seeing up to 15min wait.
We are continuing to monitor this issue and will update the status in about 3 hours. If it remains we'll proceed to adjust our protection threshold.
Sep 18, 12:23 BST
Alerting ? Operational
Alert Delivery Operational
SMS Operational
E-mail Operational
PagerDuty (Incident Creation) Operational
PagerDuty (Notification Delivery) Operational
Slack Operational
Webhooks Operational
HipChat Operational
Push notifications (global) ? Operational
Push notifications (iOS) Operational
Push notifications (Android) Operational
Agent payloads ? Operational
API Operational
Availability monitoring ? Operational
Web UI Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Past Incidents
Sep 25, 2016

No incidents reported today.

Sep 24, 2016

No incidents reported.

Sep 23, 2016

No incidents reported.

Sep 22, 2016

No incidents reported.

Sep 17, 2016

No incidents reported.

Sep 16, 2016
Resolved - We have confirmed the previous fix has completely resolved this incident.
Sep 16, 21:34 BST
Monitoring - We have identified one server which was causing errors into payload processing. That has been fixed, we have started to see regular monitoring data from the affected cluster and we have turned no data alerts back on.
All other services have been mostly unaffected for the duration of this incident except for a small set of devices that will not show data for short periods of time between 14:10 and 17:55.
Sep 16, 20:35 BST
Update - We continue to work on the issue. No data alerts remain disabled.
Sep 16, 19:26 BST
Update - We are still working on the issue. No data alerts remain disabled.
Sep 16, 18:23 BST
Update - We are still working on the issue as the root cause seems to be network degradation at our provider. No data alerts remain disabled.
Sep 16, 17:47 BST
Update - We are still working on the issue. No data alerts remain disabled.
Sep 16, 17:14 BST
Identified - We have identified an issue on one of our clusters which is causing degraded payload processing performance. We are currently working to restore it.
Sep 16, 16:38 BST
Update - We are still investigating this issue. Alerting and UI is not affected. The impacted devices will not show data for the duration of the alert.
Sep 16, 16:15 BST
Investigating - We have just disabled no data alerts as we're seeing a high number of those going out.
Sep 16, 15:32 BST
Sep 15, 2016

No incidents reported.

Sep 14, 2016
Resolved - This has now been resolved. We will share the detailed cause as soon as our provide releases them.
Sep 14, 12:26 BST
Monitoring - Our provider has implemented a fix and we are monitoring the results.
Sep 14, 11:03 BST
Update - We have received an updated affected location list:
Europe (CDG - Paris, France), North America (LAX - Los Angeles, CA, United States, ORD - Chicago, IL, United States, SJC - San Jose, CA, United States), and Asia (HKG - Hong Kong, Hong Kong, TPE - Taipei, Taiwan).
Sep 14, 09:40 BST
Identified - Our DNS provider, Cloudflare, is experiencing DNS resolution issues in Los Angeles, Chicago, San Jose, Hong Kong, Paris & Taipei.
We have also confirmed a drop in received payloads and our "no data alert protection", that delays delivery of "no data" alerts when our inbound payload volume drops by 2%, activated momentarily.
Sep 14, 09:15 BST
Sep 13, 2016

No incidents reported.

Sep 12, 2016

No incidents reported.

Sep 11, 2016

No incidents reported.