Microsoft has revealed that this week’s Microsoft 365 worldwide outage was caused by an infrastructure power outage that led to traffic management servicing failovers in multiple regions.
Starting on Monday, June 20, at 11:00 PM UTC, customers began experiencing and reporting several issues while trying to access and use Microsoft 365 services.
According to Microsoft, problems encountered during the incident included delays and failures when accessing some Microsoft 365 services.
Customer reports also shared info on continuous re-login requests, emails not getting delivered after being stuck in queues, and the inability to access Exchange Online mailboxes despite trying all available connection methods.
The affected services included the Microsoft Teams communication platform, the Exchange Online hosted email platform, SharePoint Online, Universal Print, and the Graph API.
Microsoft’s response while investigating the root cause behind the outage also brought to light some issues related to how the company fails to share new incident-related info with customers.
Even though Microsoft told customers they could find out more about this incident from the admin center under EX394347 and MO394389, user reports suggest that those incident tickets were not showing up, effectively keeping the customers in the dark.
16-hour-long incident caused by power failure
More than 16 hours after the first signs of the outage were detected, on Tuesday, June 21, at 3:27 PM UTC, Microsoft said in an update to the MO394389 service alert sent to customers that the root cause was an infrastructure power loss.
“An infrastructure power outage necessitated failing over Microsoft 365 traffic management servicing users primarily in Western Europe,” the company explained.
“This action failed to properly complete, leading to functional delays and access failures for several Microsoft 365 services.”
The outage was most severe for customers in Western Europe. Still, the impact extended to “a small percentage” users throughout EMEA (Europe, the Middle East, and Africa), North America, and the Asia-Pacific regions.
Redmond also refuted reports that a separate outage affecting the company’s Outlook on the web service was also linked to this incident.
“We’ve confirmed from our updated service monitoring that all services remain healthy following the targeted restarts,” Microsoft added.
“Additionally, we completed our investigation into the potential remaining impact to Outlook on the web and confirmed that this is a known issue
which is unrelated to this event.”
On Tuesday, Cloudflare was also hit by a massive outage that affected over a dozen data centers and hundreds of major online platforms and services.
Cloudflare revealed that the incident was caused by a configuration error while implementing a change that would have otherwise increased its network’s resilience.