Google issues apology, incident report for hours-long cloud outage

Google issues apology, incident report for hours-long cloud outage


Thomas Kurian, CEO of Google Cloud, speaks at a cloud-computing conference held by the company in 2019.

Michael Short | Bloomberg | Getty Images

Google apologized for a major outage that the company said was caused by multiple layers of flawed recent updates.

The company released an incident report late on Friday that explained hours of downtime on Thursday. More than 70 Google cloud services stopped working properly across the globe, knocking down or disrupting dozens of third-party services, including Cloudflare, OpenAI and Shopify. Gmail, Google Calendar, Google Drive, Google Meet and other first-party products also malfunctioned.

“We deeply apologize for the impact this outage has had,” Google wrote in the incident report. “Google Cloud customers and their users trust their businesses to Google, and we will do better. We apologize for the impact this has had not only on our customers’ businesses and their users but also on the trust of our systems. We are committed to making improvements to help avoid outages like this moving forward.”

Thomas Kurian, CEO of Google’s cloud unit, also posted about the outage in an X post on Thursday, saying “we regret the disruption this caused our customers.”

Google in May added a new feature to its “quota policy checks” for evaluating automated incoming requests, but the new feature wasn’t immediately tested in real-world situations, the company wrote in the incident report. As a result, the company’s systems didn’t know how to properly handle data from the new feature, which included blank entries. Those blank entries were then sent out to all Google Cloud data center regions, which prompted the crashes, the company wrote.

Engineers figured out the issue in 10 minutes, according to the company. However, the entire incident went on for seven hours after that, with the crash leading to an overload in some larger regions.

As it released the feature, Google did not use feature flags, an increasingly common industry practice that allows for slow implementation to minimize impact if problems occur. Feature flags would have caught the issue before the feature became widely available, Google said.

Going forward, Google will change its architecture so if one system fails, it can still operate without crashing, the company said. Google said it will also audit all systems and improve its communications “both automated and human, so our customers get the information they need asap to react to issues.” 

— CNBC’s Jordan Novet contributed to this report.

WATCH: Google buyouts highlight tech’s cost-cutting amid AI CapEx boom

Google buyouts highlight tech's cost-cutting amid AI CapEx boom



Source

The S&P 500 and Nasdaq kept their record rallies going. Here are 3 key takeaways
Technology

The S&P 500 and Nasdaq kept their record rallies going. Here are 3 key takeaways

Yet another record week for stocks. Strong first-quarter earnings and a war-driven spike in oil made for another historic week on Wall Street. Investors also made sense of a spate of economic data and the Federal Reserve’s latest interest rate decision. The S & P 500 and Nasdaq Composite gained 0.9% and 1.1%, respectively, over […]

Read More
Musk testimony dominated first week Musk v. Altman. ‘You can’t just steal a charity’
Technology

Musk testimony dominated first week Musk v. Altman. ‘You can’t just steal a charity’

Elon Musk arrives to court at the Ronald V. Dellums Federal Building on April 30, 2026 in Oakland, California. Benjamin Fanjoy | Getty Images A week into the Musk v. Altman trial, which features two towering figures in the tech industry facing off in a case that could have major implications for OpenAI, the plaintiff […]

Read More
Jim Cramer says the market powered through a tough earnings week but ‘that doesn’t mean we’re out of the woods yet’
Technology

Jim Cramer says the market powered through a tough earnings week but ‘that doesn’t mean we’re out of the woods yet’

CNBC’s Jim Cramer said the market just powered through the toughest week of earnings “with flying colors,” but warned that next week could be even more treacherous. “All the big techs did well … Everything connected with the data center went bonkers,” the “Mad Money” host said. However, he cautioned against complacency. “That doesn’t mean […]

Read More