Why DID Facebook, Instagram, WhatsApp and Facebook Messenger go down yesterday? Bungled server update led to a global outage that lasted almost seven HOURS – as experts warn foul play 'can't be ruled out'

 Facebook, Instagram and WhatsApp were all brought down for almost seven hours yesterday in a massive global outage.

Problems began at around 16:45 BST (11:45 ET), leaving users unable to access the three platforms, as well as Facebook Messenger and Oculus, for the rest of the evening. 

Facebook, which owns all the services, has blamed the outage on a bungled server update and insists it was not an attack from outside the company.

The US tech giant said the problem was caused by a faulty update that was sent to its core servers, which effectively disconnected them from the internet. 

But what exactly went wrong and why did it take more almost seven hours to fix? Here is MailOnline's breakdown of the issue...

Facebook, Instagram and WhatsApp were all brought down for almost seven hours yesterday in a massive global outage. The US tech giant said the problem was caused by a faulty update that was sent to its core servers, which effectively disconnected them from the internet

Facebook, Instagram and WhatsApp were all brought down for almost seven hours yesterday in a massive global outage. The US tech giant said the problem was caused by a faulty update that was sent to its core servers, which effectively disconnected them from the internet

A Facebook staff member reportedly accidentally deleted large sections of the code (pictured) which keeps the website online

A Facebook staff member reportedly accidentally deleted large sections of the code (pictured) which keeps the website online


Why did Facebook go offline?

Facebook issued a statement saying the cause of the problem was a configuration change to the company's 'backbone routers', which coordinate network traffic between the tech giant's data centres.

'This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt,' the statement said.

Web security firm CloudFlare offered more details about what happened, revealing that Facebook had effectively vanished from the internet.

The social media company made a series of updates to its border gateway protocol (BGP), CloudFlare's chief technology officer John Graham-Cunningham said, causing it to 'disappear'. 

The BGP allows for the exchange of routing information on the internet and takes people to the websites they want to access.

It is essentially the roadmap that transports you to the location of each website – known as the Domain Name System (DNS) – or its IP address.

As a consequence of the BGP problems, it meant DNS resolvers all over the world stopped resolving their domain names.

Why were Instagram, WhatsApp and Facebook Messenger also down?

It wasn't just Facebook that went offline - its associated services Instagram, WhatsApp and Facebook Messenger were affected, too. Some people also reported issues with Facebook's virtual reality headset platform, Oculus.

This is because the tech giant has a centralised, single back end for all of its products.

Facebook runs its own systems through the same servers, meaning everything needed to fix the problem – from digital engineering tools to messaging services, even key-fob door locks – was also taken offline. 

Matthew Hodgson, co-founder and CEO of Element and Technical Co-founder of Matrix, said the outage illustrated the advantage of having a 'more reliable' decentralised system that doesn't put 'all the eggs in one basket'.

'There's no single point of failure so they can withstand significant disruption and still keep people and businesses communicating,' he added. 

It wasn't just Facebook that went offline - its associated services Instagram, WhatsApp and Facebook Messenger were affected, too. Some people also reported issues with Facebook's virtual reality headset platform, Oculus

It wasn't just Facebook that went offline - its associated services Instagram, WhatsApp and Facebook Messenger were affected, too. Some people also reported issues with Facebook's virtual reality headset platform, Oculus


How many people were affected?  

Downdetector, which tracks outages, said it was the biggest failure it has ever seen, with 10.6 million problem reports around the world.

In total, Facebook has 2.9 billion monthly active users.

The issues started at 16:44 BST (11:44 ET), with nearly 80,000 reports for WhatsApp and more than 50,000 for Facebook, according to DownDetector.

From around 22:30 BST (17:30 ET), some users were reporting that they were able to access the four platforms once again. However, Facebook did not work again for many people until at least an hour after that.

WhatsApp said it was back up at running 'at 100 per cent' as of 3:30 BST (22:30 ET) this morning. 

Could it have been a cyber attack?

Interestingly, Facebook's statement is carefully written and doesn't rule out foul play. 

That being said, the chances of it being an external cyber attack seem unlikely.

A massive denial-of-service hack that could overwhelm one of the world's most popular sites would require either coordination among powerful criminal groups or a very innovative technique.

Sabotage by an insider, however, would be theoretically possible, according to tech experts.

What's also eye-opening is that the outage hampered Facebook's ability to address the problem, because it took down internal tools needed to fix it.

This meant the issue lasted for nearly seven hours, which is highly unusual.  

Users around the world reported problems with Facebook, Instagram and WhatsApp on Downdetector


It compounded a difficult week for Facebook, which has faced accusations of easing up on efforts to stop misinformation, allowing hate to be magnified on its platforms and being aware that Instagram can harm teenage girls' mental health.  

The disruption also occurred just 24 hours after a former Facebook employee gave an interview to CBS News after leaking documents about the social network.

Whistleblower Frances Haugen, who is scheduled to testify before a Senate subcommittee today, said the company had prioritised 'growth over safety'.  

Facebook insisted it was 'just not true' to suggest the company encouraged bad content or did nothing in response. 

Cybersecurity specialist Jake Moore said: 'It is quite interesting that Facebook's statement has not ruled out foul play. 

'Like the locks on a bank safe, the money inside is only as secure as the person with the keys – cybersecurity is as much about a company's own internal security procedures as it is about fending off outsider attacks.'

He reiterated that it was 'not due to an external cyber attack' because web blackouts more often originate from an undiscovered software bug or human error.

So was it a mistake by someone within Facebook?

There's every chance it could have been an accident rather than an intentional act of sabotage. 

It has been claimed that a Facebook staff member may have accidentally deleted large sections of the code which keeps the website online. 

Facebook said its engineering teams had identified 'configuration changes' to its backbone routers that brought its services to a halt.

The company said these changes caused a disruption to network traffic and blocked communication between its data centres. Employees' work passes and email were also reportedly affected by the internal issue. 

Why did it take so long to resolve the problem?

When Facebook's platforms went offline, engineers rushed to the company's data centres to reset the servers manually, only to find they couldn't get inside.

New York Times' technology reporter Sheera Frenkel told BBC's Today programme this was part of the reason it took so long to fix the issue.

'The people trying to figure out what this problem was couldn't even physically get into the building' to work out what had gone wrong, she said.

To make matters worse, one insider claimed the outage was further exacerbated because large numbers of staff are still working from home in the wake of Covid, meaning it took longer for them to get to the data centres. 

Downdetector, which tracks outages, said it was the biggest failure it has ever seen, with 10.6 million problem reports around the world. Pictured, the issues starting at 16:44 BST (11:44 ET)

Downdetector, which tracks outages, said it was the biggest failure it has ever seen, with 10.6 million problem reports around the world. Pictured, the issues starting at 16:44 BST (11:44 ET)

Engineers were rushed to the company's data centres in Santa Clara, California (pictured), to reset the servers manually

Engineers were rushed to the company's data centres in Santa Clara, California (pictured), to reset the servers manually

Facebook has not yet gone into much detail about how the issue was finally fixed but it is understood that engineers had to manually reset the servers where the problem originated.

Software testing expert, Adam Leon Smith of BCS, The Chartered Institute for IT, said: 'It is unlikely the issues were directly caused by people working from home, however it is quite possible that it took so long to restore the service because of reduced staffing within the data centre.

'This would compound the problem because the nature of the failure meant that remote access to the data centre was also unavailable.'

How much did the outage cost?

During the blackout, Facebook shares plunged by five per cent, wiping an estimated $7 billion (£5 million) off founder Mark Zuckerberg's personal fortune. 

The website Fortune also estimated that seven hours of downtime could have cost the company up to $100 million (£73 million) in lost ad revenue.

But it's not just Facebook which will have lost out. 

Businesses who rely on its services are also likely to have lost huge sums of money, although so far there have not been any cost estimates for exactly how much.

NetBlocks, which tracks internet outages and their impact, estimates that the outage cost the global economy $160 million (£117 million). 

Cyber analyst breaks down Facebook and social media outage
Loaded: 0%
Progress: 0%
0:00
Previous
Play
Skip
Mute
Current Time0:00
/
Duration Time5:06
Fullscreen
Need Text

What are the chances of it happening again?

The huge global outage Facebook experienced is a fairly uncommon one, although there's not a lot the company can do to avoid a similar situation because of its centralised back end system.

Along with the Fastly outage in June – caused by a single customer changing their settings – and Cloudflare going offline in 2020, it shows the problem of having a single point of failure for a huge number of services that people use.

There are currently no obvious solutions to this, but this latest outage is likely to reignite the debate around internet infrastructure.

For many individuals and businesses too, the incident showed just how much they depend on Facebook and its services not just to communicate, but also to log in to other platforms. 

Why DID Facebook, Instagram, WhatsApp and Facebook Messenger go down yesterday? Bungled server update led to a global outage that lasted almost seven HOURS – as experts warn foul play 'can't be ruled out' Why DID Facebook, Instagram, WhatsApp and Facebook Messenger go down yesterday? Bungled server update led to a global outage that lasted almost seven HOURS – as experts warn foul play 'can't be ruled out' Reviewed by Your Destination on October 05, 2021 Rating: 5

No comments

TOP-LEFT ADS