AstrotalkUK

Not for profit website/blog on astronomy, space and my writing

  • Home
    • FAQ
    • Contact
    • About
    • Privacy Policy
  • Content
    • Podcast
    • All episodes
    • Book Review
    • Cyber Security
  • Events

Facebook Outage – The bigger they are the harder they fall

By Gurbir Dated: October 6, 2021 Leave a Comment

Suddenly, at 16:51 on 4th October 2021, Facebook disappeared from the Internet for all the 3 billion users no matter where in the world they were. There was no warning, and the experience was identical for the head of a large commercial organisation as it was for a first-year university student using a low-cost android phone. Users of Instagram and WhatsApp, also owned by Facebook, suffered the same experience. The outrage started at 16:50 BST and returned at 22:20 BST. The impact was high because Facebook, a single company, is so large.

Facebook Availability. Source Cloudflare

The “what and why” is gradually emerging. The most surprising thing for me is that t was NOT a cyber attack. There was no malicious software, no ransomware, no Ddos and no hackers or disgruntled former employees. However, by chance, just before the outage, a former Facebook employee in the US now a whistleblower, Frances Haugen was providing testimony to Congress that Facebook prioritised profit over harm to children.

Facebook explained on their 259-word blog post the cause, “Our engineering teams have learned that configuration changes on the backbone routers that coordinate network traffic between our data centers caused issues that interrupted this communication”. Many independent sources provided an explanation including Reuters and Cloudflare.

The failure that prevented users from accessing Facebook also obstructed Facebook engineers attempting to fix it. Apparently, the systems used by Facebook for physical and logical access to its own buildings were also affected by the same outage.

In simple terms, the error involved two of the internet’s many interconnected sub-systems. The Domain Name System (DNS) and the Border Gateway Protocol. The DNS converts a URL like facebook.com to an IP address of a server (one of many around the world) hosting the Facebook application. The BGP provides routing information services on the Internet. In this case, it allows data from one Facebook Datacenter in say South Africa to find another in Norway.

Like signs on the motorway, the BGP provides drivers’ directions for their destination. The “configuration change” that went wrong on 4th October, meant that suddenly all the motor signs (the BGP) went blank (and DNS could no longer see Facebook). The drivers could not see how to get to their destination and the traffic came to a halt.

Although the outage lasted for just 6 hours, it had a huge global impact on individuals, businesses and governments that rely on Facebook for communication, data transfer, payments and education.

Facebook did not explain why this update, something they would have done many times in the past, went awry. It is unclear if this was a planned or unscheduled update nor why there was no simple regression mechanism in place for exactly these eventualities.

However, independent security specialists cannot rule the possibility of sabotage or other sinister activity.

This outage was limited to one company, albeit with a huge user base. A similar outage for Google, Amazon or Apple would potentially have a larger impact, affecting many more applications and businesses. The internet was designed and built around TCP/IP (Transaction Control / Internet Protocol). It has resilience at its core. That resilience still stands. This incidence illustrates the age-old problem of too many eggs (users) in a single basket (Facebook).

Quick update.

Down detector recorded a further Facebook outage for a few hours starting late on October 8th in to the early hours of the 9th. This was a far less significant outage that lasted just a coupe of hours and probably had a differrent cause thsn Monday’s. Here is how CNN reported it.

Facebook has provided a further update explaining the 4th October outage.

Share this:

  • Twitter
  • LinkedIn
  • Facebook

Related

Filed Under: cyber Tagged With: cyber

Join Mailing List

  • This field is for validation purposes and should be left unchanged.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Find me online here

  • E-mail
  • Facebook
  • LinkedIn
  • Twitter
  • Vimeo
  • YouTube

subscribe to mailing list and newsletter

  • This field is for validation purposes and should be left unchanged.

Browse by category

Twitter

My Tweets

Recent Comments

  • Episode 109 - The Antikythera Mechanism with Prof Xenophon Moussas - AstrotalkUK on Episode 26: Antikythera Mechanism
  • Missions To Be on the Lookout for During the 2020s – My Company on Episode 90 – An update on ISRO’s activities with S Somanath and R Umamaheshwaran
  • Apprendre les Radioamateurs - Radio club du BorinageRadio club du Borinage on Amateur Radio – Learning Under Lockdown
  • Gurbir on Categories
  • Desmond Welch on Categories

Archives

Select posts by topic

apollo Astrophotography BIS Book Review CCSK China Cloud Computing cnsa comet commercial Cosmology curiosity cyber Education ESA Gagarin History India Infosec ISRO jaxa Mars Media Moon NASA podcast Rakesh Sharma rocket Rockets Roscosmos saturn Science Science Fiction seti Solar System soviet space space spaceflight space race spacerace telescope titan USSR video Vostok

Copyright © 2008–2023 Gurbir Singh - AstrotalkUK Publications Log in

 

Loading Comments...