Legal Zyte-geist #3: What the court’s ruling in the Meta v Bright Data case really means for web scrapers

A brief overview of the Meta v Bright Data case court's ruling

Feb 13, 2024

Welcome to the monthly column about web scraping and legal themes by Sanaea Daruwalla. She is the Chief Legal & People Officer at Zyte. Sanaea has over 15 years of experience representing a wide variety of clients and is one of the leading experts on web data extraction laws.

Disclaimer: This post is for informational purposes only. The content is not legal advice and does not create an attorney-client relationship.

By now we’ve all heard that Meta sued Bright Data for scraping Facebook and Instagram and that the court just ruled that Meta’s terms of service do not prohibit scraping of public data. But what does this really mean for web scrapers?

a room with a desk and chairs in it — Photo by Robert Linder on Unsplash

Bright Data did not violate Meta’s terms, but this ruling does not extend to the scraping of all public data anywhere

Bright Data did not violate Meta’s terms of service or breach any contract with Meta by scraping public Facebook and Instagram data. But that does not mean that no terms of service will ever apply to the scraping of public data. The court only looked at Facebook and Instagram’s terms of service specifically in relation to Bright Data’s scraping. So while it’s a great ruling, it’s limited to this set of facts and the terms put before the court.

The facts the court relied on were:

Bright Data was scraping public Facebook and Instagram data
in order to scrape that public data Bright Data was circumventing Meta’s anti-bot measures
Facebook and Instagram’s terms of service both prohibit scraping,
in 2021 and 2022 Bright Data had Facebook and Instagram accounts which required them to agree to the terms of service, but they were corporate accounts and not used for scraping data
in December 2022 Bright Data disabled all its Meta accounts
the text of Meta’s terms of service.

The court read both Facebook and Instagram’s terms of service very closely and analyzed every aspect to determine that the terms are only applicable to a user who is actively logged in to their account and is using the account for the purpose of scraping data. As a result, Bright Data’s scraping of public data did not violate Meta’s terms.

If a scraper circumvents anti-bot and CAPTCHA to obtain the data this is not the same as scraping data behind a login, but it does not mean that the court determined that any type of circumvention in all circumstances is ok

The court did not make a definitive ruling on this point but stated that circumventing CAPTCHAs and anti–bot technology is not equivalent to scraping behind a login. There is a clear difference between defeating anti-scraping tech and piercing privacy walls like a login, so the court concluded that Bright Data only engaged in the scraping of public data.

However, the court did not make a final ruling as to whether or not anti-bot circumvention or CAPTCHA solving could violate other laws or other terms of service. In fact, this wasn’t the issue the court was actually reviewing in this specific case, so while it made some really promising commentary about this topic, it made no final judgments.

Do you want to suggest a topic for the next month's edition? Submit your question in The Web Scraping Club Discord Server, on the Legal Zyte-geist dedicated channel.

Join the TWSC discord server

If you want to be sure to don’t miss the new episodes, please consider subscribing for free to the newsletter.

Simply having a Facebook and Instagram account does not mean you are bound by their terms when scraping public data, but it does not mean that is true of all websites

Meta argued that Bright Data is bound by its website terms because it explicitly agreed to them when it created its business accounts. Bright Data contended that the terms govern the use of Facebook and Instagram, and that scraping public data while logged out is not “use” as defined under the terms of service, as it is not a “user” of the services when conducting logged-out scraping.

The court conducted a very detailed analysis of Facebook’s and Instagram’s terms in order to determine if they apply to scraping public data while logged out. They reviewed how a user and use of Meta’s platforms was defined throughout the terms of service and found that it is reasonable to interpret that Bright Data did not “use” Facebook or Instagram in order to conduct logged-off public data scraping.

Further, there was no evidence that Bright Data’s accounts were related in any way to their scraping activity. When Bright Data scraped, it did so without logging in and it did not use its accounts in any way to conduct the scraping. So even though Bright Data was a user for other purposes during the time they had accounts, it was not using Facebook or Instagram as a part of its non-logged-in scraping of public data.

It’s important to note that the court’s decision here is based on its very detailed reading and analysis of Meta’s terms and does not extend to all website terms of service. The court could find in a different case that the terms do apply to non-logged-in scraping, as that issue was not fully decided in this case.

While this is a great ruling for ethical web scrapers, we need to be cautious about overstating its reach.

In summary, while the court’s ruling is great, its applicability is quite limited. The wider questions of whether terms can ever be enforceable when someone is scraping public data and questions around anti-bot and CAPTCHA circumvention as a whole still remain open. But for now, this order gives good guidance and some hope for where the courts are headed.

For more helpful resources, check out my post on the ruling, the legality of web scraping, and Zyte’s more detailed compliant web scraping checklist.

Explore, connect, and collaborate with Zyte. Join us on LinkedIn and in our Extract Data Community on Discord.

Feb 14, 2024

There's no specific case where terms apply to non-logged in scraping. But what the court's ruling here showed us is that if terms specifically called out non-logged in scraping and prohibited it, a court could rule that you cannot scrape that site even when not logged in. However, there are other factors that would come into play, like whether a contract was properly formed, so we're yet to see how court's will rule on that. But the main point is that this case only looked at Facebook and Instagram's terms specifically, so other websites terms could be viewed differently based on how they are worded.

SirLoras

What is different case that the terms do apply to non-logged-in scraping?

The Web Scraping Club

Discussion about this post

Ready for more?