Politics

Meta-backed Scale AI faces scrutiny over ethics and privacy in AI data labelling

Tuesday, 7 April 2026 12:24PM UTC

Investigations reveal that Scale AI, partly owned by Meta, employs thousands of contractors engaged in controversial data labelling tasks, raising concerns over ethical practices, privacy violations, and worker exploitation in the AI training industry.

A Guardian investigation has found that tens of thousands of people have been engaged through an online platform run by Scale AI, a firm in which Meta holds a near half-stake, to label and curate material used to train large artificial‑intelligence systems, often by trawling social media profiles, copying copyrighted images and transcribing explicit audio. According to the report by The Guardian, many workers described tasks that diverged from the advertised aim of refining high‑level models and raised deep ethical concerns.

Scale AI’s Outlier platform recruits people with specialist backgrounds to supply annotated data and has been positioned at the centre of Meta’s push into more powerful AI, following the social network’s major investment that gave it 49% ownership. Time reported that the deal was part of a $15bn strategy that places Scale’s chief executive at the heart of Meta’s new AI division, a move that has prompted scrutiny over how the company will combine data, talent and influence in a competitive market.

People who performed work for Outlier told The Guardian they routinely encountered material they found upsetting and, in some cases, invasive. One contractor said: "I don’t think people understood quite that there’d be somebody on a desk in a random state, looking at your [social media] profile, using it to generate AI data," describing the view that ordinary users did not appreciate how public posts might be repurposed to create training datasets.

Several contributors spoke of low pay, unstable hours and demanding monitoring, and some said recruitment advertising overstated pay or scope. "A lot of us were really desperate," one worker told The Guardian, explaining why people with other professions, including journalists and academics, took the work. Others reported being asked to transcribe sexual audio, label disturbing imagery or order images by apparent age, tasks that many found morally troubling.

Independent reports and rights organisations have raised alarm over the privacy implications of exposing users’ personal information to large numbers of contractors, including names, selfies and private messages, and noted parallels with earlier episodes in which firms scraped public social‑media content at scale. A human‑rights organisation documented widespread exposure of personally identifiable information to reviewers, while previous reporting has detailed legal actions and controversies over scraping on platforms such as Facebook and Instagram.

Scale AI’s client list has included major tech firms and defence contractors, and the company has supplied data services used to develop commercially and strategically important models. The Guardian noted contracts with US defence suppliers, and other outlets have emphasised how a direct link between a dominant social platform and a major data‑labelling firm could reshape competitive dynamics in the AI industry; OpenAI told outside commentators it ceased purchasing services from the supplier in June 2025.

Scale AI told The Guardian that its Outlier platform offers flexible, project‑based work with transparent pay and that contributors choose when to participate. Workers who remain on the platform say the income is unpredictable and that mass changes in project availability happen periodically, yet many continue because alternative work looks uncertain. "I have to be positive about AI because the alternative is not great," one contributor said.

Source Reference Map

Inspired by headline at: ^[1]

Sources by paragraph:

Paragraph 1: ^[2]
Paragraph 2: ^[3], ^[2]
Paragraph 3: ^[2]
Paragraph 4: ^[2]
Paragraph 5: ^[4], ^[5], ^[6]
Paragraph 6: ^[2], ^[3]
Paragraph 7: ^[2]

Source: Noah Wire Services

More on this

https://www.theguardian.com/technology/2026/apr/07/meta-scale-ai-social-media-technology - Please view link - unable to able to access data
https://www.theguardian.com/technology/2026/apr/07/meta-scale-ai-social-media-technology - An article detailing how tens of thousands of individuals have been employed by Scale AI, a company partially owned by Meta, to train artificial intelligence by analyzing Instagram accounts, collecting copyrighted material, and transcribing explicit audio content. The piece highlights the ethical concerns and discomfort expressed by workers involved in these tasks, which diverged from the intended purpose of refining advanced AI systems. The article also touches upon the management of Outlier by Scale AI, its contracts with the Pentagon and US defense companies, and the leadership of CEO Alexandr Wang, who also serves as Meta's chief AI officer.
https://time.com/7293552/meta-scale-ai-workers/ - An article discussing Meta's $15 billion investment to acquire a 49% stake in Scale AI, positioning Scale CEO Alexandr Wang to lead Meta’s new AI division focused on achieving 'superintelligence.' The piece highlights concerns about the welfare of gig workers who power Scale's data annotation operations, noting low wages and poor working conditions. It also raises questions about the potential impact on competitors like OpenAI and Google, as Meta may leverage Scale’s data services to undermine rivals, potentially exacerbating existing inequities in the AI labor ecosystem.
https://www.business-humanrights.org/en/latest-news/meta-ai-allegedly-linked-to-widespread-privacy-concerns-expsoing-personally-identifiable-information-to-contractors/ - A report alleging that Meta's AI training practices exposed users' personal data—including names, contact information, and selfies—to contract workers without sufficient safeguards, raising human rights concerns about privacy violations. The scope includes thousands of chat reviews per worker per week, with one contractor reporting personal data in over half of reviewed chats, affecting potentially millions of users.
https://www.engadget.com/meta-lawsuit-data-scraping-facebook-instagram-voyager-labs-180139048.html - An article reporting on Meta's lawsuit against Voyager Labs, accusing the surveillance company of creating tens of thousands of fake accounts to scrape data from over 600,000 Facebook users' profiles. The lawsuit alleges that Voyager used its Surveillance Software to collect information such as posts, likes, friend lists, photos, and comments, along with other details from groups and pages, and also scraped data from Instagram, Twitter, YouTube, LinkedIn, and Telegram to sell and license for profit.
https://www.bloomberg.com/news/articles/2023-02-02/meta-was-scraping-sites-for-years-while-fighting-the-practice - An article revealing that Meta Platforms Inc. for years paid a contractor to scrape data from other websites while publicly condemning the practice and suing companies that pulled data from its own social-media platforms. The article highlights the contradiction between Meta's public stance and its actions, as surfaced in legal documents filed in a California court case where Meta sued the Israel-based data collection company Bright Data for harvesting and selling information drawn from Facebook and Instagram.
https://www.nixonpeabody.com/insights/articles/2021/12/23/meta-attempts-to-stop-scraping-of-instagram-user-data - An article discussing Meta's legal action against Social Data Trading Ltd., alleging that the company scraped data without permission from accounts on websites such as Instagram. Meta claims that Social Data used bots to scrape public profile information from Instagram and then analyzed this information to provide insights about 'influencers and their audiences.'

Noah Fact Check Pro

The draft above was created using the information available at the time the story first emerged. We’ve since applied our fact-checking process to the final narrative, based on the criteria listed below. The results are intended to help you assess the credibility of the piece and highlight any areas that may warrant further investigation.

Freshness check

Score: 8

Notes: The article was published on April 7, 2026, and appears to be original reporting by The Guardian. No evidence of prior publication or significant recycling of content was found. However, the topic of AI data scraping has been covered in various contexts, which may lead to thematic similarities with other reports.

Quotes check

Score: 7

Notes: Direct quotes from workers are included. While the quotes are compelling, they cannot be independently verified through other sources. The absence of corroborating evidence raises concerns about their authenticity.

Source reliability

Score: 9

Notes: The Guardian is a reputable news organisation known for investigative journalism. However, the article relies heavily on anonymous sources, which, while common in investigative reporting, can affect the verifiability of the claims made.

Plausibility check

Score: 7

Notes: The claims about Scale AI's practices are plausible given the company's known involvement in AI data annotation. However, the specific allegations of scraping personal data and transcribing explicit content are serious and require independent verification, which is currently lacking.

Overall assessment

Verdict (FAIL, OPEN, PASS): FAIL

Confidence (LOW, MEDIUM, HIGH): MEDIUM

Summary: While The Guardian is a reputable source, the article's reliance on unverified anonymous sources and the serious nature of the allegations without independent corroboration lead to a medium level of confidence in its accuracy. The lack of independent verification and the reliance on anonymous sources are significant concerns.

AI ethics
Data privacy
Tech industry