A Guardian investigation has found that tens of thousands of people have been engaged through an online platform run by Scale AI, a firm in which Meta holds a near half-stake, to label and curate material used to train large artificial‑intelligence systems, often by trawling social media profiles, copying copyrighted images and transcribing explicit audio. According to the report by The Guardian, many workers described tasks that diverged from the advertised aim of refining high‑level models and raised deep ethical concerns.
Scale AI’s Outlier platform recruits people with specialist backgrounds to supply annotated data and has been positioned at the centre of Meta’s push into more powerful AI, following the social network’s major investment that gave it 49% ownership. Time reported that the deal was part of a $15bn strategy that places Scale’s chief executive at the heart of Meta’s new AI division, a move that has prompted scrutiny over how the company will combine data, talent and influence in a competitive market.
People who performed work for Outlier told The Guardian they routinely encountered material they found upsetting and, in some cases, invasive. One contractor said: "I don’t think people understood quite that there’d be somebody on a desk in a random state, looking at your [social media] profile, using it to generate AI data," describing the view that ordinary users did not appreciate how public posts might be repurposed to create training datasets.
Several contributors spoke of low pay, unstable hours and demanding monitoring, and some said recruitment advertising overstated pay or scope. "A lot of us were really desperate," one worker told The Guardian, explaining why people with other professions, including journalists and academics, took the work. Others reported being asked to transcribe sexual audio, label disturbing imagery or order images by apparent age, tasks that many found morally troubling.
Independent reports and rights organisations have raised alarm over the privacy implications of exposing users’ personal information to large numbers of contractors, including names, selfies and private messages, and noted parallels with earlier episodes in which firms scraped public social‑media content at scale. A human‑rights organisation documented widespread exposure of personally identifiable information to reviewers, while previous reporting has detailed legal actions and controversies over scraping on platforms such as Facebook and Instagram.
Scale AI’s client list has included major tech firms and defence contractors, and the company has supplied data services used to develop commercially and strategically important models. The Guardian noted contracts with US defence suppliers, and other outlets have emphasised how a direct link between a dominant social platform and a major data‑labelling firm could reshape competitive dynamics in the AI industry; OpenAI told outside commentators it ceased purchasing services from the supplier in June 2025.
Scale AI told The Guardian that its Outlier platform offers flexible, project‑based work with transparent pay and that contributors choose when to participate. Workers who remain on the platform say the income is unpredictable and that mass changes in project availability happen periodically, yet many continue because alternative work looks uncertain. "I have to be positive about AI because the alternative is not great," one contributor said.
Source Reference Map
Inspired by headline at: [1]
Sources by paragraph:
- Paragraph 1: [2]
- Paragraph 2: [3], [2]
- Paragraph 3: [2]
- Paragraph 4: [2]
- Paragraph 5: [4], [5], [6]
- Paragraph 6: [2], [3]
- Paragraph 7: [2]
Source: Noah Wire Services