AI Can Now Unmask Anonymous Internet Users, New Study Finds

02-28-2026 • https://www.zerohedge.com, by Tyler Durden

That's according to a new study by Simon Lermen (MATS), Daniel Paleka (ETH Zurich), Joshua Swanson (ETH Zurich), Michael Aerni (ETH Zurich), Nicholas Carlini (Anthropic), and Florian Tramèr (ETH Zurich), published on arXiv.

In the paper, "Large-Scale Online Deanonymization with LLMs," the researchers show that modern large language models (LLMs) can re-identify people behind pseudonymous online accounts at a scale and accuracy that far surpass previous techniques.

The core contribution is an automated deanonymization pipeline powered by LLMs, according to the new study. Instead of relying on structured datasets or hand-engineered features—like earlier attacks on the Netflix Prize dataset—the system works directly on raw, unstructured text.

Given posts, comments, or interview transcripts written under a pseudonym, the pipeline extracts identity-relevant signals, searches for likely matches using semantic embeddings, and then uses higher-level reasoning to verify the most promising candidates while filtering out false positives. The result is a scalable attack that mirrors—and in some cases exceeds—the effectiveness of a dedicated human investigator.

To evaluate their approach, the researchers constructed three datasets with known ground truth. The first links pseudonymous Hacker News users to real-world LinkedIn profiles, relying on cross-platform clues embedded in public text. The second matches users across movie discussion communities on Reddit. The third takes a single Reddit user's history, splits it into two time-separated profiles, and tests whether the system can reconnect them.

Across all three settings, LLM-based methods dramatically outperformed classical baselines, which often achieved near-zero recall.

The headline numbers are striking. In some experiments, the system achieved up to 68% recall at 90% precision—meaning it correctly identified a substantial portion of targets while keeping false accusations low. Even when matching temporally split Reddit accounts separated by a year, performance remained strong. In contrast, traditional non-LLM approaches struggled to produce meaningful matches. The findings suggest that advances in reasoning and representation learning have transformed deanonymization from a niche, data-hungry attack into a broadly applicable capability.

The study says that a key concern is that the attack pipeline is composed of individually benign steps: summarizing text, generating embeddings, ranking candidates, and reasoning over matches. No single component appears inherently malicious, making it difficult to detect or restrict through conventional safeguards. Moreover, the study finds that increasing model reasoning effort improves deanonymization performance, implying that as frontier models become more capable, the attack may become even more effective by default.

The broader implication is that "practical obscurity"—the idea that scattered, pseudonymous posts are safe because linking them is too labor-intensive—may no longer hold.

Freedoms Phoenix

Online Magazine

Join-Us

Newspaper

Phone Apps

Declare Your Independence with Ernest Hancock

REAL TIME SPOT PRICES

Live Broadcast

Declare Your Independence

MENU

AI Can Now Unmask Anonymous Internet Users, New Study Finds

FreedomsPhoenix.com

REAL TIME SPOT PRICES

Live Broadcast

MENU

Watch Streaming Broadcast Live:

Current News | Contents By Subject

Additional Related items you might find interesting:Related items:

FreedomsPhoenix.com