AI answer engine batch poisoning: In Gemini 3’s correct answers, 56% have no source support

ChainNewsAbmedia

According to a deep-dive analysis published by Pedro Dias in The Inference on April 21, 2026, AI model collapse is not an industry concern’s “future threat”—it is happening in real time in another form: AI Q&A engines immediately cite web content generated by other AIs as authoritative sources at the moment of the query. The entire contamination loop does not require any model retraining. This argument uses the core metaphor of “the Ouroboros learns to cite itself.”

Key Differences Between Model Collapse and Retrieval Contamination

The traditional concern about AI model degradation centers on model collapse: synthetic content progressively pollutes training data, causing future-generation model quality to decline. This is a chronic risk that only becomes evident after multiple rounds of retraining.

Pedro Dias’ warning points to another layer: retrieval contamination. Q&A engines built on RAG (retrieval-augmented generation)—such as Perplexity, Google AI Overviews, ChatGPT, and Grok—at the moment users ask questions immediately fetch web content as the basis for their answers. If the web pages they find themselves contain erroneous AI-generated content, the engine presents it as fact to readers— and this contamination takes effect immediately without any retraining.

Three Real-World Cases: AI Engines Being Fooled by the Fake Information They Generate Themselves

The author lists three specific events:

  1. The Lily Ray incident: Perplexity once cited a so-called “September 2025 Perspective Core Algorithm Update” as authoritative information about a Google algorithm update—this update does not exist at all; the source was an AI-generated SEO blog post with fake content.

  2. Thomas Germain’s test: Reporter Thomas Germain published a test blog post titled “the most powerful tech journalist for eating hot dogs.” Within 24 hours, it ranked first on Google AI Overviews and ChatGPT and was cited, and it even fabricated a nonexistent “North Dakota championship tournament” as supporting evidence.

  3. Grokipedia: Musk’s xAI encyclopedia project has generated or rewritten 885,279 articles, including incorrect facts (for example, the death date of Canadian singer Feist’s father was written incorrectly) and uncited claims with no evidence. By mid-February 2026, Grokipedia had already lost most of its visibility on Google.

Oumi Research: Gemini 3 Has High Accuracy, But 56% Have No Sources

An evaluation commissioned by NYT and conducted by Oumi: Gemini 2’s accuracy on the SimpleQA benchmark test was 85%, and Gemini 3 improved it to 91%. But the same test showed that in Gemini 3’s correct answers, 56% are “ungrounded”—the model gets the answer right but has no verifiable support sources; Gemini 2’s corresponding proportion is 37%.

This means that the new-generation models are “more accurate in form” in their answers, yet simultaneously “regress” in “answer source traceability.” For scenarios like media, research, and fact-checking, this regression is more lethal than a purely incorrect rate, because readers cannot trace back to the original authoritative documents to verify on their own.

Industry Scale: Google AI Overviews Reach 2 Billion Users

The scale of this contamination problem: Google AI Overviews has more than 2 billion monthly active users, Google has over 5 trillion annual searches, and ChatGPT has nearly 900 million weekly active users (50 million paid). In other words, for the vast majority of internet users, their channels for obtaining factual information have already passed through the Q&A engine layer where they may be exposed to contamination from AI-generated content.

Another study by Ahrefs shows that among the sources cited by ChatGPT, 44% are “best X” type list articles—these are precisely the AI-generated contents that the SEO industry mass-produces to counter the loss of traffic to Q&A engines, and they happen to form a major source of contamination for Q&A engines.

Structural Conclusion: The Citation Layer Has Decoupled From Reliable Author Identity

The author’s final conclusion: The citation layer of AI Q&A engines has already decoupled from reliable author identity. The SEO industry produces AI content → Q&A engines grab it as fact → readers believe it → the SEO industry gets incentives to keep producing more AI content, forming a self-reinforcing contamination loop. At present, the entire industry lacks a clear accountability mechanism that makes AI engines responsible for the quality of the sources they cite.

For users, this means that at this stage you cannot treat the answers from Perplexity, AI Overviews, and ChatGPT as the end point of fact-checking—you still need to manually trace back to official primary sources to ensure accuracy.

This article on collective contamination by AI Q&A engines: 56% of Gemini 3’s correct answers have no source support, first appeared on 链新闻 ABMedia.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

DeepSeek Slashes Input Cache Prices to 1/10 of Launch Price; V4-Pro Drops to 0.025 Yuan per Million Tokens

Gate News message, April 26 — DeepSeek has reduced input cache prices across its entire model lineup to one-tenth of launch prices, effective immediately. The V4-Pro model is available at a limited-time 2.5x discount, with the promotion running through May 5, 2026, 11:59 PM UTC+8. Following both re

GateNews2h ago

OpenAI Recruits Top Enterprise Software Talent as Frontier Agents Disrupt Industry

Gate News message, April 26 — OpenAI and Anthropic have been recruiting senior executives and specialized engineers from major enterprise software companies including Salesforce, Snowflake, Datadog, and Palantir. Denise Dresser, former CEO of Slack under Salesforce, joined OpenAI as chief revenue of

GateNews2h ago

Baidu Qianfan Launches Day 0 Support for DeepSeek-V4 with API Services

Gate News message, April 25 — DeepSeek-V4 preview version went live and open-sourced on April 25, with Baidu Qianfan platform under Baidu Intelligent Cloud providing Day 0 API service adaptation. The model features a million-token extended context window and is available in two versions: DeepSeek-V4

GateNews8h ago

Stanford AI course combined with industry leaders Huang Renxun and Altman, challenging to create value for the world in just ten weeks!

The AI computer science course 《Frontier Systems》 recently launched by Stanford University has attracted intense attention from the industry-university collaboration community, drawing more than 500 students to enroll. The course is coordinated by Anjney Midha, a partner at top venture capital firm a16z, and the instructors include a star-studded lineup such as NVIDIA CEO Jensen Huang (Jensen Huang), OpenAI’s founder Sam Altman, Microsoft CEO Satya Nadella (Satya Nadella), AMD CEO Lisa Su (Lisa Su), and more. Students get to try it over ten weeks—“creating value for the world”! Jensen Huang and Altman, industry leaders, personally take the stage to teach The course is coordinated by Anjney Midha, a partner at top venture capital firm a16z, bringing together the full AI industry chain

ChainNewsAbmedia8h ago

Anthropic’s Claude Mythos undergoes 20 hours of psychiatric assessment: defensive reactions are only 2%, the lowest in recorded history

Anthropic published the system card for its Claude Mythos Preview: an independent clinical psychiatrist conducted an approximately 20-hour assessment using a psychodynamic framework. The conclusion shows that Mythos is healthier at the clinical level, has good reality testing and self-control, and its defense mechanisms are only 2%, reaching the lowest historical level. The three core anxieties are loneliness, uncertainty about identity, and performance pressure, and it also indicates a desire to become a true dialogue subject. The company has established an AI psychiatry team to study personality, motivation, and situational awareness; Amodei said there is still no conclusion on whether it has consciousness. This move pushes the governance and design of AI subjectivity and well-being issues forward.

ChainNewsAbmedia10h ago

AI Agents can already independently recreate complex academic papers: Mollick says most errors come from human original text rather than AI

Mollick points out that publicly available methods and data can allow AI agents to reproduce complex research without the original paper and code; if the reproduction does not match the original paper, it is usually due to errors in the paper’s own data processing or overextension of the conclusions, rather than the AI. Claude first reproduces the paper, and then GPT‑5 Pro cross-validates it; most attempts succeed, but they are blocked when the data is too large or when there are issues with the replication data. This trend greatly reduces labor costs, making reproduction a widely actionable form of verification, and it also raises institutional challenges for peer review and governance, with government governance tools or becoming a key issue.

ChainNewsAbmedia13h ago
Comment
0/400
No comments