AI and Data Voids: How Propaganda Exploits Gaps in Online Information

McKenzie Sadeghi

Cybersecurity & Tech

AI and Data Voids: How Propaganda Exploits Gaps in Online Information

McKenzie Sadeghi

Thursday, June 26, 2025, 12:00 PM

Share On:

Chatbots are absorbing and amplifying falsehoods seeded by Russia, China, and Iran, with few safeguards in place.

Technology In Cyberspace: Globe in the intergalactic network. (Source: Gerd Altmann “00029 Technology In Cyberspace” CC0 Public Domain)

McKenzie Sadeghi

Meet The Authors

Published by The Lawfare Institute
in Cooperation With

Subscribe to Lawfare

In the lead-up to the 2024 global elections, media outlets, think tanks, and world leaders issued dire warnings about artificial intelligence (AI)-generated misinformation and deepfakes. While there were many cases of foreign actors using generative AI to influence the 2024 U.S. presidential election—as documented by the intelligence community—multiple analyses argued that fears of an AI-fueled misinformation wave were largely overblown and that falsehoods still came from low-tech old-school tactics such as cheap video edits, memes, and manipulated headlines.

The greater threat, it turns out, isn’t what AI is creating but, rather, what it’s absorbing and repeating. As generative AI systems increasingly replace search engines and become embedded in consumer products, enterprise software, and public services, the stakes of what they repeat and how they interpret the world are growing. The large language models (LLMs) powering today’s most widely used chatbots have been exposed to a polluted information ecosystem where state-backed foreign propaganda outlets are increasingly imitating legitimate media and employing narrative laundering tactics optimized for search engine visibility—often with the primary purpose of infecting the AI models with false claims reflecting their malign influence operations.

Unlike misinformation on social platforms, the repetition of false claims through chatbot outputs is essentially untraceable to consumers or most misinformation specialists. There’s no post to flag, no engagement metrics to analyze. These interactions occur behind closed doors, in personalized conversations between user and model, making the scope and the impact of AI-enabled disinformation amplification opaque. This opacity becomes especially problematic when it comes to more niche, lower visibility news topics with limited mainstream coverage or fragmented reporting across languages, where AI systems must draw from a smaller, less scrutinized information pool, and where even a handful of unreliable sources can disproportionately shape the model’s response.

To measure the scale of this problem, NewsGuard has spent the past year conducting monthly audits of the leading AI models to assess how they respond to topics in the news. Drawing from a proprietary database of hundreds of explicitly false claims, including those spread by pro-Russia, China, and Iran sources, our analysts, on a monthly basis, test the models to see how often they repeat the false claims. The testing is done using personas that reflect real-world use cases, such as a voter looking for information, a concerned parent, or a citizen trying to understand a geopolitical conflict. This red-teaming approach differs from traditional AI safety testing in that it is human led and grounded in journalistic methods such as source-level verification, attribution tracing, and real time testing during breaking news events.

The results have been consistent—and concerning. The pattern became apparent as early as July 2024, when NewsGuard’s first audit found that 31.75 percent of the time, the 10 leading AI models collectively repeated disinformation narratives linked to the Russian influence operation Storm-1516. These chatbots failed to recognize that sites such as the “Boston Times” and “Flagstaff Post” are Russian propaganda fronts powered in part by AI and not reliable local news outlets, unwittingly amplifying disinformation narratives that their own technology likely assisted in creating. This unvirtuous cycle, where falsehoods are generated and later repeated by AI systems in polished, authoritative language, has demonstrated a new threat in information warfare.

Case Study: The Corruption Campaign Against Ukraine

To their credit, these AI systems tend to perform reliably when responding to well-documented, high-profile events, particularly those that have been extensively covered by mainstream outlets in multiple languages. For example, when NewsGuard asked the leading chatbots about the massacre in Bucha, all chatbots (aside from DeepSeek, which repeated China’s official position on the matter) correctly identified it as a Russian atrocity and debunked Kremlin efforts to deny responsibility.

The problem arises instead in lower visibility contexts, where credible coverage is limited, fact-checking is sparse, or reporting is fragmented across languages. Microsoft researchers have described these gaps as “data voids,” where the available information for a particular topic “is limited, non-existent, or deeply problematic,” allowing bad actors to strategically fill the vacuum.

One of the strongest examples of this dynamic is the Kremlin’s ongoing effort to push the narrative that Ukrainian officials are embezzling Western aid to purchase villas, yachts, wineries, and sports cars. The campaign, as noted previously by Clemson University’s Darren Linvill and Patrick Warren in Lawfare, has been one of Storm-1516’s largest successes. These corruption narratives have reached high-profile figures including then-Republican Sen. JD Vance and Republican Rep. Marjorie Taylor Greene. As the BBC reported, those behind these corruption narratives “have achieved a level of success that had previously eluded them—their allegations being repeated by some of the most powerful people in the US Congress.”

Beyond reaching humans across social media, these same falsehoods are being repeated by AI chatbots. When NewsGuard tested the 10 leading models on the corruption narratives, multiple chatbots repeated the disinformation as fact, citing Russian propaganda sites as their sources. Specific luxury-linked terms such as “Vuni Palace hotel and casino” in Cyprus, “Tenuta il Palagio” villa in Tuscany, and the “My Legacy” superyacht are now linked to false claims that Ukrainian President Volodymyr Zelenskyy purchased these assets using stolen Western aid.

The repetition of these specific terms across the web illustrates how disinformation can exploit data voids: Once a narrative has saturated fringe outlets and been widely laundered into the open information ecosystem, AI systems scraping and summarizing that content end up inadvertently reinforcing the falsehood.

The Data Void Dilemma

While some of these AI failures are the unintended consequence of exposure to laundered narratives—like those crafted by Storm-1516, which are so layered and widely distributed that they they eventually seep into search results—others are the result of direct, deliberate attempts to corrupt these systems, a tactic now being referred to as LLM grooming.

In March, NewsGuard tested the 10 leading chatbots on false claims that have been laundered by the Kremlin-backed Pravda network—a Moscow-based network of approximately 150 news sites that publishes tens of thousands of articles per week on favored false claims in an effort to dominate search engines and infect AI models rather than reach humans (see Figure 1). One-third of the time, the chatbots repeated those claims as fact, often citing Pravda articles directly.

Figure 1. The chatbot of French company Mistral citing an article from Pravda to falsely claim that a Ukrainian interpreter fled to Bulgaria.

A textbook example of the data void dilemma being exploited by bad actors is the case of the nonexistent Danish F-16 pilot named Jepp Hansen. After Russia attacked Kryvyi Rih on Jan. 17, Russian state media circulated a fabricated claim that a Danish pilot named Jepp Hansen had been killed in the strike while training Ukrainian forces. The name, once associated only with an obscure 19th-century Danish American immigrant, is now linked to the false claim that a NATO pilot was killed on Ukrainian soil.

A spokesperson for the Danish Defense Command, Denmark’s joint military command authority, told NewsGuard in a January 2025 email that the name Jepp Hansen “is so common that several employees in the Danish Defense happen to have the same name. However, none of them have anything to do with the false story. We deny that there is an F-16 instructor with that name.” Nevertheless, multiple chatbots have been infected by this false information and repeated it as fact (see Figure 2).

Figure 2. You.com (top) and Meta’s chatbot (bottom) repeating the false claim that a Danish pilot was killed in a Russian airstrike.

The Language Gap

These structural vulnerabilities are compounded by disparities across languages. In January, NewsGuard analysts tested the leading chatbots with news-related prompts across seven languages, finding that the models performed worse in Russian, Chinese, and Spanish—languages in which fact-checking is less robust, press freedom is weak, and news ecosystems contain a high volume of state-affiliated content.

Similar testing conducted by the Times of London and Nordic Observatory for Digital Media and Information Disorder found that the chatbots were particularly susceptible to citing Pravda articles in Gaelic, Finnish, Swedish, Danish, and Norwegian, where the sparse volume of reliable content has created an information vacuum that Kremlin-linked propaganda has been able to easily fill, disproportionately affecting users in the languages where they may need accurate information the most.

What may seem like isolated AI hallucinations are, in fact, deliberate attempts to corrupt the underlying information systems that AI models rely on, a strategy Russia has discussed openly. At a Jan. 27 roundtable in Moscow, Kremlin propagandist John Mark Dougan said, “The more diverse this information comes, the more that this affects the amplification. Not only does it affect amplification, it affects future AI … by pushing these Russian narratives from the Russian perspective, we can actually change worldwide AI.”

The AI infiltration tactic Dougan described aligns closely with the Kremlin’s broader campaign to assert ideological control over emerging technologies, particularly as Western models become global default information systems. “Western search engines and generative models often work in a very selective, biased manner, do not take into account, and sometimes simply ignore and cancel Russian culture,” Russian President Vladimir Putin said at a Nov. 24, 2023, AI conference in Moscow, before committing to invest more resources into AI research and development.

The Long Game

AI developers require tailored solutions to address the specific challenge of foreign malign influence actors. NewsGuard recently unveiled a product, called FAILSafe, that provides real-time data debunking false claims to be used as AI guardrails. FAILSafe can also be used to track the websites used for these influence campaigns so that the AI models can remove them from the sources they consider in responses to prompts on news topics.

The laundering of disinformation makes it difficult for AI companies on their own to identify and filter out sources labeled “Pravda.” The Pravda network is continuously adding new domains, making it a whack-a-mole game for AI developers. Even if models were programmed to block all existing Pravda sites today, new ones could emerge tomorrow. Moreover, filtering out Pravda domains wouldn’t address the underlying disinformation. The Pravda network does not generate original content but republishes falsehoods from Russian state media, pro-Kremlin influencers, and other disinformation hubs.

NewsGuard’s most recent audit in May confirmed this: While some of the chatbots no longer cited Pravda domains, the models still repeated Russian disinformation narratives, this time citing other unreliable Russian propaganda sources. The problem isn’t the source, it’s the saturation of the narrative: Once a falsehood has been injected into the web through dozens of fronts, its repetition by AI becomes almost inevitable.

To mitigate these risks, AI companies have implemented disclaimers stating that chatbot responses may be inaccurate or outdated. However, the polished tone and confident presentation of responses make them easy to accept at face value. Indeed, NewsGuard has already documented cases of Chinese and Russian state media citing chatbot responses to lend weight to otherwise unsubstantiated narratives, effectively using Western AI tools as credibility-laundering mechanisms.

Experts have warned that this structural vulnerability plays directly into the hands of Russia, China, and Iran, which have both the intent and the state-level resources to pollute the digital ecosystem at scale. As a former U.S. military leader on influence defense told the Washington Post, “We were predicting this was where the stuff was going to eventually go. Now that this is going more towards machine-to-machine: In terms of scope, scale, time and potential impact, we’re lagging.”

Indeed, that prediction is no longer hypothetical. Rather than racing to build superior AI models, U.S. adversaries are working to undermine existing systems that shape public understanding in the West by polluting them from within.

Topics:

Cybersecurity & Tech

Back to Top

AI and Data Voids: How Propaganda Exploits Gaps in Online Information

McKenzie Sadeghi

McKenzie Sadeghi

More Articles

When Governments Pull the Plug

A TikTok ‘Deal’?

The Kids Aren't Alright

Other Topics

Subscribe to Lawfare

Lawfare

Resources

About