Russian Bots Are Teaching ChatGPT to Hate Us
03.07.2025

How the "Pravda" Network Is Flooding AI Training Data with 3.6 Million Pro-Russian Articles
When NewsGuard published their research in March, many thought: “Right, here’s another conspiracy theory about Russian hackers.” Only the figures turned out to be real, and the scale was staggering.
3.6 million articles in one year¹. All with one mission: to infiltrate Western AI systems’ training data. Not for anyone to actually read them – the “Pravda” network sites receive fewer than a thousand visitors per month. But for ChatGPT, Claude, and other “clever” assistants to absorb this poison alongside other information.
John Mark Dougan and His Frank Confessions
The American who fled to Moscow and became a propagandist spoke before Russian officials last January. Dougan didn’t hide his plans: “By spreading these Russian narratives from a Russian perspective, we can actually change global AI.” He added: “This isn’t a tool to be feared, it’s a tool that can be used”.
When NewsGuard researchers tested his words in practice, it turned out: yes, it works. 33% of responses from the ten most popular chatbots contained Russian disinformation narratives. Every third response.
We, as members of the LGBTIQ+ community, have particular reasons for concern. AI systems already demonstrate bias against marginalised groups without Russian interference. Now imagine what happens when these systems are additionally “taught” with Kremlin notions about “traditional values” and “Western degradation.”
LLM Grooming: The New Reality of Information Wars
The American project Sunlight coined the term “LLM grooming” – manipulation of large language models. It sounds technical, but the essence is simple: malicious actors flood the internet with millions of texts written not for humans, but for algorithms.
The “Pravda” network (the irony of the name is obvious) operates like a massive copy-paste factory. The same material from Russian state media is republished across 150 domains in 49 countries. The sites look dreadful – no search function, broken navigation, wonky translations. But web crawlers don’t notice this. They see multiple sources and conclude credibility.
Why We’re Particularly at Risk
Research shows: AI recruitment systems can reject CVs from people with “wrong” names. Voice recognition systems don’t understand non-binary pronouns. Targeted advertising algorithms exclude LGBTIQ+ people from marketing campaigns.
Kevin McKee from Google DeepMind explains the problem: queer communities have historically been excluded from algorithmic fairness research. Sexual orientation and gender identity are things that can’t be “seen” in data. And what isn’t measured isn’t considered when developing systems.
Generative AI learns from what it finds on the internet. If this internet is artificially flooded with Russian propaganda about the “unnaturalness” of queer identities, guess what these systems will start reproducing.
The French Were First to Spot the Problem
Viginum – France’s disinformation monitoring agency – identified the “Pravda” network as early as February 2024. It emerged that it’s administered by TigerWeb, an IT company from occupied Crimea. The owner is Yevhen Shevchenko, a web developer who previously worked for the Russian occupation administration.
SimilarWeb data confirms: the network’s sites have virtually no live traffic. Meanwhile, Finnish Check First found nearly 2,000 hyperlinks to “Pravda” sites in Wikipedia across 44 languages. The content seeps everywhere.
The Technical Mechanics of Poisoning
Imagine money laundering, but for ideas. One Kremlin narrative appears on Russian Today or another state outlet. Then it’s automatically translated into dozens of languages and posted on hundreds of domains with names like News-Kiev.ru or Kherson-News.ru.
For AI algorithms, this looks like multiple independent sources confirming one fact. The system draws a logical conclusion: if many different sites write about something, it must be true.
Over three years, the network has spread at least 207 false narratives. Among them – classic Russian propaganda about American biolabs in Ukraine and President Zelensky’s embezzlement of military aid.
Global Consequences and Western Response
Tulsi Gabbard, US Director of National Intelligence, warned: Russian influence operations “will almost certainly grow in sophistication and scale”. Meanwhile, the Trump administration closed the State Department’s Global Engagement Centre and disbanded the corresponding FBI working group.
Elon Musk called fighting disinformation “censorship.” Republicans in Congress supported this position. So now, when the threat has become reality, there are hardly any defenders left.
What We Must Do
Experts propose several strategies. AI companies should clean training data and avoid known disinformation sources. Lawmakers should demand transparency and AI content labelling. Society should learn information literacy.
But for our community, there are specific recommendations. McKee emphasises: we need more LGBTIQ+ people in AI development. “The presence of queer researchers can help teams question the initial assumption that gender is binary and fixed, rather than fluid and spectral”.
It’s also critically important to ask sceptical questions about AI responses on topics concerning our community. If a chatbot gives dodgy information about LGBTIQ+ rights or history, verify through independent sources.
The Future Is Already Here
Mozilla’s 2023 research showed: “Scale makes datasets worse, amplifying bias and causing real harm”. The New York Times reported: “In the hands of anonymous internet users, AI tools can create loads of harassment and racist material”.
In May 2023, a deepfake video of Biden in women’s clothing (with anti-trans subtext) went viral on Instagram and TikTok. In February of that year, a fake video spread showing Biden making transphobic statements.
This is just the beginning. If we don’t stop the poisoning of AI systems now, discrimination against us could soon become embedded in the very architecture of the digital world.
News