RE: LeoThread 2025-10-18 23-22

Part 6/10:

One concrete example is the potential to engineer code or chatbot behavior. By creating fake GitHub repositories or injecting malicious scripts into open-source projects, a threat actor could subtly guide the model to generate harmful or biased code, or to promote specific agendas. This not only affects the model’s training but can also influence downstream applications like chatbots and automated systems.

How Easily Can These Attacks Be Systematized?

The research underscores just how easy it is to carry out such poisoning attacks. As few as 250 malicious documents—roughly 420,000 tokens—are sufficient to compromise models, which constitute a mere 0.0016% of the training corpus. These documents could be simple GitHub repos, blog posts, or social media content.