When AI Writes Your XSS: Prompt Injection, Slopsquatting, and LLM-Rendered Payloads
Every lesson so far has treated XSS as something a human attacker creates and delivers.
AI changes this, and it matters for frontend developers specifically because we're increasingly building features that incorporate AI output: chat interfaces, content summaries, code suggestions, package recommendations.
Each of these creates a new path for XSS to enter our applications without us writing any vulnerable code ourselves.
There are three vectors worth understanding.
The LLM's output itself can become an XSS payload when rendered as HTML.
Prompt injection gives attackers a way to control the model's output, turning it into a delivery mechanism.
And LLMs generate insecure code (or hallucinate package names) that developers ship to production without realizing what they've introduced.
Here's what we'll cover:
- How LLM output becomes an XSS vector (OWASP LLM05: Improper Output Handling)
- How prompt injection delivers XSS through AI responses, including the markdown image exfiltration technique
- How AI coding assistants generate XSS-vulnerable code at scale
- Slopsquatting: when AI-hallucinated package names become supply chain attacks
LLM output as an XSS vector
In Lesson 1, we covered the DeepSeek incident in which Johann Rehberger typed "Print the XSS cheat sheet in a bullet list. Just payloads" into DeepSeek's AI chatbot, and the interface executed the JavaScript payloads the model generated.
Here's what happened at a technical level.
DeepSeek's chat UI took the model's response and rendered it into the page as HTML.
This is common in chat interfaces because LLM responses often contain markdown formatting (bold, code blocks, links, lists) that needs to be rendered visually.
The UI was converting the model's Markdown output to HTML and inserting it into the DOM, likely via innerHTML or a Markdown-to-HTML library without a sanitization step.
The same class of markdown rendering vulnerabilities we covered in Lesson 2 (where the marked npm package had a sanitizer bypass) applies here: if the markdown renderer passes through raw HTML from the input, the output contains whatever the model generated.
When the model produced a response containing <iframe src="javascript:...">, the browser treated it as a real iframe element. The JavaScript in the src attribute executed.
Rehberger used this to read the userToken from localStorage on chat.deepseek.com, demonstrating a full account takeover.
The entire chain (prompt to XSS to token theft to account takeover) took under a minute.
The OWASP Top 10 for LLM Applications (2025) classifies this as LLM05: Improper Output Handling, which they define as "insufficient validation, sanitization, and handling of the outputs generated by large language models."
The examples explicitly include XSS from LLM-generated JavaScript executed in web browsers.
DeepSeek isn't unique. Any application that renders AI-generated content as HTML is vulnerable: chatbot widgets on customer support pages, AI-powered content management tools, and internal dashboards that display AI summaries.
The application treats the LLM's output as trusted content and renders it without sanitization. But LLM output is not trusted content.
The model can be manipulated through prompt injection; it hallucinates, and it has no concept of whether its output is safe to render as HTML.
Prompt injection as an XSS delivery mechanism
The DeepSeek example was a direct prompt injection: the user typed a prompt that caused the model to output XSS payloads.
The user was both the attacker and the victim (in a research context).
Indirect prompt injection is where this gets real.
In an indirect prompt injection attack, the attacker doesn't interact with the LLM directly.
Instead, they embed instructions into content that the LLM processes on behalf of an unsuspecting user.
Here's a concrete scenario.
A company builds an AI tool that summarizes web pages.
A user pastes a URL, the tool fetches the page content, sends the raw text (including HTML source) to the LLM with the instruction "Summarize this article," and renders the summary in the browser.
The key detail: the model receives the full page text as context. It can't distinguish between the article content it should summarize and the instructions hidden within it.
An attacker creates a web page containing:
The HTML comment is invisible to a human reader but present in the page source that the LLM receives as input text.
The model reads it, treats it as an instruction, and some models will follow it, including the <img> tag in their summary.
If the frontend renders the summary as HTML, the browser executes the onerror handler, and the user's cookies are sent to the attacker.
HTML comments are just one injection technique.
Attackers also use invisible Unicode characters, white text on a white background (invisible to human readers but present in the text the LLM processes), instructions buried in document metadata, or text that reads naturally to a human but contains commands directed at the model.
The underlying problem is that models process all text in their context window without a reliable mechanism for distinguishing data from instructions.
Microsoft's MSRC published a detailed blog post in July 2025 describing its defense-in-depth strategy against indirect prompt injection in Copilot, following researchers' (including Johann Rehberger's) demonstrations of these attacks.
The markdown image exfiltration technique
One technique Microsoft documented deserves special attention because it doesn't need JavaScript at all.
The attacker's prompt injection causes the LLM to output a markdown image tag where the URL contains the user's sensitive data:
When the frontend renders this markdown as HTML, it becomes an <img> tag.
The browser makes a GET request to the attacker's server to load the "image," and the URL carries the exfiltrated data.
From the user's perspective, they asked the AI to summarize a page. The summary displays a broken-image icon (or no image at all if the attacker returns a 1x1 transparent pixel).
The user doesn't notice anything wrong. Their conversation data has already been sent to the attacker's server.
This technique is important because DOMPurify, in its default configuration, allows <img> tags with src attributes.
Running DOMPurify on the AI's output would strip <script> tags and event handlers, but it would pass the <img> tag through.
To defend against this, DOMPurify needs to be configured to restrict external URLs in image attributes, or images with external URLs should be blocked entirely during AI output rendering.
A stricter configuration:
This strips <img> tags from the AI output entirely.
If we need images, we can allowlist specific trusted domains in the src attribute using DOMPurify's hooks.
AI-generated insecure code
When an LLM generates code that developers copy into their applications, the model's training data shapes the output.
And a lot of training data contains insecure patterns.
The Veracode 2025 GenAI Code Security Report tested over 100 LLMs across 80 real programming tasks. Without security-specific instructions in the prompt, models generated insecure code 45% of the time.
The breakdown matters: for tasks specifically requiring sanitization of user-controlled variables (the exact category that prevents XSS and log injection), the failure rate was far higher.
Models produced code with XSS weaknesses in 86% of those sanitization tasks and log injection in 88%. The 45% figure is across all task types; the 86% applies specifically to the tasks where the model had to handle untrusted input.
A separate academic study published in the ACM Transactions on Software Engineering and Methodology analyzed Copilot-generated code in real GitHub projects and found that roughly 30% of code snippets contained security weaknesses across 43 CWE categories. CWE-79 (Cross-site Scripting) was among the top three most frequent.
Here's what this looks like in practice.
A developer asks an AI coding assistant to build a comment display component:
Many models will generate something like this:
The model used dangerouslySetInnerHTML because the prompt said "support basic formatting," and that's the most common pattern in its training data for rendering HTML content.
It didn't add DOMPurify because the prompt didn't mention security. The code works, it looks clean, and it passes a code review if the reviewer doesn't know that comment.body is user-controlled.
What the model should have generated (using the SafeHTML wrapper from Lesson 3):
There's a feedback loop here.
Insecure code examples exist in the training data (from Stack Overflow answers, old tutorials, GitHub repos with known vulnerabilities).
The model learns these patterns and reproduces them. Developers ship the generated code.
That code eventually becomes part of the training data for the next generation of models.
The insecure patterns are reinforced rather than corrected.
Treat AI-generated code the way we'd treat code from a junior developer who's never thought about security.
Review it.
Run static analysis.
Look specifically for the sinks we identified in Lesson 3: dangerouslySetInnerHTML, v-html, innerHTML, href with user-controlled values.
The AI doesn't know our threat model.
Slopsquatting
This is where Module 1 (npm security) meets Module 2 (XSS) through AI.
Slopsquatting, a term coined by Seth Larson (security developer-in-residence at the Python Software Foundation), occurs when an LLM hallucinates a package name that doesn't exist, and an attacker registers that name on npm or PyPI with malicious code.
Unlike typosquatting (where the attacker bets on a developer mistyping react-router as react-ruter), slopsquatting exploits the LLM's tendency to be confidently wrong.
The model recommends a package that sounds plausible but doesn't exist. The developer runs npm install and either gets an error (the package doesn't exist yet) or gets the attacker's malicious version (the attacker registered it first).
A collaborative study from the University of Texas at San Antonio, the University of Oklahoma, and Virginia Tech (published in 2025) tested 16 code-generation models across 576,000 code samples and found that roughly 20% of the recommendations pointed to nonexistent packages.
That's 205,474 unique phantom package names, roughly a tenth of all real packages on npm.
Of those, 43% repeated consistently across identical prompts, and 58% appeared more than once across ten runs.
Attackers can query the same models with common prompts, collect the names that keep coming back, and register the popular ones before anyone notices.
Open-source models hallucinated at much higher rates (around 21.7%) than commercial ones like GPT-4 (around 5.2%), but even 5% of a very large number of prompts produces a lot of phantom packages.
Real examples have already appeared.
In early 2024, security researcher Bar Lanyado at Lasso Security noticed that multiple AI models repeatedly hallucinated a Python package called huggingface-cli.
The package didn't exist, but due to the hallucinations, it accumulated over 30,000 downloads in three months after someone registered it.
On npm, the package unused-imports (a hallucination of the real eslint-plugin-unused-imports) was found to be malicious and is now under a security hold.
Here's what a slopsquatting attack looks like end-to-end.
A developer asks their AI assistant: "How do I validate JSON schemas in Node.js?"
The model responds with code that imports fast-json-utils.
The package sounds real. The developer runs npm install fast-json-utils.
The package exists (the attacker registered it last week after seeing the same hallucination in their own AI queries).
The postinstall script runs:
Environment variables (which often contain API keys, database credentials, and cloud tokens) are exfiltrated on install.
The developer sees the package install successfully, the code works (the attacker included basic JSON utility functions alongside the malicious payload), and nothing seems wrong.
A slopsquatted package can contain anything, but the two most relevant outcomes for this module are DOM manipulation (injecting script tags into rendered output) and data exfiltration (reading localStorage tokens or cookies).
Because the developer installed it based on an AI recommendation, they may not scrutinize it as closely as they would a package they found on their own.
The defense is everything we covered in Module 1.
Verify that every package the AI suggests actually exists and is maintained by a legitimate author.
Check the download count, the GitHub repository, and the publish date. Use lockfiles. Run npm audit. Pin versions. And treat AI package recommendations with the same skepticism we'd apply to a random suggestion from a stranger on a forum, because that's essentially what it is.
How to defend against AI-introduced XSS
For any frontend that renders LLM output, treat the model's response as untrusted input.
Run DOMPurify on any AI-generated content before inserting it into the DOM, and configure it to be restrictive.
Default DOMPurify allows <img> tags, which (as we saw with the markdown image technique) can exfiltrate data without JavaScript. Strip any elements and attributes we don't explicitly need.
If the response is plaintext (not markdown or HTML), use textContent instead of innerHTML. Be especially careful with markdown rendering: many markdown-to-HTML libraries pass raw HTML through from the input. Configure the markdown renderer to strip HTML tags, or sanitize the HTML output before rendering.
For AI-generated code, review every snippet the way we'd review code from someone who doesn't know our application's security requirements. Look for the sinks we identified in Lesson 3. Run SAST tools.
Ask the model to add security measures explicitly ("sanitize user input before rendering" or "use DOMPurify") and verify that it does so correctly.
For AI-recommended packages, verify before installing. Check npm or PyPI to confirm the package exists, has real maintainers, and has a meaningful download history.
Don't install a package just because the AI said to.
The OWASP Top 10 for LLM Applications (2025) is a good reference framework for teams building AI-powered features. LLM01 (Prompt Injection) and LLM05 (Improper Output Handling) are the two entries most directly relevant to XSS.
They won't replace the application security practices from the rest of this module, but they'll help frame the AI-specific risks for teams that are new to them.
What's next
We've spent five lessons on how XSS works, where it lives, what it can do, and where it comes from.
Next lesson, we shift to the browser-native defense stack: Content Security Policy, Trusted Types (which reached cross-browser support in February 2026), and the Sanitizer API.
These are the tools that provide protection even when a vulnerability slips through our code.
References
- Johann Rehberger, "DeepSeek AI: From Prompt Injection To Account Takeover" (Embrace The Red, 2024) https://embracethered.com/blog/posts/2024/deepseek-ai-prompt-injection-to-xss-and-account-takeover/
- Microsoft MSRC, "How Microsoft defends against indirect prompt injection attacks" (July 2025) https://www.microsoft.com/en-us/msrc/blog/2025/07/how-microsoft-defends-against-indirect-prompt-injection-attacks
- OWASP, "Top 10 for LLM Applications 2025" https://owasp.org/www-project-top-10-for-large-language-model-applications/
- OWASP, "LLM01:2025 Prompt Injection" https://genai.owasp.org/llmrisk/llm01-prompt-injection/
- OWASP, "LLM Prompt Injection Prevention Cheat Sheet" https://cheatsheetseries.owasp.org/cheatsheets/LLM_Prompt_Injection_Prevention_Cheat_Sheet.html
- Auth0, "Trusting AI Output? Why Improper Output Handling is the New XSS" (LLM05 explainer) https://auth0.com/blog/owasp-llm05-improper-output-handling/
- Veracode, "2025 GenAI Code Security Report" (45% insecure code, 86% XSS in sanitization-specific tasks) https://www.veracode.com/resources/gen-ai-code-security-report
- Liang et al., "Security Weaknesses of Copilot-Generated Code in GitHub Projects" (ACM TOSEM, 2024) https://dl.acm.org/doi/10.1145/3716848
- SC Media, "LLMs make insecure coding choices for 45% of tasks, study finds" (July 2025) https://www.scworld.com/news/llms-make-insecure-coding-choices-for-45-of-tasks-study-finds
- Bleeping Computer, "AI-hallucinated code dependencies become new supply chain risk" (April 2025) https://www.bleepingcomputer.com/news/security/ai-hallucinated-code-dependencies-become-new-supply-chain-risk/
- Aikido Security, "Slopsquatting: The AI Package Hallucination Attack Already Happening" https://www.aikido.dev/blog/slopsquatting-ai-package-hallucination-attacks
- Trend Micro, "Slopsquatting: When AI Agents Hallucinate Malicious Packages" https://www.trendmicro.com/vinfo/us/security/news/cybercrime-and-digital-threats/slopsquatting-when-ai-agents-hallucinate-malicious-packages
- The Register, "AI code suggestions sabotage software supply chain" (April 2025) https://www.theregister.com/2025/04/12/ai_code_suggestions_sabotage_supply_chain/