Cross-site scripting (known also as XSS) is a type of attack aimed at web application users. Attacker injects client-side code (typically a JavaScript) into vulnerable web application in such a way that the script is run on on users browsers visiting vulnerable page.

Cross-Site Scripting / XSS attack scenario.

Imagine that you’ve build an web application allowing your users to send private messages to each other. One of the users finds out that you do not encode messages, so it is possible to send pure HTML or JavaScript code to other person. The user decides to send this message to his buddy:

1
2
3
4
5
<div style="position:absolute;top:0;left:0;bottom:0;right:0;background:black">
    <h1>Surprise</h1>
</div>

<script>alert('peek-a-boo');</script>

As you imagine, this is what the other user would see:

Screenshot of XSS “peek-a-boo” example.

What can be done with XSS

Cross-Site Scripting vulnerability can lead to many different possibilities e.g.:

  • session hijacking/stealing cookies
  • performing unauthorized activities
  • corrupting/replacing page contents
  • phising/keylogging
  • hosting malware

How to prevent XSS attacks

Before we consider how to protect our application against XSS we need to consider what we allow our users to do. Security countermeasures depends on the context, location (e.g. script tag, style tag, div tag on your page) and functionality. Let’s take a look at few practical examples.

Displaying “text” data sent by user

In this scenario we want to print any text sent by the user. The data will be rendered in a container and we don’t want to allow any formatting.

The most basic way is to HTML encode all data. It means that all “special” characters are replaced by HTML entities. Entities in turn are understood by the browser as a literals to print, instead of HTML code. Take a look at this example:

1
2
Input: <script>alert('hello');</script>
HTML encoded: &lt;script&gt;alert('hello');&lt;/script&gt;

The input string without encoding would show JavaScript alert, the encoded string would just print the text.

In ASP.NET Core we can use @ for HTML encoding, so whenever you want to render a string it’s encoded by default. Try this:

1
2
3
4
5
@{
    string userInput = "<script>alert('hello');</script>";
}

The text is: @userInput

Displaying “formatted text” data

Often we want to include some kind of WYSIWYG editor on our page. This is a tricky scenario, we want to give our users ability to format the data (e.g. bold or underline), without allowing them to do too much.

It’s obvious that we cannot encode all input. This case is much more complex and difficult, the data must be sanitized, which means that it should allow only specific subset of HTML markup (e.g. whitelist b and u HTML tags) and that it doesn’t contain XSS code.

We can render pure HTML content from a variable in ASP.NET Core using Razor and IHtmlHelper, but it doesn’t do any HTML sanitization:

1
@Html.Raw("<script>alert(1);</script>Hello <b>WORLD</b>!!!")

In this case I’d advise you to use a HTML sanitization library or use a different approach altogether. You can think of a non-HTML markup e.g. Markdown, where would use non HTML tags to format your text. As and example, you bold the text by putting it inside double stars (like this **bolded**). The parsing engine would then change it to valid HTML. Beware, while this approach can prevent XSS attacks it is not a surefire way of stopping malicious code, if the parser is vulnerable users could still try to abuse it.

Attribute based XSS

You should also know that XSS can also be injected inside attributes, which makes sanitiziation even more complex. An example to this could be a HTML link, you let users put a link including URL and link title.

Links gives us two more source of untrused input, the href attribute and link description. We cannot simply assume that users will enter correct and valid data all the time.

Another example. Let’s assume that each user has a profile page which lets them provide a short bio. You allow them to include links inside:

1
2
3
4
5
<h1>@Model.UserName profile</h1>
@if (@Model.HasBio)
{
    @Html.Raw(Model.Bio)
}

The URL and title is inserted through modal window and it is inserted in your editor as:

1
<a href="{0}">{1}</a>

What would happen if the user entered following data?:

1
2
3
4
5
6
7
8
URL: https://softdevpractice.om
Title: SoftDevPractice

Url: javascript:alert('Why did you do that?')
Title: don't click me

Url: #">Broken
Title: <b>Text

The only way I can think of to counter that is using HTML sanitization or using alternate markup formatting (see previous section).

Summary

  • Think of all possible input channels as untrusted source of data.
  • Consider what kind of input you allow and what’s its purpose.
  • Sanitize HTML.
  • Prefer whitelisting instead of blacklisting.