In this article, we will look at such a type of attack on web applications as HTML – Injection.

What is HTML Injection?

The essence of this type of injection attack is to inject HTML code through vulnerable parts of a website. An attacker sends HTML code through any vulnerable field in order to change the design of the website or any information displayed to the user.

As a result, the user can see the data sent by the attacker. Hence, in general, HTML injection is simply the injection of markup language code into a page document.

The data that is sent during this type of injection attack can be very different. It could be a few HTML tags that will simply display the submitted information. It could also be an entire fake form or page. When this attack occurs, the browser usually interprets the user’s malicious data as valid and displays it.

Changing the look of a website is not the only risk this type of attack carries. This is very similar to an XSS attack in which an attacker steals other people’s personal data. Consequently, theft of the user’s personal data can also occur during this injection.

Types of HTML Injection

This attack does not seem very difficult to understand or execute, since HTML is considered a fairly simple language. However, there are different ways to accomplish this type of attack. We can also highlight different types of this injection.

First, the different species can be sorted by the risks they carry.

As mentioned, this injection attack can be performed with two different targets: 

  • To change the appearance of the displayed website.
  • To steal personal information from other users.

In addition, this injection attack can be performed through various parts of a website, such as data entry fields and a website link.

However, the main types are: 

  • Stored HTML Injection
  • Reflected HTML Injection

Stored HTML Injection:

The main difference between the two types of injection is that a Stored HTML Injection attack occurs when malicious HTML code is stored on a web server and executed every time the user calls the appropriate function.

However, in a reflected injection attack, the malicious HTML code is not persistently stored on the web server. Reflected injection occurs when a website responds immediately to malicious input.

Reflected HTML Injection:

This again can be divided into several types:

  • Reflected GET
  • Reflected POST
  • Reflected URL

A reflected injection attack can be performed differently depending on the HTTP methods, that is, GET and POST. Let me remind you that using the POST method, data is sent, and using the GET method, it is requested. 

To find out which method is used for the corresponding elements of the website, we can check the page source code. 

For example, a tester can check the source code of a login form to see what method is being used for it. You can then select the appropriate HTML injection method.

Reflected GET occurs when our input is displayed (reflected) on the website. Let’s say we have a simple page with a data entry form that is vulnerable to this attack. Then, if we enter any HTML code as a parameter, it will appear on our website and at the same time will be inserted into the HTML document.

http: //mutillidae/index.php? page = user-poll.php & csrf-token = & choice = nmap & initials =% 3Ca% 22% 22 + href% 3D% 22% 22onclick% 3Dalert% 28% 22HTML% 22% 29% 3Eprotey % 3C% 2Fa% 3E & user-poll-php-submit-button = Submit + Vote

Payload:

<a “” href = “” onclick = alert (“HTML”)> protey </a>

Result:

Reflected POST HTML Injection is a little more complicated. This happens when malicious HTML is sent instead of the correct parameters to the POST method.

For example, we have a login form that is vulnerable to an HTML attack. The data entered in the login form is sent using the POST method. Then, if we enter any HTML instead of the correct parameters, it will be POSTed and displayed on the website.

To carry out the Reflected POST HTML attack, it is recommended to use a special browser plugin that will fake the sent data. One of them is the Mozilla Firefox “Tamper Data” plugin. The plugin accepts the submitted data and allows the user to modify it. The modified data is then sent and displayed on the site.

For example, if we use a plugin like this, we will send the same HTML and it will also display in much the same way as in the previous example (depending on the application).

Payload:

<h1> PROTEY.NET </h1>

Reflected URL happens when HTML is sent via a website URL, displayed on the website, and at the same time entered into the website’s HTML document.

Payload:

<a href=”data:text/html;base64,PHNjcmlwdD5hbGVydCg5KTwvc2NyaXB0Pg”> protey </a>

How is HTML injection done?

To perform this type of injection, an attacker must first find vulnerabilities in the website. As already mentioned, such places on the site can be data entry fields and a link to the site.

Malicious HTML code can enter the source code via innerHTML. Recall that innerHTML is a property of the DOM document, and using innerHTML, we can write dynamic HTML. It is mainly used for data entry fields such as comment fields, questionnaires, registration forms, etc. Therefore, these elements are most vulnerable to HTML attacks.

Let’s say we have a questionnaire where we fill in the appropriate answers and our name. And when the questionnaire is completed, a confirmation message is displayed. The confirmation message also displays the specified user name or selection.

As we understand it, Tester_name is the name specified by the user. Therefore, this confirmation message code might look like this:

var user_name = location.href.indexOf (“user =”);

document.getElementById (“Thank you for filling our questionnaire”). innerHTML = ”Thank you for filling our questionnaire,” + user;

The code shown is vulnerable to such an attack. If you enter any HTML code in the questionnaire form, its message will be displayed on the confirmation page.

The same thing happens with comment fields. Suppose if we have a comment form and it is vulnerable to HTML attack.

In the form, the user enters his name and comment text. All saved comments are listed on the page and loaded on page load. Therefore, if the malicious code was typed and saved, it will also be downloaded and displayed on the site.

For example, if we save the code in the comment box as shown below, a popup will appear with the message “Hello!” will be displayed on page load.

<html><body><script>alert(“Hello!”);</script></body> </html>

Another way to perform this type of injection is by following a link on a website. Let’s say we have a link to a PHP site.

As you can see, “site” is a parameter, and “1” is its value. Then, if for the “site” parameter instead of the value “1” we specify any HTML code with the displayed text, this specified text will be displayed on the page “Page Not Found”. This only happens if the page is vulnerable to an HTML attack.

Suppose we are typing text with <input type = “text” name = “foo” value = “” onmouseover = alert (HTML_Injection) // “> tags instead of the parameter value.

Then we will get the text displayed on the website as shown below:

Also, as already mentioned, more than just a chunk of HTML code can be entered. An entire malicious page can also be sent to the end user.

For example, if a user opens any login page and enters their credentials. In this case, if a malicious page is loaded instead of the original page, and the user submits their credentials through that page, a third party can obtain the user’s credentials.

How do I check for HTML injection?

When starting to check for possible injection attacks, a tester should first list all potentially vulnerable parts of a website.

Let me remind you that it can be: 

  • All data entry fields
  • Link

Then manual tests can be performed.

When manually testing whether HTML embedding is possible, you can enter simple HTML code — for example, to test if text will be displayed. There is no point in testing very complex HTML code, simple code may be enough to test its rendering.

For example, these can be simple tags with text:

<h1> HTML Injection testing </h1>

Or the search form code if you want to test something more complex:

<form method = “post” action = “index.html”>
<p> <input type = “text” name = “search” value = “” placeholder = “Search text”> </p>
<p class = ” search_text “>
<label>
<input type =” checkbox “name =” search_text “id =” search_text “>

Enter your search text:

</label>
</p>
<p class = “submit”> <input type = “submit” name = “commit” value = “Search”> </p>
</form>

If HTML is displayed and stored somewhere, then the tester can be confident that this injection attack is possible. You can then try more complex code – for example, to display a fake login form.

Another solution is HTML Injection scanner. Automatic scanning against this attack can save you a lot of time. I want to note that there are not so many tools for testing HTML Injection compared to other attacks.

However, one possible solution is the WAS application. WAS can be called a pretty strong vulnerability scanner because it tests with different inputs rather than just stopping when the first one fails.

This is useful for testing, perhaps as mentioned in the aforementioned “Tamper Data” browser plugin, it receives the sent data, allows the tester to modify it, and sends it to the browser.

We can also find some online crawling tools where you only need to provide a link to a website and an HTML attack will be scanned. When testing is complete, a summary is displayed.

I would like to point out that when choosing a scanning tool, we must pay attention to how it analyzes the results, and whether it is accurate enough.

However, keep in mind that manual testing should not be forgotten. This way, we can be sure which input data is being used and which exact results we get. It also makes it easier to analyze the results.

Based on my experience in software testing, I would like to comment that for both ways of testing, we need to be familiar with this type of injection. Otherwise, it would be difficult to choose a suitable automation tool and analyze its results. In addition, it is always advisable to remember to test manually, as this only gives us more confidence in the quality.

How to prevent HTML injection?

There is no doubt that the main reason for this attack is the inattention and ignorance of the developer. This type of injection attack occurs when input and output are not properly validated. Therefore, the main rule to prevent an HTML attack is appropriate data validation.

Every input must be checked if it contains any script code or any HTML code. Usually it is checked if the code contains any special script or HTML brackets – <script> </script>, <html> </html>.

There are many functions to check for special parentheses in your code. The choice of the check function depends on the programming language you are using.

Remember that good safety testing is also part of prevention. I would like to point out that since HTML Injection attack is very rare, there is little information about it, as well as about detection tools. However, this part of the security testing should not be skipped as you never know when you might need it.

In addition, both the developer and the tester should have a good knowledge of how this attack is performed. A good understanding of the attack process can help prevent it.

Comparison with other attacks

Compared to other possible attacks, HTML definitely would not be considered as dangerous as an attack using SQL injection or JavaScript injection or even XSS. It will not destroy the entire database or steal all data from the database. However, it should not be considered insignificant.

As mentioned earlier, the main purpose of this type of embedding is to change the appearance of the displayed website with a malicious intent, displaying the information or data you send to the end user. These risks can be considered less important.

Changing the look of your website can damage your company’s reputation. If an attacker ruins the look of your site, it can change the way visitors think about your company.

Please be aware that another risk associated with this website attack is another user’s identity theft.

As mentioned, with HTML injection, an attacker can inject an entire page that will be displayed to the end user. Then, if the last user provides their login details to a fake login page, they will be sent to the attacker. This case is of course the most risky part of the attack.

It should be noted that this type of attack is less often chosen to steal data from other users, as there are many other possible attacks.

However, it is very similar to an XSS attack that steals the user’s cookies and the personal data of other users. There are also HTML-based XSS attacks. Therefore, testing against XSS and HTML attacks can be very similar and can be done together.

Output

Since HTML injection is not as popular as other attacks, it can be considered less risky. Therefore, testing for this type of injection is sometimes skipped.

Also worth noting is the small amount of information about HTML Injection. Therefore, testers may decide not to conduct this type of testing. However, in this case, the risks of an HTML attack may not be sufficiently appreciated.

As we analyzed in this guide, with this type of embedding, your entire website design can be destroyed or user login credentials can be stolen. Therefore, it is highly recommended to include HTML injection in your security testing.