Using the htmlspecialchars() Function in PHP

This tutorial will show you how to use the htmlspecialchars() function in PHP.

PHP htmlspecialchars() Function

The htmlspecialchars() function is incredibly useful in PHP, especially when you have text you intend to output.

You can easily convert any special characters to their HTML entity equivalent using this function.

One of the key reasons you will want to do this is to try and prevent XSS. “XSS” stands for cross-site scripting and is an attack when a bad actor attempts to inject malicious code onto your website.

By using this function many of the characters a script might rely on will be converted to a HTML entity stopping the browser blindly executing the code.

Any user input should be sanitized using functions like htmlspecialchars() to reduce the chance of a malicious attack.

Syntax of the htmlspecialchars() Function in PHP

Let us start by exploring the htmlspecialchars() function’s syntax within PHP. The syntax will show us all the parameters and what it returns.

Below you can see that this function takes four parameters. Only the first parameter is required, which is the string you want to be processed.

htmlspecialchars(
    string $string,
    int $flags = ENT_QUOTES | ENT_SUBSTITUTE | EN_HTML401,
    ?string encoding = null,
    bool $double_encode = true
): string

This function will return the converted string.

If the “ENT_IGNORE” or “ENT_SUBSTITUTE” flags aren’t set, and the string contains an invalid unit sequence, then PHP will return an empty string.

htmlspecialchars() Conversions within PHP

Below you can see a list of the characters that PHP will replace when using the htmlspecialchars() function.

CharacterReplacement
& (Ampersand)&
" (Double Quote)"

Won’t be replaced if the ENT_NOQUOTES flag is set.
' (Single Quote)" (ENT_HTML401)
' (ENT_HTML5, ENT_XML1, ENT_XHTML)

Only replaced if the ENT_QUOTES flag is set..
< (Less Than)&lt;
> (Greater Than)&gt;

Parameters

In this section we are going to explore each of the parameters that this function has to offer. One of the key optional parameters you will want to look into is “$flags“.

The “$flags” parameter allows you to alter the behavior of the htmlspecialchars() function making it super important.

$string (REQUIRED)

With this parameter, you specify the string that you want to be converted. PHP’s “htmlspecialchars()” function will run through this string, converting the special characters to HTML entities.

$flags (OPTIONAL)

The flags allow you to control the behavior of the “htmlspecialchars()” function in PHP. Using these you can control how it handles quotes, invalid code sequences, and the document type.

As this is a bitmask, each flag you add needs to be separated with the bitwise or operator (|).

As of PHP 8.1, the function will default to the following flag bitmask “ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401“. On older releases of PHP, the default flag is “ENT_COMPAT“,

If you are curious about what these flags do, let us quickly go through them with this handy table. For most use cases, the default flags will suffice.

ConstantDescription
ENT_COMPATConvert double-quotes but not single-quotes.
ENT_QUOTESConvert both single-quotes and double-quotes.
ENT_NOQUOTESDon’t convert double-quotes or single-quotes.
ENT_IGNOREDiscard invalid code unit sequences silently instead of returning an empty string.

For security reasons, it is recommended not to use this flag.
ENT_SUBSTITUTEReplace any invalid code unit sequences with a Unicode Replacement Character U+FFFD (UTF-8) or &#xFFFD.

Won’t return an empty string.
ENT_DISALLOWEDReplace any invalid code points that don’t match the provided document type.

PHP will replace invalid code points with a Unicode Replacement Character U+FFFD (UTF-8) or &#xFFFD.

Useful for ensuring that converted string conforms to a particular standard
ENT_HTML401Handles the string using the HTML 4.01 standard.
ENT_XML1Handles the string using the XML 1 standard.
ENT_XHTMLHandles the provided string as XHTML.
ENT_HTML5Handles the provided string using the HTML5 standard.

$encoding (OPTIONAL)

PHP allows you to control the encoding the htmlspecialchars() function uses when it converts your characters.

If this encoding is set to null (The default option), then it will utilize whichever encoding that set with the “default_charset” configuration option.

$double_encode (OPTIONAL)

When the “$double_encode” option is set to “true“, PHP will convert everything within the given string even if it is an existing HTML entity.

With this set to “false“, PHP will only convert a special character if it isn’t an HTML entity already.

By default, the “$double_encode” parameter is set to “true“.

Using the htmlspecialchars() Function in PHP

Within this section, we will be exploring how to use the htmlspecialchars() function within PHP.

Basic Usage of the htmlspecialchars() Function

Let us start by exploring PHP’s most basic usage of the htmlspecialchars() function.

For this example, we will ignore all optional parameters and focus purely on the “$string” parameter. PHP will utilize the function’s default options by just using the required parameter.

Let us start this example by declaring a variable called “$example_script“, to which we will assign a PHP string containing a short piece of JavaScript. This script will trigger an alert when executed.

We then use PHP’s echo statement to print out this string.

First, however, we wrap this string in the htmlspecialchars() function. This way, PHP will replace any special characters such as the less than and greater signs in the output with safe HTML entities.

<?php

$example_script = "<script>alert('PiMyLifeUp');</script>";

echo htmlspecialchars($example_script);

?>

After running the above script you will see the following be outputted as plain text.

<script>alert('PiMyLifeUp');</script>

What this actually looks like within the outputted code is shown in the output below. With this output, you can see the various symbols that were replaced with their HTML entity equivalent.

The browser will interpret these entities back into the symbols as shown in the previous lines.

&lt;script&gt;alert(&#039;PiMyLifeUp&#039;);&lt;/script&gt;gt;

Using the Flags Parameter

For our following example, let us show you a usage of the “htmlspecialchars()” function flags parameter in PHP.

With this example, we will be setting the “ENT_NOQUOTES” flag. This flag tells the function not to convert single-quotes or double-quotes.

The script is started by defining a variable called “$example_script” and assigning it a simple script we want to intentionally break when it has been outputted.

We then utilize PHP’s htmlspecialchars() function to convert our “$example_script” string. We pass the “ENT_NOQUTOES” flag into the second parameter.

The final result is output to the screen by us using the echo statement.

<?php

$example_script = "<script>alert('PiMyLifeUp');</script>";

echo htmlspecialchars($example_script, ENT_NOQUOTES);

?>

After running the above example, you will end up with the following HTML. With this result, you can see how the function didn’t touch the single quotes within our example script.


&lt;script&gt;alert('PiMyLifeUp');&lt;/script&gt;gt;

Conclusion

Throughout this tutorial, we have shown you how to use PHP’s htmlspecialchars() function.

This function is incredibly useful for sanitizing user input, especially when it is intended to be stored in a database or output to a web page.

Without sanitizing input it opens it up to various ways of being exploited.

Please comment below if you have questions about using the htmlspecialchars() function within your PHP code.

Be sure to check out our many other PHP tutorials. Alternatively, we have guides that cover other programming languages if you want to learn something new.

Leave a Reply

Your email address will not be published. Required fields are marked *