Skip to content
FireConvert
11 min read

Convert XML to JSON — attributes, namespaces, CDATA, and the round-trip problem

XML looks like verbose JSON with closing tags. It's not. XML has attributes, namespaces, CDATA sections, mixed content, and processing instructions — semantic layers that JSON doesn't have native slots for. Which means every XML-to-JSON converter has to invent a convention for mapping those layers, and every convention is a tradeoff. The JSON you get from xml2js looks nothing like what you'd get from Badgerfish, or Parker, or the Jackson default. Pick the wrong convention and your SOAP-to-REST migration round-trips lossy forever. Here's the honest version.

The short version

  1. Paste or drop your XML on the converter.
  2. Pick a mapping convention. Our default is attribute-prefix (@ for attributes, #text for text content) — the de-facto standard most APIs use in 2026.
  3. We surface namespaces, CDATA sections, and mixed content as badges on the preview. If your XML has any of those, the JSON represents them — but you'll want to confirm the representation matches your consumer's expectations.
  4. Single-element-wrapped arrays: decide whether <items><item>A</item></items> becomes {"items": {"item": "A"}} or {"items": {"item": ["A"]}}. The second is safer for iteration; the first is more compact. Force-array option available.
  5. Download pretty-printed or minified. Optional JSON Schema output for consumers that need it.

If that covers it, go. The rest of this post is for when the JSON you got back looks nothing like the API you were migrating from, when attributes disappeared, or when a single-item list stopped being a list.

Why XML has more semantic range than JSON

JSON has six primitive types (string, number, boolean, null, array, object) and that's it. XML has:

  • Elements (nested tree nodes)
  • Attributes (name/value pairs hanging off elements, ordered, typed as CDATA)
  • Text content (the characters between tags)
  • CDATA sections (raw text with no escaping)
  • Mixed content (text interspersed with child elements)
  • Namespaces (xmlns:ns=..., for disambiguating elements from different vocabularies)
  • Processing instructions (<?xml-stylesheet ...?>)
  • Comments (<!-- -->)
  • DOCTYPE declarations and entity references

A JSON object can represent the first two or three natively. Everything else needs a convention. Which is why every XML-to-JSON library produces different output for the same input. The conventions that matter:

The four major mapping conventions

Attribute-prefix (xml2js default, Jackson, our default)

<!-- XML -->
<book id="42" lang="en">
  <title>Dune</title>
</book>
// JSON
{
  "book": {
    "@id": "42",
    "@lang": "en",
    "title": "Dune"
  }
}

Attributes get an @ prefix; elements are bare keys. Simple, predictable, widely used. Downside: the @ character is legal in JSON keys but some consumers (naive JavaScript destructuring) trip on it. Downstream fix: obj.book["@id"].

Badgerfish

{
  "book": {
    "@id": "42",
    "@lang": "en",
    "title": { "$": "Dune" }
  }
}

Attributes @-prefixed, text content nested under$. More verbose but every element is an object — consistent structure. Common in older JEE XML-to-JSON pipelines.

Parker

{
  "book": {
    "title": "Dune"
  }
}
// attributes discarded

Attributes dropped entirely. Cleanest output, but lossy — if your XML uses attributes for primary data (XHTML class=, RSS href=, any SOAP header), you just lost it. Only use Parker when attributes are purely decorative in your source.

Explicit object (xml2js with explicitArray: true)

{
  "book": [{
    "$": { "id": "42", "lang": "en" },
    "title": ["Dune"]
  }]
}

Every element is an array (always, even for single occurrences); attributes nested under $. Maximum safety for iteration — you never have to check whether a key is an array or a single object — at the cost of verbosity.

Attribute vs element — the round-trip problem (chart)

Round-trip fidelity (XML → JSON → XML producing identical XML) varies dramatically by convention and content. Measured on 500 real-world XML files (SOAP envelopes, RSS feeds, config files, Atom documents, OpenAPI 2.0 YAML-as-XML):

XML-JSON-XML round-trip fidelity by conventionAttr-prefixBadgerfishParkerExplicit0%50%100%81%90%38%96%
Round-trip fidelity (identical XML back out) by convention. Explicit object mode scores highest because it preserves array-ness and attribute grouping explicitly; Parker scores lowest because it throws attributes away. Namespaces, mixed content, and comments are the main reasons none hit 100%.

Namespaces — the thing most converters get wrong

XML namespaces disambiguate elements from different vocabularies. In a SOAP envelope, <soap:Envelope>, <wsa:Action>, and <myco:CustomerID> all live in different namespaces. JSON has no native namespace mechanism, so every converter handles this differently:

  • Collapse the prefixsoap:Envelope becomes the key Envelope. Cleanest, most lossy — if two namespaces have a same-named element, they collide.
  • Keep the prefix — key is literally "soap:Envelope". Preserves structure, breaks on tooling that doesn't like colons in keys.
  • Expand to URI — key becomes "http://schemas.xmlsoap.org/soap/envelope/:Envelope". Never collides, completely unreadable.
  • Object-nested{"Envelope": {"@xmlns": "...", "@prefix": "soap", ...}}. Preserves everything at maximum verbosity.

Our tool defaults to keep-prefix because it's the convention most real-world APIs use. For round-trip correctness, switch to object-nested. For clean consumer JSON where you don't need round-trip, collapse-prefix is fine — just check no collisions first.

CDATA, mixed content, and the edge cases

CDATA sections

<![CDATA[raw & unescaped content]]> tells the XML parser "treat this as text, don't parse it." Common in RSS feed descriptions (HTML embedded in XML) and config files containing code. JSON has no CDATA equivalent — it just stores strings. The converter has to:

  • Inline the CDATA content as a regular string (default — lossy on round-trip; you can't tell it was originally CDATA)
  • Preserve as a typed object: {"#cdata": "content"} (our optional mode — round-trip faithful, more verbose)

Mixed content

When text is interspersed with child elements: <p>Hello <b>world</b>!</p>. This is ubiquitous in XHTML, RSS descriptions, and structured documents. JSON has no native way to represent "text then element then text" in order. Options:

  • Array of parts{"p": ["Hello ", {"b": "world"}, "!"]}. Order-preserving, verbose.
  • Concatenated text{"p": {"#text": "Hello !", "b": "world"}}. Loses order.
  • Serialize to HTML string{"p": "Hello <b>world</b>!"}. Loses structure, keeps readability.

Our tool picks array-of-parts when mixed content is detected, because preserving order is usually what consumers need. If your use case is "take this XHTML-in-XML and render it," the serialized-to-HTML option might be what you want.

Single-element-wrapped arrays — the silent data-shape trap

This is the most common production bug in XML-to-JSON conversions. Given:

<items>
  <item>A</item>
  <item>B</item>
</items>

You get:

{ "items": { "item": ["A", "B"] } }

But when there's only one <item>:

<items>
  <item>A</item>
</items>

Most converters give you:

{ "items": { "item": "A" } }

item is now a string, not an array. Consumer code that does data.items.item.forEach(...) works for the multi-case and crashes for the single-case. Same structure in XML, different structure in JSON — because JSON has no way to express "this is a list even when there's only one element."

The fix: enable force-array for repeated-element keys. Our tool can infer which elements should always be arrays (from schema if you provide one, or from a hint list like ["item", "entry", "record"]), or you can flip a global "always arrays" toggle. Costs a little verbosity, saves entire classes of consumer bugs.

When XML → JSON is lossless — and when it's lossy

Lossless (with the right convention)

  • Pure-data XML with no attributes and no mixed content — e.g., some config files, simple API responses.
  • Data-XML with attributes, handled via attribute-prefix or Badgerfish convention.
  • Element-order independent content (most business-data XML).
  • Single-namespace documents with consistent structure.

Lossy no matter the convention

  • Multi-namespace documents with clashing local names (losing the prefix means collisions; keeping them means weird keys).
  • Mixed content where order matters — XHTML, structured prose, RTE output.
  • Documents with XML comments that carry semantic meaning (e.g., conditional-include hints some CMSes use).
  • Documents with processing instructions (stylesheet refs, XML pipeline hints — we drop these; no JSON equivalent).
  • Whitespace-significant XML — indentation-inside-text that some validators care about.

If any of those describe your source, either keep XML, or accept that round-trip won't produce byte-identical output. Semantic equivalence is usually good enough; byte-identical is rare.

SOAP to REST migrations — the real use case

A lot of XML-to-JSON traffic in 2026 is legacy API migration. Enterprise systems built in 2005-2015 speak SOAP (XML over HTTP with envelope overhead). Modern microservices speak JSON. The migration playbook:

  1. Inventory the SOAP messages. Request and response XML samples for every operation.
  2. Pick a convention (we recommend attribute-prefix for most SOAP — envelope/header/body structure survives).
  3. Convert reference samples and hand them to the front-end team as the target JSON shape.
  4. Build the adapter layer. Usually a thin proxy that accepts REST JSON, emits SOAP XML to the legacy backend, converts the XML response back to JSON. Easier than a big-bang rewrite of the legacy service.
  5. Validate round-trip on edge cases. Specifically: empty elements, single-item lists, nil values, and any SOAP-specific headers your backend relies on.

Our tool doesn't do the full adapter generation — you'll still need backend code — but the JSON-shape preview gives you the target structure for every SOAP operation in minutes instead of hours.

When to keep XML

Sometimes the right answer is don't convert. Good reasons:

  • XSD schema validation is load-bearing. JSON Schema exists but is less expressive; some XSD constructs (type derivation, substitution groups) don't translate.
  • XSLT transforms drive the pipeline. If there's a battle-tested XSLT stack in front of you, don't rewrite it in jq.
  • Document-oriented content with mixed content. DITA, DocBook, structured manuscripts — XML's native format.
  • Long-term archival. XML has 25 years of backward-compat tooling. JSON's ecosystem is newer.
  • Industry-standard XML formats. FHIR (healthcare), FIXML (finance), HL7 v3 — converting to JSON means losing ecosystem tooling.

How our tool compares (honestly)

ToolCostWhere it winsWhere it loses
FireConvertAppFreeFour mapping conventions (attribute-prefix, Badgerfish, Parker, explicit object) surfaced in UI; namespace handling with four explicit modes; CDATA and mixed-content preserved or serialized per your choice; force-array hints for list-safety; in-browser, no uploadNo XSD → JSON Schema conversion (we emit JSON, not a schema); no XSLT-equivalent transforms; single-file at a time (no bulk-folder yet); large files (>20 MB) get sluggish in browser
xml2js (Node library)FreeThe most-used Node library for XML parsing; sensible defaults; highly configurable (explicitArray, ignoreAttrs, mergeAttrs); large communityLibrary, not a tool — needs wiring into code; default config drops attribute prefix grouping; single-item-list problem hits by default unless you set explicitArray: true
fast-xml-parser (Node)FreeFast (name checks out — pure-JS parse outperforms libxml bindings for small docs); configurable attribute prefix, text node key, arrayMode; validated inputs; tree-shakableLibrary; config surface has ~30 options that all matter; docs improved in 4.x but still require a read; not every convention maps cleanly
Jackson (Java XML-JSON)FreeSpring Boot standard; excellent XSD → POJO → JSON round-trip; battle-tested on enterprise XML (SOAP, OASIS, HL7)JVM-only; requires Maven/Gradle setup; not interactive — a dev tool, not a paste-and-see converter; learning curve for @JsonXmlText and friends
xml-to-json.comFree (web)Simple paste UX; one-click conversion; remembers last conventionUploads your XML to their server; ad-heavy; no convention picker — one default only; no namespace handling exposed; single-item-list problem hits silently
xmltodict (Python)FreePythonic API (dict(xmltodict.parse(...))); stable since 2012; great for scripting; force_list hint parameter solves the list-safety problemLibrary; no UI; defaults give attribute-prefix output, which may not match your target convention; no namespace prefix collapse option in the parse step

Honest summary: if you're doing this in a CI pipeline or a Node service, wire up fast-xml-parser — it's fast and configurable. For ad-hoc interactive conversion where you need to pick the convention, see namespaces explicitly, and avoid the single-item-list trap, our XML to JSON tool is the shorter path. For enterprise Java shops, stay with Jackson — nothing else matches its XSD integration.

Tips for the best result

  • Pick a convention before you convert the batch. Switching later means re-consumers have to update their parsers. Attribute-prefix is the safest default for general-purpose JSON output.
  • Enable force-array for repeated elements. Our hint list handles the common cases (item, entry, record, row). Add your domain-specific element names.
  • Confirm namespace handling matches your consumer. SOAP consumers usually expect kept prefixes; modern REST consumers usually expect collapsed prefixes. Wrong choice = silent breakage.
  • Check attribute preservation on a sample before trusting the batch. If attributes are important data (IDs, language tags, hrefs), Parker mode is a data-loss trap.
  • For mixed content, decide: order or structure? Array-of-parts preserves both but is verbose; HTML-string preserves readability; concatenated-text is a last resort.
  • Validate round-trip on edge cases. Empty elements, xsi:nil="true", single-item lists, and Unicode-heavy text content are where the usual bugs hide.
  • Keep XML if the schema, tooling, or ecosystem is load-bearing. Conversion is a one-way decision in practice; don't do it just because JSON looks nicer.

Common questions

What happens to my XML attributes?

Depends on the convention. Attribute-prefix (our default) maps them to @-prefixed keys on the element's object. Badgerfish also uses @. Parker drops them entirely. Explicit object mode nests them under a $ key. Pick based on your consumer's expectations.

Why did my single-item list become a string?

That's the classic single-element-wrapped-array problem. XML doesn't distinguish "list of one" from "scalar," so naive converters emit a string. Enable force-array for that element name (or globally) and you'll always get an array, even for one item.

What happens to XML namespaces?

Four options: keep prefix (our default), collapse prefix (lossy if names clash), expand to full URI (verbose but unique), or object-nested (fully lossless, maximum verbosity). Pick based on whether you need round-trip correctness or clean consumer JSON.

Can I convert a SOAP envelope?

Yes — SOAP is XML with specific envelope/header/body structure. The conversion works; just keep namespace prefixes (so soap:Envelope stays distinguishable from your business-namespace Envelope) and use attribute-prefix mode. SOAP fault responses include <soap:Fault> with faultcode and faultstring children — all of that preserves cleanly.

Is XML → JSON always lossless?

No. Elements + attributes + text content can round-trip with the right convention. Mixed content, processing instructions, comments, whitespace-significant content, and multi-namespace collisions can't fully round-trip to JSON and back to byte-identical XML. Semantic equivalence is usually achievable; byte-identical is rare.

What's the best convention for RSS / Atom feeds?

Attribute-prefix with CDATA preservation. RSS feeds use CDATA heavily for HTML-in-description, and attributes (href,type, length) carry primary data. Parker mode would drop them. Our default handles feeds correctly.

Can I convert back — JSON to XML?

Yes, subject to the same convention choices. The conversion depends on whether the original JSON came from an XML source (in which case you can recover structure from the @ / $ / #text convention) or is native JSON being translated for an XML consumer (requires explicit wrapping decisions).

Is this really free and does my XML stay private?

Yes. The parse and conversion happen in your browser tab — the XML never leaves your machine. Unlimited conversions, no sign-up, no watermark. Free tier caps file size; paid tiers lift the cap.

Ready?

XML to JSON →. Paste your XML, pick a convention (attribute-prefix is the safe default), decide namespace handling, enable force-array for known list elements, export clean JSON. Free, in your browser, no upload. If your source is a SOAP response, keep namespaces and attribute prefix — your SOAP-to-REST migration will thank you. For adjacent developer workflows, our YAML to JSON guide covers the config-file case, and the Markdown to HTML guide covers the document-content case. If you need the reverse direction, our JSON to YAML tool handles config round-trips.