APIs & Authentication
What is XML?
Extensible Markup Language, a self-describing text format for structured data, predating JSON and still ubiquitous in enterprise systems, sitemaps, and RSS feeds.
Also known as: Extensible Markup Language
XML uses nested tags to describe data: <book><title>Dune</title><author>Frank Herbert</author></book>. It was designed in the late 1990s as a simpler alternative to SGML and became the default interchange format for the early 2000s web. SOAP, RSS, Atom, sitemaps, SVG, and Office documents are all XML under the hood.
JSON ate XML's lunch in the 2010s for HTTP APIs because it is shorter, less ceremonious, and maps directly to JavaScript objects. XML is still where you find it locked into enterprise integrations (banking, healthcare, government), document formats (DOCX, EPUB, SVG), and SEO standards (sitemaps, RSS). Schemas (XSD) and transforms (XSLT) give XML capabilities JSON has only recently been catching up to with JSON Schema and JSONata.
For crawlers, XML matters because sitemaps are XML. The structure is well-defined (urlset, url, loc, lastmod, changefreq, priority) so a sitemap parser is one or two helper functions. RSS and Atom feeds are also XML and remain a clean way to discover new content on a site without crawling every page.
In the wild
- →Parsing a sitemap.xml with 50,000
<url>entries to seed a crawl - →An SVG file (which is XML) being inspected to extract icon paths
- →A legacy SOAP integration where every request is a
<soap:Envelope>of nested elements
How Brand.dev uses xml
Endpoints in the Brand.dev API where this concept comes up directly.
FAQ
XML or JSON for new APIs?
JSON. XML's verbosity, namespaces, and schema complexity rarely earn their keep on a modern HTTP API. The exception is integrations with systems that already speak XML.
Can XML handle binary data?
Not natively; binary has to be base64-encoded and embedded as text. CDATA sections wrap content that should not be parsed (raw HTML inside an XML doc, for instance).
Are sitemaps required to be XML?
Google accepts XML sitemaps, plain-text URL lists, RSS feeds, and Atom feeds. XML is conventional and easiest to validate.
Related terms
JavaScript Object Notation, a lightweight text format for representing structured data, supported natively by every modern language.
An XML file that lists every important URL on a site so search engines and crawlers can discover them efficiently.
Scalable Vector Graphics, an XML-based image format that describes shapes mathematically, so the image is sharp at any resolution.
A human-readable data serialization format that uses indentation rather than braces, popular for configuration files (CI pipelines, Kubernetes manifests, OpenAPI specs).