If you're asking whether llms.txt replaces sitemap.xml, you're asking the wrong question. They solve different problems.
sitemap.xml is about completeness. It tells crawlers which URLs exist.
llms.txt is about judgment. It tells AI agents which pages are worth reading first.
That distinction matters because documentation discovery is no longer just "can Google find this page?" It is also "if an agent needs to solve a task, where should it start?"
For most docs teams, the right answer is simple: keep your sitemap, add llms.txt, and do not try to make one impersonate the other.
If you need the basics first, start with What Is llms.txt? A Practical Guide for SaaS Docs Teams. If you want examples of good llms.txt structure, read llms.txt Examples: Real Patterns for API Docs, Help Centers, and Developer Docs.
What sitemap.xml Does
sitemap.xml is a crawler inventory.
Its job is straightforward:
- list URLs
- optionally include
lastmod - help search engines discover pages
- help search engines prioritize crawling
A typical sitemap entry looks like this:
1<url>2 <loc>https://docs.example.com/authentication</loc>3 <lastmod>2026-05-30</lastmod>4</url>There is no editorial intent here. The sitemap is not trying to say "read this page first" or "these are the three pages that matter most for API onboarding." It is just telling a crawler what exists.
That is exactly what it should do.
For Google and traditional search engines, this is useful. For AI agents trying to answer a question or complete a task, it is often not enough.
What llms.txt Does
llms.txt is a curated docs guide for agents.
Its job is different:
- explain what the docs set covers
- point to the most important pages
- group links by real tasks or concepts
- reduce ambiguity about where an agent should start
A typical llms.txt section looks like this:
1# Acme API Docs2
3Developer documentation for Acme's REST API. Covers authentication,4webhooks, rate limits, and SDK setup.5
6## Start Here7
8- [Quickstart](https://docs.acme.com/quickstart): Make your first request9- [Authentication](https://docs.acme.com/authentication): API keys and OAuth10- [Errors](https://docs.acme.com/errors): Error codes and retry guidanceThis is not a full site inventory. It is a small, opinionated map.
That makes it useful in the exact places where a sitemap is weak:
- "what page should I read first?"
- "which auth path is canonical?"
- "where are the webhook docs?"
- "what should I look at before writing code?"
The Real Difference: Completeness vs Curation
The difference is not XML versus Markdown.
The real difference is this:
sitemap.xmloptimizes for completenessllms.txtoptimizes for usefulness
That one distinction explains most of the confusion.
If you generate llms.txt straight from your sitemap, you usually lose the thing that makes llms.txt valuable. You get a second, worse sitemap.
If you try to use a sitemap as a task guide, you get a giant URL dump with no editorial signal.
They overlap in the broad sense that both help discovery. But they help different kinds of discovery.
Side-by-Side
Here is the simplest way to think about them:
sitemap.xml | llms.txt | |
|---|---|---|
| Primary audience | search engine crawlers | AI agents and tools |
| Format | XML | plain text / Markdown |
| Main purpose | list what exists | point to what matters |
| Coverage | exhaustive | selective |
| Maintenance style | generated | curated |
| Best for | crawl discovery | task-oriented docs guidance |
| Bad at | telling agents where to start | representing every page on the site |
That table is more useful than arguing about whether one is "better."
They are not substitutes. They sit at different layers.
When a Sitemap Is Enough
For some jobs, sitemap.xml is enough.
If your goal is:
- making sure Google can discover your pages
- exposing a large docs surface to traditional crawling
- tracking freshness through
lastmod - helping search engines notice newly published docs
then the sitemap is doing exactly what you need.
A sitemap is also better whenever completeness matters more than editorial guidance.
For example:
- versioned docs with lots of pages
- large reference surfaces
- generated API docs
In those cases, you still want the sitemap even if you also publish llms.txt.
When llms.txt Changes the Outcome
llms.txt matters when an agent needs to do more than just discover pages.
It matters when the agent needs help choosing.
Examples:
- your docs have both OAuth and API key auth, but one is the recommended default
- your product has three SDKs, but most users should start with one
- your help center has 200 pages, but only 8 solve most support tasks
- your developer docs have architecture pages that matter before implementation
These are not crawl problems. They are prioritization problems.
That is what llms.txt helps with.
The file is valuable because it captures editorial judgment in a machine-readable way.
The Common Mistake: Turning llms.txt into a Sitemap Clone
This is probably the most common failure mode.
Teams publish llms.txt, but instead of curating it, they dump every docs URL into the file.
At that point, it stops being helpful.
You end up with:
- a longer file
- more ambiguity
- weaker prioritization
- no clear starting path
If the file is just a second inventory, agents still have to guess where to begin.
A good llms.txt should feel like a senior engineer narrowing the search space, not a crawler export.
The Other Common Mistake: Expecting llms.txt to Replace Search Infrastructure
The opposite mistake is also common.
People hear "AI discovery" and assume llms.txt should replace traditional discovery files.
It should not.
You still want:
sitemap.xml- canonical URLs
- clean internal linking
- crawlable pages
- good metadata
llms.txt is additive. It does not make your search infrastructure irrelevant.
If your sitemap is broken, llms.txt does not save you. If your docs structure is weak, llms.txt only papers over part of the problem.
A Practical Default for Docs Teams
If you run a docs site in 2026, the default setup is pretty simple:
Keep sitemap.xml exhaustive
Let it do the boring job:
- include the important public docs pages
- generate it automatically
- update
lastmod
Keep llms.txt short and curated
Let it do the editorial job:
- quickstart
- auth
- errors
- webhooks
- SDKs
- a few core concept pages if needed
Do not make them mirror each other
If both files contain the same long list of URLs, you probably are not getting much value from llms.txt.
What to Ship First
If you only have time for one hour of work, do this:
- make sure your sitemap is present and healthy
- create a small
llms.txtwith 10 to 20 important links - group those links by real tasks, not by internal nav labels
That gets you most of the benefit.
You can refine from there.
What Neither File Solves
It is worth being explicit about the limit.
Neither sitemap.xml nor llms.txt solves:
- bloated HTML payloads
- weak docs IA
- broken examples
- unclear product boundaries
- missing implementation guidance
And neither one tells the agent how to use your product well in the deeper sense.
That is why we keep treating llms.txt as a discovery layer, not the whole AI-readable docs story. If you want the systems argument for that, read llms.txt Isn't Enough.
The Short Version
If you want one sentence:
sitemap.xml tells crawlers what exists. llms.txt tells agents what matters.
Most teams should ship both.
One is infrastructure.
The other is judgment.