llms.txt
llms.txt is a proposed plain-text file at the root of a site. It gives large language models a curated, machine-readable map of the pages that matter most.
llms.txt is a proposed standard for a plain-text file served at the root of a website (for example, https://example.com/llms.txt) that gives large language models a curated, human-readable map of the most important content on the site. It is to LLMs roughly what sitemap.xml is to search engines — but optimised for context windows, not crawlers.
Why llms.txt exists
Modern AI systems often have to summarise a site from a single page or a small set of fetched URLs. Marketing chrome, navigation, cookie banners and JavaScript-rendered widgets can crowd out the substance. A site's most useful content for an LLM — its documentation, definitions, pricing, comparisons — is rarely the easiest to extract from raw HTML.
llms.txt addresses this by letting the publisher hand-write a markdown index that points the model directly at clean, canonical resources. It is an editorial tool, not a permission tool.
What llms.txt looks like
The file is markdown-flavoured. A typical structure starts with the site's name as an H1, a short description, and one or more sections of links with brief annotations:
# Example Co
> Open-source platform for X.
## Docs
- [Getting started](https://example.com/docs/start): five-minute setup
- [API reference](https://example.com/docs/api): endpoints, auth, rate limits
## Product
- [Pricing](https://example.com/pricing): plans and add-ons
An optional llms-full.txt can hold the full text of those resources concatenated into a single file, ready to drop into a model's context window.
llms.txt vs. robots.txt vs. sitemap.xml
| File | Audience | Purpose | Format |
|---|---|---|---|
robots.txt | Crawlers | Allow / disallow access | Plain text directives |
sitemap.xml | Search engines | List every indexable URL | Structured XML |
llms.txt | LLMs and AI agents | Curate the most useful content | Markdown index |
Does it actually work?
llms.txt is an emerging standard. Major AI vendors have not formally committed to consuming it, but several agentic frameworks, code assistants, and research tools already check for it when summarising or indexing a domain. The cost of implementation is low — a handful of lines of markdown — and the upside is a higher-fidelity representation of your site when an AI system needs one.
How to implement llms.txt
- Author
/llms.txtas a curated markdown index, not a dump of every URL - List your strongest evergreen content first: docs, definitions, comparison pages
- Annotate each link in plain English so a model can pick the right one without fetching
- Optionally publish
/llms-full.txtwith the full text of those resources concatenated - Serve both files with
Content-Type: text/plainand a long cache lifetime
Frequently asked questions
Does llms.txt replace robots.txt?
No. robots.txt still controls crawler access. llms.txt assumes a model has already chosen to read your site and helps it find the best content.
Is llms.txt the same as ai.txt?
No. ai.txt is an alternative permission-style proposal focused on training opt-outs. llms.txt is a content discovery file. The two can coexist.
Will it improve my AI search visibility?
It can — particularly for agents and code assistants that fetch a single URL to summarise a brand. It is not a substitute for sound technical SEO, structured data, or earning citations through quality content.
Related Terms
What Are AI Crawlers?
AI crawlers are bots run by AI vendors. Some fetch pages to train models, others fetch pages live to power answers inside AI search products.
What Is Schema Markup?
Schema markup is structured code added to a page using the Schema.org vocabulary. It describes the content in a way search engines and AI systems can parse reliably.
What Is Structured Data?
Structured data is information marked up in a defined format so machines can read it without guessing. On the web, that usually means Schema.org JSON-LD.
Generative Engine Optimization (GEO)
Generative Engine Optimization is the work of shaping how generative AI platforms describe, recommend, and cite your brand when they answer a buyer's question.
Answer Engine Optimization (AEO)
Answer Engine Optimization is the work of becoming the cited source inside AI answers from ChatGPT, Gemini, Claude, and Perplexity, not just a blue link on Google.
What Is Source Attribution?
Source attribution is the practice of an AI system naming and linking the sources it used to generate an answer.
