In late 2024, a new file format emerged: llms.txt. Some called it the “robots.txt for AI.” Others dismissed it as vaporware. The reality, as with most things in early-stage AI optimization, lands somewhere in between. We analyzed the current evidence, tested implementations, and talked to the people building AI retrieval systems. Here’s what we found about whether llms.txt actually helps AI visibility.
TL;DR — llms.txt and AI Visibility
- llms.txt is a new structured file that tells AI crawlers what content to prioritize
- It’s not robots.txt for AI — it’s closer to a curated content index for language models
- Current evidence shows limited direct impact on AI citation rates for most sites
- Most useful for complex or large sites with deep content hierarchies
- Worth implementing alongside schema and content improvements — not as a standalone fix
Table of Contents
What Is llms.txt?
llms.txt is a proposed standard for providing AI language models with structured information about a website’s content. Originally proposed by Jeremy Howard of Answer.AI (formerly fast.ai), the concept is straightforward: just as robots.txt tells search engine crawlers what they can and can’t access, llms.txt tells AI models what your most important content is and how it’s organized.
The file lives at the root of your website (e.g., yoursite.com/llms.txt) and contains a structured, human-readable overview of your site, its purpose, key pages, and content organization.
A typical llms.txt file includes:
- A brief description of the site and organization
- A list of the most important pages with descriptions
- Content categories and their purposes
- Links to key resources like documentation, blog, and product pages
The format is intentionally simple — plain text with Markdown formatting, designed to be easily parsed by both AI models and humans.
How llms.txt Works
The concept behind llms.txt addresses a specific problem: when an AI model or its retrieval system encounters your website, how does it efficiently understand what your site is about and where to find the most valuable content?
Current approaches to this problem are imperfect:
- Sitemaps (sitemap.xml): List every URL but provide no context about what each page contains or why it matters.
- Robots.txt: Tells crawlers where they can go, but nothing about what they’ll find there.
- Homepage crawling: AI crawlers can start at your homepage and follow links, but this is inefficient and may miss important pages buried in your site architecture.
llms.txt aims to solve this by providing a curated, context-rich guide to your site’s content. Think of it as a cover letter for your website — a concise document that says: “Here’s who we are, here’s what we know about, and here’s where to find our best content on each topic.”
When an AI system encounters your llms.txt, it can quickly assess:
- Whether your site is relevant to the current query
- Which specific pages are most likely to contain the needed information
- How your content is organized and interconnected
- What topics and areas you specialize in

Who Has Implemented llms.txt?
Adoption has been growing steadily, particularly among technology companies and content-heavy sites. Notable early adopters include:
- Anthropic (anthropic.com): The AI company that helped popularize the standard has its own llms.txt implementation.
- Various documentation sites: Technical documentation platforms were early adopters, as their content is heavily consumed by AI models.
- SaaS companies: Product companies with extensive help documentation and blog content have adopted llms.txt to guide AI retrieval.
- SEO and marketing platforms: Companies in the SEO space have adopted llms.txt as both a practical tool and a signal of AI-readiness.
The adoption curve is similar to schema markup adoption in the early 2010s — early movers are technology companies and SEO-aware businesses, with broader adoption following as the standard gains traction.
Does It Actually Help AI Visibility?
This is the critical question, and the honest answer is: the evidence is promising but not definitive.
What we know:
- AI crawlers do read llms.txt. Server log analysis shows that major AI crawlers (GPTBot, ClaudeBot, PerplexityBot) do request /llms.txt when they crawl a site. This confirms that the standard is being consumed by AI systems.
- It helps with content discovery. For large sites with thousands of pages, llms.txt can direct AI crawlers to the most important content faster than relying on crawl-based discovery alone.
- The format is LLM-friendly. Because llms.txt uses plain text with Markdown, it’s trivially easy for language models to parse. No complex XML parsing or schema interpretation required.
What we don’t know:
- Direct citation impact. There’s no published research showing a causal relationship between having an llms.txt file and receiving more AI citations. The standard is too new for rigorous before/after studies.
- Retrieval weight. It’s unclear how much weight AI retrieval systems give to llms.txt versus their existing crawling and indexing processes. It may serve as a tiebreaker rather than a primary signal.
- Cross-platform consistency. Different AI systems may use llms.txt differently, and there’s no guarantee that all major platforms will standardize their approach to it.
Our assessment: llms.txt provides a low-cost signal that makes your site more legible to AI systems. It’s unlikely to be the single factor that determines whether you get cited, but it contributes to the overall machine-readability that correlates with AI citation success. Given the minimal implementation effort (30-60 minutes), the risk-reward ratio strongly favors implementation.

The Crawl vs Retrieval Distinction
Understanding why llms.txt’s impact is indirect requires understanding the two-stage process of AI search:
Stage 1: Crawling and Indexing (where llms.txt helps)
AI systems crawl the web to build their knowledge base and search indices. During this stage, llms.txt helps by providing a structured guide to your content, improving content discovery, and ensuring your most important pages are indexed.
Stage 2: Retrieval and Generation (where content quality matters)
When a user asks a question, the AI retrieves relevant documents from its index and generates an answer with citations. At this stage, the AI is evaluating content quality, relevance, authority, and specificity — not consulting your llms.txt.
This distinction explains why llms.txt alone won’t transform your AI visibility. It optimizes the crawl stage, ensuring your content gets into the AI’s index. But the retrieval stage — where citation decisions happen — depends on your content quality, structure, authority, and relevance.
Think of llms.txt as making sure your book is in the library. Whether anyone checks it out depends on what’s inside.
Should You Implement llms.txt?
Yes. Here’s why:
- Minimal effort, potential upside. Creating an llms.txt file takes 30-60 minutes. The potential benefit — improved content discovery by AI systems — is worth that investment even if the impact is modest.
- Future-proofing. As AI search matures, standards for communicating with AI systems will become more important, not less. Early implementation positions you ahead of competitors who will eventually need to catch up.
- Signal of AI-readiness. Having an llms.txt file signals to AI systems (and to users who check) that your site is aware of and prepared for AI search. This is similar to having schema markup — its presence indicates a site that takes machine readability seriously.
- Content organization benefit. The process of creating llms.txt forces you to identify and articulate your site’s most important content and topic areas. This exercise often reveals content gaps and organizational issues that benefit your overall content strategy.
How to Write a Good llms.txt
Here’s a five-step process for creating an effective llms.txt file:
Step 1: Open with a Clear Site Description
Start with 2-3 sentences describing your organization and what your site offers. Be specific about your domain of expertise.
Step 2: List Your Primary Content Sections
Organize your key pages by category. For each section, provide a brief description of what content it contains and what topics it covers.
Step 3: Highlight Your Most Important Pages
For each section, list the 5-10 most important or comprehensive pages with a one-line description. These should be your pillar content, product pages, and key resources.
Step 4: Include Key Topics and Entities
List the primary topics your site covers and any entities (products, tools, frameworks) that you’re authoritative about. This helps AI models quickly determine if your site is relevant to a given query.
Step 5: Keep It Updated
Review and update your llms.txt quarterly, or whenever you publish significant new content. An outdated llms.txt that points to moved or deleted pages is worse than no llms.txt at all.
The total file should be concise — typically 50-200 lines. Remember, it needs to be easily parseable by an AI model, so clarity and structure matter more than comprehensiveness.
llms.txt vs robots.txt vs sitemap.xml
| Feature | robots.txt | sitemap.xml | llms.txt |
|---|---|---|---|
| Purpose | Access control (allow/disallow) | URL discovery | Content context and guidance |
| Target audience | Search engine crawlers | Search engine crawlers | AI models and LLM crawlers |
| Format | Custom text format | XML | Plain text / Markdown |
| Contains content descriptions | No | No (URLs and dates only) | Yes |
| Prioritization | No | Priority field (often ignored) | Implicit through page ordering |
| Topic context | No | No | Yes |
| Adoption | Universal | Very high | Growing (early stage) |
These three files are complementary, not competing. You should have all three:
- robots.txt to control crawl access
- sitemap.xml to ensure complete URL discovery
- llms.txt to provide content context for AI systems
llms-full.txt: The Extended Format
Some implementations also include an llms-full.txt file that provides more comprehensive content — sometimes including actual page content in Markdown format. This extended version serves as a single-file knowledge base that AI models can consume without crawling individual pages.
The trade-off: llms-full.txt can be very large and may contain content you’d normally only serve behind page loads. It’s most useful for documentation sites and knowledge bases where comprehensive AI comprehension is the priority. For most business sites, the standard llms.txt is sufficient.
Implementation Checklist
Ready to add llms.txt to your site? Here’s the quick checklist:
- Create a plain text file named
llms.txt - Add your site description and organization overview
- List your primary content sections with descriptions
- Include URLs to your 10-20 most important pages
- Upload to your site root (accessible at
yoursite.com/llms.txt) - Verify it’s accessible by visiting the URL in your browser
- Set a quarterly calendar reminder to review and update it
Want to know how your overall AI visibility stacks up? BlueJar’s free GEO audit evaluates your site’s AI readiness across multiple dimensions — including llms.txt, schema markup, content structure, and more — and provides a 0-100 GEO score with specific improvement recommendations.
FAQ: llms.txt
Is llms.txt an official web standard?
Not yet. It’s a proposed convention, similar to how robots.txt started as an informal agreement before becoming an official standard. There’s growing adoption, but no W3C or IETF specification yet. This means the format may evolve, but the core concept — providing AI-readable site context — is unlikely to change.
Will implementing llms.txt guarantee more AI citations?
No. llms.txt improves content discovery at the crawl stage, but AI citations depend on content quality, authority, and relevance at the retrieval stage. llms.txt is one component of a comprehensive GEO strategy, not a silver bullet.
Does llms.txt replace robots.txt?
No. They serve different purposes. robots.txt controls crawler access (what AI crawlers can and can’t access). llms.txt provides content context and guidance (what your site is about and where to find key content). You should have both.
How often should I update llms.txt?
Quarterly at minimum, or whenever you add significant new content sections, launch new products, or restructure your site. A stale llms.txt that references deleted or moved pages can be counterproductive.
Should I include every page on my site?
No. llms.txt should be curated, not comprehensive. Include your 10-20 most important pages per content section. For comprehensive URL listing, that’s what sitemap.xml is for. llms.txt is about quality and context, not completeness.
Can llms.txt hurt my AI visibility?
Only if it’s inaccurate or misleading. If your llms.txt claims expertise in topics your site doesn’t actually cover, or links to pages that return 404 errors, it could send negative signals. Keep it accurate and current.
Frequently asked questions
What is an LLMs.txt file and why does it matter?
LLMs.txt is a plain text file placed at your domain root (yourdomain.com/llms.txt) that provides AI language models with a structured overview of your site’s content, key pages, and content summaries. It’s designed to be human-readable and AI-readable, helping AI systems better understand your site’s purpose and content — similar to robots.txt but for AI comprehension rather than crawling rules.
How do I create an LLMs.txt file?
Create a plain text file named llms.txt in your site’s root directory. Include: (1) A brief site overview/description, (2) Key pages organized by topic with URLs and summaries, (3) Important content categories and their purpose, (4) Contact and company information. The file should be concise (under 1,000 lines recommended) and written in clear prose that both humans and AI can understand.
Does LLMs.txt directly affect ChatGPT or Perplexity citations?
LLMs.txt helps AI systems better understand your site’s content structure and key pages, which can improve the accuracy and frequency of citations. It’s not a guarantee of citation — the underlying content quality and schema markup still matter more. But LLMs.txt reduces AI misrepresentation of your site and improves the chances that AI describes your content accurately.
Is LLMs.txt an official standard?
LLMs.txt was proposed by Jeremy Howard in 2024 and has been adopted voluntarily by many websites. It is not a formal W3C or IETF standard, but it has growing support among AI search platforms. Some AI crawlers explicitly look for and process llms.txt files. Adoption is increasing rapidly as AI search matures.
Where can I see examples of well-structured LLMs.txt files?
BlueJar maintains its own LLMs.txt at beta.bluejar.ai/llms.txt as a reference example. Many tech companies and SaaS providers have published LLMs.txt files. The LLMs.txt community site (llmstxt.org) maintains documentation and examples of well-structured files.