llms.txt Best Practices: The Most Common Mistakes and How to Avoid Them

Having an llms.txt is good. Having an llms.txt that AI systems can actually use is better. This guide shows the typical mistakes and explains what really belongs in the file — and what does not.

The 5 most common mistakes

Most llms.txt files fail not because of technical issues — but because of content. These five mistakes come up most frequently:

Mistake #1

Too vague and non-specific

Descriptions like "We are an innovative company that develops solutions for our customers" are worthless for AI systems. They provide no concrete context and could apply to thousands of websites.

→ Be specific: What exactly do you do, for whom, and with what result?
Mistake #2

Incorrect or outdated information

An llms.txt with outdated product names, old prices or discontinued services is worse than no llms.txt — AI systems will then give incorrect information about your website.

→ Update llms.txt with every major change, and specify the date.
Mistake #3

Wrong encoding or broken special characters

Special characters that are not saved in UTF-8 cause encoding problems. AI systems may then not be able to process the file correctly.

→ Always save the file as UTF-8 without BOM and check it with the validator.
Mistake #4

Sensitive information in llms.txt

Internal price lists, employee data, unpublished products or confidential business information do not belong in llms.txt — the file is public and accessible to everyone.

→ Only include publicly available information intended for users.
Mistake #5

Incorrect Content-Type output

If the web server does not deliver llms.txt as text/plain, some AI systems may not be able to process the file correctly. This is a common issue with CMS systems in particular.

→ Check after deployment: curl -I yourdomain.com/llms.txt

What belongs in — and what does not

Belongs in

A good llms.txt answers the questions an AI system would have about your website before it starts crawling:

  • Clear identity: Name of the website/organisation and what it does — in one sentence
  • Concrete offering: What specific products, services or content is available?
  • Target audience: Who is the website made for?
  • Key URLs: Links to the central pages — not all of them, just the most important
  • Contact: A reachable email address
  • Language(s): In what language is the content available?

✓ Good: "llmshub.de ist die deutschsprachige Plattform für llms.txt und LLM-Sichtbarkeit. Wir bieten kostenlose Tools zum Erstellen und Validieren von llms.txt Dateien sowie Guides zu GEO und KI-Crawlern."

✗ Bad: "Willkommen auf unserer Website. Wir sind Experten auf unserem Gebiet und helfen unseren Kunden dabei erfolgreich zu sein."

Does not belong in

  • Advertising copy and marketing language: "Market leader", "unique", "revolutionary" — AI systems value substance, not superlatives
  • Complete page texts: That is what llms-full.txt is for — llms.txt is a structured overview, not a content dump
  • Technical implementation details: Which CMS is used or how the server is configured does not interest AI systems
  • Repetitions from robots.txt: Crawling rules belong in robots.txt, not in llms.txt
  • SEO keywords without context: A list of keywords is not explanatory text

Language and tone

LLMs process natural language — that is their strength. An llms.txt should therefore be written in clearly structured but natural language. Not like a database schema, but not like an advertising text either.

Precision beats length

Two precise sentences about the actual offering are more valuable than a paragraph of vaguely formulated company philosophy. AI systems extract concrete facts — the more clearly these are formulated, the better they can be used.

✓ Präzise: "AI-Ready Check prüft kostenlos ob eine Website technisch für KI-Suchmaschinen wie ChatGPT und Perplexity optimiert ist. Das Tool analysiert 20 Faktoren und gibt einen Score von 0–100."

✗ Vage: "We help companies improve their digital presence and remain visible in the modern AI-driven world."

Use the language of your audience

Write the llms.txt in the main language of your website and target audience. If your website is in English, write the llms.txt in English. If you are targeting international audiences, create separate llms.txt files or add a section in another language.

Tip: Write the llms.txt as if you were explaining to someone in two minutes what your website is and why it matters. No jargon, no filler — direct and concrete.

File size and performance

The llms.txt should stay lean. A rule of thumb: under 10 KB for llms.txt, under 100 KB for llms-full.txt. AI crawlers have limited timeouts — a file that is too large may not be fully loaded.

What unnecessarily bloats the file size

  • Complete product descriptions or blog articles — these belong in llms-full.txt
  • Long lists of URLs — only link to the 5–10 most important pages
  • Redundant information that repeats itself
  • Binary characters or faulty encoding artefacts

Content-Type and encoding

The server must deliver the file correctly. This can easily be tested:

curl -I https://deinedomain.de/llms.txt # Expected output — Content-Type must be text/plain: HTTP/2 200 content-type: text/plain; charset=utf-8

If the Content-Type is wrong, an .htaccess rule (Apache) or an Nginx configuration helps on most web servers.

Keeping it up to date

An outdated llms.txt is an underestimated risk: AI systems provide information based on the data they have crawled. If your llms.txt still contains old products, expired offers or incorrect contact details, this incorrect information will appear in AI responses.

When to update?

  • New products or services are launched
  • Existing offerings are discontinued or renamed
  • Contact details change
  • The target audience or positioning changes
  • Important new pages are created that should be linked

Specify the date

A ## Last Updated section helps AI systems assess how current the information is. Format: YYYY-MM or YYYY-MM-DD.

Warning: Do not use the date as decoration — if the update date is two years ago but the products have changed since then, that is a warning signal for AI systems.

llms-full.txt — when it makes sense

llms-full.txt is the optional extension to llms.txt: it contains the complete content of the most important pages in a form optimised for LLMs. The difference lies in purpose:

  • llms.txt: Structured overview — who you are, what you offer, how to reach you
  • llms-full.txt: Complete content — all relevant texts so an LLM can answer without further crawling

When is llms-full.txt worth it?

llms-full.txt makes sense when your website has substantial content that should regularly be used as a source:

  • Documentation pages and knowledge bases
  • Extensive guides and tutorials
  • FAQ collections with many entries
  • Product databases or catalogues

For simple company websites, landing pages or smaller blogs, llms-full.txt is a nice-to-have, not a necessity. The maintenance effort must be worthwhile — an outdated llms-full.txt is worse than none at all.

Keeping size under control

llms-full.txt can be larger than llms.txt but should still stay under 500 KB. What goes in: the most important pages in full text, neatly structured with Markdown headings. What does not go in: navigation, footer, cookie banners, code snippets.

Best practices checklist

These points distinguish a good llms.txt from a very good one:

  • Description is concrete and specific — no generic marketing speak
  • File is accessible at yourdomain.com/llms.txt (HTTP 200)
  • Content-Type ist text/plain; charset=utf-8
  • Encoding is UTF-8 without BOM — special characters are displayed correctly
  • Dateigröße unter 10 KB
  • No sensitive or internal information included
  • Only the 5–10 most important pages linked — no complete URL list
  • Last update date is current and correct
  • Language matches the main language of the website
  • Content is updated when major website changes occur
  • Technically checked with the llms.txt Validator

Check your llms.txt now

The llmshub.de Validator shows in seconds whether your llms.txt is technically correct and where there is still room for improvement.

Validate llms.txt now →