Citegrove · Reddit-first AI citation outreach
llms.txt Generator
llms.txt is the emerging "robots.txt for AI" — a markdown manifest that tells ChatGPT, Claude, and Perplexity what your site is and which pages actually matter. Type your domain and we'll generate a starter file based on your homepage and bucket your internal links into the canonical sections. Copy, edit, and publish to /llms.txt.
FAQ
Frequently asked
What is llms.txt?↓
An emerging standard (llmstxt.org) for telling AI crawlers what your site is, who it serves, and which pages matter most. It's a markdown file at /llms.txt — like robots.txt but for AI discovery instead of permission.
Where do I put the generated file?↓
Save it to /llms.txt at the root of your domain (e.g. https://yourdomain.com/llms.txt). Then add a Cloudflare/CDN rule (or static asset route) that serves it as text/plain.
How do you decide which pages go in?↓
We fetch your homepage via Jina Reader, extract every internal link, dedupe by path, and bucket into sections (Documentation, Guides, API reference, Blog, Pricing, etc.) using filename patterns. We cap each section so the file doesn't become a sitemap dump.
Will this be perfect?↓
No — it's a 70% starter. Your homepage doesn't link to every important page (e.g., specific blog posts, deep API endpoints). Edit the output before publishing. We surface the raw section list so you can see exactly what we picked.
Is llms.txt actually used by AI engines yet?↓
Adoption is still early. Anthropic, Cloudflare, and OpenAI have signalled support; in practice few engines parse it actively today. But the cost of publishing it is near-zero and it future-proofs your AI discoverability.
How is this different from sitemap.xml?↓
Sitemap.xml lists every URL for traditional search crawlers. llms.txt is a curated, human-readable manifest that explains what your site IS — written for an LLM to summarize, not a crawler to index.
Do you store the generated file or my domain?↓
No. We log a hashed IP for rate limiting (10 generations per IP per 24 hours). Domains and outputs aren't persisted.