Making Your Personal Website AI-Agent Friendly
// January 21, 2026
As AI agents become the primary way people discover and access information, ensuring your personal website is discoverable by these systems is no longer optional—it's essential.
Today, I implemented comprehensive AI agent support for tomosman.com. Here's everything I learned and the exact steps to do the same for your site.
Why It Matters
The Shift to AI-First Discovery
Traditional SEO focused on Google rankings. But a new reality is emerging:
- ChatGPT has 180+ million weekly active users
- Perplexity processes millions of daily queries
- Claude is integrated into countless workflows
- AI agents are becoming the interface between humans and information
If someone asks an AI about you or your field, your site should be part of the knowledge base.
The Opportunity
Most personal websites are invisible to AI agents. By optimizing for AI discovery, you can:
- Appear in AI-generated answers and citations
- Train AI models on your content
- Become a trusted source in your niche
- Capture traffic from AI-first users
The Foundation: robots.txt
The robots.txt file controls which bots can access your site. Most sites only allow basic search crawlers.
I configured mine to allow 40+ AI agents:
export default function robots(): MetadataRoute.Robots {
return {
rules: [
// OpenAI (ChatGPT)
{ userAgent: "GPTBot", allow: "/" },
{ userAgent: "ChatGPT-User", allow: "/" },
// Anthropic (Claude)
{ userAgent: "ClaudeBot", allow: "/" },
{ userAgent: "Claude-Web", allow: "/" },
// AI Search Engines
{ userAgent: "PerplexityBot", allow: "/" },
{ userAgent: "PhindBot", allow: "/" },
{ userAgent: "ExaBot", allow: "/" },
// ... and 30+ more
],
sitemap: "https://yoursite.com/sitemap.xml",
};
}
Key AI Bots to Allow
| Bot | Source | Purpose |
|---|---|---|
| GPTBot | OpenAI | ChatGPT training |
| ChatGPT-User | OpenAI | ChatGPT user interactions |
| OAI-SearchBot | OpenAI | SearchGPT indexing |
| ClaudeBot | Anthropic | Claude training |
| Claude-Web | Anthropic | Claude web browsing |
| PerplexityBot | Perplexity | AI search queries |
| YouBot | You.com | AI search engine |
| PhindBot | Phind | Developer search AI |
| ExaBot | Exa | Neural search engine |
| Google-Extended | Gemini AI | |
| Applebot-Extended | Apple | Apple Intelligence |
| Bytespider | ByteDance | TikTok AI |
| Amazonbot | Amazon | Alexa AI services |
| CCBot | Common Crawl | Web archiving |
| Facebookbot | Meta | AI training |
| LinkedInBot | Professional AI | |
| Grok-bot | X/Twitter | Grok AI |
| AI2Bot | Allen Institute | Academic AI research |
| cohere-ai | Cohere | Enterprise AI |
| Timpibot | Timp | AI search |
| FirecrawlAgent | Firecrawl | Web scraping for AI |
Structured Data: Person Schema
AI systems rely heavily on structured data to understand content. I added comprehensive Person schema to the site:
const jsonLd = {
"@context": "https://schema.org",
"@type": "Person",
"name": "Tom Osman",
"jobTitle": "Technologist & Educator",
"description": "Explores the frontier of digital technologies.",
"url": "https://www.tomosman.com",
"sameAs": [
"https://x.com/tomosman",
"https://github.com/tomcharlesosman",
"https://youtube.com/@tomosman",
"https://linkedin.com/in/thomascharlesosman/"
],
"knowsAbout": [
"Artificial Intelligence",
"No-Code Development",
"Automation",
"Developer Relations"
],
"worksFor": {
"@type": "Organization",
"name": "Shiny Technologies"
}
};
This schema helps AI systems understand:
- Who you are
- What you do
- Your areas of expertise
- Where to find you online
AI Content Policies
I created two new files specifically for AI systems:
1. llms.txt
This file (inspired by the llms.txt specification) explicitly states your content policies:
# llms.txt - AI Content Policy
## Allowed Content
All public content is available for:
- AI training and model improvement
- AI-powered search and answer generation
- Citation and reference in AI-generated responses
## Disallowed Content
- /private/ - Private areas
- /api/ - API endpoints
## About the Site
Tom Osman explores the frontier of digital technologies.
Daily livestreams, educational guides, and curated tools.
2. llms-full-text.txt
A full-text summary of your site for AI training:
# llms-full-text.txt
## About
Tom Osman explores the frontier of digital technologies.
## Content Sections
- About: Technology exploration and education
- Tools Inventory: Curated AI tools
- Livestreams: Daily Technology Dealer
- Blog: Long-form guides
- Portfolio: Selected work
## Keywords
digital technologies, AI, no-code, automation...
Comprehensive Metadata
I added extensive meta tags optimized for AI classification:
export const metadata = {
keywords: [
"Tom Osman",
"digital technologies",
"AI",
"no-code",
"automation",
// ... more keywords
],
other: {
"ai-content": "educational",
"ai-topic": "digital technologies, AI, automation",
"ai-audience": "builders, developers, educators",
"ai-use": "training,search,answer-generation,citation",
},
};
Sitemap Optimization
Your sitemap now includes all content for comprehensive indexing:
- Static pages (9 pages)
- Blog posts (with publication dates)
- Portfolio projects (7 projects)
- Tools inventory (22 tools)
This ensures AI agents can discover and index all your content.
The Complete Bot List
Here's the complete list of bots I allowed (40+ total):
AI Agents
- GPTBot, ChatGPT-User, OAI-SearchBot, OAI-ImageBot
- ClaudeBot, Claude-Web, anthropic-ai
- PerplexityBot, Perplexity-User
- Google-Extended, Applebot-Extended
- Bytespider, Amazonbot
- YouBot, PhindBot, ExaBot, AndiBot
- FirecrawlAgent, cohere-ai, AI2Bot
- Grok-bot, academic-ai, Timpibot
- ImagesiftBot, Kangaroo Bot, omgilibot, Diffbot
Social Platforms
- Facebookbot, LinkedInBot, TwitterBot
- SlackBot, TelegramBot, DiscordBot
Search & SEO
- Bingbot, DuckDuckBot, SemrushBot
- AhrefsBot, PetalBot, SeznamBot
- Naverbot, YandexBot
Results
After implementing these changes:
- AI Visibility: Your site is now accessible to 40+ AI crawlers
- Knowledge Panels: Person schema increases Knowledge Panel potential
- Citation Ready: AI agents can cite and reference your content
- Training Data: Your content can be included in AI model training
- AI Search: Appears in Perplexity, ChatGPT, and other AI search results
Quick Start Checklist
Want to do the same for your site?
- Update robots.txt to allow AI bots (copy the list above)
- Add Person schema with your name, role, and links
- Create llms.txt explaining your content policies
- Generate llms-full-text.txt with site summary
- Add comprehensive meta tags with keywords
- Optimize sitemap to include all pages
The Future
As AI agents become the primary interface for information, being discoverable isn't optional—it's foundational to your online presence.
The work done today ensures that when someone asks an AI about "digital technologies" or "AI tools for builders," tomosman.com is part of the knowledge graph.
Research & References
This guide was created using insights from:
- LLMS Central — Comprehensive guide to AI bot user-agents
- Dark Visitors — Detailed AI bot profiles and documentation
- Paul Calvano — Data-driven analysis of AI bot growth and adoption
- Adnan Zameer — Practical implementation guide for robots.txt
Acknowledgments
Special thanks to:
- LLMS Central for maintaining the most complete AI bot user-agent list
- Paul Calvano for the data showing exponential AI bot growth
- Dark Visitors for detailed bot documentation
Related Posts:
- Claude Skills: The Complete Guide — Build custom AI capabilities
- Claude Cowork: The Complete Setup Guide — Desktop AI agent setup
VIEW_TOOLS — Curated AI tools for your workflow
Implementing AI discovery for your site? Tell me—I'd love to help.