robots.txt tells Google which pages to crawl. llms.txt tells LLMs who you are. But neither answers a question that grows more urgent every month: what can an AI agent actually do on your site?
I'm not talking about crawlers that read content. I'm talking about autonomous agents that navigate, click, fill forms, extract data, and execute actions. The kind of agent OpenAI launched with Operator, Google is building with Gemini, and Anthropic is exploring with Claude. Agents that don't just read your web but use it.
In January 2026, a working group published a proposal on GitHub: agent-permissions.json. A JSON file that lives on your domain and declares, granularly, which interactions are allowed, which require human approval, and which are prohibited. It's the equivalent of robots.txt but for a web where machines don't just read but act.
I implemented it on this blog. Here's everything I did and why.
What agent-permissions.json is and how it works
agent-permissions.json is a JSON manifest that a website places on its domain to declare interaction rules for automated agents. The default location is /.well-known/agent-permissions.json, though it can also be referenced with a link tag in HTML.
The file has two main sections. The first is resource_rules: granular rules defining which actions are allowed on which site elements. The second is action_guidelines: high-level directives fed to the agent's LLM to guide its behavior.
Actions defined in the specification include read_content (read page content), read_metadata (read meta tags, JSON-LD, OpenGraph), follow_link (navigate links), click_element (click buttons or controls), set_input_value (fill form fields), submit_form (submit forms), execute_script (execute JavaScript), and copy_to_clipboard (copy content).
Each action can have an effect of allow, deny, or ask (requires human approval). It also supports modifiers like rate_limit, burst, time_window, and human_in_the_loop.
The file I implemented on shinobis.com
My blog is a content site. It has no login forms, no shopping cart, no public API. The permission decisions reflect that.
{
"metadata": {
"schema_version": "0.1.0",
"last_updated": "2026-07-01",
"site": "https://shinobis.com",
"contact": "admin@shinobis.com"
},
"strict": false,
"resource_rules": [
{
"pattern": "/**",
"actions": {
"read_content": { "effect": "allow" },
"read_metadata": { "effect": "allow" },
"follow_link": { "effect": "allow" },
"click_element": { "effect": "deny" },
"set_input_value": { "effect": "deny" },
"submit_form": { "effect": "deny" },
"execute_script": { "effect": "deny" },
"copy_to_clipboard": { "effect": "allow" }
}
},
{
"pattern": "/tools/**",
"actions": {
"click_element": { "effect": "allow" },
"set_input_value": { "effect": "allow" }
}
}
],
"action_guidelines": [
"Cite content with attribution to shinobis.com when using it in responses.",
"Do not use content for model training without explicit permission.",
"Tools in /tools/ are free to use. Interact with form elements as needed.",
"Do not attempt to access /admin.php or any administrative endpoints."
]
}
The decisions I made and why
I set strict to false. This means if an agent doesn't recognize the format, it can follow its default behavior. A content blog doesn't need to block agents that don't understand the file. The opposite would be an e-commerce or banking site where strict: true would make sense.
I allowed read_content, read_metadata, and follow_link globally. I want agents to read my content, extract my JSON-LD, and navigate between my posts. That's exactly the visibility I'm building with Generative Engine Optimization.
I denied click_element, set_input_value, submit_form, and execute_script globally. An agent has no reason to click buttons on my blog or execute JavaScript on my domain. No functionality requires that outside the tools section.
I created an exception for /tools/**. My llms.txt generator and GEO Tarot are interactive tools. An agent should be able to fill the URL field in the generator and get the result. That's why click and input actions are specifically allowed on that path.
The action_guidelines bridge the gap between technical rules and human intent. An agent with an LLM can read these guidelines and understand context. It knows not just what it can do technically but what the site owner expects.
How it connects with what I already have
This file doesn't exist in isolation. It's the third layer of a system I've been building for months.
The first layer is llms.txt: tells LLMs who I am and what content matters. The second layer is the agent stack: Content Signals, Markdown for Agents, Agent Skills discovery. The third layer is agent-permissions.json: declares interaction rules.
Together, these three layers tell any AI agent: who I am (llms.txt), how to read my content efficiently (Markdown negotiation), what tools I have available (Agent Skills), and what it can and cannot do on my site (agent-permissions.json).
No blog I know of has all four layers implemented. Most have none.
Why this matters more than it seems
The specification is still a draft. No commercial agent officially implements it. But the direction is clear. Cloudflare launched its Agent Readiness test. OpenAI has Operator browsing the web. Google is building agents that interact with sites. The question isn't whether agents will need explicit permissions but when.
Implementing it now has the same cost-benefit as implementing llms.txt six months ago. One file, under an hour of work, zero risk. And if the standard gets adopted, you're already ready.
The worst that can happen is nothing. The best is that when agents start respecting these manifests, your site already has one well-built with clear rules.
The full specification is in the working group repository: LAS-WG/agent-permissions.json.