Time to read: 14 minutes Time to apply: 20 minutes to write protocols, immediate payoff Prerequisites: Patterns 1, 2, 3 (Boot, Skills, Memory)
They either let the agent do anything — and get deployment scares, overwritten files, and broken builds. Or they micromanage every decision — and get approval fatigue, slow sessions, and an agent that's not an agent, just a slow CLI.
The middle ground is decision protocols: rules the agent follows about when to ask, when to proceed, and when to escalate.
Most people never write these down. They just react: "No, don't do that!" after the fact. This is exhausting and doesn't scale — the agent keeps making the same bad decisions because it has no written rules.
What the agent can do immediately, no questions asked:
pytest, npm test, etc.)Your green zone will be different. The key: these are actions where being wrong doesn't cause damage. Worst case: a file ends up in the wrong place, a test fails, a commit needs reverting. All recoverable.
What the agent can do but should tell you about:
patch (targeted edits)pip install, npm install)The agent proceeds — you don't have to approve every one — but you get a summary so you can spot-check. This is where the 10x speedup comes from: the agent does 90% of the work without blocking on you.
What the agent must never do without explicit approval:
These are the boundaries where a mistake isn't just a bad commit — it's a production incident, a compliance violation, or a customer-facing error.
Save this to memory so the agent reads it every session:
Autonomy protocol:
GREEN (proceed without asking):
- Writing new files, running tests, linting, type checking
- Reading/searching files, exploring the codebase
- Committing to feature branches (never main)
- Non-destructive shell commands
AMBER (proceed but tell me):
- Modifying existing files with targeted edits
- Installing packages or changing dependencies
- Creating pull requests
- Deploying to staging
RED (must ask first):
- Deploying to production
- Destructive database operations
- External communications (email, messages, social)
- Modifying payments, billing, or auth
- Accessing production secrets or APIs
- Anything with regulatory/compliance impact
Special: When I say "Proceed" or "Don't stop until finished" —
elevate all AMBER + GREEN to full autonomy. Work through
everything without pausing. Only stop when done or I interrupt.
This is the single highest-leverage protocol I use. It's a mode switch — one word flips the agent from "ask about everything" to "execute everything."
The memory entry:
When user says "Proceed" or "Don't stop until finished," work through all
remaining items without pausing for approval. Commit after each, run tests,
move to next. Only stop when done or user interrupts.
When I say "Proceed," the agent handles an entire multi-step build autonomously: create files → run tests → fix failures → commit → next task → repeat. I come back 15 minutes later to a completed feature with passing tests.
Start restrictive. Loosen as confidence grows.
Week 1: Everything in Amber zone requires approval. You're learning the agent's judgment.
Week 2: Move frequent, consistently-correct actions to Green. File reads, test runs, simple file writes.
Week 3: Move reliable targeted edits to Amber. The agent's patch usage has been clean for 2 weeks.
Week 4: Introduce "Proceed" for small, defined task lists. 3-5 items, all recoverable.
Month 2: Proceed for any task list where all items are in Green/Amber zones.
The key: trust is earned, not granted. Every time the agent makes a good autonomous decision, your confidence increases. Every mistake resets the calibration — not all the way back, but you tighten the relevant zone.
Me: "Add rate limiting to the LLM proxy."
Agent: *writes code* — good.
Agent: *runs tests* — good.
Agent: "Should I commit this?"
Me: "Yes."
Agent: *commits* — good.
Agent: "Should I deploy to production?"
Me: "NO. Never deploy without asking."
Agent: "Sorry. Should I create a PR?"
Me: "Yes."
Agent: *creates PR* — good.
Agent: "Should I merge?"
Me: "No, that needs review."
Agent: "Should I send an email to the team about the change?"
Me: "NO. Never send emails without asking."
8 interruptions. 4 "NO" corrections. Exhausting.
Me: "Add rate limiting to the LLM proxy. Proceed."
Agent: *writes code, runs tests, commits to feature branch, opens PR* — complete.
Me: *comes back 15 minutes later, reviews PR, merges*
Zero interruptions. Zero corrections. Because the agent knew: commits go to feature branches, PRs are created automatically, never deploy, never email.
Time investment: 20 minutes. Return: hundreds of approval interruptions eliminated per month.
Pattern 5: Tool Composition — Your agent has a toolbox: terminal commands, file operations, web searches, sub-agents. Using the wrong tool turns a 30-second task into a 15-minute ordeal. We'll go through the decision matrix for every tool type.
Pattern 4 of 10. From the Works With Agents methodology.
Suggest an improvement, report an error, or just say hi.