
Quick Tips — Just Know This and You Can Command AI
Having agents handle only code isn’t the end. To delegate builds, deployments, and monitoring to agents, the entire system must be readable by agents. You don’t need to understand Docker internals. The agent handles everything.
To the agent: “Add a /health endpoint to the server. Return DB connection status, error rate, and uptime as JSON.”
This one phrase gives the agent eyes to read system state. With /health, the agent can mechanically verify “is the server alive?” Without it, it’s a surgeon operating blind.
To the agent: “Configure this project as docker-compose.yml. Include app server and DB. Everything should come up with docker compose up.”
You don’t need to know what Docker is. Just knowing it’s a tool that puts apps in a box so they run identically anywhere is enough. Agent handles installation through configuration.
To the agent: “Set up automatic rollback on deploy failure. If /health fails, revert to previous version.”
Agents will inevitably make mistakes. Mistakes must be reversible. This one phrase is the safety net.
Three phrases. Give the server eyes, declare the system, lay a safety net. The agent does the rest.
Hands-on Try
Open any project (or Class 1 app) with Claude Code:
“Add a /health endpoint to the server. Return DB connection status, error rate, and uptime as JSON. All existing Hurl tests must pass.”
After the agent adds code:
“Create a Hurl test that verifies /health returns 200. Also check the JSON response has db, status, and uptime fields.”
This is the start of Observability. The agent can now mechanically read system state.
Why You Need to Command This Way
Introduction: Beyond the Codebase
In Class 8 we made code agent-readable and writable. Split files with filefunc, secured tests with tsma, tracked change history with whyso.
But is agent-operable code enough?
After modifying code, you need to build. After building, deploy. After deploying, monitor. If something fails, rollback. If any of these steps requires manual human action, the agent’s autonomous scope ends at “code editing.”
Think of a vibe coder’s reality. “Add feature” produces code. Then what? Run build commands in terminal, enter AWS console to deploy, skim logs by eye, and if problems arise, say “revert” again.
This entire manual process should be connected as one pipeline. Agent edits code, runs tests, builds, deploys, monitors — humans just press the approval button.
This is Agent Operable System.
From Codebase to System
| Class 8: Agent Operable Codebase | Class 9: Agent Operable System |
|---|---|
| Can read code | Can read system state |
| Can modify code | Can change system config |
| Verified by tests | Verified by monitoring |
| Persists at file level | Infrastructure state persists |
The codebase is one part of the system. The system is the total of code + infrastructure + deployment pipeline + monitoring + operational procedures.
Feature request
→ SSOT editing (yongol)
→ Code generation (yongol generate)
→ Test pass (Hurl + go test)
→ Build (Docker)
→ Deploy (CI/CD)
→ Monitor (health check + logs)
→ Complete
If any link in this chain is opaque to the agent, everything after becomes the human’s responsibility. One broken link topples entire automation.
4 Conditions for Agent Operable System
For a system to be operable by agents, four conditions must be met.
Condition 1. Observability — All State Mechanically Observable
Agents have no eyes. They can’t see screens. They can’t read dashboards. For an agent to know system state, that state must output as text.
# Human observation
Log into AWS console → CloudWatch dashboard → Check graphs by eye
→ "Oh, CPU is high" → Judgment
# Agent observation
$ curl -s localhost:8080/health | jq .
{
"status": "ok",
"db": "connected",
"uptime": "3h42m",
"error_rate_5m": 0.02
}
→ error_rate > 0.05? → Alert
Observability’s core: not what humans see, but what machines can parse.
A system without observability is a surgeon operating blind.
Condition 2. Declarative — All Actions Defined Declaratively
When you tell an agent “deploy,” what does it do?
Without a declarative system: The agent guesses “usually you do it this way.” SSH into server, git pull, restart process… and misses something.
With a declarative system: Everything is written in files.
# docker-compose.yml — What services run
services:
app:
build: .
ports: ["8080:8080"]
environment:
DATABASE_URL: ${DATABASE_URL}
# Makefile — What commands do what
deploy:
docker compose up -d
curl -sf localhost:8080/health || (docker compose logs && exit 1)
In a declarative system, what the agent does is clear:
- Read files (docker-compose.yml, Makefile, workflow)
- Execute as files say
- Check results
No guessing. Files are truth.
The SSOT principle from Class 4 applies identically here. Just as we separated decisions from implementation in code, in systems too we separate “what to do” (declaration) from “how to do it” (execution).
Docker is a tool that puts apps in a box so they run identically anywhere. Like packing belongings in boxes when moving — put the app and everything it needs in one box. Move the box and it runs the same anywhere. Terraform is a tool for managing servers as code files.
You don’t need to understand Docker and Terraform internals. “Putting apps in a box” — that one line is enough. The agent handles the rest.
Condition 3. Reversible — All Changes Verifiable and Reversible
If the agent deployed and the service died, two things are needed:
- Can tell what went wrong (verifiable)
- Can revert to previous state (reversible)
# Irreversible deploy (terror)
Directly upload and overwrite files on server.
→ Problem → Where's the previous version? → Can't remember → Panic
# Reversible deploy (peace)
git revert HEAD && make deploy
→ Problem → Rollback to previous commit → Recovered in 1 minute
Git’s core from Class 3 returns here. Code rollback is handled by Git. Infrastructure rollback by Terraform. DB rollback by migration down files.
Irreversible changes cannot be delegated to agents. Agents will make mistakes — and they inevitably do — so mistakes must be reversible.
Condition 4. Human-in-the-loop — Approval Gates Are Explicit
The most important of the four conditions.
Structure where agent judges and human approves. Not “human instructs, agent executes” but “agent proposes, human approves.” Direction is reversed.
The key is approval gates are explicit and can’t be automatically bypassed.
| Task | Auto-execute | Needs approval |
|---|---|---|
| Run tests | O | |
| Code formatting | O | |
| Staging deploy | O | |
| Production deploy | O | |
| DB schema change | O | |
| Env variable change | O | |
| Rollback (pre-approved) | O |
Reversible tasks can auto-execute. Hard-to-reverse or high-impact tasks must go through approval. Declaring this boundary in advance is Human-in-the-loop design.
The Agent’s Bottleneck Is Context, Not Intelligence
In Class 8, filefunc removed code’s context pollution. This principle extends to the entire system.
Structuring code lets the same agent handle 10x wider scope.
Not just code. Structuring every layer of the system dramatically widens the agent’s exploration scope:
Code → Structured with filefunc
Config → Declaratively defined with docker-compose.yml, Makefile
Specs → Cross-validated with yongol SSOT
Infra → Persisted with Terraform state
Monitor → Machine-readable with /health + structured logs
The agent’s bottleneck isn’t intelligence. Giving agents structured information is 10x more effective than using smarter models. Just as filefunc structured code in Class 8, Class 9 structures the entire system.
The Complete Pipeline: From “Add Feature” to Deployment
In a project with Agent Operable System, saying “add an order history query feature”:
1. SSOT editing
Agent: Add ListOrders to features.yaml
Agent: Define GET /orders in OpenAPI
Agent: Define orders table in DDL
Agent: Declare service flow in SSaC
Agent: Write test scenario in Hurl
2. Consistency validation
Agent: yongol validate → 0 errors
3. Code generation
Agent: yongol generate → Go handler, sqlc queries, React component
4. Test pass
Agent: go test → PASS
Agent: Hurl tests → PASS
5. Build
Agent: docker build → Success
6. Deploy (approval gate)
Agent: "All validations passed. Requesting staging deploy approval."
Human: "Approved"
Agent: staging deploy → /health check → Normal
7. Production deploy (approval gate)
Agent: "30 minutes error-free on staging. Requesting production deploy approval."
Human: "Approved"
Agent: production deploy → /health check → Normal
8. Complete
Agent: "ListOrders feature deployed.
Monitoring. Auto-rollback on anomaly."
What the human did: “Add order history query feature” + “Approved” twice. What the agent did: Everything else.
This is the complete form of vibe coding scale-up.
Vision — The End of Vibe Coding Scale-up
Where we started in Class 1:
Class 1:
"Add feature" → Code emerges → 5 features then crumbles
Class 9:
"Add feature" + "Approve"
→ Code generation → Test pass → Build → Deploy → Monitor
→ Entire pipeline is agent-driven
A structure where non-SWEs can maintain, deploy, and operate 100+ endpoints.
Scale-up possible without SWEs because: decisions by humans, implementation and verification by machines.
But one thing is still missing. We structured code, structured the system. But what about data?
Class 10 completes the final puzzle.
Exercise
Required Exercise (Non-technical)
Goal: Add a /health endpoint and verify with Hurl.
Step 1 — Establish Observability
To the agent: "Add a /health endpoint to the server.
Return DB connection status, error rate, and uptime as JSON.
All existing Hurl tests must pass."
Step 2 — Write Hurl Test
To the agent: "Create a Hurl test that verifies /health returns 200.
Also check the JSON response has db, status, and uptime fields."
What to check:
- Does
/healthreturn 200? - Does the Hurl test pass?
Challenge Exercise (Optional)
No need to install Docker yourself or understand config files. Tell the agent everything.
Step 1 — Docker Compose
To the agent: "If Docker isn't installed, install it.
Configure this project as docker-compose.yml.
Include app server and PostgreSQL.
Everything should come up with docker compose up.
Add build, deploy, test commands to the Makefile."
The agent handles everything from Docker installation to config file creation to execution verification. You just check at the end — “does the app start? Does /health return 200?”
Step 2 — CI/CD Pipeline
To the agent: "Create a GitHub Actions workflow.
On push to main branch:
1. Run go test
2. Run Hurl tests
3. Docker build
4. Deploy to staging if all pass
But production deploy requires manual approval."
Step 3 — Integration Demo
Request one feature addition from the agent. Demo the full pipeline from SSOT editing to staging deploy, with only production deploy requiring manual approval.
What to check:
- From “add feature” to staging deploy, how many times did a human intervene?
- When the agent failed, how many minutes to rollback?
- Could the agent read
/healthresults and assess system state?
Related Articles
Reins Engineering Full Course
| Class | Title |
|---|---|
| Class 1 | How to Command AI |
| Class 2 | How to Distrust AI |
| Class 3 | Unbreakable Apps |
| Class 4 | Decisions Outside Code |
| Class 5 | AI with Reins |
| Class 6 | Lock When It Passes |
| Class 7 | Flipping Sycophancy |
| Class 8 | Agent Factory |
| Class 9 | Automation Beyond Code |
| Class 10 | Law of Data |
Sources
- Class 8 reference: Stanford “Lost in the Middle” (2024), Amazon “Context Length Alone Hurts LLM Performance” (2025) — Unnecessary context degrades agent performance 30-85%
- Observability principles — Machine-parseable structured output (/health endpoints, JSON logs) as prerequisite for agent operation
- Docker Compose — Declarative service configuration for agents to read and execute systems without guessing
- Terraform — Infrastructure as Code, declarative definition and reversible changes of infrastructure state
- CI/CD (GitHub Actions) — Declarative automation of build-test-deploy pipelines
- Human-in-the-loop design — Auto for reversible tasks, approval gate required for high-impact tasks