Class 4

Quick Tips — Just Know This and You Can Command AI

In Class 3 we learned to prevent drift with Hurl tests and Git. That’s enough for up to 50 endpoints. But at larger scale, a new problem emerges. AI mistakes your decisions for implementation details and overwrites them when you just say “clean it up.”

Core principle: Pull decisions out of code. Inside code, your decisions (“this column is an integer”) and details (variable names, error handling) are mixed together. AI can’t tell them apart. Declaring decisions in separate specifications means AI can’t overwrite them.

Installing yongol:

To the agent: “Install npx skills add park-jun-woo/yongol”

Let’s declaratively create a Login feature:

To the agent: “Declare Login feature as SSOT”

AI auto-generates API spec, DB schema, service flow, authorization policy, and test scenario.

To the agent: “Run yongol validate and get to 0 errors”

If errors appear, AI fixes them itself. 287 rules cross-validate across 10 specifications. When all checkmarks are green, code generation is ready.

yongol is currently Go-only. But the principle — “pull decisions out of code and catch contradictions with cross-validation” — is language-independent. And Hurl tests from Class 3 already work regardless of language.

Hands-on Try

Have the agent install the yongol skill:

To the agent: “Install npx skills add park-jun-woo/yongol”

To the agent: “Declare Login feature as SSOT”

AI will generate 5 files. specs/api/openapi.yaml, specs/db/users.sql, specs/service/auth/login.ssac, specs/policy/authz.rego, specs/tests/scenario-login.hurl.

To the agent: “Run yongol validate specs/”

0 errors means success.

Let’s intentionally create a contradiction:

To the agent: “Change the email field in OpenAPI to mail”

To the agent: “Run yongol validate specs/”

An error like “OpenAPI says mail but DDL says email” should appear. Looking at one layer alone, there’s no error. Cross two layers and the contradiction appears.

To the agent: “Fix the validate errors”

AI unifies the field names. Validate again. 0 errors.


Why You Need to Command This Way

Previous Class Recap

In Class 3 we learned three things.

  • How to declare and verify API behavior with Hurl
  • How to create save points with Git
  • How to automate verification with CI/CD

These three alone can prevent vibe coding’s biggest enemy — drift. “Add this new feature. But all existing Hurl tests must pass.” This one phrase is the defense line.

But past 50 endpoints, a new problem emerges.

What Happens Past 50

Say you’re building a SaaS with vibe coding. It starts fast.

“Make signup” — 2 minutes. “Make login” — 1 minute. “Make profile edit” — 1 minute.

12 endpoints, 5 tables. Running in 20 minutes.

Past 50, strange things happen. AI makes today’s pattern differently from yesterday’s. Past 100, existing features break silently. Past 200, adding one new feature takes 10x longer than making the first 10.

Why?

Not because AI is stupid.

Three Things Mixed in Code

Open source code and three things are interleaved.

User decisions — “This column is an integer.” “This API is owner-access only.” “Pagination uses cursor style.”

Business logic — Pricing policies, workflows, lifecycle rules.

Implementation details — Variable names, library call order, error handling code.

When AI reads this code, it can’t distinguish which line is your decision and which is a detail. So when you say “refactor,” it mistakes your decisions for details and quietly overwrites them.

Think of it like this. You decided “the entrance must face south” when building a house. But when you told the interior designer “clean up the house,” they changed the entrance direction. “The traffic flow is better this way.” From the designer’s perspective, it’s optimization. From your perspective, it’s disaster.

This is exactly what AI does. Using a bigger model doesn’t fix it. Because the medium (source code) itself can’t preserve decisions.

Pull Decisions Out of Code

The solution is simple. Separate decisions from code.

How you’ve been directing AI:

"Make login API" → AI writes code → Decisions and details get mixed

What yongol proposes:

You declare decisions → AI edits declarations → yongol generates code

Decisions live in declarative specifications, code is a disposable projection. When decisions change, edit the declaration and regenerate code. When details change, just regenerate code. They never mix.

10 SSOT Types — Each Handles One Concern

yongol separates the decisions composing software into 10 declarative specifications (SSOT: Single Source of Truth). Each specification handles one concern.

You don’t need to memorize these names. Grouped by role, they’re intuitive:

Defining data:

SSOTDecision it handlesSimply put
features.yamlFeature catalog“What to build”
manifest.yamlProject settings“Auth is JWT, DB is PostgreSQL”
SQL DDLData model“Store these columns in this table”
sqlcDB queries“Query data with this SQL”

Defining behavior:

SSOTDecision it handlesSimply put
OpenAPIAPI contract“Send this data to this address, get this response”
SSaCService flow“Process in order: query → validate → create → respond”
Mermaid stateDiagramState transitions“Order changes: pending → approved → complete → cancelled”

Verifying:

SSOTDecision it handlesSimply put
OPA RegoAuthorization policy“Only admins can delete”
HurlTest scenarios“Call this way, must respond this way”

Defining screens:

SSOTDecision it handlesSimply put
STML (Service Template Markup Language)Frontend“Show this data on screen this way”

10 types seem like a lot? No need to worry. Three facts to know.

First, 8 of the 10 are industry standards. OpenAPI, SQL, sqlc, Rego, Mermaid, Hurl, YAML — industry-standard tools used by professional developers, but you’ll never need to use them directly. AI knows them. yongol created only two new ones: SSaC (service flow) and STML (frontend).

Second, you don’t need to learn them. AI knows. You say “make a signup feature” and AI edits the 10 specifications. All you see is the result.

Third, yongol is currently Go-only. It can’t be used with stacks like React+FastAPI or Next.js yet. But the principle learned in Class 4 — pull decisions out of code and catch contradictions with cross-validation — is language-independent. Understanding the principle means you can apply it immediately when the tool expands. And Hurl tests from Class 3 already work regardless of language — you can do API contract verification right now without yongol.

100K Lines vs 12K Lines

Why bother separating decisions? Numbers make it immediately clear.

ScaleExampleSSOT (decisions only)Implementation code
SmallHair salon booking~1,500 lines~10K lines
MediumJira, Notion-class~12,500 lines~100K lines
LargeShopify-class~30,000 lines~300K lines

Take a medium SaaS. Of 100K lines of code, decisions are 12,500 lines. The remaining 87,500 lines are wiring — error handling, library calls, boilerplate.

You can have AI read 100K lines. With 1M token context, it’s physically possible. But being able to read isn’t the same as being able to handle accurately. As context grows, middle information gets missed, and unnecessary tokens blur judgment.

Separating only decisions yields 12,500 lines. In a context with only the core and no noise, AI accuracy improves. Same AI doing the same task, but reading 1/8th the amount increases accuracy. An effect of ~10x context compression.

operationId — The Key That Threads All Layers

If 10 specifications play separately, that’s chaos. They need to be connected. How?

With one name.

Pressing one button sends one request to the server. Each of those requests is called an endpoint.

In a full-stack application, the unit of a feature is an API endpoint. When a user presses a button, an API is called, that API executes service logic, reads the DB, checks permissions, and transitions state. The starting point of this flow is the operationId.

It shows at a glance which files a single feature spans.

Let’s see what happens when you enter an operationId called ExecuteWorkflow:

── Feature Chain: ExecuteWorkflow ──

  OpenAPI    api/openapi.yaml                POST /workflows/{id}/execute
  SSaC       service/workflow/execute_workflow.ssac   @get @empty @auth @state @call @publish @response
  DDL        db/workflows.sql                CREATE TABLE workflows
  DDL        db/execution_logs.sql           CREATE TABLE execution_logs
  Rego       policy/authz.rego               resource: workflow
  StateDiag  states/workflow.md              diagram: workflow → ExecuteWorkflow
  FuncSpec   func/billing/check_credits.go   @func billing.CheckCredits
  FuncSpec   func/billing/deduct_credit.go   @func billing.DeductCredit
  FuncSpec   func/worker/process_actions.go  @func worker.ProcessActions
  FuncSpec   func/webhook/deliver.go         @func webhook.Deliver
  Hurl       tests/scenario-happy-path.hurl  scenario: scenario-happy-path.hurl

You don’t need to be able to read this output. What matters is that one operationId can track everything.

From API spec to DB schema, authorization policy, state transitions, function implementation, test scenario — the entire terrain of one feature is visible on one screen.

This is why yongol calls the operationId a keystone. Just as in architecture the final wedge stone placed at the top of an arch supports the entire arch, one PascalCase identifier physically binds 10 layers.

During code review — “If this feature was modified, shouldn’t that file have changed too?” Compare against the chain for instant confirmation. When a new team member joins and asks “how does ExecuteWorkflow work?”, just show one Feature Chain. Dozens of greps replaced by one command.

SSaC — Captures Decisions Inside Functions

The most unique of the 10 SSOTs is SSaC (Service Sequences as Code).

Looking at existing SSOTs, OpenAPI declares “what requests are received and what responses are given.” SQL DDL declares “what gets stored.” But the function internals — the business flow of “query → validate → create → respond” — had nowhere to be declared. You had to read implementation code to know.

SSaC fills this gap.

Let’s look at a real example. A feature called “accept a proposal (AcceptProposal).”

Reading it in plain English:

  1. Query the proposal
  2. If the proposal doesn’t exist, return “not found” error
  3. Query the project linked to the proposal
  4. If the project doesn’t exist, return “not found” error
  5. Check if the requester is the project owner
  6. Check if the proposal status allows acceptance
  7. Check if the project status allows acceptance
  8. Change the proposal status to “accepted”
  9. Assign a freelancer to the project and change status to “in progress”
  10. Hold payment in escrow
  11. Publish “proposal accepted” event
  12. Query and return the updated proposal

Written in SSaC:

The code below is just the above English formatted to specification. You don’t need to read it.

// @get Proposal p = Proposal.FindByID({ID: request.id})
// @empty p "Proposal not found" 404
// @get Gig gig = Gig.FindByID({ID: p.GigID})
// @empty gig "Gig not found" 404
// @auth "AcceptProposal" "gig" {ResourceID: request.id} "Forbidden" 403
// @state proposal {status: p.Status} "AcceptProposal" "Cannot accept" 409
// @state gig {status: gig.Status} "AcceptProposal" "Cannot accept on gig" 409
// @put Proposal.UpdateStatus({ID: p.ID, Status: "accepted"})
// @put Gig.AssignFreelancer({ID: gig.ID, FreelancerID: p.FreelancerID, Status: "in_progress"})
// @call billing.HoldEscrowResponse escrow = billing.HoldEscrow({GigID: gig.ID, Amount: gig.Budget})
// @publish "proposal.accepted" {GigID: gig.ID, FreelancerID: p.FreelancerID}
// @get Proposal updated = Proposal.FindByID({ID: p.ID})
// @response { proposal: updated }
func AcceptProposal() {}

You don’t need to read this code. AI writes it. You just confirm whether the English sentences above are correct.

16 lines. 10 annotations. What’s inside:

  • Two resource queries (@get)
  • Existence checks (@empty)
  • Permission check (@auth)
  • Two state machine checks (@state)
  • Two updates (@put)
  • Escrow processing (@call)
  • Event publishing (@publish)
  • Final response (@response)

Generating implementation code from these 16 lines produces over 100 lines. Error handling, transaction management, type conversion, response formatting — all filled by code generation. All you care about is the decision of “what order to process in.”

SSaC has fewer than 20 annotations total. You can learn them on one page. And again, you don’t need to learn them yourself. AI writes them.

yongol validate — 287 Rules Catch Contradictions

Since decisions are distributed across 10 files, contradictions between files can arise.

  • What if DDL says BIGINT but OpenAPI says string?
  • What if SSaC declares @auth but Rego has no corresponding rule?
  • What if the state diagram has a transition but SSaC has no corresponding function?
  • What if Hurl has a test referencing an endpoint not in features?

Contradictory decisions produce contradictory code. No matter how clean the code, if decisions conflict, behavior goes wrong.

yongol validate catches this. The result of checking all connections between 10 specifications:

You don’t need to understand the output below fully. All green checkmarks mean no contradictions.

✓ manifest        ✓ openapi_ddl       ✓ ssac_rego
✓ openapi         ✓ openapi_ssac      ✓ ssac_authz
✓ ddl             ✓ hurl_openapi      ✓ ssac_sqlc
✓ query           ✓ hurl_statemachine ✓ ddl_statemachine
✓ ssac            ✓ hurl_manifest     ✓ ddl_rego
✓ statemachine    ✓ openapi_manifest  ✓ rego_manifest
✓ rego            ✓ ssac_ddl          ✓ stml_openapi
✓ hurl            ✓ ssac_statemachine
✓ funcspec        ✓ ssac_func

0 errors, 0 warnings

First it validates each SSOT individually, then runs cross-layer validation. ~287 rules check all symbol references between the 10 SSOTs. If there’s a single contradiction, code generation is refused.

Let’s highlight the key point. Existing tools only see their own layer. An OpenAPI validator checks if the OpenAPI spec is valid. A SQL validator checks if DDL is valid. But “OpenAPI says user_id is string but DDL says BIGINT” — nobody catches this kind of cross-layer contradiction. yongol validate’s unique value is this cross-validation.

When errors occur, messages like this appear:

✗ SSaC         CancelReservation
               @model Reservation.SoftDelete — method not found in sqlc queries
✗ Cross        1 mismatch

FAILED: Fix errors before codegen.

“The CancelReservation function calls Reservation.SoftDelete, but there’s no SoftDelete method in sqlc queries.” Unambiguous. It tells you exactly where things don’t align.

AI writes freely. When it goes off-rails, validate catches it immediately. Freedom on rails.

yongol agent — Even 4.5B Models Converge to 0 Errors

Validate catches things — great. But does a person need to fix the errors?

No. AI does.

yongol agent specs/ --model ollama:gemma4:e4b --max-rounds 20

This one command makes AI repeat validate → check errors → fix → validate again → fix again. Until 0 errors.

There are experimental results. For one Login endpoint, various models were asked to write 9 SSOT files:

ModelSizeEnvironmentResult
Grok 4.3LargeAPI0 errors on first try
Gemini 2.5 FlashMediumAPI (free)0 errors with 1 feedback
Gemma44.5BLocal (16GB VRAM)0 errors with 1 feedback
Qwen38BLocal0 errors with 1 feedback

Even a 4.5B local model works. $0 cost. Offline. Without internet.

Why do small models work? Because validate’s feedback is deterministic fact. “line 41: field name mismatch, expected ‘user_id’, got ‘userId’” — this isn’t an opinion. It’s a fact. There’s no room for AI to flatter about facts. It accepts “yes, I’ll fix it” and corrects.

It’s not the model’s IQ but the precision of feedback that determines the result.

Benchmark: ZenFlow — 32 Endpoints in 69 Minutes

Not theory. Measured results.

ZenFlow — a multi-tenant workflow automation SaaS. Built from scratch entirely with yongol.

PhaseContentTimeCumulative
Initial build10 endpoints, 6 tables, auth, state machine13 min13 min
+ VersioningWorkflow cloning, version list6 min19 min
+ WebhooksWebhook CRUD, queue backend6 min25 min
+ Template marketplaceCursor pagination, cross-org cloning3 min28 min
+ File attachmentsExecution reports, file backend4 min32 min
+ SchedulingCron scheduling, session backend6 min38 min
+ Audit logsOffset pagination, cache backend3 min41 min
+ DashboardRelation joins, func response types7 min48 min
+ Batch operationsjsonb batch insert14 min62 min
+ External APIGeocoding func, column addition3 min65 min
+ Conditional updatesSentinel pattern, auto-assignment4 min69 min

Final: 32 endpoints, 14 tables, 47 Hurl requests. 11/11 phases passed.

The most important thing about these numbers isn’t “69 minutes.” It’s that speed didn’t decrease as features were added.

The first feature (initial build) took 13 minutes. The eleventh feature (conditional updates) took 4 minutes. The phenomenon called “the 200-endpoint wall” in vibe coding — where the cost of adding features grows exponentially — didn’t exist.

Existing tests didn’t break either. 47 Hurl requests all passed at every phase.

Can Generated Code Be Edited

“If code is auto-generated, won’t my manual edits be lost?”

It’s possible. yongol generate preserves user edits on re-run.

  • All generated files have a //yg:checked llm=yongol-gen hash=<8hex> annotation.
  • If you modify code, the hash changes.
  • Files with changed hashes are marked as preserved and skipped in the next generate.
  • yongol status shows preserved files and contract drift.

SSOT is the truth, generated code is a projection, but your drawings on the projection are preserved.

Why Bigger Models Aren’t the Answer

“GPT-6 will fix it.”

It won’t. The problem isn’t model intelligence — it’s the medium.

The medium of code doesn’t distinguish decisions from implementation. Whatever model reads code, it sees text where decisions and details are interleaved. No matter how smart the model, if the medium doesn’t provide distinction, it can’t distinguish.

yongol changes the medium. It moves AI’s editing target from code to declarative specifications. Specifications contain only decisions and no implementation details, so AI can’t mistake decisions for details.

When a small LLM edits only SSOTs and validate gives precise feedback at every mistake, it maintains the same level of decision integrity as a much larger model editing raw code.

Not a bigger model, but a more precise structure is the answer.

Agent Workflow — All You See Is the Result

The actual flow of using yongol:

1. You say "make a reservation feature"
2. AI edits SSOTs in specs/
3. yongol validate specs/ — checks consistency
4. If errors → AI fixes the relevant SSOT → back to step 3
5. 0 errors → yongol generate — generates code
6. Hurl tests run automatically
7. Pass → commit. On to the next feature.

You don’t need to read code. You don’t even need to read SSOTs. “Make it” → “Done?” → “Done” — that’s the loop. What changes is that nothing breaks behind the scenes.

The vibe coding experience stays the same. Only the 200-endpoint wall disappears.

Summary — What to Remember from This Class

  1. Code has decisions and details mixed together. AI can’t distinguish them. This is the root cause of drift.

  2. Pull decisions out of code. 10 declarative specifications (SSOT) each handle one concern. 8 of 10 are industry standards.

  3. operationId is the keystone. One name threads through 10 layers. A single Feature Chain shows the complete terrain of a feature.

  4. 287 rules catch cross-layer contradictions. Existing tools only see their own layer. yongol validate catches cracks between layers.

  5. Even 4.5B models converge to 0 errors. It’s not model IQ but precision of feedback that determines results.

Exercise: Declare a Login Endpoint as SSOT and Catch Contradictions

Goal: Experience yongol validate’s cross-validation firsthand.

Step 1: Environment Setup

Copy and run the command below in your terminal to install yongol functionality.

npx skills add park-jun-woo/yongol

Install the yongol skill in your AI agent.

Step 2: Declare the Login Endpoint

Tell AI: “Declare the Login feature as SSOT.”

AI will generate 5 files:

  • specs/api/openapi.yaml — POST /auth/login
  • specs/db/users.sql — CREATE TABLE users
  • specs/service/auth/login.ssac — @get → @empty → @call → @response
  • specs/policy/authz.rego — Authorization policy
  • specs/tests/scenario-login.hurl — Login test

Step 3: Validate

yongol validate specs/

0 errors means success.

Step 4: Intentionally Create a Contradiction

Tell AI “change the email field in OpenAPI to mail.” Leave DDL and SSaC as-is.

yongol validate specs/

An error should appear. “OpenAPI says mail but DDL says email.”

This is cross-validation. Looking at one layer alone, there’s no error. Cross two layers and the contradiction appears.

Step 5: Resolve the Contradiction

Tell AI: “Fix the validate errors.” AI will unify the field name. Validate again. 0 errors.

Step 6: Code Generation

yongol generate specs/ artifacts/

The complete code for the Login feature is generated. Over 100 lines of implementation code from 16 SSaC lines.

What you should have felt in this exercise:

  • The sensation of decisions (SSOT) and implementation (generated code) being separated
  • The moment cross-validation catches a contradiction between layers
  • The process of AI converging by following validate feedback when told “fix it”

  • yongol — The Keel of AI Coding SaaS — Technical details on yongol’s architecture, 10 SSOTs, and 287 cross-validation rules
  • Feature Chain — A tracking system that threads through 10 layers with one operationId
  • SSaC — Service Sequences as Code. How to declaratively capture business flows inside functions

Reins Engineering Full Course

ClassTitle
Class 1How to Command AI
Class 2How to Distrust AI
Class 3Unbreakable Apps
Class 4Decisions Outside Code
Class 5AI with Reins
Class 6Lock When It Passes
Class 7Flipping Sycophancy
Class 8Agent Factory
Class 9Automation Beyond Code
Class 10Law of Data

Sources

  • ZenFlow benchmark — 32 endpoints, 14 tables, 47 Hurl requests in 69 minutes. 11/11 phases passed. No speed degradation when adding features
  • yongol agent model experiment — Grok 4.3 (0 errors first try), Gemini 2.5 Flash (0 errors with 1 feedback), Gemma4 4.5B (0 errors with 1 feedback), Qwen3 8B (0 errors with 1 feedback)
  • yongol validate — 287 rules, cross-validation across 10 SSOTs
  • Medium SaaS code size comparison — SSOT 12,500 lines vs implementation code 100K lines (~10x context compression)