Image: AI generated
Rule engines have stood on the same premise for 60 years: the validation target is a “fact.”
Drools puts Java objects as “facts” into working memory. Rego treats input as already-true data. JSON Schema assumes the document structure is given. It’s all the same assumption — incoming data is fact.
But what is a rule engine for? Validating whether data satisfies rules. Calling something that needs validation “already true” is a contradiction.
Not Facts, but Claims
Validation targets are not facts — they are claims. Assertions that may be true or false. Their validity must be judged by rules.
JWT already follows this principle. It calls sub, exp, iss not “facts” but “claims.” They are the token issuer’s assertions. Only after verifying the signature, checking expiration, and matching the issuer can they be trusted.
This structure was already established in 1958.
Toulmin’s Argumentation Model
Stephen Toulmin analyzed the structure of argumentation into six elements in 1958:
- Claim: The target of judgment. What must be verified as true or false.
- Ground: The evidence data used for judgment.
- Warrant: The rule that determines whether the ground supports the claim.
- Backing: The justification for why the rule is valid.
- Qualifier: The degree of confidence in the judgment.
- Rebuttal: The exception conditions under which the claim does not hold.
Formal logic says “if the premises are true, the conclusion is true.” Toulmin was different. “A claim is supported by grounds and warrants, but overturned if exception conditions exist.” Every argument is defeasible.
Rule engines have stood on the formal logic side for 60 years. Input is fact, output is allow/deny, exceptions are a separate mechanism. Toulmin stood on the opposite side. Input is claim, output is degree, exceptions are built-in.
The problem was — Toulmin’s book sat on the philosophy shelf. It was invisible from the rule engine shelf. A 60-year missing link.
So I Built a Rule Engine
toulmin implements Toulmin’s argumentation model as a Go rule engine.
Requirements Evolve
Let’s see how if-else and toulmin respond to the same evolution of requirements.
// Monday: "Only authenticated users, IP blocking applied, internal network exempt from blocking"
g := toulmin.NewGraph("api:access")
auth := g.Rule(isAuthenticated)
blocked := g.Counter(isIPBlocked)
exempt := g.Except(isInternalIP)
blocked.Attacks(auth)
exempt.Attacks(blocked)
// Tuesday: "Add rate limiting"
limited := g.Counter(isRateLimited)
limited.Attacks(auth)
// Wednesday: "Premium users are exempt from rate limits"
premium := g.Except(isPremiumUser)
premium.Attacks(limited)
// Thursday: "During incident response, even premium users are limited"
incident := g.Counter(isIncidentMode)
incident.Attacks(premium)
Two lines added each day, no changes to existing code. The same evolution with if-else:
// Monday
if user != nil {
if blockedIPs[ip] {
if strings.HasPrefix(ip, "10.") {
allow = true
}
} else {
allow = true
}
}
// Thursday — 4 levels of nesting, structure unreadable
if user != nil {
if blockedIPs[ip] {
if strings.HasPrefix(ip, "10.") {
allow = true
}
} else if isRateLimited(ip) {
if isPremium(user) {
if !incidentMode {
allow = true
}
}
} else {
allow = true
}
}
toulmin: 2 lines per requirement, structure unchanged. if-else: Rewrite the entire structure every time.
Rules Are Go Functions
func(ctx Context, specs Specs) (bool, any)
ctx= judgment material that varies per request (user, IP, context). Accessed viaGet/Set.specs= judgment criteria fixed at graph declaration time via.With()(thresholds, role names, config)- Return =
(judgment result, evidence). Evidence is a domain-specific free type.
func CheckOneFileOneFunc(ctx toulmin.Context, specs toulmin.Specs) (bool, any) {
gf, _ := ctx.Get("file")
f := gf.(*FileGround)
if len(f.Funcs) > 1 {
return true, &Evidence{Got: len(f.Funcs), Expected: 1}
}
return false, nil
}
No need to learn a new language like Rego. Just write Go functions. (The TypeScript port rulecat uses the same signature — npm install rulecat.)
spec — Same Function, Different Judgment Criteria
A spec passes a rule’s judgment criteria via the builder at declaration time. Registering the same function with a different spec creates a separate rule — no closure factory needed:
g := toulmin.NewGraph("access")
admin := g.Rule(isInRole).With(&RoleSpec{Role: "admin"}) // ruleID = "isInRole#admin"
editor := g.Rule(isInRole).With(&RoleSpec{Role: "editor"}).Qualifier(0.8)
g := toulmin.NewGraph("line-limit")
strict := g.Rule(CheckLineCount).With(&LineLimit{Max: 100}).Qualifier(0.7)
relaxed := g.Rule(CheckLineCount).With(&LineLimit{Max: 200}).Qualifier(0.5)
relaxed.Attacks(strict)
A spec value must implement the Spec interface (SpecName() string and Validate() error) and is validated at registration time. Rules that need no spec simply omit .With() (a nil spec).
Exceptions Are Declared as a Graph
Declare relationships between rules with the Graph Builder API and the engine handles the rest. Functions are identifiers. No string names needed.
g := toulmin.NewGraph("filefunc")
w := g.Rule(CheckOneFileOneFunc)
d := g.Except(TestFileException)
d.Attacks(w)
ctx := toulmin.NewContext()
ctx.Set("file", file)
results, _ := g.Evaluate(ctx)
The same function can be reused in different graphs with different defeat relationships:
strictGraph := toulmin.NewGraph("strict")
strictGraph.Rule(CheckOneFileOneFunc)
// No exceptions — test files not allowed either
lenientGraph := toulmin.NewGraph("lenient")
w := lenientGraph.Rule(CheckOneFileOneFunc)
r1 := lenientGraph.Except(TestFileException)
r2 := lenientGraph.Except(GeneratedFileException).Qualifier(0.8)
r1.Attacks(w)
r2.Attacks(w)
// Both test + generated files are exceptions
Judgment Rationale Is Traced
Pass EvalOption{Trace: true} and the engine tracks not just the verdict but which rules activated and which rules defeated which. Each TraceEntry carries the Toulmin elements directly — Name (Claim), Ground (ctx), Specs (Backing), and Verdict:
results, _ := g.Evaluate(ctx, toulmin.EvalOption{Trace: true})
// results[0].Verdict: +0.6
// results[0].Trace: [
// {Name: "CheckOneFileOneFunc", Role: "rule", Activated: true, Qualifier: 1.0},
// {Name: "TestFileException", Role: "except", Activated: true, Qualifier: 1.0},
// ]
When there are dozens of rules, “why did this verdict come out” is human-readable. Pass Duration: true and it also measures per-rule execution time. Audit logging and debugging are built into the engine — no separate logging needed.
The Verdict Is Computed by a Single Formula
Amgoud’s h-Categoriser (2013) is applied:
raw = w / (1 + Σ raw(attackers))
verdict = 2 × raw - 1
+1.0— violation confirmed0.0— undecidable-1.0— rebuttal confirmed
When a rule fires, it becomes a warrant. When an exception fires, it becomes an attacker. The formula computes the balance of power between them to produce a verdict. What about exceptions to exceptions? They become attackers of attackers, restoring the original rule. Compensation principle — a property that only h-Categoriser satisfies.
Rules Have Three Strengths
Nute’s (1994) classification is applied:
| Strength | Meaning | Example |
|---|---|---|
| Strict | Can never be defeated | “No admin API access without authentication” |
| Defeasible | Can be defeated by exceptions | “One function per file” |
| Defeater | Only blocks other rules, makes no claim of its own | “Test files are exceptions” |
Strict rules reject attack edges. Defeaters only attack and have no judgment of their own. This structurally expresses the enforcement level of rules.
How Is It Different from Rego?
| Rego | toulmin | |
|---|---|---|
| Rule authoring | Must learn Rego DSL | Go functions |
| Exception handling | Manual default/else patterns | Declarative defeats graph |
| Judgment | Binary allow/deny | Continuous [-1, +1] |
| Rule justification | # METADATA (ignored by engine) | spec (part of the structure) |
| Rule strength | None | strict/defeasible/defeater |
| Engine size | Tens of thousands of lines | Hundreds of lines |
| Speed | Interpreter (parse -> AST -> evaluate) | Direct Go function calls |
Rego is broad — it has a Kubernetes, Terraform, and Envoy integration ecosystem. toulmin is deep — it has what Rego lacks (defeasibility, qualifier, backing).
Repositioning the Qualifier
In Toulmin’s original model, the Qualifier is attached to the Claim. “This patient probably should be given penicillin” — a modal qualifier expressing the confidence of the claim.
The toulmin engine repositions the Qualifier from the Claim to each Rule. In a rule engine, a claim is merely the validation target. “This file has 3 functions” — it’s a factual check, not something that needs a confidence level. What determines the quality of judgment is the rule’s confidence:
- “One function per file” — qualifier 1.0 (certain rule)
- “Recommended under 100 lines” — qualifier 0.7 (flexible rule)
Each Rule’s qualifier becomes the initial weight w(a) in h-Categoriser, and the final verdict takes over the role that the Qualifier played in Toulmin’s original model — the confidence of the judgment.
Empirical Validation: Converting filefunc’s 22 Rules to Toulmin
filefunc is a code structure convention tool for LLM-native Go development. All 22 rules were converted to Toulmin warrants.
Strength Classification
| Strength | Count | Ratio | Examples |
|---|---|---|---|
| Strict | 15 | 68% | F1, F2, F3, F4, A1-A3, A6-A16 |
| Defeasible | 4 | 18% | Q1, Q2, Q3, C4 |
| Defeater | 3 | 14% | F5, F6, test file exception |
Most are strict — code structure conventions inherently minimize exceptions.
Quantitative Results
| Project | Files (before -> after) | Avg LOC/file (before -> after) | SRP violations resolved | Depth violations resolved |
|---|---|---|---|---|
| filefunc | — (compliant from start) | 25.1 | 0 | 0 |
| yongol | 87 -> 1,260 | 244 -> 25.4 | 66 -> 0 | 148 -> 0 |
| whyso | 12 -> 99 | 147.8 -> 24.4 | 12 -> 0 | 23 -> 0 |
yongol went from 87 files to 1,260. The number of files exploded, but average LOC dropped from 244 to 25.4. All 66 SRP violations and 148 depth violations went to 0.
Theoretical Foundation
There is no original theory. It’s all existing research:
| Element | Original Work |
|---|---|
| 6-element structure | Toulmin (1958) |
| strict/defeasible/defeater | Nute (1994) |
| h-Categoriser | Amgoud & Ben-Naim (2013) |
The originality lies in the discovery that these connect. Things that existed separately in philosophy (Toulmin), logic (Nute), and argumentation theory (Amgoud) for 60 years meet at a single point: the software rule engine.
Computing Contracts
The rule of law works not because judges are smart, but because the structure forces judgment. Rules exist, exceptions are declared, and verdicts are computed based on evidence.
toulmin moved this structure into code.
- Warrant = statute
- Backing = legislative intent
- Strength = mandatory vs. discretionary provision
- Rebuttal = exception clause
- Claim = case
- Ground = evidence
- h-Categoriser = verdict
Declare contracts (warrants), declare exceptions (rebuttals), supply evidence (grounds), and the verdict is computed.
Not by human judgment. By formula.
Acc(a) = w(a) / (1 + Σ Acc(attackers))
MIT License. github.com/park-jun-woo/toulmin
Changelog
- 2026-06-18: API update — builder-pattern graph (
Rule/Counter/Except,.With(),.Attacks()), rule signaturefunc(ctx Context, specs Specs),Evaluate(ctx, EvalOption{Trace}), backing → spec, TypeScript port (rulecat) reflected - 2026-03-22: Initial release