Shadow tribunals - second opinions beside the run, not inside the myth
A strong agent system does not need one louder voice. It needs a primary path, bounded shadow judges, and a clear rule for what disagreement can and cannot do.
A second opinion should sit beside the run, not inside the story about the run.
The real point is a shadow tribunal.
The primary agent path still does the work. One role still owns the main artifact, brief, or verifier pass. But one nearby judge, or two, watches the same boundary and records whether the primary path still looks sane.
The shadow does not exist to add drama. The shadow exists to catch drift early.
What a shadow tribunal is
A shadow tribunal is a bounded set of second-opinion roles running beside the primary path.
Important parts of the definition:
- the primary path stays primary
- the shadow roles have explicit names
- the shadow roles have explicit scope
- the system records disagreement
- promotion power is a separate decision
The last part matters.
A shadow judge can exist in three useful modes:
- non-blocking observer
- warning surface
- blocking authority
Most systems should start with the first mode.
Why the shadow belongs beside the run
Many systems talk as if quality lives inside the best prompt, the best model, or the best orchestrator.
The story gets brittle fast.
A system can feel stable and still drift:
- a provider update changes tone or refusal behavior
- a retrieval surface starts surfacing weaker context
- a verifier path gets too forgiving
- a teaching loop starts sounding flatter
- a bounded brief quietly broadens into a vague summary
The primary path may still “work.” The shadow exists to name the change while the cost is still small.
This is why the tribunal belongs beside the run. A later postmortem is too late to be useful as a day-to-day sentinel.
The Trinity version
The StoneyTECH Trinity repos already ship the seams for this move:
shadow/tribunal-config.example.jsonagents/graph-map.jsonintegrations/n8n/workflow-stub.jsonc
StoneyTECH-Trinity-Learning-Agent hints at:
- shadow draft judges
- shadow study judges
StoneyTECH-Trinity-Evidence-Agent hints at:
- shadow brief judges reviewing the bounded evidence output
StoneyTECH-Trinity-GVAR-Engine hints at:
- shadow judges beside the verifier loop
- weekly shadow tournaments over retained receipts
The public pattern is already visible. The article simply names what the seams are for.
What the shadow should judge
A shadow judge should not score “everything.”
A useful shadow role watches one narrow risk:
- voice drift
- source drift
- verifier softness
- safety framing loss
- confidence inflation
The narrowness is what keeps the tribunal from turning into theater.
If the shadow role watches one thing, disagreement means something. If the shadow role watches the whole universe, disagreement becomes mush.
What disagreement should do first
The safest first policy is:
- primary path continues
- shadow path records disagreement
- disagreement lands in the trace or receipt
- retros and tournaments compare outcomes later
This gives three benefits quickly:
- drift becomes visible
- the primary path stays fast
- the team learns whether the shadow is useful before giving it authority
Only after repeated evidence should a shadow role gain blocking power.
This is the cheapest honest path.
Why weekly tournaments matter
The GVAR ledger already points at the next useful move: compare retained runs over a short horizon.
A shadow tournament helps answer:
- which primary path produced cleaner outcomes
- which shadow judge catches useful drift
- which shadow judge is noisy
- whether disagreement predicts later rework
The move turns the tribunal from folklore into evidence.
A week is a good first window because the memory stays fresh and the storage stays cheap.
What the shadow should never become
A shadow tribunal should not become a mystical chorus.
Warning signs:
- too many judges
- no named risk per judge
- no written disagreement policy
- no receipts
- no retirement path for noisy judges
The result looks sophisticated and teaches nothing.
The good version stays almost boring:
- one primary path
- one or two narrow shadow roles
- one receipt trail
- one explicit rule about whether the shadow can block
The shape is teachable. The shape is upgradeable. The shape survives contact with code.
The right growth path
Start here:
- one primary path
- one non-blocking shadow judge
- disagreement in the receipt
- weekly replay or tournament
Grow later into:
- multiple shadow roles
- n8n fan-out
- provider diversity in the shadow set
- blocking authority for proven judges
The order keeps the tribunal earned instead of decorative.
Why this matters for public pattern repos
A public agent repo becomes more generous when it shows how second opinions enter the system without pretending full governance is already solved.
This is what the Trinity repos now do.
They do not ship a grand council. They ship the seams:
- shadow config
- graph role map
- workflow stub
- retained receipts where comparison can happen
The seam set is enough for a reader to begin with second opinions the same way the rest of the StoneyTECH corpus keeps teaching:
small first, explicit first, inspectable first.
Axioms applied in this essay
This article tested 6 of the StoneyTECH engineering axioms. Each verdict is the result of applying that axiom in this specific argument.
- #1 The smallest lever wins held
The article starts with non-blocking shadow judges before full panel authority.
- #2 Push work down toward determinism held
Determinism moves into named shadow roles, explicit disagreement policy, and recorded outcomes.
- #5 Never trust 'running' without sentinels held
Second opinions become sentinels with visible boundaries instead of hidden reassurance.
- #13 Ship with the failure mode named held
The article names the failure mode plainly: silent drift in the primary path.
- #14 Two cheaper alternatives first held
Shadow judges start as non-blocking readers before they gain promotion power.
- #16 Don't comment without building. Don't curate without proving. held
The Trinity repos already ship shadow tribunal seams in public, so the article can point to working structure instead of only describing it.
