Documentation Index
Fetch the complete documentation index at: https://verdictweight.dev/llms.txt
Use this file to discover all available pages before exploring further.
The full formal treatment is in Paper 3: Unified Architecture. This page summarizes the result and points to the relevant sections.
What “completeness” means here
VERDICT WEIGHT does not claim to cover every conceivable failure mode of every conceivable AI system. It claims that for the target threat surface — autonomous decisioning under adversarial conditions in high-stakes deployments — the eight streams collectively cover the failure classes documented below, and that no smaller subset of the streams retains that coverage. The proof has two parts:- Coverage: every failure class in the target taxonomy is detected, suppressed, or surfaced by at least one stream.
- Necessity: removing any one stream leaves at least one failure class undetected. This is what the ablation studies measure empirically.
Failure taxonomy
| # | Failure class | Mitigated by |
|---|---|---|
| F1 | Miscalibrated model confidence | Streams 1, 5 |
| F2 | Source-correlation collapse (naive Bayes failure) | Stream 4 |
| F3 | Aleatoric / epistemic conflation | Stream 2 |
| F4 | Confidence drift on semantically equivalent inputs | Stream 3 |
| F5 | Confidence-flip adversarial input (Curveball class) | Stream 6 |
| F6 | Tampering with historical decisions | Stream 7 |
| F7 | Compromise of the scoring layer itself | Stream 8 |
| F8 | Forced classification under contradictory evidence | Composition rule (abstention) |
Coverage argument
For each failure class above, the corresponding stream(s) produce a signal that:- Lowers the composed confidence (F1–F4), or
- Triggers abstention (F8), or
- Triggers veto / abort (F5–F7).
Necessity argument
Necessity is established empirically through ablation. For each stream , we measure the change in detection rate of failure classes when is removed:- Removing Stream 1 degrades F1 detection (raw scores re-enter the aggregate uncorrected).
- Removing Stream 2 collapses F3 detection (no aleatoric/epistemic split).
- Removing Stream 3 silently re-admits F4 (drift becomes invisible).
- Removing Stream 4 re-introduces F2 (correlated sources count as independent).
- Removing Stream 5 flatly removes the calibration map; raw aggregates are systematically overconfident.
- Removing Stream 6 re-admits F5 (the Curveball attack class becomes undetectable).
- Removing Stream 7 removes audit-chain integrity (F6 becomes silent).
- Removing Stream 8 removes the registry kill switch (F7 cannot be enforced).