Skip to main content

Documentation Index

Fetch the complete documentation index at: https://verdictweight.dev/llms.txt

Use this file to discover all available pages before exploring further.

Design principles

VERDICT WEIGHT is built on five design principles, in priority order:
1

Composition over monolith

Each stream is independently testable, independently replaceable, and produces an interpretable signal. The composed score is a function of those signals, not a black box trained over them.
2

Calibration over rank-order

Confidence reported by the system must match correctness observed in the wild. A score of 0.9 must mean “right ~90% of the time.”
3

Audit by default

Every scoring event is hashed, chained, and reproducible. Audit is not a feature toggle.
4

Deterministic kill conditions

Adversarial detection and integrity violations route to a binary, registry-level abort — not a soft warning.
5

No external runtime dependencies

Scoring proceeds without contacting any external service. Suitable for air-gapped and classified environments.

The eight streams

The framework is organized into two layers: core scoring streams (1–5) which produce the calibrated confidence value, and hardening streams (6–8) which protect that value against adversarial manipulation and tampering.

Stream 1: Evidence aggregation

Uncertainty-aware fusion of model outputs, retrieval, and structured priors.

Stream 2: Uncertainty quantification

Decomposition into aleatoric and epistemic components.

Stream 3: Temporal stability

Penalizes confidence drift across semantically equivalent inputs.

Stream 4: Cross-source coherence

Rewards corroboration across independent signal sources.

Stream 5: Calibration

Post-hoc reliability correction.

Stream 6: SIS / Curveball detection

Detects adversarial inputs designed to perturb confidence.

Stream 7: CPS / hash-chain integrity

Cryptographic provenance for every scoring event.

Stream 8: RIS / registry kill switch

Binary abort on integrity compromise.

Composition

The streams compose under a formal rule documented in Eight-stream composition. Two properties of that rule are worth surfacing here:
  1. Hardening streams have veto power. A compromised audit chain (Stream 7) or a triggered kill switch (Stream 8) overrides the composed core score regardless of how high it would otherwise be.
  2. Abstention is a first-class output. When the core streams disagree past a configurable threshold, the framework returns abstention rather than a forced classification. Abstention is the right answer often enough that producing it is a feature, not a failure mode.

Computational profile

LayerComplexity
Core streams (single decision)O(1)
Pipeline (K decisions)O(K)
Audit verification (chain length N)O(N)
See Complexity analysis for the full derivation and empirical wall-clock measurements.

What the framework does not do

  • It does not train a model. VERDICT WEIGHT scores decisions produced by an upstream model stack you already control.
  • It does not replace human review. It produces a calibrated gate. The gate threshold is a policy decision; it is not the framework’s job to decide what is “confident enough.”
  • It does not promise to detect every adversarial input. Stream 6 raises the cost of a particular attack class (Curveball / confidence-flip). It does not claim universal robustness, and the head-to-head comparison is honest about what is and is not measured.