Skip to main content

Documentation Index

Fetch the complete documentation index at: https://verdictweight.dev/llms.txt

Use this file to discover all available pages before exploring further.

Purpose

A confidence score that does not distinguish what the system cannot know from what is inherently noisy is operationally useless. Stream 2 performs that decomposition.
  • Aleatoric uncertainty — irreducible noise in the input itself. More data does not help.
  • Epistemic uncertainty — gaps in what the model has learned. More data, or different data, would help.
A high aleatoric reading means the input itself is ambiguous; the system should not be confident, and no amount of evidence-gathering will change that. A high epistemic reading means the input is outside the model’s reliable operating envelope; the system should escalate to human review or to a more capable model.

What the stream does

1

Estimate total uncertainty

From the upstream evidence (entropy of model logits, retrieval-set diversity, source disagreement), produce a total uncertainty estimate.
2

Decompose into aleatoric and epistemic components

Apply the variance-decomposition rule to split total uncertainty into the two components.
3

Penalize the contribution accordingly

Both components reduce the stream’s confidence contribution c2c_2, but they reduce it through different mechanisms.
4

Surface the decomposition in the audit record

The aleatoric / epistemic split is recorded for downstream review, not just folded silently into the score.

Why both components matter

Conflating the two components is the most common reason calibration fails out-of-distribution. A model trained on in-distribution data can be perfectly calibrated in distribution and catastrophically miscalibrated out of distribution — precisely because epistemic uncertainty was never measured. Stream 2’s decomposition is what allows the framework to surface “I don’t know what I don’t know” as a first-class output, rather than papering over it with a single conflated number.

Interaction with other streams

  • Stream 2’s epistemic estimate feeds Stream 4 (cross-source coherence) as a reliability signal.
  • Stream 2’s total uncertainty feeds Stream 5 (calibration) as an input to the reliability map.
  • A high epistemic reading can independently trigger abstention if it crosses the configured threshold.

What this stream does not do

  • It does not detect adversarial inputs. Adversarial inputs are designed to minimize observable uncertainty — that is what makes them adversarial. Detection is the job of Stream 6.
  • It does not provide a Bayesian posterior. The decomposition is variance-based, not posterior-based, by design — it does not require a tractable posterior over model weights.