Layer-4 validation
Tag-quality labelling
Blind-label a random sample per field so the harness can measure model-vs-human agreement. Once a field clears the bar (30 labels, Cohen's κ ≥ 0.60), its dimension renders as trusted instead of experimental. Label what you see — the model's answer is hidden to avoid bias.
Production quality — 0/30 labelled
—