Metrics¶
Pure functions over already-computed pipeline outputs, plus the typed report
dataclasses they return. Each evaluator takes ground truth as a keyword-only
gold argument. Per Architecture (ADR-0002) these live in the metrics
layer, not in core — nothing in the contract depends on them.
Functions¶
Precision/recall/F1 over all candidate pairs against |
|
Pair-completeness@k for each k in |
|
Single-k pair-completeness (kwarg |
|
B³ clustering quality of |
|
Sweep a similarity threshold over scored |
|
Decompose end-to-end recall into matcher x blocker components. |
Reports¶
Contract: pairs that errored (a |
|
Pair-completeness@k. |
|
B³ (Bagga-Baldwin) clustering quality. |
|
Full precision/recall/F1 curve over a threshold grid. |
|
End-to-end honest number: matcher metrics adjusted by blocker pair-completeness@k ( |