denselinkage.metrics.blocking_metrics

denselinkage.metrics.blocking_metrics(candidates: Sequence[CandidatePair], *, gold: LabeledPairs, ks: Sequence[int], directed: bool = True) BlockingMetrics[source]

Pair-completeness@k for each k in ks.

Candidates are grouped by query record (record_b) and ranked by similarity; PC@k is the fraction of gold pairs recalled when each query keeps its top-k candidates. Candidates are expected to be blocker-oriented (record_a indexed/reference, record_b query), as produced by BlockingIndex.query. To sweep ks meaningfully, pass the blocker’s full ranked retrieval (a large top_k), not an already-truncated set.

Pair identity (D1): same rule as linkage_metricsdirected=True (link) compares ordered; directed=False (dedupe) canonicalizes to an unordered key.