denselinkage.mining.mine_hard_negatives

denselinkage.mining.mine_hard_negatives(candidates: Sequence[CandidatePair], *, gold: LabeledPairs, n: int | None = None, directed: bool = True) list[CandidatePair][source]

The hardest negatives among candidates: scored pairs not in gold, ordered by descending similarity (ties broken by id for determinism).

These are the pairs the blocker ranked as similar but gold says are non-matches — contrastive material a Phase-C trainer turns into the negatives of a TrainingPairs. Unscored pairs (similarity_score is None) carry no hardness signal and are excluded. n caps the result to the n hardest (None = all); directed follows linkage_metrics (pass directed=False for dedupe candidates). Raises ValueError if n is negative.