Rebecca C. Steorts
August 30, 2016
Handmatching is one of the most widely used methods in practice. Why?
Benefits: This is easy and simple to implement. Downsides:
There are obvious extensions of this.
Exact matching is very easy to implement and is widely used. Like all record linkage methods, it's not scalable.
\[ \begin{aligned} P[(a,b) \in M] &= p_M \\ P[\Gamma_{ab} = g \mid (a,b \in M)] &= \pi_{g \mid M}\\ P[\Gamma_{ab} = g \mid (a,b \in U)] &= \pi_{g \mid U}\\ \end{aligned} \] Putting this together we arrive at a two component mixture model: \[ P[\Gamma_{ab} = p_M \pi_{g \mid M} + (1-p_M)\pi_{g \mid U} \]
Four classifications of how pairs of records can be linked or not linked under the truth and under the estimate.
\[ FNR = \frac{FN}{CL+FN} \] \[ FPR = \frac{FP}{FP + CNL}. \]
\[ FDR = \frac{FP}{CL + FP}, \] where by convention take \( FDR = 0 \) if the num and denom are both 0.