A framework for accurate, efficient private record linkage

Durham, Elizabeth Ashley

A framework for accurate, efficient private record linkage

dc.creator	Durham, Elizabeth Ashley
dc.date.accessioned	2020-08-22T00:02:52Z
dc.date.available	2012-04-09
dc.date.issued	2012-04-09
dc.identifier.uri	https://etd.library.vanderbilt.edu/etd-03262012-144837
dc.identifier.uri	http://hdl.handle.net/1803/11417
dc.description.abstract	Record linkage is the task of identifying records from multiple data sources that refer to the same individual. Private record linkage (PRL) is a variant of the task in which data holders wish to perform linkage without revealing identifiers associated with the records. PRL is desirable in various domains, including health care, where it may not be possible to reveal an individual’s identity due to confidentiality requirements. In medicine, PRL can be applied when datasets from multiple care providers are aggregated for biomedical research, thus enriching data quality by reducing duplicate and fragmented information. Additionally, PRL has the potential to improve patient care and minimize the costs associated with replicated services, by bringing together all of a patient’s information. This dissertation is the first to address the entire life cycle of PRL and introduces a framework for its design and application in practice. Additionally, it addresses how PRL relates to policies that govern the use of medical data, such as the HIPAA Privacy Rule. To accomplish these goals, the framework addresses three crucial and competing aspects of PRL: 1) computational complexity, 2) accuracy, and 3) security. As such, this dissertation is divided into several parts. First, the dissertation begins with an evaluation of current approaches for encoding data for PRL and identifies a Bloom filter-based approach that provides a good balance of these competing aspects. However, such encodings may reveal information when subject to cryptanalysis and so, second, the dissertation presents a refinement of the encoding strategy to mitigate vulnerability without sacrificing linkage accuracy. Third, this dissertation introduces a method to significantly reduce the number of record pair comparisons required, and thus computational complexity, for PRL via the application of locality-sensitive hash functions. Finally, this dissertation reports on an extensive evaluation of the combined application of these methods with real datasets, which illustrates that they outperform existing approaches.
dc.format.mimetype	application/pdf
dc.subject	private record linkage
dc.subject	data sharing
dc.subject	entity resolution
dc.title	A framework for accurate, efficient private record linkage
dc.type	dissertation
dc.contributor.committeeMember	Mark Frisse
dc.contributor.committeeMember	Dario Giuse
dc.contributor.committeeMember	Murat Kantarcioglu
dc.contributor.committeeMember	Yuan Xue
dc.type.material	text
thesis.degree.name	PHD
thesis.degree.level	dissertation
thesis.degree.discipline	Biomedical Informatics
thesis.degree.grantor	Vanderbilt University
local.embargo.terms	2012-04-09
local.embargo.lift	2012-04-09
dc.contributor.committeeChair	Bradley Malin

Files in this item

Name:: dissertation.pdf
Size:: 2.763Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Electronic Theses and Dissertations
Electronic theses and dissertations of masters and doctoral students submitted to the Graduate School.

Show simple item record