Show simple item record

A framework for accurate, efficient private record linkage

dc.creatorDurham, Elizabeth Ashley
dc.date.accessioned2020-08-22T00:02:52Z
dc.date.available2012-04-09
dc.date.issued2012-04-09
dc.identifier.urihttps://etd.library.vanderbilt.edu/etd-03262012-144837
dc.identifier.urihttp://hdl.handle.net/1803/11417
dc.description.abstractRecord linkage is the task of identifying records from multiple data sources that refer to the same individual. Private record linkage (PRL) is a variant of the task in which data holders wish to perform linkage without revealing identifiers associated with the records. PRL is desirable in various domains, including health care, where it may not be possible to reveal an individual’s identity due to confidentiality requirements. In medicine, PRL can be applied when datasets from multiple care providers are aggregated for biomedical research, thus enriching data quality by reducing duplicate and fragmented information. Additionally, PRL has the potential to improve patient care and minimize the costs associated with replicated services, by bringing together all of a patient’s information. This dissertation is the first to address the entire life cycle of PRL and introduces a framework for its design and application in practice. Additionally, it addresses how PRL relates to policies that govern the use of medical data, such as the HIPAA Privacy Rule. To accomplish these goals, the framework addresses three crucial and competing aspects of PRL: 1) computational complexity, 2) accuracy, and 3) security. As such, this dissertation is divided into several parts. First, the dissertation begins with an evaluation of current approaches for encoding data for PRL and identifies a Bloom filter-based approach that provides a good balance of these competing aspects. However, such encodings may reveal information when subject to cryptanalysis and so, second, the dissertation presents a refinement of the encoding strategy to mitigate vulnerability without sacrificing linkage accuracy. Third, this dissertation introduces a method to significantly reduce the number of record pair comparisons required, and thus computational complexity, for PRL via the application of locality-sensitive hash functions. Finally, this dissertation reports on an extensive evaluation of the combined application of these methods with real datasets, which illustrates that they outperform existing approaches.
dc.format.mimetypeapplication/pdf
dc.subjectprivate record linkage
dc.subjectdata sharing
dc.subjectentity resolution
dc.titleA framework for accurate, efficient private record linkage
dc.typedissertation
dc.contributor.committeeMemberMark Frisse
dc.contributor.committeeMemberDario Giuse
dc.contributor.committeeMemberMurat Kantarcioglu
dc.contributor.committeeMemberYuan Xue
dc.type.materialtext
thesis.degree.namePHD
thesis.degree.leveldissertation
thesis.degree.disciplineBiomedical Informatics
thesis.degree.grantorVanderbilt University
local.embargo.terms2012-04-09
local.embargo.lift2012-04-09
dc.contributor.committeeChairBradley Malin


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record