Using Abstraction to Overcome Problems of Sparsity, Irregularity, and Asynchrony in Structured Medical Data

VanHouten, Jacob Paul

Using Abstraction to Overcome Problems of Sparsity, Irregularity, and Asynchrony in Structured Medical Data

dc.creator	VanHouten, Jacob Paul
dc.date.accessioned	2020-08-22T20:35:17Z
dc.date.available	2018-07-29
dc.date.issued	2016-07-29
dc.identifier.uri	https://etd.library.vanderbilt.edu/etd-07252016-100314
dc.identifier.uri	http://hdl.handle.net/1803/13588
dc.description.abstract	Electronic health records (EHRs) are rich data sources that can be analyzed to discover new, clinically relevant patterns of disease manifestations. However, sparsity, irregularity, and asynchrony in health records pose challenges for their use in such discovery tasks, as standard statistical and machine learning techniques possess limited ability to handle these complications. Abstracting the clinical data into models and then using elements of those models as input to statistical and machine learning algorithms is one approach to overcoming these challenges. This dissertation provides insight into the use of different models for this purpose. First, I examine the effect of model complexity on algorithm performance. Specifically, I examine how well different models capture the low-specificity information distributed throughout electronic health data. For several predictive algorithms, low-complexity models turn out to be nearly as powerful and much less costly as high-complexity models. I then explore the use of continuous longitudinal models of laboratory results and diagnosis billing codes to discover clinically relevant patterns between and among these data. I look for associations between clusters of specific laboratory values and single billing codes, and identify known associations as well as others that are consistent with current medical knowledge but not expected a priori. Finally, I use the same longitudinal abstraction models as inputs into more complex probabilistic models that adjust for indirect associations, and find that diagnosis codes can be used to predict the laboratory status of a patient.
dc.format.mimetype	application/pdf
dc.subject	clinical informatics
dc.subject	data mining
dc.subject	medical records
dc.subject	data representation
dc.subject	machine learning
dc.title	Using Abstraction to Overcome Problems of Sparsity, Irregularity, and Asynchrony in Structured Medical Data
dc.type	dissertation
dc.contributor.committeeMember	Katherine E. Hartmann
dc.contributor.committeeMember	Nancy M. Lorenzi
dc.contributor.committeeMember	Michael E. Matheny
dc.contributor.committeeMember	Christopher J. Fonnesbeck
dc.type.material	text
thesis.degree.name	PHD
thesis.degree.level	dissertation
thesis.degree.discipline	Biomedical Informatics
thesis.degree.grantor	Vanderbilt University
local.embargo.terms	2018-07-29
local.embargo.lift	2018-07-29
dc.contributor.committeeChair	Thomas A. Lasko

Files in this item

Name:: VanHouten_Dissertation.pdf
Size:: 1.614Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Electronic Theses and Dissertations
Electronic theses and dissertations of masters and doctoral students submitted to the Graduate School.

Show simple item record