Show simple item record

Using Abstraction to Overcome Problems of Sparsity, Irregularity, and Asynchrony in Structured Medical Data

dc.creatorVanHouten, Jacob Paul
dc.date.accessioned2020-08-22T20:35:17Z
dc.date.available2018-07-29
dc.date.issued2016-07-29
dc.identifier.urihttps://etd.library.vanderbilt.edu/etd-07252016-100314
dc.identifier.urihttp://hdl.handle.net/1803/13588
dc.description.abstractElectronic health records (EHRs) are rich data sources that can be analyzed to discover new, clinically relevant patterns of disease manifestations. However, sparsity, irregularity, and asynchrony in health records pose challenges for their use in such discovery tasks, as standard statistical and machine learning techniques possess limited ability to handle these complications. Abstracting the clinical data into models and then using elements of those models as input to statistical and machine learning algorithms is one approach to overcoming these challenges. This dissertation provides insight into the use of different models for this purpose. First, I examine the effect of model complexity on algorithm performance. Specifically, I examine how well different models capture the low-specificity information distributed throughout electronic health data. For several predictive algorithms, low-complexity models turn out to be nearly as powerful and much less costly as high-complexity models. I then explore the use of continuous longitudinal models of laboratory results and diagnosis billing codes to discover clinically relevant patterns between and among these data. I look for associations between clusters of specific laboratory values and single billing codes, and identify known associations as well as others that are consistent with current medical knowledge but not expected a priori. Finally, I use the same longitudinal abstraction models as inputs into more complex probabilistic models that adjust for indirect associations, and find that diagnosis codes can be used to predict the laboratory status of a patient.
dc.format.mimetypeapplication/pdf
dc.subjectclinical informatics
dc.subjectdata mining
dc.subjectmedical records
dc.subjectdata representation
dc.subjectmachine learning
dc.titleUsing Abstraction to Overcome Problems of Sparsity, Irregularity, and Asynchrony in Structured Medical Data
dc.typedissertation
dc.contributor.committeeMemberKatherine E. Hartmann
dc.contributor.committeeMemberNancy M. Lorenzi
dc.contributor.committeeMemberMichael E. Matheny
dc.contributor.committeeMemberChristopher J. Fonnesbeck
dc.type.materialtext
thesis.degree.namePHD
thesis.degree.leveldissertation
thesis.degree.disciplineBiomedical Informatics
thesis.degree.grantorVanderbilt University
local.embargo.terms2018-07-29
local.embargo.lift2018-07-29
dc.contributor.committeeChairThomas A. Lasko


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record