Show simple item record

Statistical Methods for the Analysis of Error-Prone Electronic Health Records: Impact of Source Data Verification, Time Discretized Multiple Imputation, and Variance Estimation with Incompatible Imputation and Analysis Models

dc.creatorGiganti, Mark Joseph
dc.date.accessioned2020-08-22T20:51:41Z
dc.date.available2020-08-29
dc.date.issued2018-08-29
dc.identifier.urihttps://etd.library.vanderbilt.edu/etd-08232018-184014
dc.identifier.urihttp://hdl.handle.net/1803/13990
dc.description.abstractObservational data from electronic health records (EHRs) are prone to errors which are often correlated across multiple variables. One strategy to assess EHR data quality is to compare the research study data to the original source document for a subset of records and document discrepancies. Given the resource-intensiveness of this source data verification (SDV), it is imperative to be able to justify its continued implementation. Using a data audit from an international HIV setting as a practical example, I propose a framework for assessing the impact of audits on study results and illustrate its implementation. Given the discrepancies in the originally collected data are substantial enough to impact epidemiological inferences, I propose a method to obtain unbiased and efficient estimates in time-to-event analyses while incorporating both the original error-prone data for all subjects and the audited data for the subsample of subjects. This time-discretized modeling and imputation (TDMI) approach uses discrete time models built in a validation sample to multiply impute covariate and outcome values in the remaining unvalidated records. Imputation variances estimates were calculated using an approached proposed by Robins and Wang (2000) that allows for incompatibility between imputation and analysis models. We provide a tutorial for calculating this imputation variance estimator using multiple examples and providing comprehensive R code.
dc.format.mimetypeapplication/pdf
dc.subjectelectronic health records
dc.subjectmissing data
dc.subjecttime-to-event outcomes
dc.subjectmeasurement error
dc.subjectvariance estimation
dc.titleStatistical Methods for the Analysis of Error-Prone Electronic Health Records: Impact of Source Data Verification, Time Discretized Multiple Imputation, and Variance Estimation with Incompatible Imputation and Analysis Models
dc.typedissertation
dc.contributor.committeeMemberPeter Rebeiro
dc.contributor.committeeMemberQingxia (Cindy) Chen
dc.contributor.committeeMemberBryan Shepherd
dc.type.materialtext
thesis.degree.namePHD
thesis.degree.leveldissertation
thesis.degree.disciplineBiostatistics
thesis.degree.grantorVanderbilt University
local.embargo.terms2020-08-29
local.embargo.lift2020-08-29
dc.contributor.committeeChairJonathan Schildcrout


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record