dc.creator | Giganti, Mark Joseph | |
dc.date.accessioned | 2020-08-22T20:51:41Z | |
dc.date.available | 2020-08-29 | |
dc.date.issued | 2018-08-29 | |
dc.identifier.uri | https://etd.library.vanderbilt.edu/etd-08232018-184014 | |
dc.identifier.uri | http://hdl.handle.net/1803/13990 | |
dc.description.abstract | Observational data from electronic health records (EHRs) are prone to errors which are often correlated across multiple variables. One strategy to assess EHR data quality is to compare the research study data to the original source document for a subset of records and document discrepancies. Given the resource-intensiveness of this source data verification (SDV), it is imperative to be able to justify its continued implementation. Using a data audit from an international HIV setting as a practical example, I propose a framework for assessing the impact of audits on study results and illustrate its implementation. Given the discrepancies in the originally collected data are substantial enough to impact epidemiological inferences, I propose a method to obtain unbiased and efficient estimates in time-to-event analyses while incorporating both the original error-prone data for all subjects and the audited data for the subsample of subjects. This time-discretized modeling and imputation (TDMI) approach uses discrete time models built in a validation sample to multiply impute covariate and outcome values in the remaining unvalidated records. Imputation variances estimates were calculated using an approached proposed by Robins and Wang (2000) that allows for incompatibility between imputation and analysis models. We provide a tutorial for calculating this imputation variance estimator using multiple examples and providing comprehensive R code. | |
dc.format.mimetype | application/pdf | |
dc.subject | electronic health records | |
dc.subject | missing data | |
dc.subject | time-to-event outcomes | |
dc.subject | measurement error | |
dc.subject | variance estimation | |
dc.title | Statistical Methods for the Analysis of Error-Prone Electronic Health Records: Impact of Source Data Verification, Time Discretized Multiple Imputation, and Variance Estimation with Incompatible Imputation and Analysis Models | |
dc.type | dissertation | |
dc.contributor.committeeMember | Peter Rebeiro | |
dc.contributor.committeeMember | Qingxia (Cindy) Chen | |
dc.contributor.committeeMember | Bryan Shepherd | |
dc.type.material | text | |
thesis.degree.name | PHD | |
thesis.degree.level | dissertation | |
thesis.degree.discipline | Biostatistics | |
thesis.degree.grantor | Vanderbilt University | |
local.embargo.terms | 2020-08-29 | |
local.embargo.lift | 2020-08-29 | |
dc.contributor.committeeChair | Jonathan Schildcrout | |