dc.description.abstract | Chart reviews are often needed to collect or adjudicate information for retrospective studies in the absence of reliable research databases. For studies with smaller sample sizes, it is relatively feasible to review all the records. However, with large datasets, such as electronic health record (EHR) data, this is more of a challenge due to limited time and resources. Therefore, appropriate sampling strategies for chart review are warranted. In this thesis, different sampling strategies for chart review were considered in the context of risk prediction modeling using EHR data. The impact of EHR data quality issues was evaluated on risk factor effects and risk prediction model performance under Cox proportional hazards model framework. Two chart review sampling strategies, i.e. random sampling and case-cohort sampling, were considered to correct data errors and improve model performance. Extensive simulation studies were conducted under different risk factor distributions, error rates, and event rate scenarios. Last, the two chart review sampling strategies were applied to evaluate the Sequential Organ Failure Assessment (SOFA) score for predicting thirty-day mortality using EHR data. | |