dc.description.abstract | This work is centered on the implementation of EHR derived medication data to perform population pharmacokinetic studies and is divided into chapters which focus on a different aspect of this field. In the first, we conducted a population pharmacokinetic (PK) study with 363 subjects using real-world data extracted from electronic heath records (EHRs) to estimate the tacrolimus population PK profile. We assessed the sensitivity of the PK parameter estimates to assumptions about dose timing using last-dose times extracted by our own natural language processing system, medExtractR. Our findings suggest that drugs with a slower elimination rate (or a longer half-life) are less sensitive to dose timing errors and that experimental designs which only allow for trough blood concentrations are usually insensitive to deviation in absorption rate. In the next, we examined fentanyl pharmacogenetics. CYP3A4 and CYP3A5 encode enzymes which metabolize fentanyl; genetic variants in these genes impact fentanyl pharmacokinetics in adults. In a pediatric cohort, we found that a genotype of CYP3A5*1/*3 or CYP3A5*1/*6 (i.e., intermediate metabolizer status) was associated with a 0.84-fold (95% confidence interval [CI]: 0.71 to 1.00) reduction in clearance vs. CYP3A5*1/*1 (i.e., normal metabolizer status). CYP3A5*3/*3, CYP3A5*3/*6, or CYP3A5*6/*6 (i.e., poor metabolizer status) was associated with a 0.76-fold (95% CI: 0.58 to 0.99) reduction in clearance. In the final model, expected clearance was 8.9 and 6.8 L/hr for a normal and poor metabolizer, respectively, with median population covariates (9 months old, 7.7 kg, low surgical severity). In the final study, we developed a set of models for predicting valid doses from invalid doses in EHR sourced medication data. We built models using supervised methods such as Random Forests and Adapted Boosting as well as unsupervised methods such as Markov Models and Hidden Markov Models. We tested the models on cohorts of medication notes for two drugs, tacrolimus and lamotrigine. In the tacrolimus test set, the best model was the Hidden Markov Model (squared root mean squared error (RMSE) = 1.4 mg). In the lamotrigine set, it was Two-Stage Random Forest Model (RMSE = 168 mg). | |