Electronic Health Record Phenotyping of Radionecrosis
Jean-Baptiste, Samuel R.
0000-0001-7442-3573
:
2024-07-15
Abstract
Radionecrosis is a serious complication that can occur in patients who have received radiation therapy to the brain. However, the complexity and heterogeneity of radionecrosis presentations pose challenges for manual identification from electronic health records (EHRs). We hypothesized that by leveraging advanced computational methods and clinically relevant curated features, we could develop accurate and efficient phenotyping algorithms for identifying radionecrosis in the EHR. This study aimed to develop and evaluate advanced phenotyping algorithms for the automated identification of radionecrosis in patients who have undergone brain radiation therapy.
We retrospective evaluated a cohort of 865 patients, with a median age of 63 (IQR 71-53), who received brain radiation therapy. Following treatment, 60 patients (7%) developed radionecrosis. We then developed a naive rule-based algorithm, a weighted scoring function, and several machine learning models (including Logistic Regression, Support Vector Machines, and Random Forests) to identify radionecrosis cases based on structured EHR data as well as unstructured clinical notes. These algorithms were then evaluated using performance metrics.
Most (501 patients, 57.9%) received radiation for metastatic disease to the brain, while 195 patients (22.5%) underwent radiation for a primary brain tumor. The weighted scoring function approach, which derived optimal feature weights through logistic regression, achieved a sensitivity of 0.82, specificity, 0.94, accuracy 0.92 and precision 0.74. Machine learning models, particularly Logistic Regression and Support Vector Machines with linear and radial basis function kernels, demonstrated high discriminative ability (AUC> 0.90). Feature importance analysis revealed key predictors of radionecrosis, including never-smoker status and specific patient characteristics.
The development of phenotyping algorithms for the accurate identification of radiation-induced toxicity is crucial for optimizing patient care and advancing precision medicine in oncology. Our developed phenotyping algorithms showcase the potential for automated identification of radionecrosis from EHR data using machine learning approaches. This study lays the foundation for future research and validation efforts to refine these methods. Accurate and efficient curation of large radiation oncology datasets has implications for downstream adverse outcome risk stratification, observational studies, hypothesis-generating, and prospective trial recruitment.