Integrated Analysis of Genetic and Proteomic Data
Reif, David Michael
Biological organisms are complex systems that dynamically integrate inputs from a multitude of physiological and environmental factors. Complex clinical outcomes arise from the concerted interactions among the myriad components of a biological system. Therefore, in addressing questions concerning the etiology of phenotypes as complex as common human disease or adverse reaction to vaccination, it is essential that the systemic nature of biology be taken into account. Analysis methods must integrate the information provided by each data type in a manner analogous to the operation of the body itself. It is hypothesized that such integrated approaches will provide a more comprehensive portrayal of the mechanisms underlying complex phenotypes and lend confidence to the biological interpretation of analytical conclusions.This dissertation concerns the development of a comprehensive analysis paradigm wherein experimental data of multiple types were analyzed jointly in the study of complex phenotypes. Flexible machine learning methods were used to integrate information that is insensitive to spatial and temporal flux (genetic polymorphisms) with information subject to dynamic changes (protein concentrations measured at multiple time points). This strategy was applied to genetic and proteomic data in both simulated and real analysis situations. Results of studies using simulated data indicated that utilizing multiple data types is beneficial when the disease model is complex and the phenotypic outcome-associated data type is unknown. The successful application to combined genetic and proteomic data from smallpox vaccine studies supported the hypothesis that such integrated approaches are analytically beneficial. Considering the rapid progress in experimental technologies able to reliably generate vast quantities of data, as well as continual improvements in cost efficiency, it is expected that datasets including multiple types of experimental information will become commonplace in the near future. It is hoped that the positive conclusions from this dissertation will help spur the adoption of an analytical approach that rightfully takes the broader physiological context of complex biological systems into account.