A Bayesian framework to integrate genomic annotations for identification of Autism risk genes
Autism spectrum disorder (ASD) is a group of complex neurodevelopment disorders with a strong genetic basis. Large scale sequencing studies have provided strong evidence for dozens of ASD risk genes. However, it is estimated that around 1000 genes contribute to ASD risk and it remains challenging to identify putative risk genes. To close the gap between the number of anticipated and known ASD genes, several computational approaches have been developed to prioritize ASD genes from biological annotations or functional network information. Yet, very few methods have attempted to integrate these two types of evidence together. In this paper, we present a Bayesian model-selection based approach that involves both biological annotations and functional network information. Unlike previous approaches, ours includes a more comprehensive set of evidence with known ASD genes of strong evidence from sequencing studies, to obtain predicted probability of ASD risk for each gene across the genome. We validated our prediction by testing candidate genes for 1) enrichment in independent gene lists from independent sequencing studies or experts’ curation and 2) enrichment of heritability from recent GWAS study. Leveraging the candidate genes identified in the study, we identified strong involvement of striatal medium spiny neurons and early developmental stages in ASD. It is our hope that this framework can provide an integrative approach to ASD gene discovery and contribute to advancing our understanding of the biology of ASD.