dc.description.abstract | Uropathogenic Escherichia coli (UPEC) is a genotypically and phenotypically diverse bacterial species and is the overwhelming etiological agent of bacteriuria. Unlike diarrheagenic strains of E. coli, UPEC does not have a specific genomic signature to differentiate it from other benign, urocolonizing strains of E. coli. In this work, I investigate the genomics underscoring UPEC and further advance development of genomics methods for studying host-pathogen interactions. I provide a large dataset of over 700 sequenced strains of E. coli isolated from patients with various presentations of bacteriuria and associated de-identified patient metadata. Through various genomics techniques I assessed how mobile genetic elements or mutations in the core genome may predict whether the bacterial strain came from a patient with a symptomatic presentation of bacteriuria, or asymptomatic. In these studies, I show that mobile genetic elements, including plasmids, prophages, and transposons, do not constitute a defining feature of UPEC, nor do they significantly contribute to the acquisition of putative virulence factors within urinary E. coli. Additional applications of a genome-wide association study and a novel application of a polygenic
score suggest that the development of symptoms by the patient may be primed by an accumulation of core gene mutations, rather than a single genotype. These data suggest that a while no single gene appears to be responsible for the UPEC phenotype, accumulation of certain variants may lead to increased pathogenic potential of strains. Collectively, this work increases data availability for sequenced urinary E. coli, progresses our knowledge of the genomics underlying UPEC, and provides methods for quantifying the cumulative nature of genomic variants underlying complex bacterial phenotypes. | |