Show simple item record

Machine Learning on Single-Cell RNA-Seq to Advance our Understanding of Clonal Hematopoiesis

dc.contributor.advisorSchmidt, Douglas C
dc.contributor.advisorSpencer-Smith, Jesse
dc.contributor.advisorBick, Alexander
dc.contributor.advisorHeimlich, Jonathan B
dc.creatorSharber, Brian
dc.date.accessioned2024-01-19T20:17:08Z
dc.date.available2024-01-19T20:17:08Z
dc.date.created2023-12
dc.date.issued2023-11-16
dc.date.submittedDecember 2023
dc.identifier.urihttp://hdl.handle.net/1803/18551
dc.description.abstractClonal Hematopoiesis of Indeterminate Potential (CHIP) is characterized by genetic mutations within blood-forming stem cells, leading to the emergence of mutated blood cell populations. Associated with elevated risks of various diseases, including malignancies and cardiovascular ailments, CHIP's intricate relationship between genetics and health underscores the need for comprehensive investigation and understanding. Current methodologies for the identification of CHIP cells are time-consuming and cost intensive. Utilizing single-cell RNA sequencing (scRNA-seq) data, this study aims to delve deeper into the complex genomic landscape of CHIP, harnessing the power of machine learning classifiers and techniques to build a more cost-effective pipeline for the identification of CHIP cells and enhance our understanding. A specialized machine learning classifier is tailored within a pipeline specifically for the nuances of CHIP-related single-cell RNA expression data, meticulously analyzing critical features. Using model-agnostic methods such as permutation importance, the model refines hundreds of features down to the most critical ones while maintaining a high level of accuracy. Exploration with the TET2 dataset successfully pinpoints 13 key genes that play a pivotal role in the identification of CHIP cells vs. non-CHIP cells using a model that accurately classifies CHIP cells 91% of the time. Exploration with the DNMT3A dataset successfully pinpoints 3 key features using a model that accurately classifies CHIP cells 81% of the time. The classifier developed within the pipeline holds the potential to assist in the precise identification of CHIP cells and unveil distinct RNA expression profiles. This research endeavors to illuminate the genetic and functional facets of CHIP cells, paving the way for advancements in disease prediction, diagnostics, and potential therapeutic interventions.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectMachine Learning
dc.subjectSingle-Cell RNA Sequencing
dc.subjectClonal Hematopoiesis of Indeterminate Potential
dc.titleMachine Learning on Single-Cell RNA-Seq to Advance our Understanding of Clonal Hematopoiesis
dc.typeThesis
dc.date.updated2024-01-19T20:17:09Z
dc.type.materialtext
thesis.degree.nameMS
thesis.degree.levelMasters
thesis.degree.disciplineComputer Science
thesis.degree.grantorVanderbilt University Graduate School
dc.creator.orcid0009-0005-9598-8211


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record