Show simple item record

LUNG CANCER RISK ESTIMATION WITH IMPERFECT DATA FROM MULTIPLE MODALITIES

dc.contributor.advisorLandman, Bennett
dc.creatorGao, Riqiang
dc.date.accessioned2022-04-08T15:08:52Z
dc.date.created2022-03
dc.date.issued2022-03-10
dc.date.submittedMarch 2022
dc.identifier.urihttp://hdl.handle.net/1803/17091
dc.description.abstractLung cancer risk estimation is the problem of predicting lung cancer risk of patients using medical data. The development of deep learning techniques and the availability of large-scale datasets from multiple modalities provide new opportunities for lung cancer risk estimation, however, there are inherent challenges in practice. This dissertation proposes a series of approaches that utilize data from multiple modalities and target the challenges of data imperfectness to predict lung cancer risk. Patients may have repeated CT images and a collection of clinical data including demographic and radiographic features. Starting with data collection from the research repositories, we propose an image quality assessment tool to address the challenge with potential data quality issues. Then, to utilize multiple CT images from the same patient for diagnosis, we propose new recurrent models to capture the additional information upon a single image and handle irregularly sampled serial CTs for risk prediction, respectively. Except only using the image modality, integrating image and non-image modalities for lung cancer prediction with deep learning is relatively scarce. We show that tabular clinical features as the non-image modality can provide complementary information to image modality with a multi-path deep learning model combining those two modalities for risk prediction. Considering challenges that data can be missing in both/each modality, and multi-modal missing imputation is challenging with existing methods when 1) the missing data span across heterogeneous modalities (e.g., image vs. non-image) or 2) one modality is largely missing, we propose a novel generative imputation model by modeling the joint distribution of multi-modal data. Apart from discrimination, confidence calibration is essential for trustworthy prediction. We introduce a contrastive analysis on confidence calibration with representative calibration models across general computer vision and medical imaging prediction tasks. To closely address specific clinical interests of screening and incidental cohorts respectively, we conduct validation studies and validate our previously proposed model from a clinical perspective. Finally, we conclude this dissertation by plotting the big picture of general prediction models with three major opportunities and four significant challenges, as well as outlining future directions.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectLung Cancer, Serial CT, Multi-modal, Missing Data
dc.titleLUNG CANCER RISK ESTIMATION WITH IMPERFECT DATA FROM MULTIPLE MODALITIES
dc.typeThesis
dc.date.updated2022-04-08T15:08:52Z
dc.type.materialtext
thesis.degree.namePhD
thesis.degree.levelDoctoral
thesis.degree.disciplineComputer Science
thesis.degree.grantorVanderbilt University Graduate School
local.embargo.terms2022-09-01
local.embargo.lift2022-09-01
dc.creator.orcid0000-0002-8729-1941
dc.contributor.committeeChairLandman, Bennett


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record