Epigenetic prediction of smoking status using machine-learning methods Restricted; Files & ToC

Liu, Tianxiao (Fall 2019)

Permanent URL: https://etd.library.emory.edu/concern/etds/0k225c080?locale=es


Background: Tobacco smoking has been recognized as a major risk factor for many adverse health outcomes. Although many DNA methylation sites have been reported to be associated with tobacco smoking, few studies have focused on establishing prediction models of smoking status from DNA methylation data. This study aims at smoking status prediction using machine learning algorithms with precision, generalizability and a small number of predictors. Methods: An epigenetic prediction analysis of smoking status was performed on 218 male Caucasian twins, using DNA methylation data and two machine learning methods, random forests and elastic net. Training and testing of the prediction models were performed in two non-overlapping subsets. Results: Accuracy of the prediction model is higher in differentiating current and non-current smokers, than that in differentiating past and never smokers. In predicting past and never smokers, elastic net has a higher accuracy for smaller predictor sets compared with random forests. After variable tuning and predictor selection, the performance of random forests in predicting past and never smokers increases for all predictor sets. Conclusion: This study suggested that machine learning approaches could be utilized in understanding smoking risks using DNA methylation data with a relatively small set of DNA methylation data.

Table of Contents

This table of contents is under embargo until 03 January 2022

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
Subfield / Discipline
  • English
Research field
Palabra Clave
Committee Chair / Thesis Advisor
Última modificación No preview

Primary PDF

Supplemental Files