Predicting Combined Chemotherapeutic Agents’ Efficacy Synergy via Multiple Regression Models and Cross Validation Technique, Upon Inter-Patient and Intra-Patient Levels Public
Zhang, Xiaozhu (Spring 2019)
Abstract
Background: Precise medicine is crucial to cancer treatment for minimizing potentially lethal side-effects and maximize drug efficacy, and accurately modeling the individual drug efficacy is the key step. However, previous studies mostly modeled single drug efficacy while combination chemotherapy is more frequently applied in practice. In this study, we compared and integrated several models and algorithms to predict individual multiple-drug-polymer response on both intra-patient and inter-patient levels. Eventually, we aim to push cancer treatment one step down to the road of precise medicine.
Methods: We are interested in three key variables: two drug dosages, gene expression, and gene mutation. By adding these variables one by one to the model and evaluating model performance, we can determine their relative importance in the prediction. Linear regression, ridge regression, lasso regression, elastic net regression and random forest algorithms are applied in model construction. The goodness of fit is evaluated through R-square value tested by 10-fold cross validation and leave one out cross validation. Model was built upon single cell line data as well as data composed of four cell lines’ information to investigate models’ ability of predicting synergy at inter-cell-line level and intra-cell-line level.
Results and Conclusion: Compared to baseline model in which dosage information are only explanatory variables, secondary model with added gene expression data generated significantly larger R-square. However, adding mutation data into final model did not improve model accuracy, and R-squares are nearly the same to secondary model. In addition, model built upon multiple cell lines were incompetent in predicting drug synergy. Among five regression methods, random forest algorithm consistently produces largest R-square in each model. 10-fold CV is proved to have better generality and LOOCV coupled with random forest algorithm built best model. In conclusion, this study proved feasibility of predicting multiple chemotherapeutic agents’ efficacy synergy utilizing their dosage information and gene expression data with-in cell line. The efforts of adding mutation information returned result that lower than expectation. More information is needed to model the drug synergy among patients.
Table of Contents
Introduction ………………………………………………………………………………………………………………………… 1
Methods ……………………………………………………………………………………………………………………………….. 4
1.Data description …………………………………………………………………………………………………………………… 4
2.Baseline model establishment ……………………………………………………………………………………………… 5
3.Secondary model construction ……………………………………………………………………………………………. 8
4.Final model and model comparison …………………………………………………………………………………….. 9
Results …………………………………………………………………………………………………………………………………. 10
Discussion ……………………………………………………………………………………………………..…………………… 17
Acknowledgment ……………………………………………………………………………………….…………………. 20
Bibliography ……………………………………………………………………………………………………….……………… 20
About this Master's Thesis
| School | |
|---|---|
| Department | |
| Subfield / Discipline | |
| Degree | |
| Submission | |
| Language | 
 | 
| Research Field | |
| Mot-clé | |
| Committee Chair / Thesis Advisor | 
Primary PDF
| Thumbnail | Title | Date Uploaded | Actions | 
|---|---|---|---|
|  | Predicting Combined Chemotherapeutic Agents’ Efficacy Synergy via Multiple Regression Models and Cross Validation Technique, Upon Inter-Patient and Intra-Patient Levels () | 2019-04-08 00:35:51 -0400 |  | 
Supplemental Files
| Thumbnail | Title | Date Uploaded | Actions | 
|---|