Assessment of diagnostic accuracy after biomarker combination in the same study: The issue of over-optimism and potential solutions Restricted; Files Only

Zhao, Zhiwei (Spring 2019)

Permanent URL: https://etd.library.emory.edu/concern/etds/5x21tg461?locale=en
Published

Abstract

   In disease diagnosis, biomarker combination is an important method in disease diagnosis since it is usually not enough to consider only a single marker. There are a few studies focusing on the biomarker combination rules. In practice, it will be ideal if researchers have independent training and validation datasets. However, it is usually not the case in real word. In fact, it is well-known that using single dataset for both development and evaluation of a combination rule could produce over optimism problem. In this thesis, we are trying to address this problem. We used logistic regression to generate the combination rule. Then, area under the ROC curve (AUC) was used as the assessment method. The k-fold cross-validation was used in order to solve the over optimism. To reduce the bias, we proposed and introduced a two-sample jackknife bias-reduced approach as well as bootstrap bias-reduced approach. As for inference, bootstrap was introduced to estimate the standard error and a double bootstrap was proposed to improve the estimate for bootstrap bias-reduced estimators. A prostate cancer data was used as an illustration of the aforementioned methods in real word application.

Table of Contents

1 Introduction 1

2 Methods 4

2.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 Point Eatimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2.1 Standard Approach . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2.2 Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2.3 Jackknife Bias Correction . . . . . . . . . . . . . . . . . . . . . . 6

2.2.4 Bootstrap Bias Correction . . . . . . . . . . . . . . . . . . . . . . 8

2.3 Variance Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Simulations 10

3.1 Situation with only one informative biomarker . . . . . . . . . . . . . . 10

3.2 Situation with subtypes . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4 Real Data Application 17

5 Discussion 18

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Subfield / Discipline
Degree
Submission
Language
  • English
Research field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified No preview

Primary PDF

Supplemental Files