Design Strategies for Studies Using Logistic Regression to Analyze Data on Pooled Samples 公开
Yan, Xiaobo (Spring 2021)
Abstract
We review logistic regression modeling to estimate the risk of potential factors' odds ratios and predict disease prevalence using pooled samples via the maximum likelihood (ML) approach. We determine the preferred methods to deal with either categorical variables or continuous variables. For categorical variables, random pooling within subsets stratified by the variables of interest yields the most accurate and most efficient estimate on both coefficient and prevalence. We take advantage of statistical software for continuous variables to pool samples with a prespecified number of pools by the k-means clustering algorithm to optimize the estimation performance. We also modify the k-means clustering function embedded in SAS to constrain the maximum pool size to consider laboratory operability and test limitation. We compare the estimates between incorporating perfect and imperfect testing (sensitivity and sensitivity) to demonstrate the necessity of adjustment ML for test bias. Both of our proposed strategies showed the most efficacy while keeping good performance accuracy for the Malaria data and simulated data. Further potential study on imperfect tests is also discussed at the end of the study.
Table of Contents
Table of Contents
1 Introduction
2 Methodology
2.1 Standard multiple logistic regression
2.2 Pooling strategies
2.3 Logistic regression in the pooling setting
3 Results
3.1 Motivational study
3.1.1 Age as a categorical variable
3.1.2 Age as a continuous variable
3.2 Simulation
3.2.1 Age as a categorical variable
3.2.2 Age as a continuous variable
4 Discussion
4.1 Imperfect tests
4.2 Overall prevalence
4.3 Investigating implausible simulation results
5 References
About this Master's Thesis
School | |
---|---|
Department | |
Subfield / Discipline | |
Degree | |
Submission | |
Language |
|
Research Field | |
关键词 | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Design Strategies for Studies Using Logistic Regression to Analyze Data on Pooled Samples () | 2021-04-24 16:40:43 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|