Generalized Quantile Random Forest with Smoothed Estimating Equations Restricted; Files Only

Sui, Jiayu (Spring 2023)

Permanent URL: https://etd.library.emory.edu/concern/etds/f7623d92j?locale=en
Published

Abstract

Quantile regression establishes the relationship between one or more independent variable(s) and specific quantiles or percentiles of a dependent variable. It has been a handy supplement to the Least Squares Regression in the analysis of real-life applications. There are two random forest based implementation of the quantile regression, the quantile regression forest (quantregForest) by Meinshausen [9] and the quantile regression in the Generalized Random Forest (GRF) framework by Athey et al. [1]. The latter achieves a better performance by the redesign of the splitting rule using the quantile regression moment condition. However, the moment condition used contains an indicator function, which makes it non-smooth and non-differentiable.

By applying Kaplan and Sun [6]’s smoothed estimating equation (SEE) to the quantile regression moment condition, we managed to approximate the original moment condition with a smoothed moment condition with flexible bandwidth to adjust for the bias-variance tradeoff. Using self-constructed Python implementation of the GRF framework, we were able to insert the smoothed new moment condition and observe how such modification affect the performance of quantile estimation.

It was observed that on the random forest level, testing of the quantile GRF with SEE on the simulated data did not receive positive effect on the estimation accuracy. However, further inspection on the decision tree level quantile estimation reveals that with increased training sample, the quantile estimation should be able to approach satisfactory accuracy. Hence, we plan to further improve the program’s run-time and produce better approach for hyperparameter tuning, such that the program can be executed with increased training sample and larger tree number in reasonable runtime. This would allow us to develop more understanding of the performance on the random forest level given sufficiently large training samples and tree numbers.

Table of Contents

1 Introduction........................ 1

2 Background and Related Work........................ 4

2.1 QuantileRegressionForest........................ 4

2.2 GeneralizedQuantileRandomForest .................. 5 

3 Method........................ 8

3.1 SmoothedMomentCondition ...................... 8

3.2 SplittingRuleModification........................ 9

3.3 EstimationStage ............................. 11 

4 Result ............................. 12

5 Discussion and Future Works ............................. 17

6 Conclusion ............................. 19

Bibliography ............................. 21 

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified Preview image embargoed

Primary PDF

Supplemental Files