Predicting Baseball Player Performance with OLS Regression and Out-of-Sample Forecasting Pubblico
Treiman, Lauren (Spring 2020)
Abstract
Objective: To assist Major League Baseball (MLB) teams in contract negotiations by better predicting players’ future performance.
Methods: Players from 1871 – 2018 with at least 7 years of MLB experience were analyzed to determine the most important factors affecting their performance. I used wins above replacement (WAR) as my dependent variable to measure players’ value and Ordinary Least Squares (OLS) regression to predict players’ future WAR. Initially, players from the 2010s were ana- lyzed with out of sample forecasting by comparing players with similar WAR. Multiple regression models of comparable players were then developed from different decades with 1-6 years of past experience. Future performance for multiple seasons were then predicted for players competing in the early 2010s by using comparable players who played in the last 3 decades (1990s-2010s) with 6 years of past experience. To best reflect the contract negotiation process, only the sample’s actual WAR from their first 6 years in the MLB was considered to predict the rest of their career. Thus, WAR predictions for their 7th, 8th, ... years were used to predict performance towards the end of their career.
Results: The model developed was most accurate when only analyzing the 3 most recent decades of past players (players since the 1990s for batters in 2010s) in conjunction with the past 6 WAR values. The regression model constructed was within 2 WAR from the actual WAR and was able to accurately predict player’s performance trends throughout their career. My model should help teams by providing additional information that will improve evaluation of a player’s performance for the next four years after seven years in MLB.
Table of Contents
1 Introduction 1
1.1 Contracts ............................................ 1
1.2 Alternative Projection Methods ......... 2
2 Methods 6
2.1 Datasets.............................................. 6
2.2 Variables............................................. 7
2.3 Predictive Metric Analysis ................. 13
2.4 Player Classification .......................... 14
2.5 The Model........................................... 16
3 Results 18
3.1 Groups................................................ 18
3.2 Handedness ....................................... 23
3.3 Predictive Metrics............................... 24
3.4 Single Year Predictions....................... 27
3.5 Multi-Year Predictions ....................... 33
4 Discussion 35
4.1 Background ........................................ 35
4.2 Approach ............................................ 35
4.3 Rate of Improvement.......................... 36
4.4 Handedness ........................................ 37
4.5 Predictive Metrics............................... 37
4.6 Implications........................................ 40
4.7 Limitations ......................................... 40
4.8 Future Approaches ............................. 41
5 Conclusion 43
References 44
Appendix 47
About this Honors Thesis
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
Parola chiave | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Predicting Baseball Player Performance with OLS Regression and Out-of-Sample Forecasting () | 2020-04-11 16:26:16 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|