Analyzing Batting Patterns of Major League Baseball Players For Advance Scouting Reports: Using R to Generate High-Level Spatial Plots of PITCHf/x Data Público
Superak, Hillary Margolin (2011)
Abstract
Abstract
Baseball, regarded as "America's National Pastime," has been a
constantly evolving sport
in many respects. In particular, the concept of data analytics has
increasingly been applied to
baseball in recent years. Statistical and graphical analyses have
enhanced two main areas of
Major League Baseball: player selection and game strategy. In 2006,
Sportvision developed
PITCHf/x, a system of high-speed cameras installed in every MLB
stadium that records every
pitch. PITCHf/x data include the three-dimensional spatial
coordinates of the ball's trajectory,
along with several other pitch characteristics. This technology
provided the first opportunity to
evaluate individual pitches based on information not contained in
box scores or other statistics.
Graphical analyses have been conducted on the PITCHf/x data, but
there is room for
improvement. Specifically, new techniques were needed that could
display spatial data for large
data sets - for example, an entire season's worth of pitches
received by a batter. Also desirable
was a method of plotting two-dimensional spatial data that could be
categorized according to the
levels of a third, discrete variable. The figures needed to be
informative, yet concise and easily
interpretable.
A user-friendly graphical analysis tool for the advance scouting of
MLB batters was
developed. A function was programmed in R to generate three types
of plots: (1) heat map
density plots, (2) heat map contour plots, and (3) hexbin pie
charts. Plots were created according
to the hierarchical four-level "pitch universe": (1) all pitches,
(2) swings only, (3) balls in play
only, and (4) base hits only. Classification by pitch type was also
examined.
Results for two players, Jason Heyward and Dustin Pedroia, were
evaluated. As expected,
the "hotspot" for Jason Heyward was consistently in the lower
outside area of the strike zone; for
Pedroia, it was in the center of the strike zone. Due to Pedroia's
small stature, his power hits were
located primarily on the inside of the strike zone; Heyward, a much
taller player, was more
successful on the outside. The readily apparent sensitivity of
these plots to different batting
tendencies speaks to the legitimacy of these graphical analyses as
a potentially valuable advance
scouting tool.
Table of Contents
TABLE OF CONTENTS
Introduction
.......................................................................................................................
1
Baseball: America's National Pastime
......................................................................
1
Major League Baseball
.............................................................................................
2
A Numbers Game
......................................................................................................
2
SABR: Society for American Baseball Research
..................................................... 3
A New Graphical Analysis Tool for Advance Scouting
........................................... 4
Current Methods & Technology
......................................................................................
5
PITCHf/x
...................................................................................................................
5
Graphics
....................................................................................................................
6
Areas for Improvement
.............................................................................................
8
New Developments
..........................................................................................................
10
Advance Scouting of Batters
...................................................................................
10
Hexbin Pie Charts
....................................................................................................
10
Contour Plots
...........................................................................................................
11
Density Plots
............................................................................................................
12
Methodology
....................................................................................................................
13
The Data Set
............................................................................................................
13
Data Cleaning and Variable Manipulations
............................................................
13
Preparation for Plotting
...........................................................................................
17
Plotting Function
.....................................................................................................
17
Hexbin Pie Charts
...................................................................................................
19
Heat Map Contour Plots
..........................................................................................
20
Heat Map Density Plots
..........................................................................................
21
Validation
................................................................................................................
22
Results & Interpretation
...............................................................................................
24
Jason Heyward
........................................................................................................
25
Dustin Pedroia
.........................................................................................................
25
Note for Plot Analysis
.............................................................................................
25
Discussion & Possibilities for Future Research
............................................................
67
Value of Graphical Analysis to Advance Scouting
................................................. 67
Limitations
...............................................................................................................
68
Ideas for Further Analysis
.......................................................................................
70
Further Technological Developments
.....................................................................
71
TABLES & FIGURES
Figure 1
: Example of a relative frequency display using PITCHf/x data
...................................... 6
Figure 2 : Example of a polar plot using PITCHf/x
data
................................................................
6
Figure 3 : Example of a hit chart using PITCHf/x data
..................................................................
7
Figure 4 : Example of a strike chart using PITCHf/x
data ..............................................................
7
Figure 5 : Example of spatial plot of the pitcher's
actions using PITCHf/x data ........................... 8
Table 1 : PITCHf/x Variables Retained for Analyses
...................................................................
15
Table 2 : New Variables Created for Plotting Purposes
................................................................
16
Figure 6 : Diagram of the four-level pitch universe
......................................................................
18
Figure 7 : Heat map density plot for Heyward's swings
...............................................................
26
Figure 8 : Heat map contour plot for Heyward's swings
..............................................................
26
Figure 9 : Heat map density plot for Pedroia's swings
.................................................................
27
Figure 10 : Heat map contour plot for Pedroia's
swings
...............................................................
27
Figure 11 : Heat map density plot for Heyward's balls
in play ....................................................
29
Figure 12 : Heat map contour plot for Heyward's balls
in play ....................................................
29
Figure 13 : Heat map density plot for Pedroia's balls
in play .......................................................
30
Figure 14 : Heat map contour plot for Pedroia's balls
in play ......................................................
30
Figure 15 : Heat map density plot for good outcome
vs. Heyward .............................................. 32
Figure 16 : Heat map contour plot for good outcome
vs. Heyward .............................................. 32
Figure 17 : Heat map density plot for good outcome
vs. Pedroia .................................................
33
Figure 18 : Heat map contour plot for good outcome
vs. Pedroia ................................................
33
Figure 19 : Heat map density plot of Heyward's base
hits ...........................................................
35
Figure 20 : Heat map contour plot of Heyward's base
hits ...........................................................
35
Figure 21 : Heat map density plot of Pedroia's base
hits ..............................................................
36
Figure 22 : Heat map contour plot of Pedroia's base
hits .............................................................
36
Figure 23 : Hexbin pie chart of all pitches to
Heyward, classified by pitch type .........................
39
Figure 24
: Hexbin pie chart of all pitches to Pedroia, classified by
pitch type ............................ 40
Figure 25 : Hexbin pie chart of all pitches to
Heyward, classified by swinging or not ................ 41
Figure 26 : Hexbin pie chart of all pitches to
Pedroia, classified by swinging or not ................... 42
Figure 27 : Hexbin pie chart of all pitches to
Heyward, classified by BIP or not .........................
43
Figure 28 : Hexbin pie chart of all pitches to
Pedroia, classified by BIP or not ...........................
44
Figure 29 : Hexbin pie chart of all pitches to
Heyward, classified by outcome for pitcher .......... 45
Figure 30 : Hexbin pie chart of all pitches to
Pedroia, classified by outcome for pitcher ............ 47
Figure 31 : Hexbin pie chart of all pitches to
Heyward, classified by base hit or not .................. 48
Figure 32 : Hexbin pie chart of all pitches to
Pedroia, classified by base hit or not .....................
49
Figure 33 : Hexbin pie chart of all pitches to
Heyward, classified by type of base hit ................. 50
Figure 34 : Hexbin pie chart of all pitches to
Pedroia, classified by type of base hit ...................
51
Figure 35 : Hexbin pie chart of all swings for
Heyward, classified by pitch type ........................ 52
Figure 36 : Hexbin pie chart of all swings for
Pedroia, classified by pitch type ..........................
53
Figure 37 : Hexbin pie chart of all swings for
Heyward, classified by BIP or not ....................... 54
Figure 38 : Hexbin pie chart of all swings for
Pedroia, classified by BIP or not ..........................
55
Figure 39 : Hexbin pie chart of all swings for
Heyward, classified by outcome for pitcher ........ 56
Figure 40 : Hexbin pie chart of all swings for
Pedroia, classified by outcome for pitcher ........... 57
Figure 41 : Hexbin pie chart of all BIP for Heyward,
classified by pitch type ............................. 58
Figure 42 : Hexbin pie chart of all BIP for Pedroia,
classified by pitch type ............................... 59
Figure 43 : Hexbin pie chart of all BIP for Heyward,
classified by outcome for pitcher ............. 60
Figure 44 : Hexbin pie chart of all BIP for Pedroia,
classified by outcome for pitcher ................ 61
Figure 45 : Hexbin pie chart of all good-outcome
pitches to Heyward, classified by pitch type . 62
Figure 46 : Hexbin pie chart of all good-outcome
pitches to Pedroia, classified by pitch type .... 63
Figure 47 : Hexbin pie chart of all base hits for
Heyward, classified by base hit type ................. 64
Figure 48 : Hexbin pie chart of all base hits for
Heyward, classified by power ........................... 64
Figure 49 : Hexbin pie chart of all base hits for
Pedroia, classified by base hit type ................... 65
Figure 50 : Hexbin pie chart of all base hits for
Pedroia, classified by power ..............................
65
About this Master's Thesis
School | |
---|---|
Department | |
Subfield / Discipline | |
Degree | |
Submission | |
Language |
|
Research Field | |
Palabra Clave | |
Committee Chair / Thesis Advisor | |
Committee Members | |
Partnering Agencies |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Analyzing Batting Patterns of Major League Baseball Players For Advance Scouting Reports: Using R to Generate High-Level Spatial Plots of PITCHf/x Data () | 2018-08-28 13:39:01 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|