Analyzing Batting Patterns of Major League Baseball Players For Advance Scouting Reports: Using R to Generate High-Level Spatial Plots of PITCHf/x Data Open Access

Superak, Hillary Margolin (2011)

Permanent URL: https://etd.library.emory.edu/concern/etds/sq87bv041?locale=en%255D
Published

Abstract



Abstract

Baseball, regarded as "America's National Pastime," has been a constantly evolving sport
in many respects. In particular, the concept of data analytics has increasingly been applied to
baseball in recent years. Statistical and graphical analyses have enhanced two main areas of
Major League Baseball: player selection and game strategy. In 2006, Sportvision developed
PITCHf/x, a system of high-speed cameras installed in every MLB stadium that records every
pitch. PITCHf/x data include the three-dimensional spatial coordinates of the ball's trajectory,
along with several other pitch characteristics. This technology provided the first opportunity to
evaluate individual pitches based on information not contained in box scores or other statistics.
Graphical analyses have been conducted on the PITCHf/x data, but there is room for
improvement. Specifically, new techniques were needed that could display spatial data for large
data sets - for example, an entire season's worth of pitches received by a batter. Also desirable
was a method of plotting two-dimensional spatial data that could be categorized according to the
levels of a third, discrete variable. The figures needed to be informative, yet concise and easily
interpretable.
A user-friendly graphical analysis tool for the advance scouting of MLB batters was
developed. A function was programmed in R to generate three types of plots: (1) heat map
density plots, (2) heat map contour plots, and (3) hexbin pie charts. Plots were created according
to the hierarchical four-level "pitch universe": (1) all pitches, (2) swings only, (3) balls in play
only, and (4) base hits only. Classification by pitch type was also examined.
Results for two players, Jason Heyward and Dustin Pedroia, were evaluated. As expected,
the "hotspot" for Jason Heyward was consistently in the lower outside area of the strike zone; for
Pedroia, it was in the center of the strike zone. Due to Pedroia's small stature, his power hits were
located primarily on the inside of the strike zone; Heyward, a much taller player, was more
successful on the outside. The readily apparent sensitivity of these plots to different batting
tendencies speaks to the legitimacy of these graphical analyses as a potentially valuable advance
scouting tool.

Table of Contents



TABLE OF CONTENTS


Introduction

....................................................................................................................... 1
Baseball: America's National Pastime ...................................................................... 1
Major League Baseball ............................................................................................. 2
A Numbers Game ...................................................................................................... 2
SABR: Society for American Baseball Research ..................................................... 3
A New Graphical Analysis Tool for Advance Scouting ........................................... 4

Current Methods & Technology ...................................................................................... 5
PITCHf/x ................................................................................................................... 5
Graphics .................................................................................................................... 6
Areas for Improvement ............................................................................................. 8

New Developments .......................................................................................................... 10
Advance Scouting of Batters ................................................................................... 10
Hexbin Pie Charts .................................................................................................... 10
Contour Plots ........................................................................................................... 11
Density Plots ............................................................................................................ 12

Methodology .................................................................................................................... 13
The Data Set ............................................................................................................ 13
Data Cleaning and Variable Manipulations ............................................................ 13
Preparation for Plotting ........................................................................................... 17
Plotting Function ..................................................................................................... 17
Hexbin Pie Charts ................................................................................................... 19
Heat Map Contour Plots .......................................................................................... 20
Heat Map Density Plots .......................................................................................... 21
Validation ................................................................................................................ 22



Results & Interpretation

............................................................................................... 24
Jason Heyward ........................................................................................................ 25
Dustin Pedroia ......................................................................................................... 25
Note for Plot Analysis ............................................................................................. 25

Discussion & Possibilities for Future Research ............................................................ 67
Value of Graphical Analysis to Advance Scouting ................................................. 67
Limitations ............................................................................................................... 68
Ideas for Further Analysis ....................................................................................... 70
Further Technological Developments ..................................................................... 71




TABLES & FIGURES

Figure 1

: Example of a relative frequency display using PITCHf/x data ...................................... 6
Figure 2 : Example of a polar plot using PITCHf/x data ................................................................ 6
Figure 3 : Example of a hit chart using PITCHf/x data .................................................................. 7
Figure 4 : Example of a strike chart using PITCHf/x data .............................................................. 7
Figure 5 : Example of spatial plot of the pitcher's actions using PITCHf/x data ........................... 8
Table 1 : PITCHf/x Variables Retained for Analyses ................................................................... 15
Table 2 : New Variables Created for Plotting Purposes ................................................................ 16
Figure 6 : Diagram of the four-level pitch universe ...................................................................... 18
Figure 7 : Heat map density plot for Heyward's swings ............................................................... 26
Figure 8 : Heat map contour plot for Heyward's swings .............................................................. 26
Figure 9 : Heat map density plot for Pedroia's swings ................................................................. 27
Figure 10 : Heat map contour plot for Pedroia's swings ............................................................... 27
Figure 11 : Heat map density plot for Heyward's balls in play .................................................... 29
Figure 12 : Heat map contour plot for Heyward's balls in play .................................................... 29
Figure 13 : Heat map density plot for Pedroia's balls in play ....................................................... 30
Figure 14 : Heat map contour plot for Pedroia's balls in play ...................................................... 30
Figure 15 : Heat map density plot for good outcome vs. Heyward .............................................. 32
Figure 16 : Heat map contour plot for good outcome vs. Heyward .............................................. 32
Figure 17 : Heat map density plot for good outcome vs. Pedroia ................................................. 33
Figure 18 : Heat map contour plot for good outcome vs. Pedroia ................................................ 33
Figure 19 : Heat map density plot of Heyward's base hits ........................................................... 35
Figure 20 : Heat map contour plot of Heyward's base hits ........................................................... 35
Figure 21 : Heat map density plot of Pedroia's base hits .............................................................. 36
Figure 22 : Heat map contour plot of Pedroia's base hits ............................................................. 36
Figure 23 : Hexbin pie chart of all pitches to Heyward, classified by pitch type ......................... 39


Figure 24

: Hexbin pie chart of all pitches to Pedroia, classified by pitch type ............................ 40
Figure 25 : Hexbin pie chart of all pitches to Heyward, classified by swinging or not ................ 41
Figure 26 : Hexbin pie chart of all pitches to Pedroia, classified by swinging or not ................... 42
Figure 27 : Hexbin pie chart of all pitches to Heyward, classified by BIP or not ......................... 43
Figure 28 : Hexbin pie chart of all pitches to Pedroia, classified by BIP or not ........................... 44
Figure 29 : Hexbin pie chart of all pitches to Heyward, classified by outcome for pitcher .......... 45
Figure 30 : Hexbin pie chart of all pitches to Pedroia, classified by outcome for pitcher ............ 47
Figure 31 : Hexbin pie chart of all pitches to Heyward, classified by base hit or not .................. 48
Figure 32 : Hexbin pie chart of all pitches to Pedroia, classified by base hit or not ..................... 49
Figure 33 : Hexbin pie chart of all pitches to Heyward, classified by type of base hit ................. 50
Figure 34 : Hexbin pie chart of all pitches to Pedroia, classified by type of base hit ................... 51
Figure 35 : Hexbin pie chart of all swings for Heyward, classified by pitch type ........................ 52
Figure 36 : Hexbin pie chart of all swings for Pedroia, classified by pitch type .......................... 53
Figure 37 : Hexbin pie chart of all swings for Heyward, classified by BIP or not ....................... 54
Figure 38 : Hexbin pie chart of all swings for Pedroia, classified by BIP or not .......................... 55
Figure 39 : Hexbin pie chart of all swings for Heyward, classified by outcome for pitcher ........ 56
Figure 40 : Hexbin pie chart of all swings for Pedroia, classified by outcome for pitcher ........... 57
Figure 41 : Hexbin pie chart of all BIP for Heyward, classified by pitch type ............................. 58
Figure 42 : Hexbin pie chart of all BIP for Pedroia, classified by pitch type ............................... 59
Figure 43 : Hexbin pie chart of all BIP for Heyward, classified by outcome for pitcher ............. 60
Figure 44 : Hexbin pie chart of all BIP for Pedroia, classified by outcome for pitcher ................ 61
Figure 45 : Hexbin pie chart of all good-outcome pitches to Heyward, classified by pitch type . 62
Figure 46 : Hexbin pie chart of all good-outcome pitches to Pedroia, classified by pitch type .... 63
Figure 47 : Hexbin pie chart of all base hits for Heyward, classified by base hit type ................. 64
Figure 48 : Hexbin pie chart of all base hits for Heyward, classified by power ........................... 64
Figure 49 : Hexbin pie chart of all base hits for Pedroia, classified by base hit type ................... 65
Figure 50 : Hexbin pie chart of all base hits for Pedroia, classified by power .............................. 65

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Subfield / Discipline
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Partnering Agencies
Last modified

Primary PDF

Supplemental Files