Sunday, February 3, 2013

What Makes a Good Pitcher?

Now that I've gone over the statistics I like to go over when determining the ability of a hitter, I went on and looked at statistics that determine the ability of a pitcher.  For this analysis, I used team pitching statistics from 1988 - 2012 since 1988 was the earliest year I was able to find statistics on one of the underlying stats I want to look at.

So when determining the ability of the pitcher, many different statistics are used.  The most basic and probably the most incorrect one to use is Win and Loss record.  A pitcher on a very good offensive team will inherently have an advantage in this statistic over a pitcher on a very bad team.  One prime example of this is the 2004 season for Ben Sheets of the Milwaukee Brewers.  Let's look at his statistics from that year:

ERA = 2.70
WHIP = 0.983
Strikeouts = 264
Walks = 32

I'm assuming most people know what ERA, strikeouts, and walks are.  However, WHIP is a slightly less know statistic.  WHIP means "Walks and Hits per Innings Pitched".  Generally, a WHIP below 1.200 is considered good and a WHIP above 1.400 is considered bad.  To put it in perspective, here was the WHIP for a selected group of Milwaukee Brewers in 2012:

Yovani Gallardo =     1.304
Marco Estrada =       1.142
Mike Fiers =          1.261
Randy Wolf =          1.574
Shaun Marcum =        1.266
Zack Greinke =        1.203
John Axford =         1.442
Francisco Rodriguez = 1.333
Kameron Loe =         1.434
Manny Parra =         1.653
Jose Veras =          1.507
Jim Henderson =       1.272

In short, Ben Sheets was spectacular in 2004.  The difference between Ben Sheets' 2004 WHIP and Yovani Gallardo's 2012 WHIP was higher than the difference between Yovani Gallardo's 2012 WHIP and Randy Wolf's 2012 WHIP.

On the other hand, let's look at Rick Helling's 1998 season.  Here are his statistics:

ERA = 4.41
WHIP = 1.327
Strikeouts = 164
Walks = 78

His 1998 season wasn't bad, especially considering he played in the American League and at a hitter friendly park.  However, there is no way his season even compares to Ben Sheets' 2004 season.

However, win and loss record would say otherwise.  Rick Helling finished 1998 with a 20-7 record whereas Ben Sheets finished 2004 with a 12-14 record. The win-loss statistic would argue that Helling was a better pitcher despite the fact Sheets was Cy Young Award worthy in every other category.  How could the win-loss record show such a disparity?

A lot has to do with luck, but the main culprit was the starting lineups for each of the teams.  An 8-7 win counts the same as a 1-0 win with the win-loss statistic, but a 1-0 loss also counts the same as an 8-7 loss, so offense, something that the pitcher has very little to do with, plays a role.  Let's look at the offense each team put out there most of the season and their Eq. Run Statistic.  I took the 14 hitters with the most plate appearances from each team for each of those years for this analysis.

Texas Rangers - 1998
Ivan Rodriguez      6.141
Will Clark          6.568
Mark McLemore       4.466
Kevin Elster        3.726
Fernando Tatis      3.647
Rusty Greer         6.069
Tom Goodwin         4.817
Juan Gonzalez       7.602
Lee Stevens         5.507
Luis Alicea         5.506
Roberto Kelly       6.485
Mike Simms          7.687
Todd Zeile          5.490
Royce Clayton       4.886
Average             5.614

Milwaukee Brewers - 2004
Chad Moeller        2.492
Lyle Overbay        6.284
Junior Spivey       5.224
Craig Counsell      3.729
Wes Helms           4.148
Geoff Jenkins       5.118
Scott Podsednik     3.851
Brady Clark         5.485
Keith Ginter        5.325
Bill Hall           3.297
Ben Grieve          5.259
Gary Bennett        3.260
Russell Branyan     5.646
Chris Magruder      4.065
Average             4.513


Please note that the averages used above are simple averages and not weighted in any way.  However, the average Eq. Runs for the top 14 players for the Rangers was about 1 run better than the average Eq. Runs for the top 14 players for the Brewers.  Even taking luck out of the equation, Ben Sheets had about a 1.11 run disadvantage to Helling through no fault of his own.  Sheets' ERA of 2.70 would, in theory, account for the same record as a 3.81 ERA for a Texas Rangers pitcher.  Clearly, it didn't (Sheets got extraordinarily bad run support (and was therefore very unlucky) from that 2004 team; Doug Davis finished with a 12-12 record and a 3.39 ERA and Victor Santos finished with an 11-12 record and a 4.97 ERA that year.  On the other hand, Rick Helling got very lucky and got extraordinarily good run support from the 1998 Texas Rangers.

So we know win-loss record is not a good way of evaluating a pitcher.  How about ERA?  This statistic is much better than win-loss record.  However, there are two fundamental flaws with it.  First, the definition of an earned run is based on the definition of an error, and errors are not a good indicator of a defense's success.  If all defenses were equal, ERA would be fine.  But a ground ball pitcher who has Yuniesky Betancourt at shortstop has an inherent disadvantage against a ground ball pitcher who has Ozzie Smith at his prime at shortstop.  Ozzie Smith is likely to have more errors than Yuniesky Betancourt but there is a simple reason for that; Ozzie Smith will get to more ground balls than Betancourt.  To prove that, we will look at perhaps the most basic of all defensive statistics: range factor.  Basically, it mentions how many times a player converts a play to an out in a game.

Yuniesky Betancourt, 2011:    RF / 9 innings = 4.12   21 Errors
Ozzie Smith, 1983:            RF / 9 innings = 5.51   21 Errors

What does this mean?  A team with 1983 Ozzie Smith can expect to give up 1.4 fewer hits (likely singles) per game than a team with 2011 Yuniesky Betancourt.  Sabermaticians generally agree that a single is worth about 0.57 runs, so a team with 2011 Yuniesky Betancourt can expect to give up more than 1 run more per game than a team with 1983 Ozzie Smith at shortstop.  So, even though 2011 Betancourt had exactly as many errors as 1983 Ozzie Smith (and therefore, the difference between RA and ERA should be approximately the same for both players), a 2011 Brewers pitcher would likely still be at over a 1.00 ERA per game disadvantage because of Betancourt.

The other inherent issue with ERA is that it is not a predictive statistic; it is a result statistic.  We are looking for a statistic that will predict the number of runs a team will give up; ERA is a statistic that tells you how many runs a team gave up.  It's a small inherent issue (the defense issue is much bigger), but it would be reasonable to believe that ERA and RA should correlate exceptionally well because they measure approximately the same thing.

The defensive issue also appears in WHIP.  WHIP, in theory, should eliminate the defense from the equation of how good a pitcher is.  However, as discussed above, while errors are not included in the statistic, there will be a defensive bias to it.  A pitcher with 1983 Ozzie Smith at shortstop will likely have a lower WHIP than the same pitcher with 2011 Yuniesky Betancourt at shortstop.

Is there any way we can take defense out of the equation?  The answer is yes.  There are a few things that a pitcher and only the pitcher affects.  Neither the team's starting lineup nor the team's defense will affect the pitcher if he gives up a walk, a strikeout, or a home run.  Nothing a fielder can do will stop a walk, strikeout, and most likely a home run (there are an immaterial number of catches which involve a fielder reaching over the fence to catch the ball; this is an immaterial amount).  Walk percentage and strikeout percentage are good indicators of how good a pitcher is; over 10% of the plate appearances against normally end in a strikeout and over 5% normally end with a walk; the amount is not immaterial.  Home runs, on the other hand, are affected very much by luck and chance.  A pitcher who gives up 15 home runs over the season has a home run percentage that is over twice as much as a pitcher who gives up 7 home runs over the season.  Do we really want to downgrade a pitcher over 8 plate appearances over the course of a season?  The answer is probably no, but pitching to contact is a significant part of determining how good a pitcher is.  How do we include that without including an immaterial statistic?

The answer is that we can use fly ball / ground ball ratio.  Fly balls generally lead to more hits and runs than ground balls, and home runs are fly balls.  That is one measure of a ball in play that the pitcher can somewhat control.

I ran a correlation analysis between runs against per game and strikeout percentage, walk percentage, and ground ball / fly ball ratio and found that while none of the correlations are exceptionally strong, they are all reasonably strong.  Strikeout percentage has a -0.375 correlation, walk percentage has a 0.4821 correlation, and ground ball / fly ball ratio has a 0.2796 correlation with runs against per game.  The scattergrams for these three analyses are below:


To try to come up with the best stat to analyze this information, I combined all three stats and found a coefficient and exponent for each stat.  The best I could do is a -0.5964 correlation, which means that the statistic explains about 59.64% of a pitcher's success.  The other 40.16% is likely due mainly to luck, but also to competition (an Orioles pitcher will have to face the Rays, Yankees, Red Sox, and Blue Jays more than any other teams all year whereas a Rangers pitcher will get to face the Angels, Athletics, Mariners, and Astros), stadium (a Padres pitcher will likely give up fewer runs than a Rockies pitcher because of the size of the stadiums and the location of the stadiums (i.e. the lightness of the air)), defense (we can take it out of the formula but it's always there), and various other reasons.  In all reality, a -0.5964 correlation is not a bad measure of a pitcher's success.  A scattergram is shown below:

 I use the formula on the screen to determine Eq. Runs Against, which I will use as a statistic in future posts.

So in conclusion, there is a hierarchy in pitching statistics.  Win-loss record is the worst of the most common statistics.  ERA is better in that it takes the team's offense out of the equation, but it doesn't take the team's defense out of it.  In addition, it's a reaction stat and not a predictive stat.  WHIP takes out the reaction part of the stat, but it still doesn't take stats out of it.  The best way, of the ways mentioned in this post, would be to look at strikeout percentage, walk percentage, and ground ball / fly ball percentage.

No comments:

Post a Comment