Larry Mahnken and SG's

Replacement Level Yankees Weblog

"Hey, it's free!"


The Replacement Level Yankees Weblog has moved!  Our new home is:
http://www.replacementlevel.com

Featuring:
Larry Mahnken
SG
sjohnny
TVerik
Sean McNally
Fabian McNally
John Brattain


This is an awesome FREE site, where you can win money and gift certificates with no skill involved! If you're bored, I HIGHLY recommend checking it out!



Web
yankeefan.blogspot.com

Disclaimer: If you think this is the official website of the New York Yankees, you're an idiot. Go away.


January 26, 2004


Automatic For The People
by Larry Mahnken

If you've read Moneyball, or visit sites like Baseball Primer and Baseball Prospectus, you probably have heard of Vörös McCracken. McCracken, now a consultant for the Boston Red Sox (Booooo!), came up with a new way of evaluating pitchers a few years back (or, if you're RossCW, concocted an elaborate hoax that has taught us nothing about nothing).

The problem with evaluating pitchers has always been the relationship between pitchers and the defenders behind them. Whether a ball put in play is converted into an out or falls in for a hit has an enormous impact on the number of runs a pitcher gives up, and determining how much credit should be given to the pitcher and his fielders was seemingly an insurmountable barrier to seperating pitching and defense. The two could not be distinguished from one another once the ball was put in play, so statistics used to evaluate pitchers would always be inaccurate.

At this point, it had always been assumed that a pitcher had control of where a ball put in play was going to go. Vörös decided to test this assumption, looking at the year-to-year correlations of pitchers' HR rates, BB rates, SO rates and Batting Average on Ball in Play. What he found shocked him. While the defense-independent stats, HR, BB and SO correlated very well, BABIP (or $H) correlated so poorly to not be particluarly significant statistically.

Further, McCracken looked at the individual $H for pitchers, and saw that many players who were among the league leader s in $H in one season were near the bottom the next, though the park and their defense didn't change. Vörös came to a radical conclusion: that pitchers did not appear to have any impact on the result of a ball put in play, that whether it was a hit or not was a result of defense, park and--more importantly--luck.

From this conclusion, McCracken devised DIPS (Defense Independent Pitching Statistic), which used only statistics not impacted by defense to determine a pitcher's value and skill. DIPS correlated year-to-year much better than ERA or Bill James' Component ERA did, an indication that it was measuring a pitcher's skill better than the previous stats.

Well, it turns out that Vörös was wrong, some pitchers did have the ability to control $H, most notably knuckleballers, and Vörös released a second version of DIPS, which adjusted for the handedness of the pitcher and whether or not they were strict knuckleballers. Further refutation of Vörös' $H conclusion was a study by Diamond Mind's Tom Tippett, which concluded that as you increased the sample size, the correlation for $H became stronger statistically, and further, that some pitchers demonstrated a clear ability to impact $H, though for the most part it didn't become clear until the pitcher had played for many seasons, and was usually not sizable anyway.

This doesn't make DIPS irrelevant, of course, it still measure's a pitchers' skill better than ERA, because all Tippett's study did was show that you probably can't completely distinguish pitching from defense. But, since including $H in the equation adds more noise than it does useful data, a defense-independent stat like DIPS is useful. But it's flawed, just like the rest of the stats, because it doesn't measure all of a pitcher's useful skills.

Anyway, Vörös used to post DIPS stats on his site after each season, until he was hired by the Red Sox (Booooooo!), when the cause was taken up by Jay Jaffe, who has released DIPS numbers for each of the past two seasons.

Well, there's people out there like me who don't want to wait until season's end (or January, in this case) to find out what the numbers are. To make it easy to figure out DIPS, Vörös came up with a "Quick and Dirty" formula to use one the fly:

(IP*2.35)+(H*0.805)+(HR*10.76)+(BB*2.76)-(SO*1.53))/((IP*0.712)+(H*0.244)+(SO*0.096)-(HR*0.244)

Which gives an anwer that's close enough. Problem is, for people like me, that's like saying that pi is 3.14. It's close enough, but it's not pi. The other problem is that the actual formula for DIPS, if written out like the Quick DIPS formula, is about as long as pi. There's several calculations that need to be made, as well as the inclusion of Park Factors, and it makes the calculation of just one player's DIPS ERA quite a task, let alone everyone on your team's.

So, the only solution was to make a spreadsheet. I took Vörös' instructions, entered all the formulas into an Excel spreadsheet, and was able to churn out DIPS in seconds, not minutes. I'm sure dozens of other losers, er, statheads like me have done that, too.

I thought about distributing it a few times, there might be some interest in being able to calculate DIPS on the fly, but I wasn't confident that my numbers were right, so I held onto it, except for sending it to a couple of friends. Well, with Jay Jaffe releasing the 2003 numbers today, I decided a couple of weeks ago to send my worksheet to him and ask him to take a look at it. He checked it over, said it looked good to him, so now I'm ready to pass it on to you.

You can download the workbook here, it includes Regular DIPS version 2.0, Quick DIPS, and Park Factors for the past five seasons, plus a worksheet to calculate Park Factors for other seasons, if you wish. I made it on Excel 2000, so it might not work for past versions.

I think the workbook is pretty straighforward and easy to use, but if you have any questions, feel free to email me and I'll help you out as best I can. I've got a few ideas for making the workbook better, too, and if you have any suggestions, go ahead and email those to me, too. Feel free to distribute this file to anyone you want, as long as the credits and links remain on the page. Enjoy.

* * *

The file is also available in my Yahoo! group, here.