Larry Mahnken and SG's

Replacement Level Yankees Weblog

"Hey, it's free!"

The Replacement Level Yankees Weblog has moved!  Our new home is:

Larry Mahnken
Sean McNally
Fabian McNally
John Brattain

This is an awesome FREE site, where you can win money and gift certificates with no skill involved! If you're bored, I HIGHLY recommend checking it out!


Disclaimer: If you think this is the official website of the New York Yankees, you're an idiot. Go away.

July 11, 2006

More on Run Values
by SG

Mike K. requested a breakdown of the numbers I posted yesterday, and I think it's a good time to explain some of this stuff in a little more detail to those of you who may be a little less familiar with this sort of thing. If you're not a fan of math or stastics, I'd recommend skipping this entry and going to read the latest tripe from Mike Lupica or something.

I used to be a big-time reader of Baseball Prospectus and cited their numbers blindly without wondering about how they came up with them or questioning them in any manner. I still think they do some interesting work but not enough to entice me to pay for it. I've since moved on to trying to figure as much of this stuff out by myself as I can. That's why I no longer use things like VORP, WARP, and their other numbers which are often mysteriously calculated.

The opinions of many of the sabermetricians whose opinions I respect hold that the best way to assess a player's contribution is to use linear weights. What are linear weights? Funny you should ask. Linear weights (abbreviated as LWTS) is a way to figure out the run value of the various components that a player compiles, both good and bad. It was developed by Pete Palmer and introduced in the book The Hidden Game of Baseball back in 1984. The idea behind this is that every positive and negative outcome on the baseball field has a corresponding run value. You can combine these into a formula to estimate the value of a player's run contribution to his team. The formula I'm using was developed by TangoTiger and can be found on his page here.

I like linear weights because it includes a lot more information than most other metrics, and because I can calculate it using publically available stats. That is where the offensive component for each player is calculated. Then, there is a positional adjusment for each player based on the historic level of offense at that position. This can be re-calibrated from season to season depending on the depth at the position, but I'm currently using the following positional adjustments.

1B -9
2B 6
3B 2
C 16
CF 0
DH -4
LF -9
RF -7
SS 9

What this means is that the average first baseman over 150 games would be expected to be nine runs above the average offensive player. Therefore, you must deduct this from the player's offensive value when comparing him to other players. Catcher is historically the weakest offensive position in baseball, so you would add 16 runs to a catcher's performance over 150 games to compare their value to other players. This is what has helped Jorge Posada be the most valuable Yankee so far. Posada has an LWTS of 13.4. I then add the 16 run/150 game multiplier pro-rated over the 76 games he's played in (8.1) to arrive at his positional adjusted linear weights total on offense of 21.5.

So in the table below are the offensive linear weights(LWTS) and positional adjusted linear weights(psLWTS) for every player who has batted for the Yankees this season.

The fewer the plate appearances, the more you need to adjust for what the player can be expected to do going forward. As you can see from the list above, Derek Jeter has been the Yankees' most valuable offensive player overall, with Jason Giambi second, Alex Rodriguez third, and Jorge Posada fourth.

Position players don't just contribute offensively obviously, so the next part of the equation is trying to quantify a player's defensive value. Again, the system I'm used is based on zone rating.

Zone rating is a decimal from 0 to 1 which calculates the percentage of plays a player has made on balls in their zone. Think of it like this:

Zone Rating = Plays Made/Plays Available

Unfortunately, Plays Available is not publically recorded. However, thanks to the hard work of Chone Smith and Chris Dial at Baseball Think Factory we can estimate it by using a proxy of the plays made. How do we do that?

For 1B, you can either use a formula which breaks down as Assists/2 + Putouts/6, or you can you use the historic average for 1B which is that the average 1B would see 281 chances in their zone over 1440 innings.

Divide 281 by 1440 and multiply it by the innings played and you can approximate how many plays the first baseman you are looking at had available to them. This will not be exact but since Zone Rating is based on the actual plays made in zone, it will give us a good enough approximation.

For all other infielders, we treat assists as a play made, so you just divide the assists by the zone rating to get an estimate for plays available.

For outfielders, putouts are treated as a play made, so putouts divided by zone rating give us an estimate for plays available.

Chone's method for evaluating a catcher's defense is available in his article, as well as the details of how he adds in double play ratings for middle infielders and OF assists and errors.

So, we have an estimate of plays available and plays made for each player. We then compare this to what the average defender is doing at the same position, to come up with the number of plays that the player made or did not make compared to what an average player would have if faced with the same amount of opportunities. We do that by multiplying the plays available for the individual by the league average zone rating, and then comparing the individual's plays made to what the average plays made would be. This is then multiplied by the average run value for a play at that position to get a run value for the player's defense.

If you want to play around with this yourself, here are the average Zone Ratings at each position through the All Star Break.

1B 0.834
2B 0.825
3B 0.764
SS 0.837
LF 0.858
CF 0.876
RF 0.875

1B 0.842
2B 0.812
3B 0.787
SS 0.835
LF 0.854
CF 0.866
RF 0.879

And here are the average opportunities and the run values for plays made/not made at each position from Chris Dial's article.

POS AvgZROps Runs/play
1B 281 0.798
2B 507 0.754
3B 430 0.800
SS 532 0.753
LF 348 0.831
CF 462 0.842
RF 365 0.843

AvgZROps = Average plays at that position over 1440 defensive innings
Runs/play = Run Value of a play at that position

Let's run through this with an example.

Bernie Williams has a Zone Rating this season of .784 in RF and he's made 76 putouts. So we divide 76 by .784 to get an estimate that Bernie has had 97 plays hit into his zone. The average RF in the AL has a ZR of .875, which means that over the same 97 plays Bernie saw in RF, an average RF would have made 85 plays. This means Bernie made 9 fewer catches than an average RF. This seems to match my personal observations pretty closely. The average run value of each missed catch in RF is estimated as .843 runs, since it counts as an out not recorded as well as a hit that could be a single or extra base hit. So you multiply the -9 times .843 and you get a defensive hit of -7.6. Next, I compare Bernie's "arm" to that of the average RF. The average RF has 76 assists in 10898 innings, or .006974 assists/inning. Bernie has 1 assist in 326 innings, or .003067 assists/inning. That means we are estimating Bernie's arm to have a negative impact so far of 1.3 runs. Bernie's error rate is slightly better than average which brings his overall value in RF so far to a -8.7. Since there's too many numbers and words in this post, let's break this up with a graphical representation.

There are problems with zone rating, and it's only fair to mention them here. First of all, it does not consider the speed of a batted ball. A player on a team with a lot of bad pitching may appear worse then they are if they are being asked to make difficult plays more often than the typical player would. It also does not include certain plays such as popups and it does not factor in positioning. Also, we are reverse engineering and estimating some of the numbers we are working with here which is going to introduce error bars that need to be considered. I will also point you to an article that Mike K linked in an earlier entry, which lists some other concerns regarding ZR.

I still like this system because it's fairly easy for me to run and for the most part it matches my perceptions, but it's not perfect and we should not take it as an absolute by any stretch.

So, with all disclaimers issued, here are the defensive breakdowns for your 2006 Yankees.

Since the subject of Derek Jeter and defense comes up fairly often, you can see by the numbers above that Jeter has not made 8 plays that an average SS would have been expected to make. This may seem like a lot, but it's less than one play a week. When we say Jeter is a bad defensive player, it's not that he's horrendous and missing dozens of balls a week. He just misses a play or two every two weeks that an average SS would have made, and it's not necessarily that obvious to the naked eye.

So, adding it all up, you get the chart I posted yesterday, broken down like this. By request from an anonymous poster, I've added three columns to show how the current performance would break down over 150 games. Again, players with small amounts of playing time will need to have their expected performance
adjusted accordingly. For example, it's highly unlikely that Nick Green would be more valuable offensively than Alex Rodriguez over 150 games. Hideki Matsui's hitting was below par so far this season too, and if he is healthy and can return he'd probably be expected to be closer to his form from 2003-2005. Melky's offensive and defensive stats are not very impressive right now, but he's a 21 year old who's seemingly improving every day in some ways, so I'd like to think that he would do better going forward as well.

One note, I messed up Andy Phillips's position adjustment yesterday as I gave him too much of a penalty for games played. He has only started 39 games despite appearing in 67 so I adjusted that accordingly, which gives him a few runs back on offense and few more runs of defensive value if he were to play in 150 games.

And there you have it. Obviously this doesn't include factors like clutch hitting and other things which we cannot easily measure, and those things should not be discounted. On the whole though, this should give you some idea of why the Yankees are where they are right now and who the chief contributors are both positively and negatively.

A few things which should be noted from the chart above. Despite his bad offensive numbers, Miguel Cairo has been a good defender at almost every position he's played. In fact, he's been good enough defensively that his overall value is just about average. I realize it's hip to focus on something like OBP and decree Cairo as a problem, but if you look at these numbers and the totality of his contribution, he's a fine bench player. Pro-rating Terrence Long to 150 games really hammers home how horrible he was, doesn't it?

Tomorrow, the pitchers!