Look what people have to say about Larry Mahnken's commentary!
"Larry, can you be any more of a Yankee apologist?.... Just look past your Yankee myopia and try some objectivity." - Bernal Diaz
"Mr. Mahnken is enlightened." - cordially, as always,
"Wow, Larry. You've produced 25% of the comments on this thread and
said nothing meaningful. That's impressive, even for you." - Anonymous
"After reading all your postings and daily weblog...I believe you have truly become the Phil Pepe of this generation. Now this is not necessarily a good thing." - Repoz
"you blog sucks, it reeds as it was written by the queer son of mike lupica and roids clemens. i could write a better column by letting a monkey fuk a typewriter. i dont need no 181 million dollar team to write a blog fukkk the spankeees" - yan
"i think his followers have a different sexual preference than most men" - bob
"Boring and predictable." - No Guru No Method
"Are you the biggest idiot ever?" - Randal
"I'm not qualified to write for online media, let alone mainstream
media." - Larry Mahnken
This site is best viewed with a monitor.
Disclaimer: If you think this is the official website of the New York Yankees, you're an idiot. Go away.
April 10, 2006
Win Contributions by Larry Mahnken
The last thing the world needs is a new offensive statistic, but I wanted to share something I've been playing around with the last couple of days. It's not meant to be predictive, it's just meant to measure value in a different way than other statistics.
The statistic is based on these ideas, which may be wrong, because I'm not a pro at this:
1) The run-value of an event is variable depending on the Base-Out State when it occurs. This, I believe, is a fairly non-controversial statement. A Grand Slam is the same as a solo homer, but obviously worth more runs. A single with a runner on is worth more than one with the bases empty, and a single with the bases empty is worth the same as a walk.
2) All runs in a game have the same value, regardless of when they were scored. The first run of a game and the 15th are worth the same, though each is worth less than a run scored in a 6-run game.
3) The ultimate value of an event to a team is dependent on the ultimate outcome of the game. A run scored in a loss is worthless, while a run scored in a win is valuable. (edited to make more sense. I hope the rest of this still makes sense with the edits)
The final two statements, I believe, are the controversial ones. Subscribers to the Game-State theory of value (first pioneered by the Mills brothers) believe that a run that happens late in a close game is worth more than one that happens earlier in the same game, and that tack-on runs are worth progressively less. I don't buy this. If you score 10 runs in the first it's the same as scoring ten in the ninth -- the direct impact on the likely outcome of the game at the time is different, but in the end, all other things being equal, they had the same impact on the actual outcome.
In the third statement I am making the point that the goal of a team is to win ballgames, not to score runs, and that a game can be won or lost on offense. If you score 0 runs, you'll never win, and you can always score enough runs to win. This statement holds true with pitching and defense in the opposite direction, and ultimately it can be said that you win because you score enough runs on offense and prevent enough runs on defense -- while you lose because you didn't score enough or prevent enough.
So how's this stat work? It's pretty simple.
First, I find the base-out state for every event on offense in the game (I'll explain at the end of this hole thing why I didn't do pitching and defense -- to simplify, it requires a whole lot more data that I don't have). Using Tangotiger's Run Expectancy Matrix, I find the expected runs scored for each state.
OK, here's where I made another decision I'm thinking a lot of people will disagree with. I figured what the worst possible outcome of each event was, and what the RE was for it. With nobody on and nobody out, the worst possible outcome was one out with nobody on, while with two on and no out, the worst that could happen is a triple play. Obviously there's a greater chance of an out in the first situation than a triple play in the second situation, but I made no adjustment for that. I'm not sure if I should, or how to do so if I should.
The reason I did this is so there would be no negative values. I calculated the value of each event as being the difference between the RE of the outcome and the RE of the "worst possible" outcome. I also calculated the difference between the outcome and the "best possible" outcome -- which is, of course, a home run.
OK, so the next step is to add up the "value" of every event for the team, as well as the total of the difference between the value and best possible value. You then add up these totals for each player.
If the team wins, then each player's "Win Contribution" is the percentage of the total team value (this is why I set the baseline as the worst possible outcome -- so the lowest possible contribution is 0). If they lose, their "Loss Contribution" is the percentage of the team total of runs below the best possible outcome.
It's pretty simple, though I'm not yet sure how well it works. I've only run it for the Yankees for the first six games, and here are the totals:
Player Wins Losses Jorge Posada .356 .455 Hideki Matsui .312 .468 Alex Rodriguez .257 .550 Robinson Cano .219 .380 Derek Jeter .218 .367 Johnny Damon .199 .344 Jason Giambi .158 .476 Bernie Williams .116 .352 Gary Sheffield .086 .537 Miguel Cairo .048 .000 Bubba Crosby .021 .000 Andy Phillips .010 .000 Kelly Stinnett .000 .072
Hopefully, at the end of the season, this will reflect which players contributed most to victories and were most responsible for the defeats. As you can see, currently the most responsible player for the Yankees' defeats is Alex Rodriguez, just ahead of Gary Sheffield, because he's made outs in so many high-RE situations. If you want to convert these numbers to a winning percentage (which is fair), you'll find that no regular has a Pct. over .500 -- which of course isn't surprising. While A-Rod has the most loss contributions, he's also contributed heavily to their wins and his .319 Pct. is not much different than the team's .333, the regular with the worst Pct. is Gary Sheffield, who has been responsible for only about 4.3% of their wins, but 13.4% of their losses.
Now here's why I didn't do pitching and defense: lack of data.
A pitcher's value shouldn't be based on the outcome except for walks, strikeouts and homers. For any ball in play the value should be the expected run value of where he hit the ball -- the difference between that and the outcome goes to the fielder. I suppose I could buy the data from BIS or STATS or something, but that would cost a LOT. I'd then have to parse the data by Base-Out state to find values for each point on the field. I'd love to have the data and time to do that, but for now let's see how nicely this stat works out, then maybe we'll go more in-depth. --posted at 2:03 PM by Larry Mahnken / |