Larry Mahnken and SG's

Replacement Level Yankees Weblog

"Hey, it's free!"


The Replacement Level Yankees Weblog has moved!  Our new home is:
http://www.replacementlevel.com

Featuring:
Larry Mahnken
SG
sjohnny
TVerik
Sean McNally
Fabian McNally
John Brattain


This is an awesome FREE site, where you can win money and gift certificates with no skill involved! If you're bored, I HIGHLY recommend checking it out!



Web
yankeefan.blogspot.com

Disclaimer: If you think this is the official website of the New York Yankees, you're an idiot. Go away.


April 17, 2006


Pythagorean Records and Forecasted Standings
by SG

Despite the Yankees' .500 record through 12 games, the team has played quite well. To put their start in perspective, I decided to undergo a little exercise similar to Baseball Prospectus's Adjusted Standings Report, which looks at the components a team has put up to give them a record based on how they have actually performed on the basis of their runs scored and runs allowed.

The heart of this type of analysis is Bill James's pythagorean winning percentage. The definition of this can be found at Baseball Reference.com.

Pythagorean winning percentage is an estimate of a team's winning percentage given their runs scored and runs allowed. Developed by Bill James, it can tell you when teams were a bit lucky or unlucky. It is calculated by

(Runs Scored)^1.83
---------------------------------------------------------
(Runs Scored)^1.83 + (Runs Allowed)^1.83

The traditional formula uses an exponent of two, but this has proven to be a little more accurate.


The theory is that a team's runs scored and runs allowed will balance out over the course of the season, so that you can use them to see if a team is lucky or unlucky, and how they should perform going forward.

It is probably too early to run this type of exercise given the limited opposition that most teams have played, but what the hell, it's an off day. Below are the standings through the end of the year if we assume the pythagorean theory holds true from this point forward.



In this set of standings, I've taken the teams' actual record and added the projected record over their remaining games if they play to their pythagorean record with the same rates as their actual runs scored and allowed As you can see, the Yankees have been quite unlucky so far, and the Mets could very well be the greatest team in the history of baseball. In other words, it's way too early for this to be very meaningful. And I know the Mariners are projected as playing 163 games, but it's a rounding error and I'm not in the mood to fix it.

There's no question that this early in the season, a team's runs scored and allowed could be skewed by a variety of things which would make them look better or worse than they really should be. Instead of just looking at the raw runs scored and runs allowed, it may be beneficial to look at a statistic which will correct for random variance by looking at team's component stats on offense and defense and project how much they would be expected to score going forward, to smooth out any flukes. There are a lot of different methods to do this, but the one that I like the best is Base Runs, by David Smyth. Based on Smyth's research, it has shown to be more accurate than the better-known Runs Created, particularly on a team-wide level.

The idea here is that you are factoring out over and under-performance in situations to get a more reasonable run estimation on both the offensive and defensive side going forward. The formula is in the link above, but in a nutshell you basically just combine the majority of good and bad outcomes and assign run values for each one to arrive at an estimated run value. You can use this to see if teams are doing
flukishly well or poorly, and get a feel for how likely current trends are to continue.

As you would expect with a system that corrects for anomolous performances, the numbers tend to approach a more realistic level, as you can see below.



Again here, I am calculating the teams' expected records over their remaining games based on their Base Runs scored and allowed and adding that to their actual record to arrive at projected final standings.

All I would take out of this is that the Yankees should be ok, despite their .500 record so far, as long as they can keep performing at a similar level and stay healthy. If you're a Royals fan, get ready for Chiefs training camp. I'll also go out on a limb and say the AL West winner will win more than 65 games.

I'll keep an eye on this as the year moves on and post about it on occasion, because I know many of you can't get enough stastics.

The Yanks are getting set for a two game set in Toronto tomorrow. In the first game, it'll be Randy Johnson vs. Gustavo Chacin, who went 0-4 with a 5.32 ERA in four starts against the Yanks last year. The Yankees will face two lefties, in Chacin and Ted Lilly, so I'm sure Joe Torre will aggravate us all by starting Miguel Cairo at first at least once. Obviously, a win would be nice, but I think it'll be more important to see that Johnson is healthy after leaving his last start after just five innings and 87 pitches for precautionary reasons.