I might as well revisit this now, especially in light of Zack Greinke's amazing start. A little historical perspective never hurt anyone.
In my early days as a seat-filler at Driveline Mechanics, I was learning MySQL basics (thanks to Colin Wyers' series at Statistically Speaking) and in an effort to get people here to read my stuff, would occasionally post "supplementary" pieces on the Royals there to get people to read the explanations over there. One product of this was my anticlimactic (for me, anyway) posting of a list of (Perhaps) the 20 Most Valuable Single Season Performances by Royals Starting Pitchers, which used a simple, but fairly accurate version of WAR (Wins Above Replacement) in conjunction with averaging a pitchers FIP and ERA to get RAR and WAR.
Then I was the student. Now I am the master (of boring posts no one cares about... don't worry, I'll come up with Stay in the Bullpen: Chuck and the 2009 Royals when I have time). Read on for the details of my latest wanna-be-sabermetric venture into Royals' history.
So it doesn't get lost in the mess:
Skip-able Biographical Background
I'll try to avoid going too much into technical details. I use Wins Above Replacement (WAR) as my "unit" of comparison. For a brief introduction to different ways of evaluating pitchers, the meaning and use of "replacement level," and more, check out my earlier Thoughts on PItcher Value, which gets at the basic concepts. More developed concepts on using Pythagorean winnings estimates and stuff can be found in this extension on pitcher value, which were then developed further and then used in conjunction with projection systems to rank the MLB starting rotations by WAR.
I've wanted to get back to doing the history stuff for fun, but while I know how to crudely apply park factors, couldn't get a "big" or "complete" set that was easily importable into MySQL (although TucsonRoyal did help me out, I was too lazy to alter the tables -- thanks, TR, it's no me). In the course of another project involving park adjustments for woBA/linear weights, I got a complete set of 5-year park factors for 1871-2008 from terpsfan, who is also awesome. Not only will that help with my upcoming posts on offense at Driveline, but I realized I could use them for this sort of stuff, so away we go.
Boring Methodology Stuff Worth At Least Skimming
Here is the list. Maybe before or after looking, you should read the following to make sense out of what these numbers mean. Again, reading my earlier Driveline posts linked above will also help, and although I'm willing to try to answer any questions (or other) you have on how I came up with these numbers, the answers you're looking for might be found there. Other stuff that helped me out can be found in my list of links for aspring nerdlings.
- FIP: I decided to use FIP over ERA or RA for reasons Gopherballs goes into here. Note that my numbers don't exactly match those at sites like FanGraphs or The Hardball Times for various reasons -- people use slightly different formulae for FIP in terms of dealing with IBBs, HBPs, "scaling" to lgERA/RA, IPs, etc. So don't worry, it's OK. Note that FIP sometimes isn't as harsh to horrible pitchers (or their horrible seasons) as you'd expect since it assumes a basic competence on balls in play, etc. That's not a problem at the top of the list, which is my main focus, but since for fun I included 182 seasons (see below), you might wonder why certain pitchers did as "well" as they did. I know that FIP has its limits relative to RA, but in general I think it's better than RA. I actually have the BaseRuns SQLs all setup and tested, but I'm not quite comfortable with everything there yet, so, yeah, another version of this might be coming soon. Contain the excitement.
- Scaling: While the listed FIP is the usual scaled-to-ERA version, in calculating value, I scaled to RA, which is a more accurate way of judge pitcher value relative to position players, etc., just as RA is better than ERA (other than the fact the "earned run" rule is stupid). To avoid clutter, I don't have that column in the spreadsheet table, just a familiar ERA-scaled FIP.
- Park Adjustments: I park-adjusted the FIP scaled to RA using terpsfans 5-year park factors (similar to Patriot's).
- Win%: This is probably the most way to judge the pitchers "rate" of performance against the league average of the season, e.g., a 4.00 FIP is much more impressive in the 1998 AL than in the 1980 NL. In the same way that we can use the Pythagorean process we use to get a team's expected winninh percentage, we can do it with pitchers using his RA (or FIP-RA, in this case) as "runs allowed" and the league average RA to see what the pitchers "support neutral" winning percentage would be against an average pitcher and offense. I use the more accurate PythagenPat.
- RAR: Runs (saved) Above Replacement. Remember that replacement level is rate of performance over replacement times playing time. Replacement level winning percentage for starting pitchers, at least in the modern era, is set by Tom Tango at .380. To adjust for the relative difficulty of the AL and NL over the past two decades or so, we set replacement level at .370 for the AL and .390 for the NL. But when comparing different eras of the Royals, what is fair, since we aren't comparing different leagues, but eras, and value relative to era rather than "how would this guy fare if he pitched today" or whatever? I finally settled on .370 for 1991 on, and .380 for previous seasons. This is not the most precise way of doing it (one should really adjust replacement level specifically be league and decade, but I'm not there yet).
- WAR: This is the big one, since this is how I ranked the seasons. Why WAR instead of RAR? First, across history, a runs created/saved aren't equal. Each run is more valuable in a low run-scoring environment than in a high one. This is accounted for in both position player and pitcher WAR calculations. However, pitchers (especially starters) have an effect on their own run environment -- so, as others (like Rally and FanGraphs and more) do, I use a dynamic run-to-win conversion to account for the fact that, e.g., pitchers like Pedro 1999, Greinke 2008, and Lima 2005 dramatically effect the value of each run scored/prevented when they are pitching.
- Keep in mind that this is about value relative to season, not "ability," which is much more nebulous. So, yeah, pitchers who were allowed to pitch 250+ innings had more opportunities to be valuable than most pitchers today. That's the way it is. When you pithc more innings, of course, your performance in each inning is effected, etc. Keep in mind that I've done things with replacement level, run-to-win conversion, and win% estimates to adapt to the lower/higher run environments. I think this is a fair list. Keep in mind that despite all my efforts, something like RAR/WAR isn't that precise. Rounding to two decimal places is something I went back-and-forth, as it gives a illusion of precision, but I finally went with it just to get a decent sort of all the seasons around 4.3/4.4.
- TucsonRoyal posted career numbers from Rally's awesome site recently. This should not be taken as "better" than Rally's numbers or in competition with him in any way. He uses RA and adjusts for the quality of defense and whatnot, which is awesome, and probably more accurate in a way, but I wanted to do my own thing here, and don't have his capabilities. And I'm a Royals fan, so you know that I'm unbiased.
- I used the awesome free database from Baseball Databank. As far as I know, while it does give number of games started and games played for each pitcher, it doesn't say how many innings of each are in starting and relieving roles, and in some cases (some of Tom Gordon's seasons in the 1990s, other players in the 1970s) this makes a difference, since the value of being slightly above average in terms of FIP is much different for starters and relievers. To get around this, I arbitrarily set my minimum games started at 8, and used a minimum porportion of games started to overall appearances. Yes, it's arbitrary, and I'm know some relief appearanecs slipped in there, but it should make a huge difference. If you don't like it, well, then there's not much I can do about it. I know that sabermetrics isn't a popularity contest; I just want people to like me.
Some Quick Looks
Here is that spreadsheet again. I hope people will find this interesting enough to find their own stuff to talk about. For reference, here what I came up with for the top 10 seasons by WAR for Royals starters:
For comparison's sake, here are some other (somewhat random) Royals seasons of interest:
I'm not sure I'm ready to post my "greatest pitching seasons of all time" stuff yet, but for more perspective, here are some great post-1990 seasons from anyone, the ten best followed a few others chosen at random.
I've never really liked Johnson or Schilling. Or Clemens, for that matter, so that was fun... remember that this last list is only from 1991 on.
Well, there's another post that's 10 times as long as I wanted it to be. I hope someone finds it interesting.