As I'm sure you all remember my award-winning CHONE-based projections combined with a spreadsheet from Beyond the Box Score for the 2009 Royals that came out long ago (and by long ago, I mean earlier this week). Since then, Sky Kalkman has "perfected" his spreadsheet and started Beyond the Box Score's Community Projection Project. Your's truly offered to "coordinate" the Royals projections over here. I hope that's cool, 'cause it's too late. The idea is that the nerdy portion of each team's internet fanbase (us) would fill out its respective team's tab of this massive EditGrid spreadsheet (click the tabs at the bottom to get to KCA -- you'll have to "arrow over" pretty far to the right). As before, I've filled in the basic vaules, but this time, I didn't get too crazy and specific. I just went with some very lightly "contoured" playing time predictions and stuff.
Why? Because this isn't my personal projection, but a community projection that I hope we can all come together and participate in to some extent, despite some recent disagreements and what-not.
So what do we do?
To repeat what I said in my earlier post, I simply entered in CHONE's projected rate offensive and pitching stats available both from CHONE's own site and also from Fangraphs. I used CHONE as my starting point because it's free and easily accesible to everyone with a level of detail not yet available from any other free projection systems, and it's completely out now. That doesn't mean we can't incorporate stuff from other projection systems as they come out (PECOTA and ZiPS are excellent, but neither is fully out yet, and PECOTA is only properly accessible to Baseball Prospectus subscribers). I hope that we also get input from those systems when they come out with their Royals projections... but more on that in a minute.
Anything is open to question and stuff, but here are a few things I've already done that I think is pretty good so far, so you don't have to look at it (although I'm always open to suggestions) and we don't waste too much energy nitpicking:
- Baserunning: I basically used Baseball Prospectus's Baserunning Stat (EQBRR) to generate a projection for the baserunning numbers. I substracted the steals portion, since that is already included in wOBA. In general, I tried to be very conservative with this part in particular, since BP only has two years of data, and it isn't clear how to project this stuff. So the range is generally no greater than +1 or -1 run (+0.1 or -0.1 win) per player (Crisp has outstanding baserunning numbers).
- Defense: As before, I went through first an input CHONE's projections, which not everyone likes. Defense is very difficult to project, especially for players with less than 3 full years of data. So after I put in all of CHONE's projections, I went through and checked each player's recent years of bUZR stats. If there was a major discrepancy, I would change things, sometimes with reference to the Fans Scouting Reports. When in doubt, I moved closer to league average (0). Thus, as I described in my other post, despite Alex Gordon's dreadful (-9/150) projection from CHONE, I elected to have him as "average" because all the good PBP I've seen have him as substantially above average in 2007, and below average this season. His "career" bUZR/150 is +2.5. The Fans Scouting Report also has him well above average in 2007 and average in 2007. When in doubt, go to the average.
- Positions: I might need more help here than I think, but in general, I wanted to keep it simple, so I didn't list utility guys like Willie Bloomquist or Esteban German at all their possible positions, just both at 2B. Yeah, the defense will be different for each, but I think it evens out int he end. Mark Teahen is another issue, but I'll address that down below. Suggestions here are appreciated.
So What Do We Do Now?
The place where the "human element" is most needed is what some might call the "subjective" element (pictured left) of a team's projection, or more accurately, the areas were statistics can't help us (I reject the "not stats" = "not objective" on dorky philosophical grounds). What I've entered in so far are the players first that I think will start, and then an assortment of bench players that seem to be on the roster at the moment. Part of a good projection is that it hits a "midpoint" between the best and worst -- it's not a personal prediction. In this case, getting more "wins" for the projected Royals isn't really the goal. Indeed we're going to be "wrong," more likely than not. The goal is to be as close to right as possible. So at this point, we have to prepare for all possibilities, good and bad, and choose a midpoint. That's why CHONE's "main"projections are his 50th percentile projections, and why I started with them. But we'll get into what to do about projections we think would be better in just a minute...
- Who is going to be on the team, and how much will they play? As for playing time and positions, it's tough to tell, especially for bench players. After discussing with NYRoyal here and Sky over at BtBS, my thought is to balance realistic usage with optimal usage. Both NY and (to a lesser extenet) Sky favor "realism," and I agree with that, but I also think that reality has a tendency to pound its way through the thickest skulls. But let's move on to some examples...
Who is going to do what in the bullpen? Yes, due to his contract, reputation, and (ahem) pedigree, pr0f3550r Kyle Farnsworth will probably start the year as setup man. As you can see on the spreadsheet (all the pitcher projections use FIP-ERA, by the way, as per Sky's advice), he doesn't project anywhere close to the #1 reliever behind Joakim Soria, but we all know that he's going to set up for the first part of the year, at least. I think that if the projections hold, eventually Trey will look to some of the other options. While I've contoured the IPs a bit so that it reflects reality and skills a bit, they're relatively flat. Here's where we need to do some work. Leverage makes a difference, but I also flattenedd that a bit, to represent the fluctuating roles behind Soria. The other question is about the back end of the bullpen. Are Jimmy Gobboel and/or Joel Peralta really going to make the team? Should we just project them for minimal innings? I couldn't fit all the possibilities on there... Remember that leverage needs to average out to 1, and total bullpen innings to 505.
- What about guys who might start and relieve? As has been noted, we don't know if Carlos Rosa or (shudder) Horacio Ramirez will start or relieve, or how much of each. I've put Rosa as a #6/#7 starter, and Ho-Ram as a garbage time reliever. It would be easier just to think this balances out, but if you guys want to split their time into both, that's OK.... but I don't want to have to alter the formatting of the spreadsheet too much, if at all, to add more spaces.
- How many innings will guys pitch? I suspect there will be some disagreement here. I was, again, pretty conservative, but I still think we'll end up being "high" for some guys. Yes, Gil Meche pitched 200+ innings the last two years, and Zack Greinke pitched 200+ in 2008. But those were some fairly surprising developments, from a projection standpoint, given previous performance. And remember that a midpoint has to take in the possibility of injuries, and even a minor one can cost a starter 3 starts, and 15 innings... So I've come up with a a very basic curve starting with Meche at 175 on down. That actually might be high for a starting point, but whatever. this is a community projection, not my personal one, so let's see what we can do. Remember to stay on target (940 starter innnings, 505 reliever innings, as listed), and that scrubs will have to pitch at some point.
- How many PAs per batter? Yes, the projection would be more exciting if we gave Alex Gordon and David DeJesus 680 ABs, relegated Guillen to bench duty, and so on... but, well, that's a (hopeful) prediction, not a projection. Things get tricky here, again. In general, for each position, we want to "project" at just under 700 ABs (I have less for the catchers because they'll generally be hitting eighth or ninth). This gets a bit complicated with bench guys who might play multiple positions. I put German and Bloomquist both as 2Bs (if we think German should be "cut," that's another issue) because those are closer to their primary positions, and the worse/better defensive ratings they'd get by moving to other positions are already sort of reflected in their respective positional adjustments. Teahen's in a similar situation (which is why he seems to get so many PAs -- time at first and 3rd, as well as backing up both Guillen and Dejesus). Keep in mind that while we want t he basic projections done in the next week or two, we'll keep tweaking it for roster moves until Opening Day.
- What if we don't agree with the wOBA/ERA/team projections? I'm guessing this is likely to be a sore spot with some people. But keep in mind what I said above -- this isn't a prediction, but a projection that is sort of a "baseline" for predictions of different possibilties. Player performance which deviates from a midpoint projection is almost by definition unprojectable. We don't "win" anything more in this case by projecting the Royals for a higher win total -- the goal here is relative accuracy. If you look at my probability chart from my "beta" post, you'll see that just because a team is projected for about 79 wins, that means we don't think they have a chance of winning 84. In fact, on that earlier version, the 79-win projection still had a 36.5% chance of finishing .500. But I digress from player performance... I came up with an idea of how we can alter things while still staying relatively "objective" in our overall team projection (which is the real goal here).
How I suggest we "change" individual projections we think are too high/low: Look, I know we all have at least couple of players we think are going to do better than CHONE, ZiPS, etc. project. And for individual predictions, fine, use that. I won't repeat the "midpoint" point I've made a couple of times. I think it goes without saying that we're more likely have a more accurate community projection if we stick close to the midpoints and just try to get playing time/leverage distributions right. Having said, that, maybe we think the CHONE midpoints I posted for wOBA (FIP for pitchers, remember -- on CHONE's site he just does ERA, but FIP for the 50th percentile is carried at Fangraphs) are simply too high/low for some players. My suggestion is twofold:
- if you think that a player is projected too low, you can argue it, but it has to have firm statistical evidence from another top-flight projection system (ZiPS and, of course, PECOTA are what I'm thinking here). Indeed, I hope that people will take a look at those projections whenver possible (they aren't out yet for the Royals, at least). If you can make a case for it on those grounds (and no using the 80th percentile PECOTA projection to refute the 50th percentile CHONE projection), and most people agree, I'll calculate the wOBA/FIP if necessary and substitute that projection in.
- But what if none of the projection systems' 50th percentile is what we "know: a player's most likely performance will be. How can we balance our homer-ific wisdom with a reasonable projection? I came up wiht this: I'll give us one "mulligan," that is, one player can be "bumped up" to something like his 60th percentile CHONE projection (I would suggest young Mr. Greinke, but it's up to the group). From there, I suggest that for every player we "bump up," to his 60th (or maybe 70th) percentile we have to "bump down" another player with similar playing time to the 40th percentile (or however many "notches" correspond to the guy who is bumped up). Does that make sense? So no "hey, let's bump DDJ up to his 70th percentile, oh, okay, I guess we have to bump Brayan Pena down to his 30th..." That way, we can put character, or a little garlic, as Vinnie Colaiuta might say, into our a vanilla projections and process without skewing it too badly. Sound good? Maybe I can even post polls for promotions/demotions if that would work...
Sound good? I hope I'm not making this more complicated than it is, and that it can be fun and bring people who may have been arguing about too many things together. The link to the EditGrid is here, just browse to the KCA tab. It is also down below. Let me know if you have any questions, and let the Great Royals Review/BtBS/KCA Projection Debate of 2009 Begin (and hopefully can rationally get the vanilla projection beyond 77.3 wins)!
Note: The spreadsheet is set so that only the "coordinators" for each team (I'm the one for the Royals) can edit it. It's just simpler that way -- once we come to a consensus on each change, I'll make it.
Another Note: People should jump in with whatever comments they want to make, but I think I might start some "sub-threads" below, say, each day, so that people can contribute to a specific discussion...
Update, January 20, 5:11 PM EST: While we now have a "reasonable" and "objective" projection of 77.5 wins, this process hasn’t been quite as, um, "fun" as I’d hoped. I thought there’d be more angry criticisms of CHONE’s projections, or something. So, with a generous "thank you" to Colin Wyers, I’ve done something else… I’ve used a prototype Colin’s form for so that you can submit your own projections to me! Click here to see and use the form. Right now, I only have one for offense. Although you’re "’required" to fill most of the important counting stats, if you’re more comfortable with "projecting" rate stats for each player, fill in those, too, and I’ll make those the "priority" for your player. This is mostly meant as a way to liven up this projection process. The more entries I get, the more weight I’ll give them. Basically, then, I’ll so some crude "averaging" of this, meaning, after a few days, if we get a number of entries for a certain player, I’ll look and see what CHONE percentile to which the overall results are closest, and then sort of average the 50th percentile with that. Or something. It’s all very nebulous, but I hope you guys will get into it, and also that not too many jokes will come through… If I’m missing anyone, or there’s a simple way to make it easier, let me know.
Again, CLICK HERE to enter your own projections, whether based on Marcel/CHONE/PECOTA/Dempsey/Other system/your heart/your gut