As prospect enthusiasts go, they always want to know the ceiling, floor, or major league comparison for that prospect is. This is an almost impossible to answer question and is filled with variables, but there are systems out there that look at age, skill, and level similarities to provide a quantifiable comparable for a prospect. Not to say that these systems are perfect, but they are based off cold hard data.
One such system is the Comparison and Likeness (CAL) system created by Joseph Werner at our sister site Beyond the Box Score. More information about josh: he's the owner of ProspectDigest, his work has been featured on ESPN, and he's easily reachable on Twitter.
When I first read his post introducing the CAL system at BTBS, coincidentally it was about Mike Moustakas, I was intrigued by the process. I love projection and comparables systems. I'm a big believer in ZiPS, and I think PECOTA's five-year and similarity score index (so much that I wrote an article using it) is a great start to figuring out what a prospect may be given his similarity to other former prospects.
I reached out to Joseph/Joe/Joey on Twitter to see if he could provide me the CAL system results for Royals prospects. Not only was he very willing to give his information to me, but he did me one better and threw in a few current Royals pro bono (if you think that Royals prospect comps are "for the public good" as pro bono means...). What a swell guy.
Some of the information is still behind the scenes and if you have questions I'm sure Joe would be more than willing to answer them, but I'll let him explain his system better than I can.
What is CAL?
Before that question can be answered, let's take a look at what CAL isn't -- namely, a projection system in the traditional sense. CAL doesn't forecast an upcoming season -- or seasons -- like PECOTA or ZiPS or Oliver or any other groundbreaking well-known projection system.
So what exactly is CAL then? It's a player classification system whose singular goal is quite simple -- to provide a better context in which minor league numbers can be evaluated by finding closely related players. It's another piece to the analytical puzzle.
Taking James' original formula and reworking it with a litany of differently weighted statistics, CAL searches through the database for players of the same ilk. Each player starts out with 1000 points and subtractions are made for differences in age, level of competition, and, of course, position, among many others. (Some of the statistics used are: strikeout percentage, walk percentage, and homerun rates for pitchers, and plate discipline, Isolated Power, and Speed Score, which was developed by James himself).
Ideally, players with a score of 980 or higher represents a potentially strong correlation between skillsets and, again, a potentially similar development path. Outside of 980 points, skillsets begin to differ; obviously, so, the further away from that point. One player may have more speed or less power or play a different position.
Basically, at its root, CAL provides a list of players with similar skillsets and conclusions can be drawn from the evidence. If a player's top five CALs all flamed out in the minor leagues, well, there's enough evidence to suggest that the player may struggle as well. It is not definitive proof, but it allows one to make a more well-educated projection.
The database has been built using FanGraphs' minor league statistics, which means the history begins with 2006 season. This also brings me to another interesting point: it's not an expansive database; it extends just eight seasons. So as the data size continues to grow we'll have an even better understanding of CAL's potential as an analytical tool.
Now there are two things to note: The scores aren't based on a player's collective history, it's just one season at a time. For example, Player A's age-23 season matches up well with Player B's age-23 season. And the second is that until advanced minor league defensive data becomes available, CAL only focuses on a hitter's offensive ability. It simply uses a player's position as another filter.
Finally, each of the following examples depicts each player as if he is currently working through the minor leagues.
Is CAL a predictive tool for hitters?
Nothing is 100% definitive. However, I've run through numerous test cases that suggest that CAL can become -- and is -- a useful analytical tool for hitters. The system has shown the ability to root through a lot of the statistical mumbo-jumbo -- and all the unnecessary hype -- and sniff out some of the bigger prospect busts and surprises, including some late-blooming big leaguers as well.
Again, CAL provides the evidence to allow the user to make better educated guesses by looking at his contemporaries.
As with most similarity and projection systems we are missing a lot of real term data that could improve our systems, such as defensive data, but Joe has created a certainly admirable system. One thing I noticed with his results is that many of the comparables that CAL provided matched those that scouts or the baseball community have sourced.
Below are the CAL results for a handful of Royals current players and prospects. Joe has left notes by each player to further explain what CAL was likely thinking or what could cause a variation in the similarity.
So with the graces of Joe, and whether you like the results or not, here are the CAL system results for the players.
|Christian Colon||21||A+||Eduardo Nunez
|This is the absolute perfect example of what I envision CAL to be:
littered throughout all of Colon's top comparisons are nothing
but fringe major leaguers and utility guys (Nunez, Getz, Barney,
Hernandez Sogard, Sanchez). And I think it's pretty safe to assume
that we're all in agreement -- with or without the use of CAL --
Colon slides directly into that group.
REMEMBER: CAL works by looking at the evidence it provides
and making educated decisions.
|Billy Butler||20||AA||Logan Morrison
|Not only does this show proper prospect attrition rate, but look at
MLB'ers career wRC+: 103 Morrison, 113 Carp, 98 Wieters, and
129 Freeman. Ignoring this season, Butler's is 120.
Also something to point out, not one player on this list
ever topped 30 homeruns. Career ISO's: .172 Morrison, .154 Butler, .168 Carp, .166 Wieters, .183 Freeman
|21||AAA||Logan Morrison (2010)
Logan Morrison (2009)
|Yordano Ventura||20||A||Andrew Bellatti
|A prime example of CAL's usefulness when it comes to pitchers:
This list is absolutely littered with quality (future) big league arms
(Montas, Hellickson, Zimmer, Archer, Cole, Paxton, Kennedy,
Odorizzi, Smyly, Wheeler).
It's important to note -- and I plan on discussing this in next
week's piece -- pitchers, especially big arms like Ventura, have
the potential to move quickly through systems, so that does impact
|Kyle Zimmer||21||A+||Yordano Ventura
|Again, the list is littered with front-of-the-rotation-type arms:
Ventura, Archer, Cole.
|Raul Mondesi||17||A||Elvis Andrus
|It's a pretty decent list, but not one player has an impact bat. I think
Andrus comp is pretty solid, personally.
|Eric Hosmer||19||A||Lars Anderson
|Again, look at the patterns: Outside of Singleton (maybe), not
one player has developed that type of power projected for Hosmer.
Morrison (three times), Butler, Smoak, Barton, Rasmus, Belt.
Obviously, CAL wasn't too impressed by Hosmer's early work.
Logan Morrison (2009)
Logan Morrison (2010)
|Hunter Dozier||22||A+||Abel Nieves
|He was viewed as a reach in order to sign Manaea later, but
this is not an encouraging list. At. All.
|Bubba Starling||20||A||Slade Heathcott
|Toolsy outfielders who never really panned out. Not surprising.|
Michael Tayor (WAS)
|Wil Myers||19||A||Jaff Decker
|Again, look at the patterns: Harper, Buxton, Heyward, Winker, Rasmus, McCutchen, Jackson, Stanton, Jones, Bruce, Rizzo. Above-average or better big league bats. No question.
His slow start to last season in AAA coupled with a smaller
sample size skewed CAL's opinion of him in 2013.
Jay Bruce (2007)
Jay Bruce (2008)
|Patrick Leonard||20||A||Connor Narron
|Threw him in here because of the inclusion of Myers trade.
Despite some impressive numbers this season, it's an unimpressive list.
|Miguel Almonte||20||A||Christian Binford
|It's a pretty uninspiring list, really, with the exception of Parsons,
Binford, and Ranuado. Conclusion: back end starting pitcher or
solid relief arm.
|Sal Perez||20||A+||Francisco Hernandez
|This one was shockingly off (nothing's perfect), though Salome
was once viewed as a top backstop prospect before eating himself
out of the league.
Ramos' career wRC+ is 108. Perez's career wRC+ is 108.
|Alex Gordon||22||AA||Pedro Alvarez
|It's a solid list of names. Look at the career wRC+:
105 Alvarez, 97 Brown, 147 Braun, 113 Headley, 110 Gordon.
Obviously, Braun is the outlier, but I only had one minor league
season to work off of (2006).
|The McCutchen, Myers and Gomez comps stand out, but there's
an awful lot of fourth outfielder-types mixed in. Bonifacio had that
hamate injury last year, which saps a player's power for a while.
I think he looks like a solid league average regular, though there
is some risk involved.
|Sean Manaea||22||4||Tyler Thornburg
|The vagaries of having just 80+ innings to work with. Some encouraging names mixed in though (Arrieta and Roark).|
|Christian Binford||20||A||Tyler Herron
|A lot of back-end-type arms with the exclusion of Odorizzi.|
Note - I asked Joe if there were certain things he wanted to stress about the CAL system and he had this to say:
1. CAL is designed to look at a player's total production, not specifics (like avg., OBP, slug, HR, etc...)
2. CAL is a player classification system. It's up to the analysis to make the educated analysis.
I won't analyze the data myself, but I'll leave that to you all in the comments. I can already guess what the Wil Myers comments will be... Interestingly I wrote about Jorge Bonifacio as Wil Myers last August as Bonifacio was performing well in AA and CAL produced that result.
Again, big thanks to Joe for giving us this information. He's in the beginning stages of introducing the CAL system, which you will likely find more of over at BTBS, and we graciously accept him allowing us in on the project. Hopefully his information guides you further along in the endless pursuit of baseball knowledge.