/cdn.vox-cdn.com/uploads/chorus_image/image/36460436/495645335.0.jpg)
As prospect enthusiasts go, they always want to know the ceiling, floor, or major league comparison for that prospect is. This is an almost impossible to answer question and is filled with variables, but there are systems out there that look at age, skill, and level similarities to provide a quantifiable comparable for a prospect. Not to say that these systems are perfect, but they are based off cold hard data.
One such system is the Comparison and Likeness (CAL) system created by Joseph Werner at our sister site Beyond the Box Score. More information about josh: he's the owner of ProspectDigest, his work has been featured on ESPN, and he's easily reachable on Twitter.
When I first read his post introducing the CAL system at BTBS, coincidentally it was about Mike Moustakas, I was intrigued by the process. I love projection and comparables systems. I'm a big believer in ZiPS, and I think PECOTA's five-year and similarity score index (so much that I wrote an article using it) is a great start to figuring out what a prospect may be given his similarity to other former prospects.
I reached out to Joseph/Joe/Joey on Twitter to see if he could provide me the CAL system results for Royals prospects. Not only was he very willing to give his information to me, but he did me one better and threw in a few current Royals pro bono (if you think that Royals prospect comps are "for the public good" as pro bono means...). What a swell guy.
Some of the information is still behind the scenes and if you have questions I'm sure Joe would be more than willing to answer them, but I'll let him explain his system better than I can.
What is CAL?
Before that question can be answered, let's take a look at what CAL isn't -- namely, a projection system in the traditional sense. CAL doesn't forecast an upcoming season -- or seasons -- like PECOTA or ZiPS or Oliver or any other groundbreaking well-known projection system.
So what exactly is CAL then? It's a player classification system whose singular goal is quite simple -- to provide a better context in which minor league numbers can be evaluated by finding closely related players. It's another piece to the analytical puzzle.
Taking James' original formula and reworking it with a litany of differently weighted statistics, CAL searches through the database for players of the same ilk. Each player starts out with 1000 points and subtractions are made for differences in age, level of competition, and, of course, position, among many others. (Some of the statistics used are: strikeout percentage, walk percentage, and homerun rates for pitchers, and plate discipline, Isolated Power, and Speed Score, which was developed by James himself).
Ideally, players with a score of 980 or higher represents a potentially strong correlation between skillsets and, again, a potentially similar development path. Outside of 980 points, skillsets begin to differ; obviously, so, the further away from that point. One player may have more speed or less power or play a different position.
Basically, at its root, CAL provides a list of players with similar skillsets and conclusions can be drawn from the evidence. If a player's top five CALs all flamed out in the minor leagues, well, there's enough evidence to suggest that the player may struggle as well. It is not definitive proof, but it allows one to make a more well-educated projection.
The database has been built using FanGraphs' minor league statistics, which means the history begins with 2006 season. This also brings me to another interesting point: it's not an expansive database; it extends just eight seasons. So as the data size continues to grow we'll have an even better understanding of CAL's potential as an analytical tool.
Now there are two things to note: The scores aren't based on a player's collective history, it's just one season at a time. For example, Player A's age-23 season matches up well with Player B's age-23 season. And the second is that until advanced minor league defensive data becomes available, CAL only focuses on a hitter's offensive ability. It simply uses a player's position as another filter.
Finally, each of the following examples depicts each player as if he is currently working through the minor leagues.
Is CAL a predictive tool for hitters?
Nothing is 100% definitive. However, I've run through numerous test cases that suggest that CAL can become -- and is -- a useful analytical tool for hitters. The system has shown the ability to root through a lot of the statistical mumbo-jumbo -- and all the unnecessary hype -- and sniff out some of the bigger prospect busts and surprises, including some late-blooming big leaguers as well.
Again, CAL provides the evidence to allow the user to make better educated guesses by looking at his contemporaries.
As with most similarity and projection systems we are missing a lot of real term data that could improve our systems, such as defensive data, but Joe has created a certainly admirable system. One thing I noticed with his results is that many of the comparables that CAL provided matched those that scouts or the baseball community have sourced.
Below are the CAL results for a handful of Royals current players and prospects. Joe has left notes by each player to further explain what CAL was likely thinking or what could cause a variation in the similarity.
So with the graces of Joe, and whether you like the results or not, here are the CAL system results for the players.
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Christian Colon | 21 | A+ | Eduardo Nunez Eric Stamets Hector Made Shane Opitz Edgar Duran Didi Gregorius |
960.27 936.00 927.73 926.87 926.00 925.47 |
This is the absolute perfect example of what I envision CAL to be: littered throughout all of Colon's top comparisons are nothing but fringe major leaguers and utility guys (Nunez, Getz, Barney, Hernandez Sogard, Sanchez). And I think it's pretty safe to assume that we're all in agreement -- with or without the use of CAL -- Colon slides directly into that group. REMEMBER: CAL works by looking at the evidence it provides and making educated decisions. |
|
22 | AA | Chin-Lung Hu Angel Sanchez Jonathan Herrera Ozzie Martinez Rey Navarro Didi Gregorius |
947.20 945.07 938.93 926.73 922.07 907.07 |
|||
23 | AA | Alberto Gonzalez Eric Sogard Chris Getz Darwin Barney Josh Horton |
958.07 926.67 917.73 913.93 907.00 |
|||
24 | AAA | Darwin Barney Anderson Hernandez Alberto Gonzalez Niuman Romero Chin-Lung Hu |
928.53 925.93 921.92 912.47 909.80 |
|||
25 | AAA | Angel Sanchez Eric Sogard Cole Figueroa Luis Hernandez Andres Blanco |
917.93 917.73 916.47 890.20 889.53 |
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Billy Butler | 20 | AA | Logan Morrison Mike Carp Chris Marrero Freddie Freeman Kyle Blanks |
869.67 858.00 856.20 855.93 854.00 |
Not only does this show proper prospect attrition rate, but look at MLB'ers career wRC+: 103 Morrison, 113 Carp, 98 Wieters, and 129 Freeman. Ignoring this season, Butler's is 120. Also something to point out, not one player on this list ever topped 30 homeruns. Career ISO's: .172 Morrison, .154 Butler, .168 Carp, .166 Wieters, .183 Freeman |
|
21 | AAA | Logan Morrison (2010) Eric Duncan Logan Morrison (2009) Ji-Man Choi Matt Wieters |
858.53 855.07 819.20 817.67 801.60 |
|||
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Yordano Ventura | 20 | A | Andrew Bellatti Dennis Neuman Kyle Ginley Carlos Vazquez Francellis Montas Jeremy Hellickson |
957.10 926.90 924.90 920.90 919.00 915.00 |
A prime example of CAL's usefulness when it comes to pitchers: This list is absolutely littered with quality (future) big league arms (Montas, Hellickson, Zimmer, Archer, Cole, Paxton, Kennedy, Odorizzi, Smyly, Wheeler). It's important to note -- and I plan on discussing this in next week's piece -- pitchers, especially big arms like Ventura, have the potential to move quickly through systems, so that does impact comparisons. |
|
21 | A+ | Kyle Zimmer Christian Friedrich Jared Lansford Chris Archer Gerrit Cole |
968.90 926.20 914.00 912.00 910.90 |
|||
22 | AA | James Paxton Ian Kennedy Jake Odorizzi Drew Smyly Dan Smith |
964.80 939.00 933.80 932.00 928.10 |
|||
22 | AAA | Patrick Corbin Will Startup Zack Wheeler Carlos Villanueva Dana Eveland |
913.10 911.10 908.00 906.10 902.00 |
|||
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Kyle Zimmer | 21 | A+ | Yordano Ventura Christian Friedrich Jared Lansford Nick Kingham Chris Archer Wade Davis Gerrit Cole |
968.90 908.90 896.90 895.80 894.90 893.90 893.80 |
Again, the list is littered with front-of-the-rotation-type arms: Ventura, Archer, Cole. |
|
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Raul Mondesi | 17 | A | Elvis Andrus Jose Vinicio Andrew Velazquez Dorssys Paulino Delino Deshields |
943.87 882.80 866.40 863.13 856.67 |
It's a pretty decent list, but not one player has an impact bat. I think Andrus comp is pretty solid, personally. |
|
18 | A+ | Jose Vinicio Elvis Andrus Chris Owings Leury Garcia Jonathan Villar |
852.33 849.87 834.20 832.33 828.20 |
|||
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Eric Hosmer | 19 | A | Lars Anderson Nick Delmonico Aaron Hicks Daryl Jones Jon Singleton |
934.27 914.60 912.00 899.87 889.67 |
Again, look at the patterns: Outside of Singleton (maybe), not one player has developed that type of power projected for Hosmer. Morrison (three times), Butler, Smoak, Barton, Rasmus, Belt. Obviously, CAL wasn't too impressed by Hosmer's early work. |
|
20 | A+ | Logan Morrison Christian Yelich Lars Anderson Nick Evans Ryan Wheeler Joc Pederson |
864.00 854.20 845.87 842.67 842.67 836.73 |
|||
20 | AA | Oscar Taveras Maikel Franco Brandon Belt Colby Rasmus Billy Butler |
853.67 789.80 777.33 774.00 767.40 |
|||
21 | AAA | Daric Barton Logan Morrison (2009) Logan Morrison (2010) Billy Butler Justin Smoak |
838.60 836.00 827.33 812.80 793.73 |
|||
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Hunter Dozier | 22 | A+ | Abel Nieves Mike Costanzo Karexon Sanchez Niko Sanchez Thomas Pham Chase Headley |
943.47 908.60 905.00 903.93 897.80 896.73 |
He was viewed as a reach in order to sign Manaea later, but this is not an encouraging list. At. All. |
|
22 | AA | Reid Engel Marcos Vechionacci Jarek Cunningham Stephen King Michael Mosby |
917.53 910.87 886.13 866.40 862.00 |
|||
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Bubba Starling | 20 | A | Slade Heathcott Justin Jacobs Drew Vettleson Joe Benson Roman Pena |
929.60 924.33 922.73 920.13 917.00 |
Toolsy outfielders who never really panned out. Not surprising. | |
21 | A+ | Thomas Pham Michael Tayor (WAS) Shaun Cumberland Tim Battle Wilkin Ramirez |
948.80 932.20 931.27 927.27 923.33 |
|||
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Wil Myers | 19 | A | Jaff Decker Byron Buxton Josh Sale John Drennen Bryce Harper |
928.27 888.20 887.80 883.87 872.27 |
Again, look at the patterns: Harper, Buxton, Heyward, Winker, Rasmus, McCutchen, Jackson, Stanton, Jones, Bruce, Rizzo. Above-average or better big league bats. No question. His slow start to last season in AAA coupled with a smaller sample size skewed CAL's opinion of him in 2013. |
|
19 | A+ | Byron Buxton Colby Rasmus Nick Weglarz Jesse Winker Jason Heyward |
856.60 854.73 843.20 838.07 831.80 |
|||
20 | AA | Tyler Austin Andrew Lambo Robbie Grossman John Drennen Andrew McCutchen Austin Jackson |
902.67 880.67 879.27 873.07 870.80 870.80 |
|||
21 | AA | Giancarlo Stanton Travis Snider Miguel Sano Christian Yelich Joe Benson |
758.13 752.53 737.93 733.33 711.93 |
|||
21 | AAA | Adam Jones Jay Bruce (2007) Anthony Rizzo Christian Yelich Jay Bruce (2008) |
926.00 866.33 863.27 862.53 858.13 |
|||
22 | AAA | Michael Saunders Josh Reddick Brandon Wood Wladimir Balentin Nick Weglarz |
908.53 899.87 890.73 887.00 874.67 |
|||
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Patrick Leonard | 20 | A | Connor Narron Daryl Jones Mark Trumbo Brandon Snyder Jacob Kuebler |
952.53 934.40 933.27 926.47 909.40 |
Threw him in here because of the inclusion of Myers trade. Despite some impressive numbers this season, it's an unimpressive list. |
|
21 | A+ | Kirk Nieuwenhuis Dustin Geiger Beau Mills Jake Marisnick Daryl Jones |
929.27 921.87 915.13 898.67 894.73 |
|||
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Miguel Almonte | 20 | A | Christian Binford Edgar Osuna Tyler Herron Dimaster Delgado Wesley Parsons |
946.20 944.90 942.10 939.00 939.00 |
It's a pretty uninspiring list, really, with the exception of Parsons, Binford, and Ranuado. Conclusion: back end starting pitcher or solid relief arm. |
|
21 | A+ | Jon Barrett Patrick Urckfitz Bryan Shaw Ryan Berry Anthony Ranaudo |
953.90 943.90 938.00 933.90 930.90 |
|||
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Sal Perez | 20 | A+ | Francisco Hernandez Tomas Telis Angel Solome Alex Monsalve Rossmel Perez |
891.00 869.47 869.00 868.47 862.20 |
This one was shockingly off (nothing's perfect), though Salome was once viewed as a top backstop prospect before eating himself out of the league. Ramos' career wRC+ is 108. Perez's career wRC+ is 108. |
|
21 | AA | Wilson Ramos Angel Salome Christian Bethancourt Carlos Paulino |
931.67 887.80 886.73 879.47 |
|||
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Alex Gordon | 22 | AA | Pedro Alvarez Mat Gamel Jerry Sands Domonic Brown Ryan Braun Chase Headley Jedd Gyorko |
890.73 887.67 885.73 884.60 877.67 873.33 872.00 |
It's a solid list of names. Look at the career wRC+: 105 Alvarez, 97 Brown, 147 Braun, 113 Headley, 110 Gordon. Obviously, Braun is the outlier, but I only had one minor league season to work off of (2006). |
|
Jorge Bonifacio | Age | Level | Comparison | CAL | Notes: | |
19 | A | Andrew McCutchen Lius Domoromo Delta Cleary Edward Salcedo Jordan Schafer |
948.33 932.47 923.87 908.13 907.13 |
The McCutchen, Myers and Gomez comps stand out, but there's an awful lot of fourth outfielder-types mixed in. Bonifacio had that hamate injury last year, which saps a player's power for a while. I think he looks like a solid league average regular, though there is some risk involved. |
||
20 | A+ | Gorkys Hernandez Moises Sierra John Drennen Ramon Flores Marvin Lowrance |
918.47 917.00 912.93 908.73 906.00 |
|||
20 | AA | Wil Myers Yorman Rodriguez Tyler Austin Andrew Lambo Carlos Gomez |
913.27 868.73 866.60 853.93 851.60 |
|||
21 | AA | Raymond Fuentes Xavier Avery Gorkys Hernandez Kaleb Cowart Tyler Austin |
942.27 925.87 924.33 923.20 898.80 |
|||
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Sean Manaea | 22 | 4 | Tyler Thornburg Billy Bullock Jake Arrieta Sam Demel Tanner Roark |
940.80 932.90 925.20 920.80 918.00 |
The vagaries of having just 80+ innings to work with. Some encouraging names mixed in though (Arrieta and Roark). | |
PLAYER | Age | Level | Comparison | CAL | Notes: | |
Christian Binford | 20 | A | Tyler Herron Gregory Billo Justin Nicolino Wesley Parsons Miguel Almonte |
989.90 966.00 954.10 947.20 946.20 |
A lot of back-end-type arms with the exclusion of Odorizzi. | |
21 | A+ | Liam Hendricks Joe Wieland Edwin Escobar James Parr Daniel Herrera Rafael Montero Jake Odorizzi |
907.00 904.00 903.00 897.00 888.90 887.00 886.90 |
Note - I asked Joe if there were certain things he wanted to stress about the CAL system and he had this to say:
1. CAL is designed to look at a player's total production, not specifics (like avg., OBP, slug, HR, etc...)
2. CAL is a player classification system. It's up to the analysis to make the educated analysis.
I won't analyze the data myself, but I'll leave that to you all in the comments. I can already guess what the Wil Myers comments will be... Interestingly I wrote about Jorge Bonifacio as Wil Myers last August as Bonifacio was performing well in AA and CAL produced that result.
Again, big thanks to Joe for giving us this information. He's in the beginning stages of introducing the CAL system, which you will likely find more of over at BTBS, and we graciously accept him allowing us in on the project. Hopefully his information guides you further along in the endless pursuit of baseball knowledge.
If you didn't do so the first time, go check out Joe's site ProspectDigest, and follow him on Twitter.