The recent Hall of Fame voting got me thinking, which doesn’t often happen….is there a way to adjust statistics obtained by known steroid users by controlling for the variable of steroid use? I am fully aware that there is no such thing as an original idea, much less one like this that comes from me, but this isn’t exactly me trying to be the Einstein to the sabermetric Poincare (or is it the other way around?). Anyways, this is just intended to be the start of a discussion about the statistical possibilities, and hopefully garner the opinions of the folks that are actually qualified to speak somewhat authoritatively about the matter.
Question #1 - Is steroid use quantifiable?
My thinking to this is that, yes, steroid use can be quantified using a 0 for didn’t use and 1 for used. Yes, this is how complex my statistical analysis is.
Question #2 - Is there a large enough sample size to work with?
There have been enough admitted users to establish a sample size of users that could be worked with. One could also look at the Mitchell Report and use those identified players, similar to what was done....here's where I should link to the other research I found, but literature reviews are for punks and nancy boys. While pulling the sample of players that used steroids may initially seem like the difficult sample, I am unsure of whether you could pull a sample of players that definitively did not use (I’m looking at you Mike Sweeney). As with many research projects, establishing the sample could be the most problematic, especially considering that you would want to create two sample groups with similar statistical profiles prior to the introduction of the variable you want to study. Anyways, like any good researcher, I will choose to make my own adjustments where necessary in order to ensure I can conduct my research. As a result, I will say that we can establish a sample.
Question #3 – How do you identify when a player first used steroids and how long they used?
We get some indication of timelines by player admission, but I would be interested to see if there is any credible data regarding physical descriptives of players over time (height, weight, dome size, etc.). I would think an extensive analysis of changes in physical features compared to a scientifically defined norm would be one way to try to establish when use was began and ended, but not sure that is feasible. It would make sense to look for points in time where a known steroid user had their first season of rediculously outperforming their projection (pick whichever one you like), but I think that may problematic as well. I think it would be difficult to make a determination of a timeline of use, but not impossible.
Question #4 – Haven’t I already answered my own question by the answers to questions #1, #2 and #3?
Yes, but that won’t stop me from continuing.
Question #5 – If steroid use is quantifiable, a sample can be had, and general information about use beginning and end dates is available, is there a way to control for steroid use in order to adjust known steroid users’ statistics?
This is where I would like other opinions. As I stated, in no way do I think this idea is novel, but I do think it’s worth discussion. I think if a statistically sound way was established to control for steroid use, HOF voters could then use those statistics as their basis for voting on a known user. We’re of course talking about things like wins, BA, RBIzzzzzzz….not those fancy shmancy saber stats used by baseball nerds.
First post ever….if you have anything unkind to say, please couch it with some sort of compliment. Like…..dude, that was an awesome idea, but you should have kept it to yourself. Or......you'd be a really great singer if your voice didn't suck.