After that Billy Butler analysis I wrote earlier this week, I got more curious about fly balls, specifically at Kauffman. I really wanted to dig deep, do some astute analysis, and really quantify the effect of the parking lot on depressing offense at the K.
Anyway, I gathered all fly balls hit at Kauffman stadium in 2014 using Baseball Savant, which means the numbers won't quite match up with Baseball Reference or FanGraphs. I really hate not having a single source of truth for determining what is a fly ball and what is a line drive, but such is our lot in the public domain. I have to use whichever dataset is most appropriate for my goals, and I needed fly ball distance. Only Baseball Savant and Baseball Heat Maps (using the PITCHf/x database) have that (that I know of, at least).
I also gathered all fly balls hit at all stadiums and parking lots everywhere in 2014. In order to compare the data sets, I needed to establish a baseline. That baseline was the median* fly ball distance of 2014, which turned out to be 267.3 feet. I also wanted to establish some measure of variation, which was the standard deviation. The SD came out to be 55.6 feet.
*I use median because if the fly ball distance is normally distributed, the median and the mean are basically the same. If the distance is not normally distributed, the median is the better measure of central tendency.
Using those numbers, I could bucket all fly balls into distance categories that aren't arbitrary. 0-1 standard deviations above the median, 1-2 standard deviations above the median, and so on. I had two goals with this. First, I want to determine the frequency of fly balls within each category. Second, I want to determine the production allowed within each category. In this way, we might gain a little more insight into how fly balls play at Kauffman, which we know isn't good due to the long fences and the defense. Onward.
First, I'll note this. The total production allowed by the league on fly balls was a .187-.545 BA/SLG line. At Kauffman, the total production allowed on fly balls was .164-.430. The relative production allowed by Kauffman was 82. It's pretty easy to hypothesize why-defense, fewer home runs due to park dimensions, and worse hitters on the Royals. Let's look at more granular data using the standard deviation groupings. The table below shows relative to league average each category's (0-1 SD above and below median, 1-2 SD above and below median, and 2-3 SD above and below median) frequency and production allowed, and then there is the sample size at Kauffman. A relative number of 100 is average.
|Rel FREQ||Rel PRD||Sample Size|
|2 sd below||112||86||4|
|1 sd below||92||110||140|
|1 sd above||80||102||158|
|2 sd above||81||99||24|
Not much to make of at either extreme. There were only 4 fly balls hit whose distance was 2 standard deviations below the median. At the other end, 2 standard deviations above, the rate was less than league average. I'd be willing to pin that on the Royals' hitters being not very good and the Royals' pitchers being pretty good as well as some luck.
In the middle parts, maybe there is enough to say something. The thing that jumps out at me the most is the above median category. There was about a league average rate of fly balls hit within that distance category, but the production allowed on those fly balls was stifled. The second thing that jumps out at me is the relative frequency of the 1 sd above category. Far fewer of those fly balls were hit at Kauffman compared to the league. The third thing that jumps out at me is that there were more below median fly balls hit at Kauffman compared to the league.
Going a little further, and introducing more noise, I looked at the rate of singles, doubles, triples, and home runs within each category compared to the league. Again, a 100 number is average. The table below shows those numbers.
|Singles rate||Doubles rate||Triples rate||HR rate|
|2 sd below||160||0||0||0|
|1 sd below||113||100||N/A||0|
|1 sd above||209||150||134||87|
|2 sd above||N/A||208||N/A||98|
I tried to bold and italicize the numbers I felt were worth looking at. Other numbers that weren't bolded/italicized either didn't have enough sample size and/or weren't far enough away from average to be worth talking about.
First, remember the ridiculous relative production number in the above median category, which was 62. The doubles rate is very low; many of those doubles were probably just caught at Kauffman due to the defense. I'd hypothesize that the outfielders on opposing teams play deeper to, in theory, prevent the double but give up the single, whereas the Royals' defense might just catch everything in this category regardless of depth. Also, there are no cheap home runs at Kauffman. The league home run rate in this category was 1.57%; 130 home runs were hit in this category. There were 0 home runs at Kauffman in this category.
A similar phenomenon occurs in the below median and 1 sd below categories. Those go for singles more often than league average, which I think lends a little more evidence to the playing deeper hypothesis.
The next thing to look at is the 1 sd above category. The home run rate is below league average, which you would expect due to the dimensions of Kauffman. However, the production allowed for that category, remember, was 102. That's because those home runs become doubles and triples, and no defense can really stop it. These are the really hard-hit balls that go for homers at other stadiums but might bounce at the wall at Kauffman. The slugging allowed at Kauffman in this category is less than the league, but the batting average on these fly balls is higher at Kauffman.
So, potential insights and hypotheses:
1) The Royals' defense is good enough not to have to play deep to prevent the double but give up the single at Kauffman.
2) Opposing outfield defenses might have to play deep to prevent the double but give up the single. Remember how under Dayton the Royals lead MLB in singles? Perhaps this is a contributing factor. Perhaps Dayton has determined he needs to find hitters who can plop fly balls in front of a deeper opposing defense. I would need a much larger sample size and to separate the Royals' and opponents' fly balls to determine this.
3) Really hard hit balls go for home runs less at Kauffman than other stadiums. Fairly obvious.
4) Those really hard hit balls still go for hits though, since no defense can stop those really hard hit ones.
5) There appears to be a combination of the Royals' weaker hitters and the Royals' pitching managing fly balls to produce a greater frequency of weaker hit fly balls than harder hit fly balls.
1) Gather a larger sample size
2) Separate out Royals' hitters and opposing hitters
3) Figure out the rate of singles, doubles, triples, and homers among each distance bucket by Royals' hitters and opposing teams' hitters
4) Figure out the overall production allowed in each distance bucket by Royals' hitters and opposing teams' hitters
I wish I had HITf/x.