Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Phil Mickelson Outshines Tiger Woods

KC Royals and BABIP

Let’s take a look at some Royals hitters and how their luck and other factors have fared over the last few seasons.

I’m an economist that does forecasting for a living, so all these stats are a labor of love. Plus, I have become completely sold on this 3 true outcomes approach to baseball. Basically, I’m looking at the difference between BABIP and expected BABIP (an avg of a couple different methods to calculate it) to look for luck, which are the unlabled stats given. Also, I’m looking at line drive rate variance to see if expected BABIP is a reasonable estimation. I also like walk/k rates and isolated power (if all this sounds a lot like PECOTA minus its comparables, it should).

Quick note: of course young guys can improve, but this is a historical stat take on things to see how reliable those stats are for us to judge our guys on.

Star-divide

Mark Teahen:
06:  0.037    very lucky    .290/.357/.517
07:  0.041    very lucky    .285/.353/.410
08: -0.015    unlucky       .243/.312/.381

Trends: solid walk rate, high K rate, unpredictable power, below ML avg LD%
Bottom Line: current numbers more likely to be the "real" Teahen, I'm really down on him after looking at those past BABIPs

John Buck:
05: -0.009    neutral             .242/.287/.389
06: -0.019    unlucky            .245/.306/.396
07: -0.030    very unlucky    .222/.308/.429
08:  0.008    neutral             .247/.323/.407

Trends: consistent production ind. of luck, good power, OK walk rate, high k rate, low LD%
Bottom Line: League average catcher offensively and very consistently such, Royals can win with that.

Alex Gordon:
07:  0.006    neutral        .247/.314/.411
08:  0.007    neutral        .252/.343/.406

Trends: increasing walk rate, developing power, high k rate, avg LD%, huge L/R splits
Bottom Line: Gordon's future lies on developing his incredible tools, learning to lay of the slider away from leftties

Joey Gathright:
05:  0.027    lucky             .276/.316/.340
06: -0.024    unlucky        .238/.321/.292
07:  0.020    lucky             .307/.371/.342
08: -0.007    neutral         .251/.302/.267

Trends: declining walk rate, very low and improving K rate, no power with some speed based doubles, variable but likely very low LD% with extreme GB tendancies
Bottom Line: 07 looks fluky with both luck and unusually high LD%, if that was his breakout year, he has regressed badly with career lows in several key areas.

Jose Guillen:
05: -0.012    neutral             .283/.338/.479
06: -0.047    very unlucky    .216/.276/.398
07:  0.034    lucky                 .290/.353/.460
08: -0.014    neutral             .261/.291/.456

Trends: has forgotten how to take a walk as a Royal but was ok before, average K rate, great power, below avg LD%
Bottom Line: If he remembers how to walk at his career rate, legit batter, 91 OPS+ OF/DH until then.

Mike Aviles:
08:  0.067    wow lucky    .340/.365/.550

Trends: low walk rate, low K rate, legit power, avg LD%
Botom Line: flashing legit offensive tools, but contining the .340 average is a luck based pipedream. Age 27 rookie season is worrisome on development curve.

Tony Pena Jr.:
06: -0.047    Very unlucky         .227/.261/.341
07: -0.006    neutral                   .267/.284/.356
08: -0.117    God hates TPJ    .148/.169/.196

Trends: very low walk rate, high K rate, low and dropping iso power, declining LD%
Bottom Line: I think 07 Pena is closer to the real offensive performance to expect, but does his defense offset what Aviles brings? RR.com has spoken

David DeJesus:
05:  0.010    neutral        .293/.359/.445
06:  0.005    neutral        .295/.364/.446
07: -0.014    neutral        .260/.351/.372
08: -0.013    neutral        .301/.361/.457

Trends: Good walk rate, very low and declining K rate, some power, great LD%
Bottom Line: The real deal! 07 was caused by low LD% unlike rest of career. Very excited about him.

Billy Butler (BAM BAM):
07:  0.014    neutral        .292/.347/.447
08: -0.001    neutral        .271/.331/.394

Trends: good walk rate, low K rate, developing power, low LD%
Bottom Line: Once he learns to drive the ball like he did in the minors, he becomes an elite hitter.

Ross Gload:
06:  0.024    lucky             .327/.354/.462
07: -0.007    neutral         .288/.318/.441
08: -0.027    unlucky        .275/.321/.346

Trends: low walk rate, very low K rate, lower than normal iso power - avg other years - doubles based, good LD%
Bottom Line: alot better than some people make him out to be, but below avg power, especially HRs, for a 1B limits his value, future is a ML bench

Mitch Maier:
sample size

Bottom Line: needs ABs to see who he is

Estaban German:
06:  0.073    extremely lucky    .326/.422/.459
07: -0.008    neutral                  .264/.351/.376
08: -0.024    unlucky                 .244/.300/.336

Trends: down year in general, historically, good walk rate, lowish K rate, light doubles power, avg LD%
Bottom Line: tough 08, but 07 numbers look repeatable. 350 OBP with low K rate can play, but still looks like super sub at best.

Alberto Callaspo
07: -0.077    very unlucky    .215/.265/.271
08: -0.019    unlucky            .290/.349/.330
small samples

Trends: avg walk rate, very low K rate, no power, good LD%
Bottom Line: I expected more power from him, but right now, a German clone in value. Future depends on what position Aviles ends up at.

Miguel Olivo:
05: -0.028    unlucky       .217/.246/.367
06:  0.007    neutral        .263/.287/.440
07: -0.002    neutral        .237/.262/.405
08:  0.024    lucky            .261/.293/.477

Trends: low walk rate, high K rate, good power, low LD%
Bottom Line: all his value stems from his ability to hit the long ball, which plays OK for a catcher. Good back-up, marginal starter

Mark Grudzielanek
05:  0.000    neutral        .294/.334/.407
06: -0.005    neutral        .297/.331/.409
07:  0.027    lucky            .302/.346/.426
08:  0.002    neutral        .299/.345/.399

Trends: Greek God of consistency, slightly below avg walk rate, low K rate, light power, exceptional LD%
Bottom Line: he has to breakdown at some point right?




Comment 68 comments  |  6 recs  | 

Do you like this story?

Comments

Display:

loving this...

so, the initial numbers are the +/- difference between the player’s BA and what should be expected?

by Freneau on Aug 4, 2008 3:30 PM EDT reply actions  

BABIP

I liked the post; it’s interesting to see the numbers, but I must admit, I am not a BABIP-luck propenent at all, ESPECIALLY for hitters. Ignoring wind and sun-aided hits, whether a ball in play is recorded as a hit is a matter of it’s velocity and its trajectory, wherein a ball’s trajectory is essentially it’s vertical angle relative to the ground, and its lateral angle relative to the left and right field lines. I’m willing to grant that the latter is predominately a matter of luck—a low line-drive straight up the middle is a single, but if it’s 10 degrees to the right or left, it will caught by a middle-infielder.

A ball’s vertical trajectory determines whether it is called a ground ball, a line drive, or a fly ball, and a batter’s ability to affect his batted ball’s vertical trajectory over the course of a season is absolutely NOT a matter of luck. Alex Gordon generates twice as many fly balls as Joey Gathright. If they’re both still playing in 5 years, Alex Gordon will still be generating twice as many fly balls as Joey Gathright. I know most xBABIP calculations take things like LD%, GB%, and FB% into account, but these 3-way classifications are not sufficiently specific, nor are they especially accurate. One fly ball is not the same as the next, and it’s not even always clear whether a batted ball is a fly or a line drive. Likewise, a one-hopper through the infield has a lot more in common with a line drive than it does a ground ball pounded straight down into the dirt.

xBABIP calculations that rely on LD/FB/GB percentages are assuming that the majority of one player’s flies, grounders, and line drives are leaving the bat at roughly the same vertical angles as the next. This assumption is certainly flawed, but at least the calculations are taking the vertical angles of batted balls into account when determining their probability of generating hits. The much more serious flaw in these calculations is that they don’t even take the velocity of batted balls into account. Certain players consistently hit balls harder than others. Likewise, certain players consistently hit balls more weakly than others. And the velocity of a ball of the bat has a HUGE impact on whether it generates a hit. A weakly hit fly ball is called a pop-up and is a nearly automatic out; a hard hit fly ball lands behind an outfielder for a double. Even a ground ball that’s hit straight down will generate a hit if it’s hit hard enough to produce a large bounce. Likewise, a line drive is an easy out for an infielder if it’s hit softly. Any calculation that purports to generate a player’s expected batting average on balls in play without taking into account the speed of that player’s batted balls is neglecting the single most important factor in that expectancy. Because of this, we end up calling players who consistently make hard contact lucky, and players like TPJ who consistently fail to get the barrel of the bat on the ball extremely unlucky. Does this make sense?

by kcdc1 on Aug 6, 2008 3:53 PM EDT up reply actions  

Line drives aren't just or even primarily about trajectory

There is a significant, inherent velocity element in a line drive.

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 6, 2008 4:12 PM EDT up reply actions  

Callaspo...

interesting breakdown. somehow in my mind, i had him pegged as having more offensive upside and Jason Smith like “pop”

by Freneau on Aug 4, 2008 3:32 PM EDT reply actions  

I think he does have more offensive upside than Jason Smith

Less “pop” but likely a better hitter overall.

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 4, 2008 3:40 PM EDT up reply actions  

Is there a solid link between BABIP and LD%

As in, if a player is under/overperforming the normal BABIP does the line drive % the reason or is it really just luck?

by djk royal on Aug 4, 2008 3:45 PM EDT reply actions  

I believe LD% is a component of xBABIP

So if you have a high LD%, your expected BABIP is higher. The same is true for pitchers.

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 4, 2008 3:46 PM EDT up reply actions  

LD% is the most important aspect

2 formulas, one is 2007 data on BA by hit type, second is my own regression of 2002-2008 data

0.73LD% + 0.24GB% + 0.15FB%
0.59LD% + 0.30GB% + 0.17FB%

by ZeppelinDZ on Aug 4, 2008 3:48 PM EDT up reply actions  

So, just a "simple"...

0.66LD% + 0.27GB% + 0.16FB% = xBABIP

or did you take the regression data and use a multiplier?

by stlfan on Aug 5, 2008 8:43 PM EDT up reply actions  

i actually emailed the fangraph guys on this issue

both equation actually give very similar results, and honestly, xBABIP has never really been officially defined mathematically, but i just ran both equation and used a simple average on the 2 values. the only player who really gives very different results is gator since he is so extreme GB (65-70%). im open to some discussion on this cause it needs to be and can be better defined.

by ZeppelinDZ on Aug 5, 2008 11:58 PM EDT up reply actions  

Did not know that.

At one time I thought the general rules was a ball put in play would generally be a hit 30% of the time no matter what. Anything over/under is essentially luck. If I understand you correctly, you’re saying that BABIP is different player to player based on their line drive %? So Gathright would normally have a much lower BABIP than say Guillen due to how hard they typically hit the ball.

by djk royal on Aug 4, 2008 3:49 PM EDT up reply actions  

Batters and pitchers do have some influence on what happens to balls put in play

And that is seen primarily through how often they hit line drives. So yes the expected BABIP has significantly to do with how often one hits line drives. Actual BABIP fluctuates based on a number of factors out of the pitcher and hitter’s control.

So Gathright would normally have a much lower BABIP than say Guillen due to how hard they typically hit the ball.

I would say basically yes. But Gathright is an interesting case. Gathright eschews a lot of potential line drives because he bunts so often, which keeps his LD% down (12.9% this year, which is very low). But, amazingly, last year his LD% was quite high (22.8%). That looks like a lucky/flukey result to me. And this has nothing to do with BABIP, but his infield hit percentage is also down this year, and that’s his bread and butter.

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 4, 2008 3:58 PM EDT up reply actions  

ya, there are even more in depth ways to calc xBABIP

which involves factoring out HRs, and giving coef. to IF hits and bunts, but batted balls data was as easy as some of the other to process and collect so i left it out.

gator is probably the only player on the team that it would affect that much, and it wouldn't be that much

by ZeppelinDZ on Aug 4, 2008 4:02 PM EDT up reply actions  

wow that was a meesed up post

quotes shouldn’t be there, sentence cut off, you get the picture tho

by ZeppelinDZ on Aug 4, 2008 4:03 PM EDT up reply actions  

for the flyball rates

are you including or excluding infield fly balls and HRs?

by Gopherballs on Aug 4, 2008 4:37 PM EDT up reply actions  

included

in my data, LD%+GB%+FB%=1 for all years and players

by ZeppelinDZ on Aug 4, 2008 4:40 PM EDT up reply actions  

well, small issue

there is a hiccup with HRs, but i tested the data and it has only a small effect. a HR hitter like Guillen is likely the only one effected, but he has a long enough data history that i don’t think we question his abilities too much. results are the same for the most part

by ZeppelinDZ on Aug 4, 2008 4:45 PM EDT up reply actions  

INVALID! Un-rec'd!

How dare you sir!

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 4, 2008 4:48 PM EDT up reply actions  

this post is the closest thing i have to peer review

i would be happy to email anyone my spreadsheet to review. it has lots of colors from excel 2007 conditional formatting.

by ZeppelinDZ on Aug 4, 2008 4:51 PM EDT up reply actions  

I think you did a great job

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 4, 2008 5:11 PM EDT up reply actions  

A quick and dirty xBABIP analysis is LD% + .120

The Hardball Times’ Dave Studeman did a study a few years ago and found a strong correlation between LD% and BABIP in which BABIP tends to normalize around a player’s line drive percentage plus .120. Using the full batted ball data like Zeppelin is more accurate, but LD% + .120 is easier and works well enough as a quick and dirty evaluation (kind of like OPS versus more advanced metrics like weighted on-base average (wOBA) or EqA).

The non-regressed numbers in Zeppelin’s formula correspond to the rates at which a type of batted ball should become a hit in front of an average defense: line drives should go for hits about 73% of the time, groundballs about 24% of the time, and flyballs about 15% of the time. Line drives go for hits most often, so they have the biggest effect on BABIP.

by Gopherballs on Aug 4, 2008 3:57 PM EDT up reply actions  

ya, LD + .12 is a great quick reference

since all my data was already in excel, doing full calcs was just as easy

by ZeppelinDZ on Aug 4, 2008 3:59 PM EDT up reply actions  

Very good insight

Aviles should say Powerball though

Every fight is a food fight when you’re a cannibal.
-- Demetri Martin

by kcscoliny on Aug 4, 2008 3:50 PM EDT reply actions  

How is BABIP affected by home park?

Seems like Kauffman is especially prone to dunk and dinks and slices with its spacious outfield and hard outfield surface.

Relive Royals History at royalsretro.blogspot.com

by RoyalsRetro on Aug 4, 2008 4:12 PM EDT reply actions  

actualy, this is a great point

none of my data is delimited by park factor etc. and I think it likely has a big effect on things, i believe the appropriate data is available retrosheet, but as a mentioned in an above comment, retrosheet’s batted balls data is weird and kinda screws things up.

its something that would be a great statistics thesis if there are any young stats students out there.

by ZeppelinDZ on Aug 4, 2008 4:16 PM EDT up reply actions  

I think a really big park would lead to more lucky hits because of the spacious outfield

I think this would affect Royals players some, but not to a great degree. Kauffman Stadium isn’t that huge. Certainly bigger than average but not be a lot. I think that is one reason why the K has played as a very mild hitter’s park since the fences were moved back in. And is the OF at the K harder than most?

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 4, 2008 4:32 PM EDT up reply actions  

I've heard a few players say that the outfield surface is hard and fast

Which leads singles to become doubles and doubles to become triples. We also have very deep alleys.

Relive Royals History at royalsretro.blogspot.com

by RoyalsRetro on Aug 4, 2008 4:33 PM EDT up reply actions  

I hadn't heard that

And if true, that would certainly help create extra base hits. I do think you have something here overall. I just don’t know if Kauffman Stadium’s dimensions and surface greatly or even significantly help players with the BABIP. It might.

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 4, 2008 4:36 PM EDT up reply actions  

How is BABIP correlated to OPS?

It seems that this does an excellent job of describing how lucky or unlucky our hitters have been as far as their batting average goes. While batting average is a useful descriptor, it is really not that effective of a method for evaluating offensive performance. Obviously, players with a higher batting average have a better chance of having a higher OPS, but that is not strictly true.

I tried to make a table of the comparison, but when I previewed the post it looked all goofy. Instead, I’m just gonna list the OPS numbers so you don’t have to add them up.

Teahen: ‘06-.874, ‘07-.763, ‘08-.693
Buck: ‘05-.676, ‘06-.702, ‘07-.737, ‘08-.730
Gordon: ‘07-.725, ‘08-.749

If you compare some of the above variations in OPS with the variations in luck, the data is pretty confusing. I only analyzed the first three batters listed, but I guess it would hold true for the rest of the team.

From this, I conclude that the luck factor is not correlated to OPS production at all. Teahen was very lucky in ‘08, but he was even luckier in ‘07 and saw his OPS drop by about 110 points. His production dropped more with that increase in luck that it did with the decrease in luck he has had this year (even on a percentage basis). Buck’s luck decreased by about .010 from ‘05-’06 and ‘06-’07, yet his OPS jumped about 30 points both years. This year, his first year with positive luck, is the first year his OPS has declined. Gordon’s luck has remained almost identical, yet his OPS jumped 25 points.

I guess all I’m asking is what does this information tell us about our lineup’s actual production or expected production?

by KCBear on Aug 4, 2008 5:20 PM EDT reply actions   1 recs

it means the players are progressing or regressing

the luck factor tells whether they have more hits than they should. However production can go up while luck goes down if the hitters ld percentage for example increases. And alex gordons increase in ops is almost entirely due to his increasing walk rate.

by RoyalFlush on Aug 4, 2008 5:57 PM EDT up reply actions  

I still don't understand.

I don’t get how their luck is related to whether or not they are progressing. They have no control over it; that is why it’s called luck. This luck factor is the batting average equivalent of FIP-ERA.

However, the xBABIP-BABIP only accounts for total number of hits, and not the type of hits. The type of hit is as important as the number of hits. Certainly LD% is important, but even line drives are not all the same. A line drive in the gap is not the same as a soft liner in the infield, but they are treated the same in this equation. If a “should-be” double is caught and “shouldn’t be” single is not, there is no change in xBABIP-BABIP despite the fact the player’s overall production is lower than it statistically should be.

Gordon’s increase in OPS is certainly due to the increase in walk rate, and the fact that xBABIP-BABIP doesn’t factor that in makes it a less valuable measurement. I think xBABIP-BABIP is not that useful of a statistic. If there is absolutely no correlation at all between xBABIP-BABIP and expected OPS, then what is its purpose?

Does some sort of xSLG stat exist?

Also, xBABIP-BABIP needs a shorter abbreviation.

by KCBear on Aug 4, 2008 6:24 PM EDT up reply actions   1 recs

but at some point

all numbers are an abstraction. You learn to deal with some loss of data validity as the price of staying sane, rather than watching video and trying to calculate the ascending arc angle of a line drive.

That’s what the scouts are for- to build a story around the stats.

but back to the point: I think you’re onto something when you bring in walk rate if we are trying to determine who is good vs who is lucky because the theory is that good hitters work pitchers into throwing a pitch that they can drive. Otherwise, they don’t swing at it and if they have a good eye, they will draw walks.

So LD% and Walk Rate and some luck factor would give you a good idea of who a good batter is.

Then you can factor in a power metric to tell you who is a good batter and has strength.

Will it be perfect? No because it’s an abstraction. But I think it would be useful.

. . . a weary nation turns to Gil Meche

by vegasroyals on Aug 5, 2008 5:36 PM EDT up reply actions  

The same way that Batting Average correlates to OBP, SLG, and OPS (which counts BA twice)

For a ballpark estimate, the easy, short-form way is to figure out the number of hits attributed to luck and then subtract them from the total and then recalculate OBP, SLG, and OPS.

The better, long-form estimate is to use the above for OBP only and then for SLG, figure out for which batted ball types (LD v. FB v. GB) the hitter was lucky and recalculate total bases based on the type of hit (groundball hits are almost always singles, flyball hits are usually extra bases), with the adjusted OPS being the addition of the adjusted OBP and the adjusted SLG.

So using the short-form way for, say, Teahen 2006, his luck was roughly the equivalent of 15 hits. If my math is correct, subtracting out 15 hits from his OBP and 15 total bases from his SLG results in an adjusted line of 253/321/478 or a 799 OPS.

by Gopherballs on Aug 4, 2008 6:35 PM EDT up reply actions  

Okay

I can see how it could perhaps be used as an intermediary step in finding useful data. However, I just plotted the above data in a chart in Excel, and it doesn’t correlate very well. Not surprisingly, this data is most predictive of actual production for hitters with a low slugging percentage.

I don’t know how to post this chart from Excel on here, or I would do it. If somebody else knows how, I can e-mail it to them or follow directions. It shows a general correlation overall, but with extreme deviation on both sides.

I would conclude from the chart that Batting Average Luck would only be useful if it is broken down into components (LD, FB, GB) of luck as Gopherballs suggests above. However, I’m not sure if the data exists to do this. Is there somewhere that lists not only LD, FB and GB percentages, but also lists what percentage of each type of batted ball actually became a hit?

If there is such a database, I think it would be possible to construct a batting equation very similar to FIP (only in that it eliminates luck) that could be incredibly useful, perhaps called FIB: Fielding Independent Batting. It would be the offensive statistic to end all offensive statistics. It would be the FIB that never lied. Anyway, before I get too carried away, I’ve never seen such a database (but that certainly doesn’t mean it doesn’t exist).

The flip side is that if no such data exists, the xBABIP-BABIP data is pretty useless.

What do you guys think and do you think we could make some sort of FIB statistic?

by KCBear on Aug 4, 2008 7:50 PM EDT up reply actions  

The Hardball Times and Fangraphs have the batted ball data percentages, updated daily

There has been a lot written on these subjects, so I would suggest you search those sites’ archives (as well as Baseball Think Factory or Tom Tango’s The Book site) or google the issue generally. If nothing comes up, I bet a polite email to Dave Studeman or THT generally, Tom Tango, the Fangraph guys or the Baseball Think Factory guys would get some answers for you. As another starting point, you might want to read the Fangraphs article that Zeppelin linked on his previous posting (see sidebar to the right).

Comparing a hitters’ expected BABIP with his actual BABIP is a very useful tool for helping identifying players who may have performed over their heads or under their talents. I would not reject it simply because it does not automatically spit out a number scaled to OPS or your statistic of choice.

by Gopherballs on Aug 4, 2008 8:30 PM EDT up reply actions  

I use both of those sites frequently

and have seen nothing like what I am talking about. I don’t mean batted ball percentages. For example, DeJesus has a LD% of 24.9 this year. That is easy enough to find out. What I am looking for is what percentage of his LDs became hits (or, more importantly, his SLG for LDs). Does anyone know where to find that? If it is on FanGraphs or THT, please give me a link because I haven’t noticed it.

Every article I read seems to assume that the data I’m looking for is irrelevant. Has it been proven that the percentage of LDs that become hits remains constant independent of batter? I would guess that similar to primary batted ball data, it will be about the same for pitchers but could vary greatly for hitters. If it does indeed vary for hitters, then I think there needs to be a revised method of determining xBABIP. If the percentage of each type of batted ball that becomes a hit is not independent of the batter, xBABIP needs to be calculated on an individual basis.

by KCBear on Aug 4, 2008 8:59 PM EDT up reply actions  

the data you want is on retrosheet

but for a number of reasons, its complicates things alot and I don’t have the time or the desire to really do proper analysis with that. as for the rest of your points, Gopher is making the key points, when I get back to work where my spreadsheet is, I’ll see if I can’t pull some data to illustrate some of what you are wondering about.

by ZeppelinDZ on Aug 4, 2008 11:37 PM EDT up reply actions  

Thanks

I appreciate that. I’m glad people do this stuff.

by KCBear on Aug 5, 2008 12:41 AM EDT up reply actions  

xBABIP-BABIP

is useful, but only as it relates to batting average. That means its not that useful for determining which players are productive and which are not. It is a descriptive stat more than it is a productive stat. It can tell you if a hitter has gotten more or less hits than they should have, but even if you know how many hits they should have, it doesn’t necessarily mean that much.

It doesn’t need to be scaled to OPS necessarily, it just needs to relate directly to run scoring somehow or it is irrelevant. Batting average is not very relevant to run scoring. I do think it can be useful as a component for creating such a statistic.

by KCBear on Aug 4, 2008 9:04 PM EDT up reply actions  

Batting average as a component is useful

This year, the line for an average AL hitter is 265/333/415 for an OPS of 748. Thus batting average accounts for roughly 80% of the average OBA, roughly 65% of the average SLG, and roughly 70% of the average OPS. If you think OPS is useful as a “productive stat” (whatever that means), incorporating xBABIP-BABIP into your analysis can get you roughly 70% of the way there. That should be useful.

According to the Hardball Times glossary, there are 14 different version of Runs Created, with the basic formula created by Bill James years ago as OBP*TB. Searching for articles on Runs Created should lead to the more accurate (and more complex) versions.

The data that you seem to be missing for your analysis is the average total bases on hits by batted ball type. I would bet somebody has figured that out, but you might need to look or ask around to find it.

by Gopherballs on Aug 5, 2008 12:51 AM EDT up reply actions  

kinda a contiuation on this

try to think about OPS by fundamental components.
OPS -> SLG + OBP
OBP -> BA + Walk rate +(plus other shit, but ill keep it simple)
SLG -> BA + XBHs (iso power to some degree)
BA -> xBABIP + K rate + Luck (all factors have luck, but this one has solid statistical reference, which is the purpose of the study)
xBABIP -> LD% + GB% + FB%

so by measuring that luck factor (for the scientifically minded, its basically an error term) along with other fundamental components, you can determine the source of the OPS stat. if you can then forecast those fundamental stats (which is not too hard because many are very repeatable) you can forecast OPS.

by ZeppelinDZ on Aug 5, 2008 9:41 AM EDT up reply actions  

a regression output suggests

OPS = .59 + .38LD% -.55K% + .49 BB% + 1.72LUCK + 1IsoP
regression and SEs significant, R squared .95

there probably should be more variables included, but this is for a simple example.

the LUCK variable is (roughly) the number quoted in my post.

Thus, someone with a LUCK factor .02 (or moderate luck) gets roughly a .035, or 35 point boost, to OPS.

by ZeppelinDZ on Aug 5, 2008 11:20 AM EDT up reply actions  

Thanks again

I’ll have some fun messing around with these numbers.

by KCBear on Aug 5, 2008 3:44 PM EDT up reply actions  

Charts in Excel

If you save the spreadsheet as html, the charts will be saved as gifs. You should be able to post them in that format.

. . . a weary nation turns to Gil Meche

by vegasroyals on Aug 5, 2008 5:12 PM EDT up reply actions  

Or you can pretty easily cut-and-paste an excel spreadsheet into something like MS Paint and save it as a JPEG

That’s how I did the Royals Payroll Plus post.

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 5, 2008 5:21 PM EDT up reply actions  

Might be too marginal . . .

just thinking out loud-

The reason that LD% is a good proxy for this stuff is that line drives are most likely to fall for hits. If you start extending the analysis into the individual results for ground balls and fly balls you’re likely to find the second order data of what we see here: the one pure result [home run] plus a lot of statistical noise – aka luck.

The pattern you identified is going to be there- only magnified. That is, a guy who has a lot of groundballs becoming hits is super lucky because he’s not making solid contact- the same for a guy who hits a lot of flyballs.

But someone who’s hitting line drives is making solid contact and should expect a certain result and that nexus of making solid contact AND BABIP-xBABIP along with a walk percentage is going to tell you who is good, good and lucky good but not lucky, not good, not good but lucky, not good and not lucky, and TPJ.

Again, just thinking out loud.

. . . a weary nation turns to Gil Meche

by vegasroyals on Aug 5, 2008 5:25 PM EDT up reply actions  

without looking at any numbers to back this up

strikeouts would probably the first thing I would look at when luck and OPS go in different directions

by PopeSoria on Aug 4, 2008 7:09 PM EDT up reply actions  

this post is awesome

I’m an econ major taking advanced econometrics next sem, and this study gives me a new hope and unbridled enthusiam. Really cool stuff. And for anyone who has watched pretty much every game this year, I think its obvious that this is the real teahen. Almost no line drives anymore. Most his hits are ground balls through the infield or broken bat bloop hits.

by RoyalFlush on Aug 4, 2008 5:51 PM EDT reply actions  

You econ guys

Need to dazzle us with more stats. This is out of my league to compile, but I can grasp the concepts and enjoy the reading.

And I’m afraid you’re right about Teahen. I see him wandering around the league as a utility guy, then suddenly figuring it out at age 30 and having a fruitful second half career.

Relive Royals History at royalsretro.blogspot.com

by RoyalsRetro on Aug 4, 2008 6:19 PM EDT up reply actions  

xBABIP-BABIP abbreviation suggestions

My suggestion is BAL (Batting Average Luck)

by KCBear on Aug 4, 2008 6:30 PM EDT reply actions  

Better

Batting Average Luck Legitimation Statitistic

OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG

by Matt Klaassen on Aug 4, 2008 8:47 PM EDT up reply actions  

Interesting stuff

I think it’s a good point that this shows that Gload isn’t as terrible as everyone makes him out to be. Sure, he doesn’t have enough power for a starting 1B but as a bench guy he would be really useful with his defense and ability to get on base. It’s not his fault that managment hasn’t gotten someone else in here to make that a reality.

BTW, I’m not Joel in disguise.

by I need more Esteban on Aug 4, 2008 8:04 PM EDT reply actions  

I figured you weren't

because the title of your comment wasn’t “DZEPPELIN…”

Great job, DZ, not that it means much coming from someone like me, but really cool stuff. I’d love to use this formula and ones like it to generate my own xBABIP or XBA spreadsheet so that I wouldn’t need to subscribe to Shandler during fantasy season.

OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG

by Matt Klaassen on Aug 4, 2008 8:49 PM EDT up reply actions  

He's not as terrible as his stats this year would suggest

He’s more like the way, way below average 1B that he’s been throughout the rest of his career. This year he’s been abysmal. Usually, he’s merely bad.

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 4, 2008 11:41 PM EDT up reply actions  

He's been unlucky

But even at his best he should probably be a bench player with that kind of home run power and walk totals.

Relive Royals History at royalsretro.blogspot.com

by RoyalsRetro on Aug 5, 2008 11:15 AM EDT up reply actions  

Zeppelin

You should do this for Royals pitchers too.

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 5, 2008 3:59 PM EDT reply actions  

NY, you should complie the data for me too...

Ya, its a project I can and want to do, its down the road as time allows tho. It didn’t take that long, but there are several things that came up that on a second pass thru, i would want to deal with more completely.

seriously tho, if anyone does happen to want to help/get more info/challenge anything, email is zeppelindz@gmail.com (AIM:zeppelindz). i also have several fun tools I have created in excel just to learn about baseball statistics/tools/variance/advanced excel features. In particular, im kinda trying to learn how to use and process retrosheet data, so if ppl have any xp with that, i would be curious.

by ZeppelinDZ on Aug 5, 2008 5:00 PM EDT up reply actions  

I know it is a lot of work

So get to it if you feel like doing the work. It would be interesting to see the results.

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 5, 2008 5:08 PM EDT up reply actions  

Nice work here, Zep

rec’d.

A mind without purpose will walk in dark places.

by NHZ on Aug 6, 2008 6:29 PM EDT reply actions  

Hitting Coaches

Nice post. Interesting info. My question is this:

What kind of effect does a hitting coach have on this?

Don't forget to send your broken maples to the US Forest Service.

by 306008 on Aug 6, 2008 11:11 PM EDT reply actions  

If a hitting coach can help a player make more solid contact and, therefore, more hard hit balls, then he can affect it

But the difference between one’s actual BABIP and his xBABIP is basically luck. A hitting coach can’t do anything about that. Quite frankly, I think once a player gets to the majors, I don’t think hitting coaches are going to change or improve a hitter much.

This is just my opinion. I could easily be wrong.

by Scott McKinney on Aug 7, 2008 3:17 AM EDT up reply actions  

Hey Hey

Nice job, DZ, again… Mellinger mentioned this post in his blog recently.

OMG Banny. FWIW I am only crdtng u w/3 runs allwd bc of DDJ OMFG

by Matt Klaassen on Aug 7, 2008 2:20 PM EDT reply actions  

Comments For This Post Are Closed


User Tools

Welcome to the SB Nation blog about Kansas City Royals.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

293205_10100249842300990_16917861_47946787_6111553_n_small
6 Items You Don't Need From The Official Online Shop of the Kansas City Royals
Nacho_small
Interview with Royals Review Editor Jeff Zimmerman
Small
OT: Determining the exact date of Ice Cubes "Good Day"
Royalsretro_small
The 100 Greatest Royals of All-Time - #26 Al Cowens

Recent FanPosts

The_laz_small
RRCCA Vol. IV -- Before You Was Born
293205_10100249842300990_16917861_47946787_6111553_n_small
8 Items You Might Not Know Existed but Probably Need from The Official Online Store of the Kansas City Royals
Nyroyal3a_small
PECOTA, the 2012 Royals and a Mega-Projection
Small
OT: Fantasy Baseball Keeper League
952_small
Bandwagon Fans and the Royals
293205_10100249842300990_16917861_47946787_6111553_n_small
Friday OT Thread
The_laz_small
More player name anagrams
Royalsretro_small
Wil Myers Slams KC Barbecue; Malcontent Likely to Be Traded Soon
Life_or_fiction_touchup_small
The Projected Royals According to Bill James

+ New FanPost All FanPosts >


Managers

Cimg0036_small Freneau

Editors

Dayton_small Jeff Zimmerman

Authors

Royalsretro_small RoyalsRetro

Headshot_small Old Man Duggan