Wednesday, March 25, 2015

The Trouble With Stats (or the people who pretend to use them)

My father also used to enjoy telling me "figures don't lie, but liars figure." He was essentially trying to explain why people often don't trust those who use numbers to explain things that don't naturally appeal to our (highly flawed) instincts. There is a Simpsons quote along similar lines, but for those of you who are fans, you already know it, and for anyone else, it would be a waste of time.

There is another, larger problem, however, that also damages the reputation of statistical analysis and the things it can tell us about our world. It's when people know/do just enough to be dangerous, and without basic rigour produce findings that are unsupportable. When they spread those findings, people who don't have a strong understanding of the analytical process tend to take them at face value, and then end up being let down later. Rather than blaming the particular culprit who practiced the bad stats, they just blame stats in general. These bad actors aren't always guilty of malice, but it doesn't absolve them of crimes against science, and there is a perfect example on ESPN.com today.

Obligatory picture of Nomar, 'cause baseball!


Check out the article here (it's an Insider article, so if you aren't already behind the pay wall, you won't be able to read it). To summarize, Mr. Keating is trumpeting the value of a simple composite stat, runs per hit (R/H from now on), as some sort of gem in terms offensive value, in both real and fantasy terms. He begins, ironically enough, by pointing out that "many sabermetricians [statisticians who study baseball] barely glance at stats like runs and RBIs, which depend heavily on a player's offensive context" (which is true), and then immediately goes on to make a vague declaration about other skills he must have (not true).

Keating proceeds to provide a handful of historical examples of other players with a high R/H, in fun anecdotal fashion. Fully three of the seven paragraphs in the article don't talk about the particular player that it focuses on, Brian Dozier of the Twins, and even the ones that mention him hardly focus on him. The examples provided vary widely across run environments, from the height of the power-infused steroid era, to pre-WWII baseball, with no accounting for what the differences in offensive numbers looked like at those times.

Finally he cherry-picks a few contemporary data points that support his assertion, none of which are particularly relevant to the point that he is making (and in some cases, contradictory).

At no point does he cover anything like the methodology he used to arrive at his conclusions, or even imply that there was any methodology, which might be more disturbing. Here is one thing I know about people who appropriately use statistics: they love to share the gory details.

Now, I will admit, I had a little prior experience with this particular stat, because I had looked into it a few years ago, and ultimately rejected it as useful. However, context always matters, so it was worth another look. One of the first things that Keating does in the article is subtly undermine sabermatricians and complex statistics, which is curious for a guy who's bio at the bottom of the page says that he covers statistical subjects as a senior reporter. Basically, what he says is that R/H is a stat "...that is so simple you can calculate it in your head and it'll tip you off to hidden fantasy values" (I challenge both parts of that statement).

He mentions BABIP later on without any explanation, so he is assuming a certain amount of sophistication on the part of his readers, which is understandable since most fantasy enthusiasts have more exposure to statistical analysis than an average baseball fan. Certainly, anyone interested in "new" composite stats would also be familiar with OPS, and quite possibly wOBA and wRC as well, so the assumption has to be that R/H offers something that these other stats don't.

This is where the whole thing falls apart under scrutiny. Keating says that R/H is a good proxy for identifying other skills like speed, walk rate, and power, but all of those other stats I just mentioned do the same thing, but the difference is, they actually work. Depending on your analytical preference (or which site you get your baseball fix from), you might choose to believe in OPS, wOBA, or wRC (or base runs, etc.), but the reality is, they are all pretty good, and so they correlate very highly to one another (r values of above .95 between all of them). R/H, on the other hand, correlates with none of them, not even close.

Of course, those are all stats meant to capture real-world offensive value in a context neutral way, and maybe R/H only works for fantasy value. So the obvious thing to do would be look at R/H against ESPN fantasy player values. Unfortunately, once again, R/H has almost no correlation to the player rater values (whereas the fantasy rater correlates really well with those other stats).

"Wait," you might (and should) say, "the fantasy player rater is going to be heavily weighted towards counting stats, and thus might not be appropriate when comparing against a weight stat like R/H." Sadly, you, like Mr. Keating, would be wrong again, as Dozier was 7th in the majors in plate appearances (PA) among qualified batters, so he actually has an unfair advantage in this case.

The truth is, of players who had enough plate appearances to qualify for the batting title last year, 4 out of the top 5 by R/H were not in the top 30 fantasy players, and only 4 of the top 15 made the cut. Most of whom were not even guys with an balanced speed/power/discipline profile, like Dozier (which Keating says the stat should identify), but power hitters like Donaldson, Rizzo, and Bautista. If you look at wRC+, every single one of the top-15 players was in the top 40 by fantasy value, and has an r value twice as high as R/H (.71 vs .31).

A common way to look at how much noise might be in a particular stat is to look at year-over-year correlation, and I will say that for players with at least 350 PA in both 2013 and 2014 there was a slight correlation. At just below an r value of .5, however, it wasn't particularly strong, highlighting just how much luck goes into this formula by including runs, and frankly, this number would probably change by looking at more years (I admit to not going deeper here, but I will point out the failing if someone wants to look into it).

Bottom line: runs per hit is not a great stat over all, and it is not a great stat in the context of fantasy baseball value. I won't venture to say that Keating did this analysis and decided to hide the results from his readers, nor will I say that he was too lazy to do any analysis, but I do know that it took me all of thirty seconds to start finding problems with the assertion. As someone who plays fantasy baseball, I have no interest in giving my competition an advantage, but I also can't figure out how this kind of misleading "analysis" does any good for the field.

The issue isn't just that the number doesn't say what Keating claims it does, because it will, by the nature of its component parts and some of the reasons he gives, have some relationship with the skills he is trying to identify. The issues are that it does so very inefficiently due to flaws in the components used, that we already have much better stats for highlighting the same information. The last, biggest issue is that despite being at least somewhat aware of these flaws, Keating is touting this approach nonetheless, under the guise of statistical analysis, without providing any evidence in support.

Dozier is a good player, and has fantasy value, but anyone using R/H to evaluate players is going to be disappointed in their fantasy season. It is neither predictive nor descriptive of a player's skill, and shouldn't be used that way. The problem is that by placing this article behind the pay wall of an authority like ESPN, the false premise is given a credence that it doesn't warrant, and undermines statistical analysis as a whole.

No comments:

Post a Comment