Archive for September, 2018

In a recent tweet, the esteemed sabermetrician and current MLB Statcast honcho, Tom Tango, aka Tango, suggested that an increase in Statcast speed (average time from home to first on a batted ball that requires a max effort) from one year (year x) to the next (x+1), concomitant with an increase in offensive performance (from x to x+1), might portend an increase in expected offensive performance in the following year (x+2). He put out a call to saberists and aspiring saberists, to look at the data and see if there is any evidence of such an effect. I took up the challenge.

I looked at all batters who had a recorded Statcast speed in 2016 and 2017, as well as at least 100 PA in each of those years and 2018 as well, and separated them into 3 groups: One, an increase in speed greater than .5 seconds. Two, a decrease in speed of at least .46 second. Three, all the rest. I also separated each of those groups into 3 sub-groups: One, an increase in context-neutral wOBA of at least 21 points, a decrease of at least 21 points, and all the rest.

Then I looked at their 2018 performance compared to their 2018 projections. I used my own proprietary projections which are not publicly available, but are often referred to on social media and on this blog. They are probably more or less as good as any of the other credible projection systems out there, like Steamer and ZIPS. For the record, I don’t use Statcast speed data in my projection models – in fact I don’t use any Statcast data at all (such as exit velocity, launch angle, expected wOBA, etc.).

The hypothesis I suppose is that if a player sees an increase in Statcast speed and an increase in performance from 2016 to 2017, we will underestimate his expected performance in 2018. This makes sense, as any independent data which is correlated directly or indirectly with performance can and should be used to help with the accuracy of the projections. An increase or decrease in speed might suggest increased or decreased health, fitness, or injury status, which would in turn affect the integrity of our projections.

Let’s look at the data.

If we look at the players who gain speed, lose speed, and all the rest from 2016 to 2017, with no regard to how their offensive performance changed, we see this:

2018 PA 16-17 Speed Change 17-18 Speed Change 16-17 wOBA Change 18 Projected wOBA 18 Actual wOBA
9,374 .88 -.28 .031 .328 .342
30,593 -.72 .07 -.004 .332 .330
136,030 0.00 -.17 -.002 .332 .331


Lots of good stuff in the above chart! We can clearly see that increased Statcast speed is, on the average, accompanied by a substantial increase in batting performance – in fact a .88 second increase means a 31 point increase in wOBA. Presumably, either these players were suffering from some malady affecting their speed and offense in 2016 but not in 2017 or they did something in terms of health or fitness in 2017 in order to increase their speed and wOBA. And yes, we see a substantial underestimation of their predicted 2018 performance – 14 points. Interestingly, these players lose back in 2018 around a third of the speed they gained in 2017. Keep in mind that all players, on the average, lose speed from year x to x+1 because of the natural process of aging, probably starting from a very young age.

Also of interest is the fact that of all the other players, those that lose speed and those that have no change, our projections are right on the money. It is surprising that we aren’t overestimating the 2018 performance of players who lose speed. Also of note is the fact that players who lose speed from 2016 to 2017, gain back a little even though all players are expected to lose speed from one year to another. That suggests that at least a small portion of the loss is transient due to some injury or other health issue. We also don’t see a substantial loss in offensive performance accompanying a large loss in speed  – only 2 points in wOBA, which, again, is normal for all players because of aging.

Overall, the data suggest that losing speed may be a combination of injury and “bulking up” resulting in no net gain or loss in offensive performance, whereas a gain in speed suggests better fitness, at least relative to the previous year (which may have been injury-plagued), resulting in a substantial gain in wOBA.

What if we further break up these groups into those that did indeed gain or lose wOBA from 2016 to 2017? How does that affect the accuracy of our projections? For example, if a player gains speed but his offense doesn’t show a substantial uptick, are our projections more solid? Similarly, if a player loses speed and his wOBA substantially decreases, are our projections too pessimistic?

First let’s see how our projections do in general when players gain or lose a substantial amount of offense from 2016 to 2017. Maybe we aren’t weighting recent performance properly or otherwise handling drastic changes in performance well. Good forecasters should always be checking and calibrating these kinds of things.

2018 PA 16-17 wOBA Change 18 Projected wOBA 18 Actual wOBA
48,120 .000 .331 .334
42,720 .045 .334 .330
45,190 -.041 .330 .330


We don’t see any tremendous biases here. Maybe a bit too much weight on a recent uptick in performance. So let’s break those “speed increase and decrease” groups into increases and decreases in wOBA and see how it affects our projections.

Players whose speed increases from 2016 to 2017

2018 PA 16-17 Speed Change 17-18 Speed Change 16-17 wOBA Change 18 Projected wOBA 18 Actual wOBA
3,633 .85 -.15 .006 .326 .358
5,285 .92 -.38 .054 .332 .337
456 .67 -.14 -.035 .309 .292


So, the substantial under-projections seem to occur when a player gains speed but his wOBA remains about the same. When his speed and his offensive performance both go up, our projections don’t miss the mark by all that much. I don’t know why that is. Maybe it’s noise. We’re dealing with fairly small samples sizes here. In 4,000 PA, one standard deviation in wOBA is around 10 points. One standard deviation between the difference between a projected and an actual wOBA is even greater as there is random and other uncertainties in both measures. Interestingly, it appears to be quite unlikely that a player can gain substantial speed while his wOBA decreases.

What about for players who lose speed?

Players whose speed decreases from 2016 to 2017

2018 PA 16-17 Speed Change 17-18 Speed Change 16-17 wOBA Change 18 Projected wOBA 18 Actual wOBA
12,444 -.70 .12 .000 .337 .341
8,188 -.67 .05 .040 .333 .327
9,961 -.79 .03 -.045 .326 .320


We see a similar pattern here in reverse, although none of the differences between projected and actual are large. When a player loses speed and his offense decreases, we tend to overestimate his 2018 wOBA – our projections are too optimistic – by only around 6 points. Remember when a player gains speed and offense, we underestimate his 2018 performance by around the same amount. However, unlike with the “speed gain players,” here we see an overestimation regardless of whether the “speed losers” had an accompanying increase or decrease in offensive performance, and we don’t see a large error in our projections for players who lose speed but don’t gain or lose offense.

I think overall, we see that our initial hypothesis is likely true – namely that when players see an increase in Statcast speed score from one year to the next (2016-2017 in this study), we tend to underestimate their year 3 projection (2018 in this case). Similarly, when player speed decreases from year x to x+1, we tend to under-project their year x+2 wOBA, although at a level less than half that of the “speed gainers.” Keep in mind that our samples are relatively small so that more robust work needs to be done in order to gain more certainty in our conclusions, especially when you further break down the two main groups into those that increase, decrease, and remain the same in offensive performance. As well, age might play a significant factor here, as each of the groups might be different in average age and our projection algorithm might not be handling the aging process well, once the speed scores are included in the data sets.