Archive for September, 2014

There is a prolific base stealer on first base in a tight game. The pitcher steps off the rubber, varies his timing, or throws over to first several times during the AB. You’ve no doubt heard some version of the following refrain from your favorite media commentator: “The runner is disrupting the defense and the pitcher, and the latter has to throw more fastballs and perhaps speed up his delivery or use a slide step, thus giving the batter an advantage.”

There may be another side of the same coin: The batter is distracted by all these ministrations, he may even be distracted if and when the batter takes off for second, and he may take a pitch that he would ordinarily swing at in order to let the runner steal a base. All of this leads to decreased production from the batter, as compared to a proverbial statue on first, to which the defense and the pitcher pay little attention.

So what is the actual net effect? Is it in favor of the batter, as the commentators would have you believe (after all, they’ve played the game and you haven’t), or does it benefit the pitcher – an unintended negative consequence of being a frequent base stealer?

Now, even if the net effect of a stolen base threat is negative for the batter, that doesn’t mean that being a prolific base stealer is necessarily a bad thing. Attempting stolen bases, given a high enough success rate, presumably provides extra value to the offense independent of the effect on the batter. If that extra value exceeds that given up by virtue of the batter being distracted, then being a good and prolific base stealer may be a good thing. If the pundits are correct and the “net value of distraction” is in favor of the batter, then perhaps the stolen base or stolen base attempt is implicitly worth more than we think.

Let’s not also forget that the stolen base attempt, independent of the success rate, is surely a net positive for the offense, not withstanding any potential distraction effects. That is due to the fact that when the batter puts the ball in play, whether it is a hit and run or a straight steal, there are fewer forces at second, fewer GDP’s, and the runner advances the extra base more often on a single, double, or out. Granted, there are a few extra line drive and fly ball DP, but there are many fewer GDP to offset those.

If you’ve already gotten the feeling that this whole steal thing is a lot more complicated than it appears on its face, you would be right. It is also not easy, to say the least, to try and ascertain whether there is a distraction effect and who gets the benefit, the offense or the defense. You might think, “Let’s just look at batter performance with a disruptive runner on first as compared to a non-disruptive runner.” We can even use a “delta,” “matched pairs,” or “WOWY” approach in order control for the batter, and perhaps even the pitcher and other pertinent variables. For example, with Cabrera at the plate, we can look at his wOBA with a base stealing threat on first and a non-base stealing threat. We can take the difference, say 10 points in wOBA in favor of with the threat (IOW, the defense is distracted and not the batter), and weight that by the number of times we find a matched pair (the lesser of the two PA). In other words, a “matched pair” is one PA with a stolen base threat on first and one PA with a non-threat.

If Cabrera had 10 PA with a stolen base threat and 8 PA with someone else on first, we would weight the wOBA difference by 8 – we have 8 matched pairs. We do that for all the batters, weighting each batter’s difference by their number of matched pairs, and voila, we have a measure of the amount that a stolen base threat on first affects the batter’s production, as compared to a non-stolen base threat. Seems pretty simple and effective, right? Eh, not so fast.

Unfortunately there are myriad problems associated with that methodology. First of all, do we use all PA where the runner started on first but may have ended up on another base, or was thrown out, by the time the batter completed his PA? If we do that, we will be comparing apples to oranges. With the base stealing threats, there will be many more PA with a runner on second or third, or with no runners at all (on a CS or PO). And we know that wOBA goes down once we remove a runner from first base, because we are eliminating the first base “hole” with the runner being held on. We also know that the value of the offensive components are different depending on the runners and outs. For example, with a runner on second, the walk is not as valuable to the batter and the K is worse than a batted ball out which has a chance to advance the runner.

What if we only look at PA where the runner was still at first when the batter completed his PA? Several researchers have done that, included myself and my co-authors in The Book. The problem with that method is that those PA are not an unbiased sample. For the non-base stealers, most PA will end with a runner on first, so that is not a problem. But with a stolen base threat on first, if we only include those PA that end with the runner still on first, we are only including PA that are likely biased in terms of count, score, game situation, and even the pitcher. In other words, we are only including PA where the runner has not attempted a steal yet (other than on a foul ball). That could mean that the pitcher is difficult to steal on (many of these PA will be with a LHP on the mound), the score is lopsided, the count is biased one way or another, etc. Again, if we only look at times where the PA ended with the runner on first, we are comparing apples to oranges when looking at the difference in wOBA between a stolen base threat on first and a statue.

It almost seems like we are at an impasse and there is nothing we can do, unless perhaps we try to control for everything, including the count, which would be quite an endeavor. Fortunately there is a way to solve this – or at least come close. We can first figure out the overall difference in value to the offense between having a base stealer and a non-base stealer on first, including the actual stolen base attempts. How can we do that? That is actually quite simple. We need only look at the change in run expectancy starting from the beginning to the end of the PA, starting with a runner on first base only. We can then use the delta or matched pairs method to come up with an average difference in change in RE. This difference represents the sum total of the value of a base stealer at first versus a non-base stealer, including any effect, positive or negative, on the batter.

From there we can try and back out the value of the stolen bases and caught stealings (including pick-offs, balks, pick-off errors, catcher errors on the throw, etc.) as well as the extra base runner advances and the avoidance of the GDP when the ball is put into play. What is left is any “distraction effect” whether it be in favor of the batter or the pitcher.

First, in order to classify the base runners, I looked at their number of steal attempts per times on first (BB+HP+S+ROE) for that year and the year before. If it was greater than 20%, they were classified as a “stolen-base threat.” If it was less than 2%, they were classified as a statue. Those were the two groups I looked at vis-à-vis the runner on first base. All other runners (the ones in the middle) were ignored. Around 10% of all runners were in the SB threat group and around 50% were in the rarely steal group.

Then I looked at all situations starting with a runner on first (in one or the other stolen base group) and ending when the batter completes his PA or the runner makes the third out of the inning. The batter may have completed his PA with the runner still on first, on second or third, or with no one on base because the runner was thrown out or scored, via stolen bases, errors, balks, wild pitches, passed balls, etc.

I only included innings 1-6 (to try and eliminate pinch runners, elite relievers, late and close-game strategies, etc.) and batters who occupied the 1-7 slots. I created matched pairs for each batter such that I could use the “delta method” described above to compute the average difference in RE change. I did it year by year, i.e., the matched pairs had to be in the same year, but I included 20 years of data, from 1994-2013. The batters in each matched pair had to be on the same team as well as the same year. For example, Cabrera’s matched pairs of 8 PA with base stealers and 10 PA with non-base stealers would be in one season only. In another season, he would have another set of matched pairs.

Here is how it works: Batter A may have had 3 PA with a base stealer on first and 5 with a statue. His average change in RE (everyone starts with a runner on first only) at the end of the PA may have been +.130 runs for those 3 PA with the stolen base threat on first at the beginning of the PA.

For the 5 PA with a non-threat on first, his average change in RE may have been .110 runs. The difference is .02 runs in favor of the stolen base on first and that gets weighed by 3 PA (the lesser of the 5 and the 3 PA). We do the same thing for the next batter. He may have had a difference of -.01 runs (in favor of the non-threat) weighted by, say, 2 PA. So now we have (.02 * 3 – .01 * 2) / 5 as our total average difference in RE change using the matched pair or delta method. Presumably (hopefully) the pitcher, score, parks, etc. are the same or very similar for both groups. If they are, then that final difference represents the advantage of having a stolen base threat on first base, including the stolen base attempts themselves.

A plus number means a total net advantage to the offense with a prolific base stealer on first, including his SB, CS, and speed on the bases when the ball is put into play, and a negative number means that the offense is better off with a slow, non-base stealer on first, which is unlikely of course. Let’s see what the initial numbers tell us. By the way, for the changes in RE, I am using Tango’s 1969-1992 RE matrix from this web site:

We’ll start the analysis with no out situations. One of the advantages of a base stealer on first is staying out of the GDP (again, offset by a few extra line drive and fly ball DP). There were a total of 5,065 matched pair PA (adding the lesser of the two PA for each matched pair). Remember a matched pair is a certain batter with a base stealing threat on first and that same batter in the same year with a non-threat on first. The runners are on first base when the batter steps up to the plate but may not be when the PA is completed. That way we are capturing the run expectancy change of the entire PA, regardless of what happens to the runner during the PA.

The average advantage in RE change (again, that is the ending RE after the PA is over minus the starting RE, which is always with a runner on first only, in this case with 0 out) was .032 runs per PA. So, as we expect, a base stealing threat on first confers an overall advantage to the offensive team, at least with no outs. This includes the net run expectancy of SB (including balks, errors, etc.) and CS (including pick-offs), advancing on WP and PB, advancing on balls in play, staying out of the GDP, etc., as well as any advantage or disadvantage to the batter by virtue of the “distraction effect.”

The average wOBA of the batter, for all PA, whether the runner advanced a base or was thrown out during the PA, was .365 with a non-base stealer on first and .368 for a base stealer.

What are the differences in individual offensive components between a base stealing threat and a non-threat originally on first base? The batter with a statue who starts on first base has a few more singles, which is expected given that he hits with a runner on first more often. As well, the batter with a base stealing threat walks and strikes out a lot more, due to the fact he is hitting with a base open more often.

If we then compute the RE value of SB, CS (and balks, pickoffs, errors, etc.) for the base stealer and non-base stealer, as well as the RE value of advancing the extra base and staying out of the DP, we get an advantage to the offense with a base stealer on first of .034 runs per PA.

So, if the overall value of having a base stealer on first is .032 runs per PA, and we compute that .034 runs comes from greater and more efficient stolen bases and runner advances, we must conclude that that there is a .002 runs disadvantage to the batter when there is a stolen base threat on first base. That corresponds to around 2 points in wOBA. So we can say that with no outs, there is a 2 point penalty that the batter pays when there is a prolific base stealer on first base, as compared to a runner who rarely attempts a SB. In 5065 matched PA, one SD of the difference between a threat and non-threat is around 10 points in wOBA, so we have to conclude that there is likely no influence on the batter.

Let’s do the same exercise with 1 and then 2 outs.

With 1 out, in 3,485 matched pair, batters with non-threats hit .388 and batters with threats hit .367. The former had many more singles and of course fewer BB (a lot fewer) and K. Overall, with a non-base stealer starting on first base at the beginning of the PA, batters produced an RE that was .002 runs per PA better than with a base stealing threat. In other words, having a prolific, and presumably very fast, base stealer on first base offered no overall advantage to the offensive team, including the value of the SB, base runner advances, and avoiding the GDP.

If we compute the value that the stolen base threats provide on the base paths, we get .019 runs per PA, so the disadvantage to the batter by virtue of having a prolific base stealer on first base is .021 runs per PA, which is the equivalent of the batter losing 24 points in wOBA.

What about with 2 outs? With 2 outs, we can ignore the GDP advantage for the base stealer as well as the extra value from moving up a base on an out. So, once we get the average RE advantage for a base stealing threat, we can more easily factor out the stolen base and base running advantage to arrive at the net advantage or disadvantage to the batter himself.

With 2 outs, the average RE advantage with a base stealer on first (again, as compared to a non-base stealer) is .050 runs per PA, in a total of 2,390 matched pair PA. Here, the batter has a wOBA of .350 with a non-base stealer on first, and .345 with a base stealer. There is a still a difference in the number of singles because of the extra hole with the first baseman holding on the runner, as well as the usual greater rate of BB with a prolific stealer on base. (Interestingly, with 2 outs, the batter has a higher K rate with a non-threat on base – it is usually the opposite.) Let’s again tease out the advantage due to the actual SB/CS and base running and see what we’re left with. Here, you can see how I did the calculations.

With the non-base stealer, the runner on first is out before the PA is completed 1.3% of the time, he advances to second, 4.4% of the time, and to third, .2%. The total RE change for all that is .013 * -.216 + .044 * .109 + .002 * .157, or .0023 runs, not considering the count when these events occurred. The minus .216, plus .109, and plus .157 are the change in RE when a base runner is eliminated from first, advances from first to second, and advances from first to third prior to the end of the PA (technically prior to the beginning of the PA). The .013, .044, and .002 are the frequencies of those base running events.

For the base stealer, we have .085 (thrown out) times -.216 + .199 (advance to 2nd) * .109 + .025 (advance to 3rd) * .157, or .0117. So the net advantage to the base stealer from advancing or being thrown is .0117 minus .0023, or .014 runs per PA.

What about the advantage to the prolific and presumably fast base stealers from advancing on hits? The above .014 runs was from advances prior to the completion of the PA, from SB, CS, pick-offs, balks, errors, WP, and PB.

The base stealer advances the extra base from first on a single 13.5% more often and 21.7% more often on a double. Part of that is from being on the move and part of that is from being faster.

12.5% of the time, there is a single with a base stealing threat on first. He advances the extra base 13.5% more often, but the extra base with 2 outs is only worth .04 runs, so the gain is negligible (.0007 runs).

A runner on second and a single occurs 2.8% of the time with a stolen base threat on base. The base stealer advances the extra base and scores 14.6% more often than the non-threat for a gain of .73 runs (being able to score from second on a 2-out single is extremely valuable), for a total gain of .73 * .028 * .146, or .003 runs.

With a runner on first and a double, the base stealer gains an extra .0056 runs.

So, the total base running advantage when the runner on first is a stolen base threat is .00925 runs per PA. Add that to the SB/CS advantage of .014 runs, and we get a grand total of .023 runs.

Remember that the overall RE advantage was .050 runs, so if we subtract out the base runner advantage, we get a presumed advantage to the batter of .050 – .023, or .027 runs per PA. That is around 31 points in wOBA.

So let’s recap what we found. For each of no outs, 1 out, and 2 outs, we computed the average change in RE for every batter with a base stealer on first (at the beginning of the PA) and a non-base stealer on first. That tells us the value of the PA from the batter and the base runner combined. (That is RE24, by the way.) We expect that this number will be higher with base stealers, otherwise what is the point of being a base stealer in the first place if you are not giving your team an advantage?

Table I – Overall net value of having a prolific and disruptive base stealing threat on first base at the beginning of the PA, the value of his base stealing and base running, and the presumed value to the batter in terms of any “distraction effect.” Plus is good for the offense and minus good for the defense.

Outs Overall net value SB and base running value “Batter distraction” value
0 .032 runs (per PA) .034 runs -.002 runs (-2 points of wOBA)
1 -.002 runs .019 -.21 runs (-24 pts)
2 .050 runs .023 + .027 (31 pts)


We found that very much to be the case with no outs and with 2 outs, but not with 1 out. With no outs, the effect of a prolific base runner on first was .032 runs per PA, the equivalent of raising the batter’s wOBA by 37 points, and with 2 outs, the overall effect was .050 runs, the equivalent of an extra 57 points for the batter. With 1 out, however, the prolific base stealer is in effect lowering the wOBA of the batter by 2 points. Remember that these numbers include the base running and base stealing value of the runner as well as any “distraction effect” that a base stealer might have on the batter, positive or negative. In other words, RE24 captures the influence of the batter as well as the base runners.

In order to estimate the effect on the batter component, we can “back out” the base running value by looking at how often the various base running events occur and their value in terms of the “before and after” RE change. When we do that, we find that with 0 outs there is no effect on the batter from a prolific base stealer starting on first base. With 1 out, there is a 24 point wOBA disadvantage to the batter, and with 2 outs, there is a 31 point advantage to the batter. Overall, that leaves around a 3 or 4 point negative effect on the batter. Given the relatively small sample sizes of this study, one would not want to reject the hypothesis that having a prolific base stealer on first base has no net effect on the batter’s performance. Why the effect depends so much on the number of outs, and what if anything managers and players can do to mitigate or eliminate these effects, I will leave for the reader to ponder.