Archive for the ‘Fielding’ Category

Now that Adam Eaton has been traded from the White Sox to the Nationals much has been written about his somewhat unusual “splits” in his outfield defense as measured by UZR and DRS, two of the more popular batted-ball defensive metrics. In RF, his career UZR per 150 games is around +20 runs and in CF, -8 runs. He has around 100 career games in RF and 300 in CF. These numbers do not include “arm runs” as I’m going to focus only on range and errors in this essay. If you are not familiar with UZR or DRS you can do some research on the net or just assume that they are useful metrics for quantifying defensive performance and for projecting defense.

In 2016 Eaton was around -13 in CF and +20 in RF. DRS was similar but with a narrower (but still unusual) spread. We expect that a player who plays at both CF and the corners in a season or within a career will have a spread of around 5 or 6 runs between CF and the corners (more between CF and RF than between CF and LF). For example, a CF’er who has a UZR of zero and thus is exactly average among all CF’ers, will have a UZR at around +5.5 at the corners, again a bit more in RF than LF (LF’ers are better fielders than RF’ers).

This has nothing to do with how “difficult” each position is (that is hard to define anyway – you could even make the argument that the corner positions are “harder” than CF), as UZR and DRS are calculated as runs above or below the average fielder at that position. It merely means that the average CF’er is a better fielder than the average corner OF’er by around 5 or 6 runs. Mostly they are faster. The reason teams put their better fielder in CF is not because it is an inherently more “difficult” position but because it gets around twice the number of opportunities per game than the corner positions such that you can leverage talent in the OF.

Back to Eaton. He appears to have performed much better in RF than we would expect given his performance in CF (or vice versa) or even overall. Does this mean that he is better suited to RF (and perhaps LF, where he hasn’t played much in his career) or that the big, unusual gap we see is just a random fluctuation, or somewhere in the middle as is often (usually) the case? Should the Nationals make every effort to play him in RF and not CF? After all, their current RF’er, Harper, has unusual splits too, but in the opposite direction – his career CF UZR is better than his career RF UZR! Or perhaps the value they’re getting from Eaton is diminished if they’re going to play him in CF rather than RF.

How could it be that a fielder could have such unusual defensive splits and it be solely or mostly due to chance only? The same reason a hitter can have unusual but random platoon splits or a pitcher can have unusual but random home/road or day/night splits. A metric like UZR or DRS, like almost all metrics, contains a large element of chance, or noise if you will. That noise comes from two sources – one is because the data and methodology are far from perfect and two is that actual defensive performance can fluctuate randomly (or for reasons we are just not aware of) from one time period to another – from play to play, game to game, or position to position, for various reasons or for no reason at all.

To the first point, just because our metric “says” that a player was +10 in UZR that does not necessarily mean that he performed exactly that well. In reality, he might have performed at a +15 level or he might have performed at a 0 or even a -10 level. It’s more likely of course that he performed at +5 than +20 or 0, but because of the limits of our data and methodology, the +15 is an estimate of his performance. To the second point, actual fielding performance, even if we could measure it precisely, like hitting and pitching, is subject to random fluctuations for reasons known (or at least speculated) and unknown to us. On one play a player can get a great jump and make a spectacular play and on another that same player can take a bad route, get a bad jump, the ball can pop out of his glove, etc. Some days fielders probably feel better than others. Etc.

So whenever we compare one time period to another or one position to another, even ones which require similar, perhaps even identical, skills, like in the OF, it is possible, even likely, that we are going to get different results by chance alone, or at least because of the two dynamics I explained above (don’t get hung up on the words “luck”, “chance” or “random”). Statistics tell us that those random differences will be more and more unlikely the further away we get from what is expected (e.g., we expect that play in CF will be 5 or 6 runs “worse” than play in RF or LF), however, statistics also tells us that any difference, even large ones like we see with Eaton (or more), can and do occur by chance alone.

At the same time, it is possible, maybe even likely, that a player could somehow be more suited to RF (or LF) than CF, or vice versa. So how do we determine how much of an unusual “split” in OF defense, for example, is likely chance and how much is likely “skill?” In other words, what would we expect future defense to be in RF and in CF for a player with unusual RF/CF splits? Remember that future performance always equates to an estimate of talent, more or less. For example, if we find strong evidence that almost all of these unusual splits are due to chance alone (virtually no skill), then we must assume that in the future the player with the unusual splits will revert to normal splits in any future time frame. In the case of Eaton that would mean that we would construct an OF projection based on all of his OF play, adjusted for position, and then do the normal adjustment for our CF or RF projection, such that his RF projection will be around 7 runs greater than his CF projection rather than the 20 run or more gap that we see in his past performance.

To examine this question, I looked at all players who played at least 20 games in CF and RF or LF from 2003 through 2015. I isolated those with various unusual splits. I also looked at all players to establish a baseline. At the same time, I crafted a basic one-season Marcel-like projection from that CF and corner performance combined. The way I did that was to adjust the corners to represent CF by subtracting 4 runs from LF UZR and 7 runs from RF UZR. Then I regressed that number based on the number of total games in that one season, added in an aging factor (-.5 runs for players under 27 and -1.5 runs for players 27 and older), and the resulting number was a projection for CF.

We can then take that number and add 4 runs for a LF projection and 7 runs for a RF projection. Remember these are range and errors only (no arm). So, for example, if a player was -10 in CF per 150 in 50 games and +3 in RF in 50 games, his projection would be:

Subtract 7 runs from his RF UZR to convert into “CF UZR”, so it’s now -4. Average that with his -10 UZR in CF, which gives him a total of -7 runs in 100 games. I am using 150 games as the 50% regression point so we regress this player 150/(150+100) or 60% toward a mean of -3 (because these are players who play both CF and corner, they are below average CF’ers). That comes out to -1.6. Add in an aging factor, say -.5 for a 25-year old and we get a projection of -2.1 for CF. That would mean a projection of +1.9 in LF, a +4 run adjustment and +4.9 in RF, a +7 run adjustment, assuming normal “splits.”

So let’s look at some numbers. To establish a baseline and test (and calibrate) our projections, let’s look at all players who played CF and LF or RF in season one (min 20 games in each) and then their next season in either CF or the corners:

UZR season one UZR season two Projected UZR
LF or RF +6.0 (N games=11629) 2.1 (N=42866) 2.1
CF -3.0 (N=9955) -.8 (23083) -.9

 

The spread we see in column 2, “UZR season one” is based on the “delta method”. It is expected to be a little wider than the normal talent spread we expect between CF and LF/RF which is around 6 runs. That is because of selective sampling. Players who do well at the corners will tend to also play CF and players who play poorly in CF will tend to get some play at the corners. The spread we see in column 3, “UZR season two” does not mean anything per se. In season two these are not necessarily players who played both positions again (they played either one or the other or both). All it means is that of players who played both positions in season one, they are 2.1 runs above average at the corners and .8 runs below average in CF, in season two.

Now let’s look at the same table for players like Eaton, who had larger than normal splits between a corner position and CF. I used a threshold of at least a 10-run difference (5.5 is typical). There were 254 players who played at least 20 games in CF and in RF or LF in one season and then played in LF in the next season, and 138 players who played in CF and LF or RF in one season and in RF in the next.

UZR season one UZR season two Projected UZR
LF or RF +12.7 (N games=4924) 1.4
CF -12.3 (N=4626) .3

 

For now, I’m leaving the third column, their UZR in season two, empty. These are players who appeared to be better suited at a corner position than in CF. If we assume that these unusual splits are merely noise, a random fluctuation, and that we expect them to have a normal split in season two, we can use the method I describe above to craft a projection for them. Notice the small split in the projections. The projection model I am using creates a CF projection and then it merely adds +4 runs for LF and +7 for RF. Given a 25-run split in season one rather than a normal 6-run split, we might assume that these players will play better, maybe much better, in RF or LF than in CF, in season two. In other words, there is a significant “true talent defensive split” in the OF. So rather than 1.4 in LF or RF (our projection assumes a normal split), we might see a performance of +5, and instead of .3 in CF, we might see -5, or something like that.

Remember that our projection doesn’t care how the CF and corner OF UZR’s are distributed in season one. It assumes static talent and just converts corner UZR to CF UZR by subtracting 4 or 7 runs. Then when it finalizes the CF projection, it assumes we can just add 4 runs for a LF projection and 7 runs for a RF one. It treats all OF positions the same, with a static conversion, regardless of the actual splits. The projection assumes that there is no such thing as “true talent OF splits.”

Now let’s see how well the projection does with that assumption (no such thing as “true talent OF defensive splits”). Remember that if we assume that there is “something” to those unusual splits, we expect our CF projection to be too high and our LF/RF projection to be too low.

UZR season one UZR season two Projected UZR
LF or RF +12.7 (N games=4924) .9 (N=16857) 1.4
CF -12.3 (N=4626) .8 (N=10250) .3

 

We don’t see any evidence of a “true talent OF split” when we compare projected to actual. In fact, we see the opposite effect, which is likely just noise (our projection model is pretty basic and not very precise). Instead of seeing better than expected defense at the corners as we might expect from players like Eaton who had unusually good defense at the corners compared to CF in season one, we see slightly worse than projected defense. And in CF, we see slightly better defense than projected even though we might have expected these players to be especially unsuited to CF.

Let’s look at players, unlike Eaton, who have “reverse” splits. These are players who in at least 20 games in both CF and LF or RF, had a better UZR in CF than at the corners.

UZR season one UZR season two Projected UZR
LF or RF -4.8 (N games=3299) 1.4 (N=15007) 2.4
CF 7.8 (N=3178) -4.4 (N=6832) -2.6

 

Remember, the numbers in column two, season one UZR “splits” are based on the delta method. Therefore, every player in our sample had a better UZR in CF than in LF or RF and the average difference was 12.6 runs (in favor of CF) whereas we expected an average difference of minus 6 runs or so (in favor of LF/RF). The “delta method” just means that I averaged all of the players’ individual differences weighted by the lesser of their games, either in CF or LF/RF.

Again, according to the “these unusual splits must mean something” (in terms of talent and what we expect in the next season) theory, we expect these players to significantly exceed their projection in CF and undershoot it at the corners. Again, we don’t see that. We see that our projections are high for both positions; in fact we overshoot more in CF than in RF/LF exaclty the opposite of what we would expect if there were any significance to these unusual splits. Again we see no evidence of a “true talent split in OF defense.”

For players with unusual splits in OF defense, we see that a normal projection at CF or at the corners suffices. We treat LF/RF/CF UZR exactly the same making static adjustments regardless of the direction and magnitude of the empirical splits. What about the idea that, “We don’t know what to expect with a player like Eaton?” I don’t really know what that means, but we hear it all the time when we see numbers that look unusual or “trendy” or appear to follow a “pattern.” Does that mean we expect there to be more fluctuation in season two UZR? Perhaps even though on the average they revert to normal spreads, we see a wider spread of results in these players who exhibit unusual splits in season one. Let’s look at that in our final analysis.

When we look at all players who played CF and LF/RF in season one, remember the average spread was 9 runs, +6 at the corners and -3 in CF. In season two, 28% of the players who played RF or LF had a UZR greater than +10 and 26% in CF had a UZR of -10 or worse. The standard deviation of the distribution in season two UZR was 13.9 runs for LF/RF and 15.9 in CF

What about our players like Eaton? Can we expect more players to have a poor UZR in CF and a great one at a corner? No. 26% of these players had a UZR greater than +10 and 25% had a UZR less than -10 on CF, around the same as all “dual” players in season one. In fact we get a smaller spread with these players with unusual splits as we would expect given that their means in CF and at the corners are actually closer together (look at the tables above). The standard deviation of the distribution in season two UZR for these players was 13.2 runs for LF/RF and 15.3 in CF, slightly smaller than for all “dual” players combined.

In conclusion, there is simply nothing to write about when it comes to Eaton’s or anyone else’s unusual outfield UZR or DRS splits. If you want to estimate their UZR going forward simply adjust and combine all of their OF numbers and do a normal projection. It doesn’t matter if they have -16 in LF and +20 in CF, 0 runs in CF only, or +4 runs in LF only. It’s all the same thing with exactly the same projection and exactly the same distribution of results the next season.

As far as we can tell there is simply no such thing (to any significant or identifiable degree) as an outfielder who is more suited to one OF position than another. There is outfield defense – period. It doesn’t matter where you are standing in the OF. The ability to catch line drives and fly balls in the OF is more or less the same whether you are standing in the middle or on the sides of the OF (yes it could take some time to get used to a position if you are unfamiliar with it). If you are good in one location you will be good at another, and if you are bad at one location you will be bad at another. Your UZR or DRS might change in a somewhat predictable fashion depending upon what position, CF, LF, or RF is being measured, but that’s only because the players you are measured against (those metrics are relative) differ in their average ability to catch fly balls and line drives. More importantly, when you see a player who has an unusual “split” in their outfield numbers, like Eaton, you will be tempted to think that they are intrinsically better at one position than another and that the unusual split will tend to continue in the future. When you see really large splits you will be tempted even more. Remember the words in this paragraph and remember this analysis to avoid being fooled by randomness into drawing faulty conclusions, as all human beings, even smart ones, are wont to do.

Advertisement

Note: There is the beginning of a very good discussion about this topic on The Book blog. If this topic interests you, feel free to check it out and participate if you want to.

I’ve been thinking about this for many years and in fact I have been threatening to redo my UZR methodology, in order to try and reduce one of the biggest weaknesses inherent in most if not all of the batted ball advanced defensive metrics.

Here is how most of these metrics work: Let’s say a hard hit ball was hit down the third base line and the third baseman made the play and threw the runner out. He would be credited with an out minus the percentage of time that an average fielder would make the same or similar play, perhaps 40% of the time. So the third baseman would get credit for 60% of a “play” on that ball, which is roughly .9 runs (the difference between the average value of a hit down the 3rd base line and an out) times .6 or .54 runs. Similarly, if he does not make the play, he gets debited with .4 plays or minus .36 runs.

There are all kind of adjustments which can be made, such as park effects, handedness of the batter, speed of the runner, outs and base runners (these affect the positioning of the fielders and therefore the average catch rate), and even the G/F ratio of the pitcher (e.g., a ground ball pitcher’s “hard” hit balls will be a little softer than a fly ball pitcher’s “hard” hit ball).

Anyway here is the problem with this methodology which, as I said, is basic to most if not all of these defensive metrics, and it has to do with our old friend Bayes. As is usually the case, this problem is greater in smaller sample sample sizes. We don’t really, really know the probability of an average fielder making any given play; we can only roughly infer it from the characteristics of the batted ball that we have access to and perhaps from the context that I described above (like the outs, runners, batter hand, park, etc.).

In the above example, a hard hit ground ball down the third base line, I said that the league average catch rate was 40%. Where did I get than number from? (Actually, I made it up, but let’s assume that that is a correct number in MLB over the last few years, given the batted ball location database that we are working with.) We looked at all hard hit balls hit to that approximate location (right down the third base line), according to the people who provide us with the database, and found out that of those 600 some odd balls over the last 4 years, 40% of them were turned into outs by the third baseman on the field.

So what is wrong with giving a third baseman .6 credit when he makes the play and .4 debit when he doesn’t? Well, surely not every single play, if you were to “observe” and “crunch” the play like, say, Statcast would do, is caught exactly 40% of the time. For any given play in that bucket, whether the fielder caught the ball or not, we know that he didn’t really have exactly a 40% chance of catching it if he were an average fielder. You knew that already. That 40% is the aggregate for all of the balls that fit into that “bucket” (“hard hit ground ball right down the third base line”).

Sometimes it’s 30%. Other times it’s 50%. Still other times it is near 0 (like if the 3rd baseman happens to be playing way off the line, and correctly so) or near 100% (like when he is guarding the line and he gets a nice big hop right in front of him), and everything in between.

On the average it is 40%, so you say, well, what are we to do? We can’t possibly tell from the data how much it really varies from that 40% on any particular play, which is true. So the best we can do is assume 40%, which is also true. That’s just part of the uncertainty of the metric. On the average, it’s right, but with error bars. Right? Wrong!

We do have information which helps us to nail down the true catch percentage of the average fielder given that exact same batted ball, at least how it is recorded by the people who provide us with the data. I’m not talking about the above-mentioned adjustments like the speed of the batter, his handedness, or that kind of thing. Sure, that helps us and we can use it or not. Let’s assume that we are using all of these “contextual adjustments” to the best of our ability. There is still something else that can help us to tweak those “league average caught” percentages such that we don’t have to use 40% on every hard hit ground ball down the line. Unfortunately, most metrics, including my own UZR, don’t take advantage of this valuable information even though it is staring us right in the face. Can you guess what it is?

The information that is so valuable is whether the player caught the ball or not! You may be thinking that that is circular logic or perhaps illogical. We are using that information to credit or debit the fielder. How and why would we also use it to change the base line catch percentage – in our example, 40%? In comes Bayes.

Basically what is happening is this: Hard ground ball is hit down the third base line. Overall 40% of those plays are made, but we know that not every play has a 40% chance of being caught because we don’t know where the fielder was positioned and we don’t really know the exact characteristics of the ball which greatly affect its chances of being caught: it was hit hard, but how hard? What kind of a bounce did it take? Did it have spin? Was it exactly down the line or 2 feet from the line (they were all classified as being in the same “location”)? We know the runner is fast (let’s say we created a separate bucket for those batted balls with a fast runner at the plate), but exactly how fast was he? Maybe he was a blazer and he beat it out by an eyelash.

So what does that have to do with whether the fielder caught the ball or not? That should be obvious by now. If the third baseman did not catch the ball, on the average, it should be clear that the ball tended to be one of those balls that were harder to catch than the average ball in that bucket. In other words, the chances that any ball that is caught should or would have been caught by an average fielder is clearly less than 40%. Similarly if a ball was caught, by any fielder, it was more likely to be an easier play than the average ball in that bucket. What we want are conditional probabilities, based on whether the ball was caught or not.

How much easier are the caught balls than the not-caught ones in any given bucket? That’s hard to say. Really hard to say. One would have to have lots of information in order to apply Bayes theorem to better estimate the “catch rate” of a ball in a particular bucket based on whether it is caught or not caught. I can tell you that I think the differences are pretty significant. It mostly depends on the spread (and what the actual distribution looks like) of actual catch rates in any given bucket. That depends on a lot of things. For one thing, the “size” and accuracy of the locations and other characteristics which make up the buckets. For example, if the unique locations were pretty large, say, one “location bucket” is anywhere from down the third base line to 20 feet off the bag (about 1/7 of the total distance from line to line), then the spread of actual catch rates versus the average catch rate in that bucket is going to be huge. Therefore the difference between the true catch rates for caught ball and non-caught ball is going to be large as well.

Speed of the batted ball is important as well. On very hard hit balls, the distribution of actual catch rates within a certain location will tend to be polarized or “bi-modal.” Either the ball will tend to be hit near the fielder and he makes the play or a little bit away from the fielder and he doesn’t. In other words, a catch might have a 75% true catch rate and non-catch, 15%, on the average, even if the overall rate is 40%.

Again, most metrics use the same base line catch rate for catches and non-catches because that seems like the correct and intuitive thing to do. It is incorrect! The problem, of course, is what number to assign to a catch and to a non-catch in any given bucket. How do we figure that out? Well, I haven’t gotten to that point yet, and I don’t think anyone else has either (I could be wrong). I do know, however, that it is guaranteed that if I use 39% for a non-catch and 41% for a catch, in that 40% bucket, I am going to be more accurate in my results, so why not do that? Probably 42/38 is better still. I just don’t know when to stop. I don’t want to go too far so that I end up cutting my own throat.

This is similar to the problem with park factors and MLE’s (among other “adjustments”). We don’t know that using 1.30 for Coors Field is correct but we surely know that using 1.05 is better than 1.00. We don’t know that taking 85% of player’s AAA stats to convert them to a major league equivalency is correct, but we definitely know that 95% is better than nothing.

Anyway, here is what I did today (other than torture myself by watching the Ringling Brothers and…I mean the Republican debates). I took a look at all ground balls that were hit in vector “C” according to BIS and was either caught or went through the infield in less than 1.5 seconds, basically a hard hit ball down the third base line. If you watch these plays, even though I would put them in the same bucket in the UZR engine, it is clear that some are easy to field and others are nearly impossible. You would be surprised at how much variability there is. On paper they “look” almost exactly the same. In reality they can vary from day to night and everything in between. Again, we don’t really care about the variance per se, but we definitely care about the mean catch rates when they are caught and when they are not.

Keep in mind that we can never empirically figure out those mean catch rates like we do when we aggregate all of the plays in the bucket (and then simply use the average catch rate of all of those balls). You can’t figure out the “catch rate” of a group of balls that were caught. It would be 100% right? We are interested in the catch rate of an average fielder when these balls were caught by these particular fielders, for whatever reasons they caught them. Likewise we want to know the league average catch rates of a group of balls that were not caught by these particular fielders for whatever reasons.

We can make these estimates (the catch rates of caught balls and non-caught balls in this bucket) in one of two ways: the first way is probably better and much less prone to human bias. It is also way more difficult to do in practice. We can try and observe all of the balls in this bucket and then try and re-classify them into many buckets according to the exact batted ball characteristics and fielder positioning. In other words, one bucket might be hard hit ground huggers right down the line with the third baseman playing roughly 8 feet off the line. Another might be, well, you get the point. Then we can actually use the catch rates in those sub-buckets.

When we are done, we can figure out the average catch rate on balls that were caught and those that were not, in the entire bucket. If that is hard to conceptualize, try constructing an example yourself and you will see how it works.

As I said, that is a lot of work. You have to watch a lot of plays and try and create lots and lots of sub-buckets. And then, even in the sub-buckets you will have the same situation, although much less problematic. For example, in one of those sub-buckets, a caught ball might be catchable 20% of the time in reality and a non-caught one only 15% – not much to worry about. In the large, original bucket, it might be 25% and 60%, as I said before. And that is a problem, especially for small samples.

Keep in mind that this problem will be mitigated in large samples but it will never go away. It will always overrate a good performance and underrate a bad one. But, in small samples, like even in one season, it will overrate so-called good fielding performance and underrate bad ones. The better the numbers the more they overstate the actual performance. The same is true for bad numbers. This is why I have been saying for years to regress what you see from UZR or DRS, even if you want to estimate “what happened.” (You would have to regress even more if you want to estimate true fielding talent.)

This is one of the problems with simply combining offense and defense to generate WAR. The defensive component needs to be regressed while the offensive one does not (base running needs to be regressed too. It suffers from the same malady as the defensive metrics).

Anyway, I looked at 20 or so plays in one particular bucket and tried to use the second method of estimating true catch rates for catches and non-catches. I simply observed the play and tried to estimate how often an average fielder would have made the play whether it was caught or not.

This is not nearly as easy as you might think. For one thing, guessing an average “catch rate” number like 60% or 70%, even if you’ve watched thousands of games in your life like I have, is incredibly difficult. The 0-10% and 90-100% ones are not that hard. Everything else is. I would guess that my uncertainty is something like 25% on a lot of plays, and my uncertainty on that estimate of uncertainty is also high!

The other problem is bias. When a play is made, you will overrate the true average catch rate (how often an average fielder would have made the play) and vice versa for plays that are not made. Or maybe you will underrate them because you are trying to compensate for the tendency to overrate them. Either way, you will be biased by whether the play was made or not, and remember you are trying to figure out the true catch rate on every play you observe with no regard to whether the play was made or not. (In actuality maybe whether it was made or not can help you with that assessment).

Here is a condensed version of the numbers I got. In that one location, presumably from the third base line to around 6 feet off the line, for ground balls that arrive in less than 1.5 seconds (I have 4 such categories of speed/time for GB), the average catch rate overall was 36%. However, for balls that were caught (and I only looked at 6 random ones), I estimated the average catch rate to be 11% (that varied from 0 to 35%). For balls that were caught (also 6 of them), it was 53% (from 10% to 95%). That is a ridiculously large difference and look at the variation even within those two groups (caught and not-caught). Even though using 11% for non-catches and 53% for catches is better than using 40% for everything, we are still making lots of mistakes within the new caught and not caught buckets!

How does that affect a defensive metric? Let’s look at a hypothetical example: Third baseman A makes 10 plays in that bucket and misses 20. Third baseman B makes 15 and misses 15. B clearly had a better performance, but how much better? Let’s assume that the average fielder makes 26% of the plays in the bucket and the misses are 15% and the catches are 56% (actually a smaller spread than I estimated). Using 15% and 56% yields an overall catch rate of around 26%.

UZR and most of the other metrics will do the calculations this way: Player A’s UZR is 10 * .74 – 20 * .26, or plus 2.2 plays which is around plus 2 runs. Player B is 15 * .74 – 15 * .26, or plus 7.2 plays, which equals plus 6.5 runs.

What about if we use the better numbers, 15% for missed plays and 56% for made ones. Now for Player A we have: 10 * .44 – 20 * .15, or 1.4 plays, which is 1.3 runs. Player B is 3.9 runs. So Player A’s UZR for those 30 plays went from +2 to + 1.3 and Player B went from +6.5 to +3.9. Each player regressed around 35-40% toward zero. That’s a lot!

Now I have to figure out how to incorporate this “solution” to all of the UZR buckets in some kind of fairly elegant way, short of spending hundreds of hours observing plays. Any suggestions would be appreciated.

 

Many people don’t realize that one of the (many) weaknesses of UZR, at least for the infield, is that it ignores any ground ball in which the infield was configured in some kind of a “shift” and it “influenced the play.” I believe that’s true of DRS as well.

What exactly constitutes “a shift” and how they determine whether or not it “influenced the play” I unfortunately don’t know. It’s up to the “stringers” (the people who watch the plays and input and categorize the data) and the powers that be at Baseball Info Solutions (BIS). When I get the data, there is merely a code, “1” or “0”, for whether there was a “relevant shift” or not.

How many GB are excluded from the UZR data? It varies by team, but in 2015 so far, about 21% of all GB are classified by BIS as “hit into a relevant shift.” The average team has had 332 shifts in which a GB was ignored by UZR (and presumably DRS) and 1268 GB that were included in the data that the UZR engine uses to calculate individual UZR’s. The number of shifts varies considerably from team to team, with the Nationals, somewhat surprisingly, employing the fewest, with only 181, and the Astros with a whopping 682 so far this season. Remember these are not the total number of PA in which the infield is in a shifted configuration. These are the number of ground balls in which the infield was shifted and the outcome was “relevant to the shift,” according to BIS. Presumably, the numbers track pretty well with the overall number of times that each team employs some kind of a shift. It appears that Washington disdains the shift, relatively speaking, and that Houston loves it.

In 2014, there were many fewer shifts than in this current season. Only 11% of ground balls involved a relevant shift, half the number than in 2015. The trailer was the Rockies, with only 92, and the leader, the Astros, with 666. The Nationals last year had the 4th fewest in baseball.

Here is the complete data set for 2014 and 2015 (as of August 30):

2014

Team GB Shifted Not shifted % Shifted
ari 2060 155 1905 8
atl 1887 115 1772 6
chn 1958 162 1796 8
cin 1938 125 1813 6
col 2239 92 2147 4
hou 2113 666 1447 32
lan 2056 129 1927 6
mil 2046 274 1772 13
nyn 2015 102 1913 5
phi 2105 177 1928 8
pit 2239 375 1864 17
sdn 1957 133 1824 7
sln 2002 193 1809 10
sfn 2007 194 1813 10
was 1985 116 1869 6
mia 2176 125 2051 6
ala 1817 170 1647 9
bal 1969 318 1651 16
bos 1998 247 1751 12
cha 2101 288 1813 14
cle 2003 265 1738 13
det 1995 122 1873 6
kca 1948 274 1674 14
min 2011 235 1776 12
nya 1902 394 1508 21
oak 1980 244 1736 12
sea 1910 201 1709 11
tba 1724 376 1348 22
tex 1811 203 1608 11
tor 1919 328 1591 17

 

2015

Team GB Shifted Not shifted % Shifted
ari 1709 355 1354 21
atl 1543 207 1336 13
chn 1553 239 1314 15
cin 1584 271 1313 17
col 1741 533 1208 31
hou 1667 682 985 41
lan 1630 220 1410 13
mil 1603 268 1335 17
nyn 1610 203 1407 13
phi 1673 237 1436 14
pit 1797 577 1220 32
sdn 1608 320 1288 20
sln 1680 266 1414 16
sfn 1610 333 1277 21
was 1530 181 1349 12
mia 1591 229 1362 14
ala 1493 244 1249 16
bal 1554 383 1171 25
bos 1616 273 1343 17
cha 1585 230 1355 15
cle 1445 335 1110 23
det 1576 349 1227 22
kca 1491 295 1196 20
min 1655 388 1267 23
nya 1619 478 1141 30
oak 1599 361 1238 23
sea 1663 229 1434 14
tba 1422 564 858 40
tex 1603 297 1306 19
tor 1539 398 1141 26

 

The individual fielding data (UZR) for infielders that you see on Fangraphs is based on non-shifted ground balls only, or on ground balls where there was a shift but it wasn’t relevant to the outcome. The reason that shifts are ignored in UZR (and DRS, I think) is because we don’t know where the individual fielders are located. It could be a full shift, a partial shift, the third baseman could be the left-most fielder as he usually is or he could be the man in short right field between the first baseman and the second baseman, etc. The way most of the PBP defensive metrics work, it would be useless to include this data.

But what we can do, with impunity, is to include all ground ball data in a team UZR. After all, if a hard ground ball is hit at the 23 degree vector, and we are only interested in team fielding, we don’t care who is the closest fielder or where he is playing. All we care about is whether the ball was turned into an out, relative to the league average out rate for a similar ground ball in a similar or adjusted for context. In other words, using the same UZR methodology, we can calculate a team UZR using all ground ball data, with no regard for the configuration of the IF on any particular play. And if it is true that the type, number and timing (for example, against which batters and/or with which pitchers) of shifts is relevant to a team’s overall defensive efficiency, team UZR in the infield should reflect not only the sum of individual fielding talent and performance, but also the quality of the shift in terms of hit prevention. In addition, if we subtract the sum of the individual infielders’ UZR on non-shift plays from the total team UZR on all plays, the difference should reflect, at least somewhat, the quality of the shifts.

I want to remind everyone that UZR accounts for several contexts. One, park factors. For infield purposes, although the dimensions of all infields are the same, the hardness and quality of the infield can differ from park to park. For example, in Coors Field in Colorado and Chase Field in Arizona, the infields are hard and quick, and thus more ground balls scoot through for hits even if they leave the bat with the same speed and trajectory.

Two, the speed of the batter. Obviously faster batters require the infielders to play a little closer to home plate and they beat out infield ground balls more often than slower batters. In some cases the third baseman and/or first baseman have to play in to protect against the bunt. This affects the average “caught percentage” for any given category of ground balls. The speed of the opposing batters tends to even out for fielders and especially for teams, but still, the UZR engine tries to account for this just in case it doesn’t, especially in small samples.

The third context is the position of the base runners and number of outs. This affects the positioning of the fielders, especially the first baseman (whether first base is occupied or not). The handedness of the batters is the next context. As with batter speed, these also tend to even out in the long run, but it is better to adjust for them just in case.

Finally, the overall GB propensity of the pitchers is used to adjust the average catch rates for all ground balls. The more GB oriented a pitcher is, the softer his ground balls are. While all ground balls are classified in the data as soft, medium, or hard, even within each category, the speed and consequently the catch rates, vary according to the GB tendencies of the pitcher. For example, for GB pitchers, their medium ground balls will be caught at a higher rate than the medium ground balls allowed by fly ball pitchers.

So keep in mind that individual and team UZR adjust as best as it can for these contexts. In most cases, there is not a whole lot of difference between the context adjusted UZR numbers and the unadjusted ones. Also keep in mind that the team UZR numbers you see in this article are adjusted for park, batter hand and speed, and runners/outs, the same as the individual UZR’s you see on Fangraphs.

For this article, I am interested in team UZR including when the IF is shifted. Even though we are typically interested in individual defensive performance and talent, it is becoming more and more difficult to evaluate individual fielding for infielders, because of the prevalence of the shift, and because there is so much disparity in how often each team employs the shift (so that we might be getting a sample of only 60% of the available ground balls for one team and 85% for another).

One could speculate that teams that employ the most shifts would have the best team defense. To test that, we could look at each team’s UZR versus their relevant shift percentage. The problem, of course, is that the talent of the individual fielders is a huge component of team UZR, regardless of how often a team shifts. There may also be selective sampling going on. Maybe teams that don’t have good infield defense feel the need to shift more often such that lots of shifts get unfairly correlated with (but are not the cause of) bad defense.

One way we can separate out talent from shifting is to compare team UZR on all ground balls with the total of the individual UZR’s for all the infielders (on non-shifted ground balls). The difference may tell us something about the efficacy of the shifts and non-shifts. In other words, total team individual infield UZR, which is just the total of each infielder’s UZR as you would see on Fangraphs (range and ROE runs only), represents what we generally consider to be a sample of team talent. This is measured on non-shifted ground balls only, as explained above.

Team total UZR, which measures team runs saved or cost, with no regard for who caught each ball or not, and is based on every batted ball, shifted or not, represents how the team actually performed on defense and is a much better measure of team defense than totaling the individual UZR’s. The difference, then, to some degree, represents how efficient teams are at shifting or not shifting, regardless of how often they shift.

There are lots of issues that would have to be considered when evaluating whether shifts work or not. For example, maybe shifting too much with runners on base results in fewer DP because infielders are often out of position. Maybe stolen bases are affected for the same reason. Maybe the number and quality of hits to the outfield change as a result of the shift. For example, if a team shifts a lot, maybe they don’t appear to record more ground ball outs, but the shifted batters are forced to try and hit the ball to the opposite field more often and thus they lose some of their power.

Maybe it appears that more ground balls are caught, but because pitchers are pitching “to the shift” they become more predictable and batters are actually more successful overall (despite their ground balls being caught more often). Maybe shifts are spectacularly successful against some stubborn and pull-happy batters and not very successful against others who can adjust or even take advantage of a shift in order to produce more, not less, offense. Those issues are beyond the scope of UZR and this article.

Let’s now look at each team in 2014 and 2015, their shift percentage, their overall team UZR, their team UZR when shifting, when not shifting, and their total individual combined UZR when not shifting. Remember this is for the infield only.

2015

Team % Shifts Shift Runs Non-Shift Runs Team Runs Total Individual Runs Team Minus Individual after prorating Ind Runs to 100% of plays
KCA 20 -2.2 10.5 10 26.3 -19.6
LAN 13 -5 -7.3 -13.3 0.8 -14.2
TOR 26 -2.5 13.9 11 22.6 -15.6
CHA 15 -7.7 -12.3 -21.8 -11.9 -8.9
CLE 23 0.6 3.3 3.3 12.8 -11.4
MIN 23 3.5 -11.6 -7.6 1.8 -9.7
MIL 17 0.3 -7.1 -6.7 2.5 -9.5
SEA 14 -2.6 -8.7 -13.8 -5.1 -8.3
SFN 21 2.3 12.6 15.8 24.4 -11.8
MIA 14 0.5 2.7 2.4 8.4 -6.7
ARI 21 3.4 -1.5 2.1 8 -7.0
HOU 41 -7.6 -3.2 -11.3 -6.1 -3.1
PHI 14 -6.4 -16.4 -23.5 -19 -3.0
COL 31 -7.3 0 -5.5 -1.5 -3.7
ATL 13 3.1 6.9 9.8 12.6 -3.7
SLN 16 -1.1 -5.8 -8.8 -7 -1.1
DET 22 1.8 -16.2 -17.8 -16 0.5
ALA 16 -2.4 -0.4 -3.6 -2.8 -0.5
BOS 17 0.3 4.8 3.5 2.7 0.5
NYN 13 -3.8 3.1 0.8 -2.7 3.7
WAS 12 1.1 -9.4 -8.4 -12.6 5.1
CIN 17 5 9.8 16.2 11.2 3.9
CHN 15 0.2 18.7 17.4 10.5 6.0
BAL 25 10.6 -0.5 14.4 5.8 7.6
SDN 20 7.5 -6.8 1.5 -7.8 10.3
TEX 19 4.1 12.8 19.6 10.1 8.3
TBA 40 0.1 4.5 7 -9.2 19.3
NYA 30 0.1 11.8 12.2 -6.6 20.2
PIT 32 0.3 0.3 0.1 -21 26.0
OAK 23 3.9 -8.8 -5 -31.4 31.1

 

The last column, as I explained above, represents the difference between how we think the infield defense performed based on individual UZR’s only (on non-shifted GB), prorated to 100% of the games (the proration is actually regressed so that we don’t have the “on pace for” problem), and how the team actually performed on all ground balls. If the difference is positive, then we might assume that the shifts and non-shifts are being done in an effective fashion regardless of how often shifts are occurring. If it is negative, then somehow the combination of shifts and non-shifts are costing some runs. Or the difference might not be meaningful at all – it could just be noise. At the very least, this is the first time that you are seeing real infield team defense being measured based on the characteristics of each and every ground ball and the context in which they were hit, regardless of where the infielders are playing.

First of all, if we look at all the teams that have a negative difference in the last column, the teams that presumably have the worst shift/no-shift efficiency, and compare them to those that are plus and presumably have the best shift/no-shift efficiency, we find that there is no difference in their average shift percentages. For example, TBA and HOU have the most shifts by far, and HOU “cost” their teams 5.2 runs and TBA benefited by 16.2 runs. LAN and WAS had the fewest shifts and one of them gained 4 runs and the other lost 14 runs.  The other teams are all over the board with respect to number of shifts and the difference between the individual UZR’s and team UZR.

Let’s look at that last column for 2014 and compare it to 2015 to see if there is any consistency from year to year within teams. Do some teams consistently do better or worse with their shifting and non-shifting, at least for 2014 and 2015? Let’s also see if adding more data gives us any relationship between the last column (delta team and individual UZR) and shift percentage.

Team 2015 % Shift 2014 % Shift 2015 Team Minus Individual 2014 Team Minus Individual Combined 2014 and 2015 Team Minus Individual
HOU 41 32 -5.2 45.6 40.4
TBA 40 22 16.2 12.7 28.9
PIT 32 17 21.1 5.5 26.6
TEX 19 11 9.5 9.9 19.4
WAS 12 6 4.2 13.0 17.2
OAK 23 12 26.4 -9.3 17.1
BAL 25 16 8.6 7.6 16.2
NYN 13 5 3.5 9.0 12.5
NYA 30 21 18.8 -8.4 10.4
CHA 15 14 -9.9 12.5 2.6
CHN 15 8 6.9 -5.8 1.1
TOR 26 17 -11.6 12.6 1.0
DET 22 6 -1.8 2.4 0.6
SFN 21 10 -8.6 6.0 -2.6
CIN 17 6 5 -8.2 -3.2
CLE 23 13 -9.5 5.2 -4.3
MIL 17 13 -9.2 3.1 -6.1
ARI 21 8 -5.9 -0.2 -6.1
SDN 20 7 9.3 -15.7 -6.4
MIA 14 6 -6 -0.9 -6.9
BOS 17 12 0.8 -10.6 -9.8
KCA 20 14 -16.3 6.3 -10.0
ATL 13 6 -2.8 -7.5 -10.3
PHI 14 8 -4.5 -6.2 -10.7
ALA 16 9 -0.8 -11.6 -12.4
SLN 16 10 -1.8 -12.2 -14.0
LAN 13 6 -14.1 -2.5 -16.6
MIN 23 12 -9.4 -9.3 -18.7
SEA 14 11 -8.7 -11.3 -20.0
COL 31 4 -4 -23.0 -27.0

 

Although there appears to be little correlation from one year to the next for each of the teams, we do find that of the teams that had the least efficient shifts/non-shifts (negative values in the last column), they averaged 14% shifts per season in 2014 and 2015. Those that had the most effective (plus values in the last column) shifted an average of 19% in 2014 and 2015. As well, the two teams with the biggest gains, HOU and TB, had the most shifts, at 37% and 31% per season, respectively. The two worst teams, Colorado and Seattle, shifted 17% and 13% per season. On the other hand, the team with the least shifts in baseball in 2014 and 2015 combined, the Nationals, gained over 17 runs in team UZR on all ground balls compared to a total of the individual UZR’s on non-shifted balls only, suggesting that the few shifts they employed were very effective, which seems reasonable.

It is also interesting to note that the team that had the worst difference in team and individual UZR in 2014, the Rockies, only shifted 4% of the time, easily the worst in baseball. In 2015, they have been one of the most shifted teams and still their team UZR is 4 runs worse than their total individual UZR’s. Still, that’s a lot better than in 2014.

It also appears that many of the smarter teams are figuring out how to optimize their defense beyond the talent of the individual players. TB, PIT, HOU, WAS, and OAK are at the top of the list in plus value deltas (the last column). These teams are generally considered to have progressive front offices. Some of the teams with the most negative numbers in the last column, those teams which appear to be sub-optimal in their defensive alignment, are LAN, MIN, SEA, PHI, COL, ATL, SLN, and ALA, all with reputations for having less than progressive front offices and philosophies, to one degree or another. In fact, other than a few outliers, like Boston, Texas, and the White Sox, the order of the teams in the chart above looks like a reasonable order of teams from most to least progressive teams. Certainly the teams in the top half appear to be the most saber-savvy teams and those in bottom half, the least.

In conclusion, it is hard to look at this data and figure out whether and which teams are using their shifts and non-shifts effectively. There doesn’t appear to be a strong correlation between shift percentage and the difference between team and individual defense although there are a few anecdotes that suggest otherwise. As well, in the aggregate for 2014 and 2015 combined, teams that have been able to outperform on team defense the total of their individual UZR’s have shifted more often, 19% to 13%.

There also appears to the naked eye to be a strong correlation between the perceived sabermetric orientation of a team’s front office and the efficiency of their shift/non-shift strategy, at least as measured by the numbers in the last column, explained above.

I think the most important thing to take away from this discussion is that there can be a fairly large difference between team infield UZR which uses every GB, and the total of the individual UZR’s which uses only those plays in which no shift was relevant to the outcome of the play. As well, the more shifts employed by a team, the less we should trust that the total of the individual performances are representative of the entire team’s defense on the infield. I am also going to see if Fangraphs can start publishing team UZR for infielders and for outfielders, although in the outfield, the numbers should be similar if not the same.

I just downloaded my Kindle version of the brand spanking new Hardball Times Annual, 2014 from Amazon.com. It is also available from Createspace.com (best place to order).

Although I was disappointed with last year’s Annual, I have been very much looking forward to reading this year’s, as I have enjoyed it tremendously in the past, and have even contributed an article or two, I think. To be fair, I am only interested in the hard-core analytical articles, which comprise a small part of the anthology. The book is split into 5 parts, according to the TOC: The “2013 season,” which consists of reviews/views of each of the six divisions plus one chapter about the post-season. Two, general Commentary. Three, History, four, Analysis, and finally, a glossary of statistical terms, and short bios on the various illustrious authors (including Bill James and Rob Neyer).

As I said, the only chapters which interest me are the ones in the Analysis section, and those are the ones that I am going to review, starting with Jeff Zimmerman’s, “Shifty Business, or the War Against Hitters.” It is mostly about the shifts employed by infielders against presumably extreme pull (and mostly slow) hitters. The chapter is pretty good with lots of interesting data mostly provided by Inside Edge, a company much like BIS and STATS, which provides various data to teams, web sites, and researchers (for a fee). It also raised several questions in my mind, some of which I wish Jeff had answered or at least brought up himself. There were also some things that he wrote which were confusing – at least in my 50+ year-old mind.

He starts out, after a brief intro, with a chart (BTW, if you have the Kindle version, unless you make the font size tiny, some of the charts get cut off) that shows the number, BABIP, and XBH% of plays where a ball was put into play with a shift (and various kinds of shifts), no shift, no doubles defense (OF deep and corners guarding lines), infield in, and corners in (expecting a bunt). This is the first time I have seen any data with a no-doubles defense, infield in, and with the corners up anticipating a bunt. The numbers are interesting. With a no-doubles defense, the BABIP is quite high and the XBH% seems low, but unfortunately Jeff does not give us a baseline for XBH% other than the values for the other situations, shift, no shift, etc., although I guess that pretty much includes all situations. I have not done any calculations, but the BABIP for a no-doubles defense is so high and the reduction in doubles and triples is so small, that it does not look like a great strategy off the top of my head. Obviously it depends on when it is being employed.

The infield-in data is also interesting. As expected, the BABIP is really elevated. Unfortunately, I don’t know if Jeff includes ROE and fielder’s choices (with no outs) in that metric. What is the standard? With the infield in, there are lots of ROE and lots of throws home where no out is recorded (a fielder’s choice). I would like to know if these are included in the BABIP.

For the corners playing up expecting a bunt, the numbers include all BIP, mostly bunts I assume. It would have been nice had he given us the BABIP when the ball is not bunted (and bunted). An important consideration for whether to bunt or not is how much not bunting increases the batter’s results when he swings away.

I would also have liked to see wOBA or some metric like that for all situations – not just BABIP and XBH%. It is possible, in fact likely, that walk and K rates vary in different situations. For example, perhaps walk rates increase when batters are facing a shift because they are not as eager to put the ball in play or the pitchers are trying to “pitch into the shift” and are consequently more wild. Or perhaps batters hit more HR because they are trying to elevate the ball as opposed to hitting a ground ball or line drive. It would also be nice to look at GDP rates with the shift. Some people, including Bill James, have suggested that the DP is harder to turn with the fielders out of position. Without looking at all these things, it is hard to say that the shift “works” or doesn’t work just by looking at BABIP (and even harder to say to what extent it works).

Jeff goes on to list the players against whom the shift is most often employed. He gives us the shift and no shift BABIP and XBH%. Collectively, their BABIP fell 37 points with the shift and it looks like their XBH% fell a lot too (although for some reason, Jeff does not give us that collective number, I don’t think). He writes:

…their BABIP [for these 20 players] collectively fell 37 points…when hitting with the shift on. In other words, the shift worked.

I am not crazy about that conclusion – “the shift worked.” First of all, as I said, we need to know a lot more than BABIP to conclude that “the shift worked.” And even if it did “work” we really want to know by how much in terms of wOBA or run expectancy. Nowhere is there an attempt by Jeff to do that. 37 points seems like a lot, but overall it could be only a small advantage. I’m not saying that it is small – only that without more data and analysis we don’t know.

Also, when and why are these “no-shifts” occurring? Jeff is comparing shift BIP data to no-shift BIP data and he is assuming that everything else is the same. That is probably a poor assumption. Why are these no-shifts occurring? Probably first and foremost because there are runners on base. With runners on base, everything is different. It might also be with a completely different pool of pitchers and fielders. Maybe teams are mostly shifting when they have good fielders? I have no idea. I am just throwing out reasons why it may not be an apples-to-apples comparison when comparing “shift” results to “no-shift” results.

It is also likely that the pool of batters is different with a shift and no shift even though he only looked at the batters who had the most shifts against them. In fact. a better method would have been a “delta” method, whereby he would use a weighted average of the differences between shift and no-shift for each individual player.

He then lists the speed score and GB and line drive pull percentages for the top ten most shifted players. The average Bill James speed score was 3.2 (I assume that is slow, but again, I don’t see where he tells us the average MLB score), GB pull % was 80% and LD pull % was 62%. The average MLB GB and LD pull %, Jeff tells us, is 72% and 50%, respectively. Interestingly several players on that list were at or below the MLB averages in GB pull %. I have no idea why they are so heavily shifted on.

Jeff talks a little bit about some individual players. For example, he mentions Chris Davis:

“Over the first four months of the season, he hit into an average of 29 shifts per month, and was able to maintain a .304 BA and a .359 BABIP. Over the last two months of the season, teams shifted more often against him…41 times per month. Consequently, his BA was .250 and his BABIP was .293.

The shift was killing him. Without a shift employed, Davis hit for a .425 BABIP…over the course of the 2013 season. When the shift was set, his BABIP dropped to .302…

This reminds me a little of the story that Daniel Kahneman, 2002 Nobel Prize Laureate in Economics, tells about teaching military flight instructors that praise works better than punishment. One of the instructors said:

“On many occasions I have praised flight cadets for clean execution of some aerobatic maneuver, and in general when they try it again, they do worse. On the other hand, I have often screamed at cadets for bad execution, and in general they do better the next time.”

Of course the reason for that was “regression towards the mean.” No matter what you say to someone who has done poorer than expected, they will tend to do better next time, and vice versa for someone who has just done better than expected.

If Chris Davis hits .304 the first four months of the season with a BABIP of .359, and his career numbers are around .260 and .330, then no matter what you do against him (wear your underwear backwards, for example), his next two months are likely going to show a reduction in both of these numbers! That does not necessarily imply a cause and effect relationship.

He makes the same mistake with several other players that he discusses. I fact, I have always had the feeling that at least part of the “observed” success for the shift was simply regression towards the mean. Imagine this scenario – I’m not saying that this is exactly what happens or happened, but to some extent I think it may be true. You are a month into the season and for X number of players, say they are all pull hitters, they are just killing you with hits to the pull side. Their collective BA and BABIP is .380 and .415. You decide enough is enough and you decide to shift against them. What do you  think is going to happen and what do you think everyone is going to conclude about the effectiveness of the shift, especially when they compare the “shift” to “no-shift” numbers?

Again, I think that the shift gives the defense a substantial advantage. I am just not 100% sure about that and I am definitely not sure about how much of an advantage it is and whether it is correctly employed against every player.

Jeff also shows us the number of times that each team employs the shift. Obviously not every team faces the same pool of batters, but the differences are startling. For example, the Orioles shifted 470 times and the Nationals 41! The question that pops into my mind is, “If the shift is so obviously advantageous (37 points of BABIP) why aren’t all teams using it extensively?” It is not like it is a secret anymore.

Finally, Jeff discusses bunting to beat the shift. That is obviously an interesting topic. Jeff shows that not many batters opt to do that but when they do, they reach base 58% of the time. Unfortunately, out of around 6,000 shifts where the ball was put into play, players only bunted 48 times! That is an amazingly low number. Jeff (likely correctly) hypothesizes that players should be bunting more often (a lot more often?). That is probably true, but I don’t think we can say how often and by whom? Maybe most of the players who did not bunt are terrible bunters and all they would be doing is bunting back to the pitcher or fouling the ball off or missing. And BTW, telling us that a bunt results in reaching base 58% of the time is not quite the whole story. We also need to know how many bunt attempts resulted in a strike. Imagine that if a player attempted to bunt 10 times, fouled it off or missed it 9 times and reached base once.  That is probably not a good result even though it looks like he bunted with a 1.000 average!

It is also curious to me that 7 players bunted into a shift almost 4 times each, and reached base 16 times (a .615 BA). They are obviously decent or good bunters. Why are they not bunting every time until the shift is gone against them? They are smart enough to occasionally bunt into a shift, but not smart enough to always do it? Something doesn’t seem right.

Anyway, despite my many criticisms, it was an interesting chapter and well-done by Jeff. I am looking forward to reading the rest of the articles in the Analysis section and if I have time, I will review one or more of them.