Archive for the ‘Lineups’ Category

Note: I updated the pinch hitting data to include a larger sample (previously I went back to 2008. Now, 2000).

Note: It was pointed out by a commenter below and another one on Twitter that you can’t look only at innings where the #9 and #1 batters batted (eliminating innings where the #1 hitter led off), as Russell did in his study, and which he uses to support his theory (he says that it is the best evidence). That creates a huge bias, of course. It eliminates all PA in which the #9 hitter made the last out of an inning or at least an out was made while he was at the plate. In fact, the wOBA for a #9 hitter, who usually bats around .300, is .432 in innings where he and the #1 hitter bat (after eliminating so many PA in which an out was made). How that got past Russell, I have no idea.  Perhaps he can explain.

Recently, Baseball Prospectus published an article by one of their regular writers, Russell Carleton (aka Pizza Cutter), in which he examined whether the so-called “times through the order” penalty (TTOP) was in fact a function of how many times a pitcher has turned over the lineup in a game or whether it was merely an artifact of a pitcher’s pitch count. In other words, is it pitcher fatigue or batter familiarity (the more the batter sees the pitcher during the game, the better he performs) which causes this effect?

It is certainly possible that most or all of the TTOP is really due to fatigue, as “times through the order” is clearly a proxy for pitch count. In any case, after some mathematic gyrations that Mr. Carleton is want to do (he is the “Warning: Gory Mathematical Details Ahead” guy) in his articles, he concludes unequivocally that there is no such thing as a TTOP – that it is really a PCP or Pitch Count Penalty effect that makes a pitcher less and less effective as he goes through the order and it has little or nothing to do with batter/pitcher familiarity. In fact, in the first line of his article, he declares, “There is no such thing as the ‘times through the order’ penalty!”

If that is true, this is a major revelation which has slipped through the cracks in the sabermetric community and its readership. I don’t believe it is, however.

As one of the primary researchers (along with Tom Tango) of the TTOP, I was taken quite aback by Russell’s conclusion, not because I was personally affronted (the “truth” is not a matter of opinion), but because my research suggested that pitch count or fatigue was likely not a significant part of the penalty. In my BP article on the TTOP a little over 2 years ago, I wrote this: “…the TTOP is not about fatigue. It is about familiarity. The more a batter sees a pitcher’s delivery and repertoire, the more likely he is to be successful against him.” What was my evidence?

First, I looked at the number of pitches thrown going into the second, third, and fourth times through the order. I split that up into two groups—a low pitch count and a high pitch count. Here are those results. The numbers in parentheses are the average number of pitches thrown going into that “time through the order.”

Times Through the Order Low Pitch Count High Pitch Count
1 .341 .340
2 .351 (28) .349 (37)
3 .359 (59) .359 (72)
4 .361 (78) .360 (97)

 

If Russell’s thesis were true, you should see a little more of a penalty in the “high pitch count” column on the right, which you don’t. The penalty appears to be the same regardless of whether the pitcher has thrown few or many pitches. To be fair, the difference in pitch count between the two groups is not large and there is obviously sample error in the numbers.

The second way I examined the question was this: I looked only at individual batters in each group who had seen few or many pitches in their prior PA. For example, I looked at batters in their second time through the order who had seen fewer than three pitches in their first PA, and also batters who saw more than four pitches in their first PA. Those were my two groups. I did the same thing for each time through the order. Here are those results. The numbers in parentheses are the average number of pitches seen in the prior PA, for every batter in the group combined.

 

Times Through the Order Low Pitch Count each Batter High Pitch Count each Batter
1 .340 .340
2 .350 (1.9) .365 (4.3)
3 .359 (2.2) .361 (4.3)

 

As you can see, if a batter sees more pitches in his first or second PA, he performs better in his next PA than if he sees fewer pitches. The effect appears to be much greater from the first to the second PA. This lends credence to the theory of “familiarity” and not pitcher fatigue. It is unlikely that 2 or 3 extra pitches would cause enough fatigue to elevate a batter’s wOBA by 8.5 points per PA (the average of 15 and 2, the “bonuses” for seeing more pitches during the first and second PA, respectively).

So how did Russell come to his conclusion and is it right or wrong? I believe he made a fatal flaw in his methodology which led him to a faulty conclusion (that the TTOP does not exist).

Among other statistical tests, here is the primary one which led Russell to conclude that the TTOP is a mirage and merely a product of pitcher fatigue due to an ever-increasing pitch count:

This time, I tried something a little different. If we’re going to see a TTOP that is drastic, the place to look for it is as the lineup turns over. I isolated all cases in which a pitcher was facing the ninth batter in the lineup for the second time and then the first batter in the lineup for the third time. To make things fair, neither hitter was allowed to be the pitcher (this essentially limited the sample to games in AL parks), and the hitters needed to be faced in the same inning. Now, because the leadoff hitter is usually a better hitter, we need to control for that. I created a control variable for all outcomes using the log odds ratio method, which controls for the skills of the batter, as well as that of the pitcher. I also controlled for whether or not the pitcher had the platoon advantage in either case.

First of all, there was no reason to limit the data to “the same inning”. Regardless of whether the pitcher faces the 9th and 1st batters in the same inning or they are split up (the 9 hitter makes the last out), since one naturally follows the other, they will always have around the same pitch count, and the leadoff hitter will always be one time through the order ahead of the number nine hitter.

Anyway, what did Russell find? He found that TTOP was not a predictor of outcome. In other words, that the effect on the #9 hitter was the same as the #1 hitter, even though the #1 hitter had faced the pitcher one more time than the #9 hitter.

I thought about this for a long time and I finally realized why that would be the case even if there was a “times order” penalty (mostly) independent of pitch count. Remember that in order to compare the effect of TTO on that #9 and #1 hitter, he had to control for the overall quality of the hitter. The last hitter in the lineup is going to be a much worse hitter overall than the leadoff hitter, on the average, in his sample.

So the results should look something like this if there were a true TTOP: Say the #9 batters are normally .300 wOBA batters, and the leadoff guys are .330. In this situation, the #9 batters should bat around .300 (during the second time through the order we see around a normal wOBA) but the leadoff guys should bat around .340 – they should have a 10 point wOBA bonus for facing the pitcher for the third time.

Russell, without showing us the data (he should!), presumably gets something like .305 for the #9 batters (since the pitcher has gone essentially 2 ½ times through the lineup, pitch count-wise) and the leadoff hitters should hit .335, or 5 points above their norm as well (maybe .336 since they are facing a pitcher with a few more pitches under his belt than the #9 hitter).

So if he gets those numbers, .335 and .305, is that evidence that there is no TTOP? Do we need to see numbers like .340 and .300 to support the TTOP theory rather than the PCP theory? I submit that even if Russell sees numbers like the former ones, that is not evidence that there is no TTOP and it’s all about the pitch count. I believe that Russell made a fatal error.

Here is where he went wrong:

Remember that he uses the log-odds method to computer the baseline numbers, or what he would expect from a given batter-pitcher matchup, based on their overall season numbers. In this experiment, there is no need to do that, since both batters, #1 and #9, are facing the same pitcher the same number of times. All he has to do is use each batter’s seasonal numbers to establish the base line.

But where do those base lines come from? Well, it is likely that the #1 hitters are mostly #1 hitters throughout the season and that #9 hitters usually hit at the bottom of the order. #1 hitters get around 150 more PA than #9 hitters over a full season. Where do those extra PA come from? Some of them come from relievers of course. But many of them come from facing the starting pitcher more often per game than those bottom-of-the-order guys. In addition, #9 hitters sometimes are removed for pinch hitters late in a game against a starter such that they lose even more of those 3rd and 4th time through the order PA’s. Here is a chart of the mean TTO per game versus the starting pitcher for each batting slot:

 

Batting Slot Mean TTO/game
1 2.15
2 2.08
3 2.02
4 1.98
5 1.95
6 1.91
7 1.86
8 1.80
9 1.77

(By the way, if Russell’s thesis is true, bottom of the order guys have it even easier, since they are always batting when the pitcher has a higher pitch count, per time through the order. Also, this is the first time you have been introduced to the concept that the top of the order batters have it a little easier than the bottom of the order guys, and that switching spots in the order can affect overall performance because of the TTOP or PCP.)

What that does is result in the baseline for the #1 hitter being higher than for the #9 hitter, because the baseline includes more pitcher TTOP (more times facing the starter for the 3rd and 4th times). That makes it look like the #1 hitter is not getting his advantage as compared to the #9 hitter, or at least he is only getting a partial advantage in Russell’s experiment.

In other words, the #9 hitter is really a true .305 hitter and the #1 hitter is really a true .325 hitter, even though their seasonal stats suggest .300 and .330. The #9 hitters are being hurt by not facing starters late in the game compared to the average hitter and the #1 hitters are being helped by facing starters for the 3rd and 4th times more often than the average hitter.

So if #9 hitters are really .305 hitters, then the second time through the order, we expect them to hit .305, if the TTOP is true. If the #1 hitters are really .325 hitters, despite hitting .330 for the whole season, we expect them to hit .335 the third time through the order, if the TTOP is true. And that is exactly what we see (presumably).

But when Russell sees .305 and .335 he concludes, “no TTOP!” He sees what he thinks is a true .300 hitter hitting .305 after the pitcher has thrown around 65 pitches and what he thinks is a true .330 hitter hitting .335 after 68 or 69 pitches. He therefore concludes that both hitters are being affected equally even though one is batting for the second time and the other for the third time – thus, there is no TTOP!

As I have shown, those numbers are perfectly consistent with a TTOP of around 8-10 points per times through the order, which is exactly what we see.

Finally, I ran one other test which I think can give us more evidence one way or another. I looked at pinch hitting appearances against starting pitchers. If the TTOP is real and pitch count is not a significant factor in the penalty, we should see around the same performance for pinch hitters regardless of the pitcher’s pitch count, since the pinch hitter always faces the pitcher for the first time and the first time only. In fact, this is a test that Russell probably should have run. The only problem is sample size. Because there are relatively few pinch hitting PA versus starting pitchers, we have quite a bit of sample error in the numbers. I split the sample of pinch hitting appearances up into 2 groups: Low pitch count and high pitch count.

 

Here is what I got:

PH TTO Overall Low Pitch Count High Pitch Count
2 .295 (PA=4901) .295 (PA=2494) .293 (PA=2318)
3 .289 (PA=10774) .290 (PA=5370) .287 (PA=5404)

 

I won’t comment on the fact that the pinch hitters performed a little better against pitchers with a low pitch count (the differences are not nearly statistically significant) other than to say that there is no evidence that pitch count has any influence on the performance of pinch hitters who are naturally facing pitchers for the first and only time. Keep in mind that the times through the order (the left column) is a good proxy for pitch count in and of itself and we also see no evidence that that makes a difference in terms of pinch hitting performance. In other words, if pitch count significantly influenced pitching effectiveness, we should see pinch hitters overall performing better when the pitcher is in the midst of his 3rd time through the order as opposed to the 2nd time (his pitch count would be around 30-35 pitches higher). We don’t. In fact, we see a worse performance (the difference is not statistically significant – one SD is 8 points of wOBA).

 

I have to say that it is difficult to follow Russell’s chain of logic and his methodology in many of his articles because he often fails to “show his work” and he uses somewhat esoteric and opaque statistical techniques only. In this case, I believe that he made a fatal mistake in his methodology as I have described above which led him to the erroneous conclusion that, “The TTOP does not exist.” I believe that I have shown fairly strong evidence that the penalty that we see pitchers incur as the game wears on is mostly or wholly as a result of the TTO and not due to fatigue caused by an increasing pitch count.

I look forward to someone doing additional research to support one theory or the other.

Advertisement

Those of you who follow me on Twitter know that I am somewhat obsessed with how teams (managers) construct their lineups. With few exceptions, managers tend to do two things when it comes to setting their daily lineups: One, they follow more or less the traditional model of lineup construction, which is to put your best overall offensive player third, a slugger fourth, and scrappy, speedy players in the one and/or two holes. Two, monkey with lineups based on things like starting pitcher handedness (relevant), hot and cold streaks, and batter/pitcher matchups, the latter two generally being not so relevant. For example, in 2012, the average team used 122 different lineups.

If you have read The Book (co-authored by Yours Truly, Tom Tango and Andy Dolphin), you may remember that the optimal lineup differs from the traditional one. According to The Book, a team’s 3 best hitters should bat 1,2, and 4, and the 4th and 5th best hitters 3 and 5. The 1 and 2 batters should be more walk prone than the 4 and 5 hitters. Slots 6 through 9 should feature the remaining hitters in more or less descending order of quality. As we know, managers violate or in some cases butcher this structure by batting poor, sometimes awful hitters, in the 1 and 2 holes, and usually slotting their best overall hitter third. They also sometimes bat a slow, but good offensive player, often a catcher, down in the order.

In addition to these guidelines, The Book suggests placing good base stealers in front of low walk, and high singles and doubles hitters. That often means the 6 hole rather than the traditional 1 and 2 holes in which managers like to put their speedy, base stealing players. Also, because the 3 hole faces a disproportionate number of GDP opportunities, putting a good hitter who hits into a lot of DP, like a Miguel Cabrera, into the third slot can be quite costly. Surprisingly, a good spot for a GDP-prone hitter is leadoff, where a hitter encounters relatively few GDP opportunities.

Of course, other than L/R considerations (and perhaps G/F pitcher/batter matchups for extreme players) and when substituting one player for another, optimal lineups should rarely if ever change. The notion that a team has to use 152 different lineups (like TB did in 2012) in 162 games, is silly at best, and a waste of a manager’s time and sub-optimal behavior at worst.

Contrary to the beliefs of some sabermetric naysayers, most good baseball analysts and sabermetricians are not unaware of or insensitive to the notion that some players may be more or less happy or comfortable in one lineup slot or another. In fact, the general rule should be that player preference trumps a “computer generated” optimal lineup slot. That is not to say that it is impossible to change or influence a player’s preferences.

For those of you who are thinking, “Batting order doesn’t really matter, as long as it is somewhat reasonable,” you are right and you are wrong. It depends on what you mean by “matter.” It is likely that in most cases the difference between a prevailing, traditional order and an optimal one, not-withstanding any effect from player preferences, is on the order of less than 1 win (10 or 11 runs) per season; however, teams pay on the free agent market over 5 million dollars for a player win, so maybe those 10 runs do “matter.” We also occasionally find that the difference between an actual and optimal lineup is 2 wins or more. In any case, as the old sabermetric saying goes, “Why do something wrong, when you can do it right?” In other words, in order to give up even a few runs per season, there has to be some relevant countervailing and advantageous argument, otherwise you are simply throwing away potential runs, wins, and dollars.

Probably the worst lineup offense that managers commit is putting a scrappy, speedy, bunt-happy, bat-control, but poor overall offensive player in the two hole. Remember that The Book (the real Book) says that the second slot in the lineup should be reserved for one of your two best hitters, not one of your worst. Yet teams like the Reds, Braves, and the Indians, among others, consistently put awful hitting, scrappy players in the two-hole. The consequence, of course, is that there are fewer base runners for the third and fourth hitters to drive in, and you give an awful hitter many more PA per season and per game. This might surprise some people, but the #2 hitter will get over 100 more PA than the #8 hitter, per 150 games. For a bad hitter, that means more outs for the team with less production. It is debatable what else a poor, but scrappy hitter batting second brings to the table to offset those extra empty 100 PA.

The other mistake (among many) that managers make in constructing what they (presumably) think is an optimal order is using current season statistics, and often spurious ones like BA and RBI, rather than projections. I would venture to guess that you can count on one hand, at best, the number of managers that actually look at credible projections when making decisions about likely future performance, especially 4 or 5 months into the season. Unless a manager has a time machine, what a player has done so far during the season has nothing to do with how he is likely to do in the upcoming game, other than how those current season stats inform an estimate of future performance. While it is true that there is obviously a strong correlation between 4 or 5 months past performance and future performance, there are many instances where a hitter is projected as a good hitter but has had an awful season thus far, and vice versa. If you have read my previous article on projections, you will know that projections trump seasonal performance at any point in the season (good projections include current season performance to-date – of course). So, for example, if a manager sees that a hitter has a .280 wOBA for the first 4 months of the season, despite a .330 projection, and bats him 8th, he would be making a mistake, since we expect him to bat like a .330 hitter and not a .280 hitter, and in fact he does, according to an analysis of historical player seasons (again, see my article on projections).

Let’s recap the mistakes that managers typically make in constructing what they think are the best possible lineups. Again, we will ignore player preferences and other “psychological factors” not because they are unimportant, but because we don’t know when a manager might slot a player in a position that even he doesn’t think is optimal in deference to that player. The fact that managers constantly monkey with lineups anyway suggests that player preferences are not that much of a factor. Additionally, more often than not I think, we hear players say things like, “My job is to hit as well as I can wherever the manager puts me in the lineup.” Again, that is not to say that some players don’t have certain preferences and that managers shouldn’t give some, if not complete, deference to them, especially with veteran players. In other words, an analyst advising a team or manager should suggest an optimal lineup taking into consideration player preferences. No credible analyst is going to say (or at least they shouldn’t), “I don’t care where Jeter is comfortable hitting or where he wants to hit, he should bat 8th!”

Managers typically follow the traditonal batting order philosophy which is to bat your best hitter 3rd, your slugger 4th, and fast, scrappy, good-bat handlers 1 or 2, whether they are good overall hitters or not. This is not nearly the same as an optimal batting order, based on extensive computer and mathematical research, which suggest that your best hitter should bat 2 or 4, and that you need to put your worst hitters at the bottom of the order in order to limit the number of PA they get per game and per season. Probably the biggest and most pervasive mistake that managers make is slotting terrible hitters at the top, especially in the 2-hole. Managers also put too many base stealers in front of power hitters and hitters who are prone to the GDP in the 3 hole.

Finally, managers pay too much attention (they should pay none) to short term and seasonal performance as well as specific batter/pitcher past results when constructing their batting orders. In general, your batting order versus lefty and righty starting pitchers should rarely change, other than when substituting/resting players, or occasionally when player projections significantly change, in order to suit certain ballparks or weather conditions, or extreme ground ball or fly ball opposing pitchers (and perhaps according to the opposing team’s defense). Other than L/R platoon considerations (and avoiding batting consecutive lefties if possible), most of these other considerations (G/F, park, etc.) are marginal at best.

With that as a background and primer on batting orders, here is what I did: I looked at all 30 teams’ lineups as of a few days ago. No preference was made for whether the opposing pitcher was right or left-handed or whether full-time starters or substitutes were in the lineup on that particular day. Basically these were middle of August random lineups for all 30 teams.

The first thing I did was to compare a team’s projected runs scored based on adding up each player’s projected linear weights in runs per PA and then weighting each lineup slot by its average number of PA per game, to the number of runs scored using a game simulator and those same projections. For example, if the leadoff batter had a linear weights projection of -.01 runs per PA, we would multiply that by 4.8 since the average number of PA per game for a leadoff hitter is 4.8. I would do that for every player in the lineup in order to get a total linear weights for the team. In the NL, I assumed an average hitting pitcher for every team. I also added in every player’s base running (not base stealing) projected linear weights, using the UBR (Ultimate Base Running) stat you see on Fangraphs. The projections I used were my own. They are likely to be similar to those you see on Fangraphs, The Hardball Times, or BP, but in some cases they may be different.

In order to calculate runs per game in a simulated fashion, I ran a simple game simulator which uses each player’s projected singles, doubles, triples, HR, UIBB+HP, ROE, G/F ratio, GDP propensity, and base running ability. No bunts, steals or any in-game strategies (such as IBB) were used in the simulation. The way the base running works is this: Every player is assigned a base running rating from 1-5, based on their base running projections in runs above/below average (typically from -5 to +5 per season). In the simulator, every time a base running opportunity is encountered, like how many bases to advance on a single or double, or whether to score from third on a fly ball, it checks the rating of the appropriate base runner and makes an adjustment. For example, on an outfield single with a runner on first, if the runner is rated as a “1” (slow and/or poor runner), he advances to third just 18% of the time, whereas if he is a “5”, he advances 2 bases 41% of the time. The same thing is done with a ground ball and a runner on first (whether he is safe at second and the play goes to first), a ground ball, runner on second, advances on hits, tagging up on fly balls, and advancing on potential wild pitches, passed balls, and errors in the middle of a play (not ROE).

Keep in mind that a lineup does 2 things. One, it gives players at the top more PA than players at the bottom, which is a pretty straightforward thing. Because of that, it should be obvious that you want your best hitters batting near the top and your worst near the bottom. But, if that were the only thing that lineups “do,” then you would simply arrange the lineup in a descending order of quality. The second way that a lineup creates runs is by each player interacting with other players, especially those near them in the order. This is very tricky and complex. Although a computer analysis can give us rules of thumb for optimal lineup construction, as we do in The Book, it is also very player dependent, in terms of each player’s exact offensive profile (again, ignoring things like player preferences or abilities of players to optimize their approach to each lineup slot). As well, if you move one player from one slot to another, you have to move at least one other player. When moving players around in order to create an optimal lineup, things can get very messy. As we discuss in The Book, in general, you want on base guys in front of power hitters and vice versa, good base stealers in front of singles hitters with low walk totals, high GDP guys in the one hole or at the bottom of the order, etc. Basically, constructing an optimal batting order is impossible for a human being to do. If any manager thinks he can, he is either lying or fooling himself. Again, that is not to say that a computer can necessarily do a better job. As with most things in MLB, the proper combination of “scouting and stats” is usually what the doctor ordered.

In any case, adding up each player’s batting and base running projected linear weights, after controlling for the number of PA per game in each batting slot, is one way to project how many runs a lineup will score per game. Running a simulation using the same projections is another way which also captures to some extent the complex interactions among the players’ offensive profiles. Presumably, if you just stack hitters from best to worst, the “adding up the linear weights” method will result in the maximum runs per game, while the simulation should result in a runs per game quite a bit less, and certainly less than with an optimal lineup construction.

I was curious as to the extent that the actual lineups I looked at optimized these interactions. In order to do that, I compared one method to the other. For example, for a given lineup, the total linear weights prorated by number of PA per game might be -30 per 150 games. That is a below average offensive lineup by 30/150 or .2 runs per game. If the lineup simulator resulted in actual runs scored of -20 per 150 games, presumably there were advantageous interactions among the players that added another 10 runs. Perhaps the lineup avoided a high GDP player in the 3-hole or perhaps they had high on base guys in front of power hitters. Again, this has nothing to do with order per se. If a lineup has poor hitters batting first and/or second, against the advice given in The Book, both the linear weights and the simulation methods would bear the brunt of that poor construction. In fact, if those poor hitters were excellent base runners and it is advisable to have good base runners at the top of the order (and I don’t know that it is), then presumably the simulation should reflect that and perhaps create added value (more runs per game) as compared to the linear weights method of projecting runs per game.

The second thing I did was to try and use a basic model for optimizing each lineup, using the prescriptions in The Book. I then re-ran the simulation and re-calculated the total linear weights to see which teams could benefit the most from a re-working of their lineup, at least based on the lineups I chose for this analysis. This is probably the more interesting query. For the simulations, I ran 100,000 games per team, which is actually not a whole lot of games in terms of minimizing the random noise in the resultant average runs per game. One standard error in runs per 150 games is around 1.31. So take these results with a grain or two of salt.

In the NL, here are the top 3 and bottom 3 teams in terms of additional or fewer runs that a lineup simulation produced, as compared to simply adding up each player’s projected batting and base running runs, adjusting for the league average number of PA per game for each lineup slot.

Top 3

Team Linear Weights Lineup Simulation Gain per 150 games
ARI -97 -86 11
COL -23 -13 10
PIT 10 17 6

Here are those lineups:

ARI

Inciarte

Pennington

Peralta

Trumbo

Hill

Pacheco

Marte

Gosewisch

 

COL

Blackmon

Stubbs

Morneau

Arenado

Dickerson

Rosario

Culberson

Lemahieu

 

PIT

Harrison

Polanco

Martin

Walker

Marte

Snider

Davis

Alvarez

 

Bottom 3

Team Linear Weights Lineup Simulation Gain per 150 games
LAD 43 28 -15
SFN 35 27 -7
WAS 42 35 -7

 

 

LAD

Gordon

Puig

Gonzalez

Kemp

Crawford

Uribe

Ellis

Rojas

 

SFN

Pagan

Pence

Posey

Sandoval

Morse

Duvall

Panik

Crawford

 

WAS

Span

Rendon

Werth

Laroche

Ramos

Harper

Cabrera

Espinosa

 

In “optimizing” each of the 30 lineups, I used some simple criteria. I put the top two overall hitters in the 2 and 4 holes. Whichever of the two had the greatest SLG batted 4th. The next two best hitters batted 1 and 3, with the highest SLG in the 3 hole. From 5 through 8 or 8, I simply slotted them in descending order of quality.

Here is a comparison of the simple “optimal” lineup to the lineups that the teams actually used. Remember, I am using the same personnel and changing only the batting orders.

Before giving you the numbers, the first thing that jumped out at me was how little most of the numbers changed. Conventional, and even most sabermetric, thought is that any one reasonable lineup is usually just about as good as any other, give or take a few runs. As well, a good lineup must strike a balance between putting better hitters at the top of the lineup, and those who are good base runners but poor overall hitters.

The average absolute difference between the runs per game generated by the simulator from the actual and the “optimal” lineup was 3.1 runs per 150 games per team. Again, keep in mind that much of that is noise since I am running only 100,000 games per team, which generates a standard error of something like 1.3 runs per 150 games.

The kicker, however, is that the “optimal” lineups, on the average, only slightly outperformed the actual ones, by only 2/3 of a run per team. Essentially there was no difference between the lineups chosen by the managers and ones that were “optimized” according to the simple rules explained above. Keep in mind that a real optimization – one that tried every possible batting order configuration and chose the best one – would likely generate better results.

That being said, here are the teams whose actual lineups out-performed and were out-performed by the “optimal” ones:

Most sub-optimal lineups

Team Actual Lineup Simulation Results (Runs per 150) “Optimal” Lineup Simulation Results Gain per 150 games
STL 62 74 12
ATL 31 37 6
CLE -33 -27 6
MIA 7 12 5

Here are those lineups. The numbers after each player’s name represents their projected batting runs per 630 PA (around 150 games). Keep in mind that these lineups faced either RH or LH starting pitchers. When I run my simulations, I am using overall projections for each player which do not take into consideration the handedness of the batter or any opposing pitcher.

Cardinals

Name Projected Batting runs
Carpenter 30
Wong -11
Holliday 26
Adams 14
Peralta 7
Pierz -10
Jay 17
Robinson -18

Here, even though we have plenty of good bats in this lineup, Matheny prefers to slot one of the worst in the two hole. Many managers just can’t resist doing so, and I’m not really sure why, other than it seems to be a tradition without a good reason. Perhaps it harkens back to the day when managers would often sac bunt or hit and run after the leadoff hitter reached base with no outs. It is also a mystery why Jay bats 7th. He is even having a very good year at the plate, so it’s not like his seasonal performance belies his projection.

What if we swap Wong and Jay? That generates 69 runs above average per 150 games, which is 7 runs better than with Wong batting second, and 5 runs worse than my original “optimal” lineup. Let’s try another “manual” optimization. We’ll put Jay lead off, followed by Carp, Adams, Holliday, Peralta, Wong, Pierz, and Robinson. That lineup produces 76 runs above average, 14 runs better than the actual one, and better than my computer generated simple “optimal” one. So for the Cardinals, we’ve added 1.5 wins per season just by shuffling around their lineup, and especially by removing a poor hitter from the number 2 slot and moving up a good hitter in Jay (and who also happens to be an excellent base runner).

Braves

Name Projected Batting runs
Heyward 23
Gosselin -29
Freeman 24
J Upton 20
Johnson 9
Gattis -1
Simmons -16
BJ Upton -13

Our old friend Fredi Gonzalez finally moved BJ Upton from first to last (and correctly so, although he was about a year too late), he puts Heyward at lead off, which is pretty radical, yet he somehow bats one of the worst batters in all of baseball in the 2-hole, accumulating far too many outs at the top of the order. If we do nothing but move Gosselin down to 8th, where he belongs, we generate 35 runs, 4 more than with him batting second. Not a huge difference, but 1/2 win is a half a win. They all count and they all add up.

Indians

Name Projected Batting runs
Kipnis 5
Aviles -19
Brantley 13
Santana 6
Gomes 8
Rayburn -9
Walters -13
Holt -21
Jose Ramirez -32

The theme here is obvious. When a team puts a terrible hitter in the two-hole, they lose runs, which is not surprising. If we merely move Aviles down to the 7 spot and move everyone up accordingly, the lineup produces -28 runs rather than -33 runs, a gain of 5 runs just by removing Aviles from the second slot.

Marlins

Name Projected Batting runs
Yelich 15
Solano -21
Stanton 34
McGhee -8
Jones -10
Salty 0
Ozuna 4
Hechavarria -27

With the Fish, we have an awful batter in the two hole, a poor hitter in the 4 hole, and decent batters in the 6 and 7 hole. What if we just swap Solano for Ozuna, getting that putrid bat out of the 2 hole? Running another simulation results in 13 runs above average per 150 games, besting the actual lineup by 6 runs.

Just for the heck of it, let’s rework the entire lineup, putting Ozuna in the 2 hole, Salty in the 3 hole, Stanton in the 4 hole, then McGhee, Jones, Solano, and Hechy. Surpisingly, that only generates 12 runs above average per 150, better than their actual lineup, but slightly worse than just swapping Solano and Ozuna. The achilles heel for that lineup, as it is for several others, appears to be the poor hitter batting second.

Most optimal lineups

Team Actual Lineup Simulation Results (Runs per 150) “Optimal” Lineup Simulation Results Gain per 150 games
LAA 160 153 -7
SEA 45 39 -6
DET 13 8 -5
TOR 86 82 -4

Finally, let’s take a look at the actual lineups that generate more runs per game than my simple “optimal” batting order.

Angels

Name Projected Batting runs
Calhoun 20
Trout 59
Pujols 7
Hamilton 17
Kendrick 10
Freese 8
Aybar 0
Iannetta 2
Cowgill -7

 

Mariners

Name Projected Batting runs
Jackson 11
Ackley -3
Cano 35
Morales 1
Seager 13
Zunino -14
Morrison -2
Chavez -24
Taylor -2

 

Tigers

Name Projected Batting runs
Davis -2
Kinsler 6
Cabrera 50
V Martinez 17
Hunter 10
JD Martinez -4
Castellanos -20
Holaday -44
Suarez -23

 

Blue Jays

Name Projected Batting runs
Reyes 11
Cabrera 15
Bautista 34
Encarnacion 20
Lind 6
Navarro -7
Rasmus -1
Valencia -9
Lawasaki -23

Looking at all these “optimal” lineups, the trend is pretty clear. Bat your best hitters at the top and your worst at the bottom, and do NOT put a scrappy, no-hit batter in the two hole! The average projected linear weights per 150 games for the number two hitter in our 4 best actual lineups is 19.25 runs. The average 2-hole hitter in our 4 worst lineups is -20 runs. That should tell you just about everything you need to know about lineups construction.

Note: According to The Book, batting your pitcher 8th in an NL lineup generates slightly more runs per game than batting him 9th, as most managers do. Tony LaRussa sometimes did this, especially with McGwire in the lineup. Other managers, like Maddon, occasionally do the same. There is some controversy over which option is optimal.

When I ran my simulations above, swapping the pitcher and the 8th hitter in the NL lineups. the resultant runs per game were around 2 runs worse (per 150) than with the traditional order. It probably depends on who the position player is at the bottom of the order and perhaps on the players at the top of the order as well.