In an earlier post, I discussed how to find the best bating order. However, this method neglected the defensive positioning of the players.
In this post, I will describe how to find the best lineup considering both offensive and defensive aspects of the team's performance.
First, I found the positions of each of the players on the team can play. I wanted to find the best lineup and batting order for each of the three pitchers on the team.
For each of the three pitchers, I developed 100 random lineups that filled the 9 defensive positions along with the DH. I did not consider the batting order at this point. However, I ran the Monte Carlo simulation for each of the random lineups with a random batting order. Then I screened out the lineups that did not provide sufficient average runs per game. In this way, I had all of the defensive positions filled and still had a reasonably good average runs per game.
On examination of the lineups that were not screend out, I found that seven players were in each of the lineups. They were the strongest offensive players. I reordered the lineups so that these seven players would be in a specific place in the first seven positions in the batting order to maximize the average number of runs per game. One of these players could possibly play DH.
The last three players in the lineup filled out the defensive positions. So at this point, I had a number of lineups with batting orders that provided good offensive statistics.
I ran the Monte Carlo simulation to do a runoff of the batting orders to find the best offensive lineup for each pitcher while also having all of the defensive positions filled out.
Monday, 15 April 2013
Tuesday, 2 April 2013
Likelihood of Making the Final Four Tournament with an Eight Team League
In an earlier post, I predicted the chances of team G making the four team final weekend tournament based on the 7 team playoff format. The probability that I estimated for team G was 45%.
This probability would change in the 8 team league. The first round playoff format could have 1st play 8th, 2nd play 7th, 3rd play 6th and 4th play 5th. The opposition for team G plays in the first round will depend on how the new team performs.
Teams A, B, C and D will be in the North Division. Teams E, F, G and H will be in South Division.
I will assume the new team is team A. I used the Pythagorean Formula to determine the runs for and against for team A based on various values of their winning percentage. Then I adjusted the runs for and against for from the 2012 season for the other teams based on the results for team A. I then calculated the winning percentage for the other teams in the league.
Then I found the final standings based on the winning percentage of each team.
The final standings would be as shown below based on the performance of team A.
So if team A has a winning percentage between a 0.400 or 0.450, team G plays team F in the first round. Based on team G's head-to-head winning percentage versus team F, the probability of winning the first round of the playoffs would be 0.500.
If team A has a 0.500 winning percentage, then team G would play team A in the first round. Team G's head-to-head winning percentage versus team A is in this case expected to be 0.452. Thus, the probability of team G winning the first round would be 0.411.
For all of the other cases, team G would play team C in the first round. Team G's head-to-head winning percentage versus team C is expected to be 0.435. Thus the probability of winning the first round of the playoffs would be 0.380.
This probability would change in the 8 team league. The first round playoff format could have 1st play 8th, 2nd play 7th, 3rd play 6th and 4th play 5th. The opposition for team G plays in the first round will depend on how the new team performs.
Teams A, B, C and D will be in the North Division. Teams E, F, G and H will be in South Division.
I will assume the new team is team A. I used the Pythagorean Formula to determine the runs for and against for team A based on various values of their winning percentage. Then I adjusted the runs for and against for from the 2012 season for the other teams based on the results for team A. I then calculated the winning percentage for the other teams in the league.
Then I found the final standings based on the winning percentage of each team.
The final standings would be as shown below based on the performance of team A.
Team A Percent | 0.400 | 0.450 | 0.500 | 0.550 | 0.600 | 0.650 | 0.700 | 0.750 |
Standings | ||||||||
1 | E | E | E | E | E | E | E | A |
2 | B | B | B | B | A | A | A | E |
3 | C | C | C | A | B | B | B | B |
4 | G | G | A | C | C | C | C | C |
5 | F | F | G | G | G | G | G | G |
6 | H | A | F | F | F | F | F | F |
7 | A | H | H | H | H | H | H | H |
8 | D | D | D | D | D | D | D | D |
So if team A has a winning percentage between a 0.400 or 0.450, team G plays team F in the first round. Based on team G's head-to-head winning percentage versus team F, the probability of winning the first round of the playoffs would be 0.500.
If team A has a 0.500 winning percentage, then team G would play team A in the first round. Team G's head-to-head winning percentage versus team A is in this case expected to be 0.452. Thus, the probability of team G winning the first round would be 0.411.
For all of the other cases, team G would play team C in the first round. Team G's head-to-head winning percentage versus team C is expected to be 0.435. Thus the probability of winning the first round of the playoffs would be 0.380.
The Effect on Regular Season Performance of Adding an Eighth Team to the League
In an earlier post, I discussed how I used the runs for and against to predict the win percentage of a team.
Then I described how the win percentages for two teams could be used to predict results of head-to-head competiton.
Finally, I used the head-to-head win percentage to predict the regular season results for a 7 team league.
This season an eighth team has joined the league. It is difficult to predict their win percentage.
The league will be divided into two divisions. The teams will play three games against the other teams in their division and two games against the teams in the other division for a 17 game schedule.
My prediction for runs for and against are the same for team G, namely 104 runs for and 120 runs against. Thus, the predicted win percentage for team G is 0.429.
I assumed that teams A, B, C and D were in Division I and teams E, F, G and H were in Division II. I also assumed the two divisions were identical. I calculated the percentile for the standings in each division, that is, 0.8, 0.6, 0.4 and 0.2.
Then I used the inverse normal distribution to calculate the win percentage for the teams in the divisions by finding the standard deviation that would make team G and team C have a win percentage of 0.429.
The table below shows my predictions for the win percentage of the 8 teams.
I can now calculate the head-to-head performance of the 8 teams as shown in the table below.
Then I can predict the regular season performance of the 8 teams as shown in this table.
Thus my prediction of the regular season results is
Then I described how the win percentages for two teams could be used to predict results of head-to-head competiton.
Finally, I used the head-to-head win percentage to predict the regular season results for a 7 team league.
This season an eighth team has joined the league. It is difficult to predict their win percentage.
The league will be divided into two divisions. The teams will play three games against the other teams in their division and two games against the teams in the other division for a 17 game schedule.
My prediction for runs for and against are the same for team G, namely 104 runs for and 120 runs against. Thus, the predicted win percentage for team G is 0.429.
I assumed that teams A, B, C and D were in Division I and teams E, F, G and H were in Division II. I also assumed the two divisions were identical. I calculated the percentile for the standings in each division, that is, 0.8, 0.6, 0.4 and 0.2.
Then I used the inverse normal distribution to calculate the win percentage for the teams in the divisions by finding the standard deviation that would make team G and team C have a win percentage of 0.429.
The table below shows my predictions for the win percentage of the 8 teams.
Division I | Rank | Normal | Win Percent |
A | 1 | 0.80 | 0.736 |
B | 2 | 0.60 | 0.571 |
C | 3 | 0.40 | 0.429 |
D | 4 | 0.20 | 0.264 |
Division II | Rank | Normal | Win Percent |
E | 1 | 0.80 | 0.736 |
F | 2 | 0.60 | 0.571 |
G | 3 | 0.40 | 0.429 |
H | 4 | 0.20 | 0.264 |
I can now calculate the head-to-head performance of the 8 teams as shown in the table below.
0.736 | 0.571 | 0.429 | 0.264 | 0.736 | 0.571 | 0.429 | 0.264 | ||
A | B | C | D | E | F | G | H | ||
0.736 | A | 0.500 | 0.677 | 0.788 | 0.886 | 0.500 | 0.677 | 0.788 | 0.886 |
0.571 | B | 0.323 | 0.500 | 0.639 | 0.788 | 0.323 | 0.500 | 0.639 | 0.788 |
0.429 | C | 0.212 | 0.361 | 0.500 | 0.677 | 0.212 | 0.361 | 0.500 | 0.677 |
0.264 | D | 0.114 | 0.212 | 0.323 | 0.500 | 0.114 | 0.212 | 0.323 | 0.500 |
0.736 | E | 0.500 | 0.677 | 0.788 | 0.886 | 0.500 | 0.677 | 0.788 | 0.886 |
0.571 | F | 0.323 | 0.500 | 0.639 | 0.788 | 0.323 | 0.500 | 0.639 | 0.788 |
0.429 | G | 0.212 | 0.361 | 0.500 | 0.677 | 0.212 | 0.361 | 0.500 | 0.677 |
0.264 | H | 0.114 | 0.212 | 0.323 | 0.500 | 0.114 | 0.212 | 0.323 | 0.500 |
Then I can predict the regular season performance of the 8 teams as shown in this table.
A | B | C | D | E | F | G | H | Wins | |
A | 0 | 2 | 2 | 3 | 1 | 1 | 2 | 2 | 13 |
B | 1 | 0 | 2 | 2 | 1 | 1 | 1 | 2 | 10 |
C | 1 | 1 | 0 | 2 | 0 | 1 | 1 | 1 | 7 |
D | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 4 |
E | 1 | 1 | 2 | 2 | 0 | 2 | 2 | 3 | 13 |
F | 1 | 1 | 1 | 2 | 1 | 0 | 2 | 2 | 10 |
G | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 2 | 7 |
H | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 4 |
Losses | 4 | 7 | 10 | 13 | 4 | 7 | 10 | 13 |
Thus my prediction of the regular season results is
Div I | Wins | Losses | Win Percent |
A | 13 | 4 | 0.750 |
B | 10 | 7 | 0.574 |
C | 7 | 10 | 0.426 |
D | 4 | 13 | 0.250 |
Div II | Wins | Losses | Win Percent |
E | 13 | 4 | 0.750 |
F | 10 | 7 | 0.574 |
G | 7 | 10 | 0.426 |
H | 4 | 13 | 0.250 |
Saturday, 30 March 2013
Predicting the First Round of the Playoffs
In this 7 team league, the first round
of the playoffs involves the 2nd through 7th
place teams. The 1st place team gets a bye into the final
weekend double-knockout 4 team tournament involving the three winners of the first round.
In the first round, the
2nd place team plays the 7th place team, the
3rd place team plays the 6th place team, and
the 4th place team plays the 5th place team.
Each of these is a best-of-5 series.
In my last post, I made a prediction
about the regular season standings. So in the first round of
the playoffs, team B would play team F, team C would play team D and
team E would play team G.
I am interested in the probability of
team G winning the best-of-5 first round series and making it to the
final weekend tournament.
Team G's regular season win percentage is estimated to
be 0.429 and team E's regular season win percentage is estimated to be 0.456.
So
the probability of team G beating team E in one game of head-to-head competition
is 0.473.
Team G could win the best-of-5 game
series in 3, 4 or 5 games.
The probability of team G winning the
series in 3 games is
[ 0.473 * 0.473 * 0.473 ] = 0.106.
[ 0.473 * 0.473 * 0.473 ] = 0.106.
The probability of team G winning the
series in 4 games is
[ 3 * (1 – 0.473) * 0.473 * 0.473 * 0.473 ] = 0.167.
[ 3 * (1 – 0.473) * 0.473 * 0.473 * 0.473 ] = 0.167.
The probability of team G winning the
series in 5 games is
[ 6 * (1 – 0.473) * (1 – 0.473) * 0.473 * 0.473 * 0.473 ] = 0.176.
[ 6 * (1 – 0.473) * (1 – 0.473) * 0.473 * 0.473 * 0.473 ] = 0.176.
Adding up these three probabilities,
the total probability of team G winning the best-of-5 series against
team E is
[ 0.106 + 0.167 + 0.176 ] = 0.449.
[ 0.106 + 0.167 + 0.176 ] = 0.449.
So there is a 45% chance that
team G will make it to the final weekend tournament in 2013.
Regular Season Predictions for 2013
I can estimate the head-to-head
performance of two teams from their win percentages.
If a is the win percentage for Team A
and b is the win percentage for Team B, then the probability of Team
A beating Team B in one game can be estimated as a * (1 – b) / [ a
* (1 – b) + (1 – a) * b ] .
In my last post, I estimated the win
percentage for the 7 teams in the league based on the expected runs
for and against for each team.
I can now estimate the probability of
each team winning in head-to-head competition.
The table below shows these
calculations.
0.765 | 0.562 | 0.521 | 0.456 | 0.429 | 0.425 | 0.364 | ||
A | B | C | E | G | D | F | ||
0.765 | A | 0.500 | 0.717 | 0.749 | 0.795 | 0.812 | 0.815 | 0.851 |
0.562 | B | 0.283 | 0.500 | 0.541 | 0.605 | 0.631 | 0.634 | 0.692 |
0.521 | C | 0.251 | 0.459 | 0.500 | 0.565 | 0.592 | 0.596 | 0.656 |
0.456 | E | 0.205 | 0.395 | 0.435 | 0.500 | 0.527 | 0.531 | 0.595 |
0.429 | G | 0.188 | 0.369 | 0.408 | 0.473 | 0.500 | 0.504 | 0.568 |
0.425 | D | 0.185 | 0.366 | 0.404 | 0.469 | 0.496 | 0.500 | 0.564 |
0.364 | F | 0.149 | 0.308 | 0.344 | 0.405 | 0.432 | 0.436 | 0.500 |
Assuming a balanced 18 game schedule, the win-loss record for the 7 teams would be as follows
A | B | C | E | G | D | F | Wins | |
A | 0 | 2 | 2 | 2 | 2 | 2 | 3 | 14 |
B | 1 | 0 | 2 | 2 | 2 | 2 | 2 | 10 |
C | 1 | 1 | 0 | 2 | 2 | 2 | 2 | 9 |
E | 1 | 1 | 1 | 0 | 2 | 2 | 2 | 8 |
G | 1 | 1 | 1 | 1 | 0 | 2 | 2 | 8 |
D | 1 | 1 | 1 | 1 | 1 | 0 | 2 | 7 |
F | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 6 |
Losses | 4 | 8 | 9 | 10 | 10 | 11 | 12 |
Thus team G would be tied with team E
for 4th place at the end of the regular season.
Improvement in Winning Percentage
In an earlier post, I calculated the
expected runs that would be scored by a team using linear weights and
an optimal assignment of plate appearances between the players on the
team. I found that the expected number of runs scored would increase from 90 in 2012 to 104 in 2013.
In another earlier post, I calculated
the expected number of runs allowed based on a linear model of the
pitchers and an equitable distribution on innings for each pitcher.
I found that the expected number of runs allowed in 2013 is 120.
Also in an earlier post, I described
the Pythagorean formula to estimate the win percentage from the runs
for and against.
If I assume that the other teams in the
league do nothing to improve while team G optimizes the plate
appearances and pitchers innings, I can estimate the number of runs
for and against for the 7 teams in 2013 based on the runs for and
against in 2012 and the improvement of team G.
Below is a table which shows the
estimated win percentages for the 7 teams in 2013 based on the runs
for and against estimated for 2013.
2012 | 2013 | ||||
Team | RF | RA | RF | RA | Win Pct |
A | 109 | 59 | 109 | 61 | 0.765 |
B | 94 | 81 | 94 | 83 | 0.562 |
C | 92 | 86 | 92 | 88 | 0.521 |
E | 106 | 113 | 106 | 116 | 0.456 |
G | 90 | 119 | 104 | 120 | 0.429 |
D | 74 | 84 | 74 | 86 | 0.425 |
F | 79 | 102 | 79 | 105 | 0.364 |
|
We can see that in 2013, team G would
move to 5th place in the standings based on estimated win
percentage.
Friday, 29 March 2013
Determining the Necessary Improvement to Move Up in the Standings
In an earlier post, I discussed the
Pythagorean formula for estimating winning percentage from runs for
and runs against.
In this post, I will use the
Pythagorean formula to determine the necessary improvement the
last place team in a league would need to make to move up in the standings.
There were seven teams in the league
and I found the average and standard deviaton of the winning
percentages for the league. Then based on the rank, I found the
percentile for the inverse normal distribution. Then based on the
percentile and the average and standard deviation, I found the
expected winning percentage for each of the teams in the league.
I assumed that the last place team would need to improve their offensive and their defensive equally to move up in the standings. That is, the runs
for would need to increase and the runs against would need to
decrease by the
same amount.
rank | percentile | win pct | runs for | runs against | change |
1 | 0.88 | 0.636 | 119 | 90 | 29 |
2 | 0.75 | 0.580 | 113 | 96 | 23 |
3 | 0.63 | 0.538 | 109 | 100 | 19 |
4 | 0.50 | 0.500 | 105 | 105 | 15 |
5 | 0.38 | 0.462 | 101 | 108 | 11 |
6 | 0.25 | 0.420 | 96 | 113 | 6 |
7 | 0.13 | 0.364 | 90 | 119 | 0 |
Thus if the top four teams of the seven
team league make the playoffs, the last place team from 2012 would
have to increase their runs for by 15 and decrease their runs against by
15 in 2013.
Wednesday, 27 March 2013
A Goal for Pitchers
A common idea is that successful pitchers stay ahead on the
count. That means pitching more strikes
than balls. It has been suggested by some
baseball coaches that a pitcher should strive to pitch 60% strikes. I collected some data from the recent
International Softball Federation World Tournament that confirmed this
suggested goal for pitchers.
I wanted to answer the question: how important is it to pitch
a high percentage of strikes? I found
data from Major League Baseball on this subject. There were 88 pitchers in the sample. The average strike percentage was 64.0% and
the average earned run average was 3.87.
I did a linear regression to determine the relationship
between strike percentage and earned run average. The significance of the relationship was very high.
The regression equation I found is:
Earned run average = 12.5 – 13.5 * strike percentage
Here are some predicted values for earned run average based
on strike percentage using this equation.
Strike Percentage
|
Earned Run Average
|
70%
|
3.06
|
65%
|
3.73
|
60%
|
4.41
|
55%
|
5.08
|
50%
|
5.76
|
From these results, I can see why coaches recommend that
pitchers strive to throw at least 60% strikes.
Sunday, 17 March 2013
Distributing Plate Appearances between Position Players
In my last post, I distributed the number of innings between a men's fastpitch softball teams's pitching staff.
This team scored 90 runs in 2012. So with this ideal distribution of plate appearances, they should be able to improve that to 104 runs in 2013.
In this post, I will distribute the plate appearances for
the position players on a men's fastpitch softball team.
In 2012, there were 492 regular season plate appearances distributed
between 15 players. I would like to
distribute these 492 plate appearances for the 2013 regular season among the 14 players while
maximizing the number of runs created by the team.
Recall the linear weights based runs created formula that I found for this men's fastpitch softball league.
Runs Created = 0.44*1b + 0.83*2b + 1.00*3b + 1.38*hr + 0.31*walks
I calculated the runs created per plate appearance for each of the 14 players. Then I ranked the players by runs created per
plate appearance.
I found the mean and standard deviation of the plate
appearances for the team's players in the regular season for 2012.
Then I used the
inverse normal probability distribution with the percentile found using the rank of the player’s
runs created per plate appearance and the mean and standard deviation of the plate appearances for
the team in 2012. In this way, I could determine the ideal number of plate appearances for the 2013 season for each of the players.
rank | runs/pa | norm dist | new pa | new runs |
1 | 0.30 | 0.93 | 56 | 17 |
2 | 0.25 | 0.87 | 50 | 13 |
3 | 0.24 | 0.80 | 47 | 11 |
4 | 0.22 | 0.73 | 44 | 10 |
5 | 0.22 | 0.67 | 41 | 9 |
6 | 0.21 | 0.60 | 39 | 8 |
7 | 0.20 | 0.53 | 36 | 7 |
8 | 0.19 | 0.47 | 34 | 7 |
9 | 0.17 | 0.40 | 32 | 6 |
10 | 0.17 | 0.33 | 29 | 5 |
11 | 0.16 | 0.27 | 27 | 4 |
12 | 0.14 | 0.20 | 24 | 3 |
13 | 0.14 | 0.13 | 20 | 3 |
14 | 0.13 | 0.07 | 15 | 2 |
493 | 104 |
Friday, 15 March 2013
Balancing the Innings for a Pitching Staff
In my last post, I discussed Pete Palmer’s Linear Weights
formula. I showed how it could be used
to estimate the number of runs produced by a men’s fastpitch softball team.
In this post, I will look at a similar idea of linear
weights to evaluate a pitching staff. Then
I will use the linear weights to balance the innings assigned to each pitcher to
minimize the runs given up by a men’s fastpitch softball team.
I took the pitching statistics for the primary pitchers in a
local men’s fastpitch softball league. I
calculated various pitching statistics in terms of their values per inning. Then I used multiple linear regression to
estimate the runs allowed per inning pitched as a function of hits allowed
(non-homeruns), walks (base on balls and hit by pitch), strikeouts, and
homeruns allowed per inning pitched.
Here is the data that I used.
Pitcher
|
Hits
|
Walks
|
Strikeouts
|
Homeruns
|
Runs Allowed
|
1
|
1.05
|
0.42
|
1.32
|
0.08
|
0.58
|
2
|
1.23
|
0.57
|
0.91
|
0.16
|
1.25
|
3
|
1.49
|
0.47
|
1.24
|
0.17
|
1.18
|
4
|
0.79
|
0.26
|
1.44
|
0.12
|
0.32
|
5
|
0.83
|
0.68
|
1.43
|
0.00
|
0.48
|
6
|
1.03
|
0.34
|
1.52
|
0.08
|
0.76
|
7
|
1.09
|
0.34
|
1.07
|
0.10
|
0.73
|
8
|
1.13
|
0.54
|
1.02
|
0.17
|
1.22
|
9
|
1.42
|
0.46
|
0.46
|
0.07
|
1.05
|
10
|
1.31
|
0.20
|
0.76
|
0.06
|
0.71
|
11
|
1.09
|
0.76
|
1.27
|
0.15
|
0.97
|
12
|
1.19
|
0.65
|
1.19
|
0.11
|
0.98
|
13
|
0.46
|
0.33
|
1.77
|
0.06
|
0.40
|
14
|
1.15
|
0.66
|
0.66
|
0.16
|
1.23
|
The formula that I obtained from the linear regression is
Runs Allowed = 0.42*Hits + 0.55*Walks – 0.14*Strikeouts +
2.36*Homeruns
Pitchers 1, 2 and 3 are on the same team.
I wanted to balance the number of innings between the three
pitchers. I found that one good way to
do that was to equalize the runs allowed by each pitcher.
Here are the results.
Weight
|
0.42
|
0.55
|
-0.14
|
2.36
|
||
Games
|
Innings
|
Hits
|
Walks
|
Strikeouts
|
Homeruns
|
Runs Allowed
|
9
|
60
|
1.05
|
0.42
|
1.32
|
0.08
|
40
|
5
|
37
|
1.23
|
0.57
|
0.91
|
0.16
|
40
|
5
|
36
|
1.49
|
0.47
|
1.24
|
0.17
|
40
|
19
|
133
|
120
|
So the manager should plan to throw pitcher 1 for 60 innings
or the equivalent of 9 games during the season. Pitchers 2 and 3 would be expected to throw 37
and 36 innings respectively which represents approximately 5 games each.
The entire ptiching staff would be expected to allow 120 runs during
the season.
I can now use the expected offensive production of the
batters on the team from the previous post and the expected runs allowed by the
pitchers shown here in the Pythagorean formula to estimate the winning
percentage of the team during the regular season.
Subscribe to:
Posts (Atom)