Saturday, 30 March 2013

Predicting the First Round of the Playoffs

In this 7 team league, the first round of the playoffs involves the 2nd through 7th place teams. The 1st place team gets a bye into the final weekend double-knockout 4 team tournament involving the three winners of the first round.

In the first round, the 2nd place team plays the 7th place team, the 3rd place team plays the 6th place team, and the 4th place team plays the 5th place team. Each of these is a best-of-5 series.

In my last post, I made a prediction about the regular season standings. So in the first round of the playoffs, team B would play team F, team C would play team D and team E would play team G.

I am interested in the probability of team G winning the best-of-5 first round series and making it to the final weekend tournament.

Team G's regular season win percentage is estimated to be 0.429 and team E's regular season win percentage is estimated to be 0.456. 

So the probability of team G beating team E in one game of head-to-head competition is 0.473.

Team G could win the best-of-5 game series in 3, 4 or 5 games.

The probability of team G winning the series in 3 games is

[ 0.473 * 0.473 * 0.473 ] = 0.106.

The probability of team G winning the series in 4 games is

[ 3 * (1 – 0.473) * 0.473 * 0.473 * 0.473 ] = 0.167.

The probability of team G winning the series in 5 games is

[ 6 * (1 – 0.473) * (1 – 0.473) * 0.473 * 0.473 * 0.473 ] = 0.176.

Adding up these three probabilities, the total probability of team G winning the best-of-5 series against team E is

[ 0.106 + 0.167 + 0.176 ] = 0.449.

So there is a 45% chance that team G will make it to the final weekend tournament in 2013.

Regular Season Predictions for 2013

I can estimate the head-to-head performance of two teams from their win percentages.

If a is the win percentage for Team A and b is the win percentage for Team B, then the probability of Team A beating Team B in one game can be estimated as a * (1 – b) / [ a * (1 – b) + (1 – a) * b ] .

In my last post, I estimated the win percentage for the 7 teams in the league based on the expected runs for and against for each team.

I can now estimate the probability of each team winning in head-to-head competition.

The table below shows these calculations.



0.765 0.562 0.521 0.456 0.429 0.425 0.364


A B C E G D F
0.765 A 0.500 0.717 0.749 0.795 0.812 0.815 0.851
0.562 B 0.283 0.500 0.541 0.605 0.631 0.634 0.692
0.521 C 0.251 0.459 0.500 0.565 0.592 0.596 0.656
0.456 E 0.205 0.395 0.435 0.500 0.527 0.531 0.595
0.429 G 0.188 0.369 0.408 0.473 0.500 0.504 0.568
0.425 D 0.185 0.366 0.404 0.469 0.496 0.500 0.564
0.364 F 0.149 0.308 0.344 0.405 0.432 0.436 0.500

Assuming a balanced 18 game schedule, the win-loss record for the 7 teams would be as follows


A B C E G D F Wins
A 0 2 2 2 2 2 3 14
B 1 0 2 2 2 2 2 10
C 1 1 0 2 2 2 2 9
E 1 1 1 0 2 2 2 8
G 1 1 1 1 0 2 2 8
D 1 1 1 1 1 0 2 7
F 0 1 1 1 1 1 0 6
Losses 4 8 9 10 10 11 12

Thus team G would be tied with team E for 4th place at the end of the regular season.

Improvement in Winning Percentage


In an earlier post, I calculated the expected runs that would be scored by a team using linear weights and an optimal assignment of plate appearances between the players on the team. I found that the expected number of runs scored would increase from 90 in 2012 to 104 in 2013.

In another earlier post, I calculated the expected number of runs allowed based on a linear model of the pitchers and an equitable distribution on innings for each pitcher. I found that the expected number of runs allowed in 2013 is 120.

Also in an earlier post, I described the Pythagorean formula to estimate the win percentage from the runs for and against.

If I assume that the other teams in the league do nothing to improve while team G optimizes the plate appearances and pitchers innings, I can estimate the number of runs for and against for the 7 teams in 2013 based on the runs for and against in 2012 and the improvement of team G.

Below is a table which shows the estimated win percentages for the 7 teams in 2013 based on the runs for and against estimated for 2013.


2012 2013
Team RF RA RF RA Win Pct
A 109 59 109 61 0.765
B 94 81 94 83 0.562
C 92 86 92 88 0.521
E 106 113 106 116 0.456
G 90 119 104 120 0.429
D 74 84 74 86 0.425
F 79 102 79 105 0.364









We can see that in 2013, team G would move to 5th place in the standings based on estimated win percentage.



Friday, 29 March 2013

Determining the Necessary Improvement to Move Up in the Standings

In an earlier post, I discussed the Pythagorean formula for estimating winning percentage from runs for and runs against.

In this post, I will use the Pythagorean formula to determine the necessary improvement the last place team in a league would need to make to move up in the standings.

There were seven teams in the league and I found the average and standard deviaton of the winning percentages for the league. Then based on the rank, I found the percentile for the inverse normal distribution. Then based on the percentile and the average and standard deviation, I found the expected winning percentage for each of the teams in the league.

I assumed that the last place team would need to improve their offensive and their defensive equally to move up in the standings. That is, the runs for would need to increase and the runs against would need to decrease by the same amount. 

rank percentile win pct runs for runs against change
1 0.88 0.636 119 90 29
2 0.75 0.580 113 96 23
3 0.63 0.538 109 100 19
4 0.50 0.500 105 105 15
5 0.38 0.462 101 108 11
6 0.25 0.420 96 113 6
7 0.13 0.364 90 119 0

Thus if the top four teams of the seven team league make the playoffs, the last place team from 2012 would have to increase their runs for by 15 and decrease their runs against by 15 in 2013.

Wednesday, 27 March 2013

A Goal for Pitchers

A common idea is that successful pitchers stay ahead on the count.  That means pitching more strikes than balls.  It has been suggested by some baseball coaches that a pitcher should strive to pitch 60% strikes.  I collected some data from the recent International Softball Federation World Tournament that confirmed this suggested goal for pitchers.

I wanted to answer the question: how important is it to pitch a high percentage of strikes?  I found data from Major League Baseball on this subject.  There were 88 pitchers in the sample.  The average strike percentage was 64.0% and the average earned run average was 3.87.

I did a linear regression to determine the relationship between strike percentage and earned run average.  The significance of the relationship was very high.

The regression equation I found is:

Earned run average = 12.5 – 13.5 * strike percentage

Here are some predicted values for earned run average based on strike percentage using this equation.

Strike Percentage
Earned Run Average
70%
3.06
65%
3.73
60%
4.41
55%
5.08
50%
5.76

From these results, I can see why coaches recommend that pitchers strive to throw at least 60% strikes.

 

 

Sunday, 17 March 2013

Distributing Plate Appearances between Position Players

In my last post, I distributed the number of innings between a men's fastpitch softball teams's pitching staff.

In this post, I will distribute the plate appearances for the position players on a men's fastpitch softball team.

In 2012, there were 492 regular season plate appearances distributed between 15 players.  I would like to distribute these 492 plate appearances for the 2013 regular season among the 14 players while maximizing the number of runs created by the team.

Recall the linear weights based runs created formula that I found for this men's fastpitch softball league.

Runs Created = 0.44*1b + 0.83*2b + 1.00*3b + 1.38*hr + 0.31*walks

I calculated the runs created per plate appearance for each of the 14 players.  Then I ranked the players by runs created per plate appearance.

I found the mean and standard deviation of the plate appearances for the team's players in the regular season for 2012.
 
Then I used the inverse normal probability distribution with the percentile found using the rank of the player’s runs created per plate appearance and the mean and standard deviation of the plate appearances for the team in 2012.  In this way, I could determine the ideal number of plate appearances for the 2013 season for each of the players.

rank runs/pa norm dist new pa new runs
1 0.30 0.93 56 17
2 0.25 0.87 50 13
3 0.24 0.80 47 11
4 0.22 0.73 44 10
5 0.22 0.67 41 9
6 0.21 0.60 39 8
7 0.20 0.53 36 7
8 0.19 0.47 34 7
9 0.17 0.40 32 6
10 0.17 0.33 29 5
11 0.16 0.27 27 4
12 0.14 0.20 24 3
13 0.14 0.13 20 3
14 0.13 0.07 15 2



493 104

This team scored 90 runs in 2012.  So with this ideal distribution of plate appearances, they should be able to improve that to 104 runs in 2013.

Friday, 15 March 2013

Balancing the Innings for a Pitching Staff


In my last post, I discussed Pete Palmer’s Linear Weights formula.  I showed how it could be used to estimate the number of runs produced by a men’s fastpitch softball team.

In this post, I will look at a similar idea of linear weights to evaluate a pitching staff.  Then I will use the linear weights to balance the innings assigned to each pitcher to minimize the runs given up by a men’s fastpitch softball team.

I took the pitching statistics for the primary pitchers in a local men’s fastpitch softball league.   I calculated various pitching statistics in terms of their values per inning.  Then I used multiple linear regression to estimate the runs allowed per inning pitched as a function of hits allowed (non-homeruns), walks (base on balls and hit by pitch), strikeouts, and homeruns allowed per inning pitched.

Here is the data that I used.

Pitcher
Hits
Walks
Strikeouts
Homeruns
Runs Allowed
1
1.05
0.42
1.32
0.08
0.58
2
1.23
0.57
0.91
0.16
1.25
3
1.49
0.47
1.24
0.17
1.18
4
0.79
0.26
1.44
0.12
0.32
5
0.83
0.68
1.43
0.00
0.48
6
1.03
0.34
1.52
0.08
0.76
7
1.09
0.34
1.07
0.10
0.73
8
1.13
0.54
1.02
0.17
1.22
9
1.42
0.46
0.46
0.07
1.05
10
1.31
0.20
0.76
0.06
0.71
11
1.09
0.76
1.27
0.15
0.97
12
1.19
0.65
1.19
0.11
0.98
13
0.46
0.33
1.77
0.06
0.40
14
1.15
0.66
0.66
0.16
1.23

The formula that I obtained from the linear regression is

Runs Allowed = 0.42*Hits + 0.55*Walks – 0.14*Strikeouts + 2.36*Homeruns

Pitchers 1, 2 and 3 are on the same team.  

I wanted to balance the number of innings between the three pitchers.  I found that one good way to do that was to equalize the runs allowed by each pitcher.

Here are the results.

Weight
0.42
0.55
-0.14
2.36
Games
Innings
Hits
Walks
Strikeouts
Homeruns
Runs Allowed
9
60
1.05
0.42
1.32
0.08
40
5
37
1.23
0.57
0.91
0.16
40
5
36
1.49
0.47
1.24
0.17
40
19
133
120

So the manager should plan to throw pitcher 1 for 60 innings or the equivalent of 9 games during the season.  Pitchers 2 and 3 would be expected to throw 37 and 36 innings respectively which represents approximately 5 games each.

The entire ptiching staff would be expected to allow 120 runs during the season.

I can now use the expected offensive production of the batters on the team from the previous post and the expected runs allowed by the pitchers shown here in the Pythagorean formula to estimate the winning percentage of the team during the regular season.