Saturday, 23 February 2013

How to Determine if a Player is Above Average

So far I have concentrated on Markov Chains and Monte Carlo simulation. Markov Chains use only average teams made up of average players. I used Monte Carlo simulation to evaluate team performance based on the data from particular players.

One of the problems with using data from particular players in fastpitch softball is that the sample size of plate appearances is quite small even when I combine the statistics for 2011 and 2012.

The table below shows the data for 3 players in 2011 and 2012: the plate appearances, the times the player reached base safely, the on-base average, and the lower and upper values of the 95% confidence interval around the on-base average.

Player    PA      OB      Lower    OBA      Upper
1           107      46       0.288     0.430     0.478
2            85       31       0.210     0.365     0.417
3            59       18       0.128     0.305     0.365
 
The league on-base average was 0.373. So players 3 has an upper bound of his 95% confidence interval that is below the league average. So we can say with confidence that this player is significantly below average.

None of the players that have a lower bound that is higher than the league average. So according to this analysis I cannot say that any of players significantly above average.

Another tool that could be helpful in my statistical analysis is Bayes Updating.  This method starts with an estimate of the typical probability of a player being above average and updates the probability as more information is collected.

I will use the statistics for 2011 for the players considered in my earlier post. I will begin with an initial estimate of the probability of being above average of 0.5 and update the estimate based on the 2011 statistics.

Player    PA     OB     OBA     Probability of being above average (2011)
1           49      23     0.469                          0.88
2           46      16     0.348                          0.29
3           38      13     0.342                          0.29
 
So it appears that player 1 is likely to be above average, while players 2 and 3 are likely to be below average.
 
Then I will use the 2011 probability of being above average and update the probability with the 2012 data.
 
Player    2011 Prob    PA    OB    OBA    Probability of being above average (2012)
1               0.88          58    23    0.397                         0.94
2               0.29          39    15    0.385                         0.38
3               0.29          21      5    0.238                         0.05
 
Based on the data from 2011 and 2012 by  applying Bayesian Updating, I safely say that players 1 is above average in terms of on-base average.  Players 2 and 3 who had the same probability of being above average at the end of 2011 look quite different after 2012.  Player 2 is more likely to be above average while player 3 is very likely to be below average.

No comments:

Post a Comment