Friday, April 06, 2012

How stable is a baseball player's talent? Part II

David Justice had a pretty good year in 2000, hitting .286 with 41 home runs for an OPS of .961. Here's one of his splits from that season:

......... AB .H 2B 3B HR RBI BB K .AVG .OPS
---------------------------------------------
One .... 289 67 15 00 13 45 39 52 .232 0.742
Two .... 235 83 16 01 28 73 38 39 .353 1.230
---------------------------------------------


That's a pretty significant split -- bigger than his home/road or his day/night. Line 1 is not that great, but Line 2 is MVP material.

What is it? The first line is how he hit on even days of the month. The second is how he hit on odd days of the month.

There's actually nothing special about day of the month, or about David Justice in particular. I ran splits for every player-season from 1974 to 2009, and this was one of the biggest, so I just thought I'd show it.

------

Anyway, the reason I did this was because of a suggestion from Tango. In the previous post, I showed a method where we could estimate how much a player's talent in batting average fluctuates from year to year. (The estimate we got was that the SD of talent changes is around .010.)

Tango asked me to repeat the analysis for OBP, K rate, and BB rate. For OBP, it was .0136. For K, it was .0161 (that's strikeouts per plate appearance). And, for BB, it was .0161. I posted those results at his blog.

Those were larger that Tango thought, and so he wondered if it could just be random, and suggested the odd/even analysis. If the method was correct, then, for the odd/even analysis, the same method should give us a talent change of close to zero.

So I did it.

For every player-season from 1974 to 2009, where the player had at least 200 AB on both odd and even days of the month, I calculated the binomial Z-score of the difference between his odd performance in OBP, and his even performance in OBP.

Then, I took the mean and SD of those Z-scores. If everything was just random, we would get a mean of zero, and an SD of 1.00.

It was close. The mean was -.009, and the SD was 1.02.

By the same method used in the previous post, we can figure that the SD of "talent" changes between odd days and even days is .2 SD (the square root of 1.02 squared, minus 1 squared). With an average of 282 PA in each group, one SD of the binomial difference for a single player is .040. So, the talent change, in terms of OBP, is the product of the two, which is .008.

Is .008 in the right range? We wouldn't expect it to be zero, because, even though the days are random, the circumstances aren't. It could be that, just by chance, a certain player had more home games in one group than the other, or faced better pitching, or had more day games, or more games that the wind was blowing out, and so on.

But, my gut says that .008 is too large to be just the result of random circumstances. But is that really true? .008 is only two plate appearances in 250. Is it reasonable to expect that, typically, if you divide a season into two parts of 250 PA each, the random combination of home/day/baserunners/opposition pitching would lead to an expectation of one fewer out in one group, and one extra out in the next group? When I put it that way, it seems more reasonable ... but my gut still says it's too high.

However: there's a lot of randomness involved. The numbers varied quite a bit from year to year. Recall that the overall SD of the Z-scores, for all 36 seasons combined, was 1.02. Here are the first few single season numbers:

1974: 0.986
1975: 1.010
1976: 1.027
1977: 1.035
1978: 1.026
1979: 0.966
1980: 0.943
...

There are large differences between seasons. Indeed, for some seasons, the SD is less than 1, which really shouldn't happen for any reason except random chance. That is, there's no reason to expect that players' differences between odd and even should be less than if you just assigned plate appearances randomly.

If I continued the series all the way to 2009, we'd find that the SD of all those numbers is 0.051. Since there were 36 seasons in the study, which happens to be the square of exactly 6, the SD of the overall average is 1/6 of 0.051, which is around 0.0085.

The average of the 36 numbers was 1.0156, which is .0156 from 1.000. Since the SD is .0085, what we observed is 1.8 SD from 1.000.

But, as I said, we shouldn't be expecting exactly exactly exactly 1.000, because, the two groups are not actually identical in circumstances of those plate appearances. I don't know how much greater than 1.000 we'd expect, though. It might be only a tiny bit.

(By the way, I apologize if all these different SDs are getting confusing. There's the SD of the difference in OBA for a single season, which we talked about in the previous post. There's the SD of the Z-scores for a single season's worth of players, which is the list of numbers above. And, now, there's the SD of all those SDs! Sorry about that.)

-----

The bottom line is: we get our estimate of the SD of "OBP talent change" (which actually includes circumstance change) between odd and even to be around .008. But, the standard error of that is so large, that it could be anywhere between -.008 and .025 -- or, actually, .000 and .025, because negative SDs don't exist.

So, what we learned from this exercise is that this method isn't all that precise, even with 36 seasons worth of data to work with. That means that our previous estimate, that between-season talent of OBP varies with an SD of .0136, is similarly imprecise.

Maybe for next post, I'll repeat this "SD of the SD" analysis for the real season-to-season data, instead of this odd/even data, and see what the confidence intervals look like.





Labels: , ,

0 Comments:

Post a Comment

<< Home