Wednesday, January 12, 2011

Do players "steal" rebounding opportunities from teammates?

In my previous post on rebounding, I promised to review some of the evidence supporting the "diminishing returns" hypothesis. That's the theory that, in general, a player's individual rebound totals are not mostly a product of his own ability to snare rebounds, but, rather, predominantly because of his positioning on the court, or the role he is assigned on his team.

That is: if a player gets a lot of rebounds, even adjusted for what position he plays, a large part of his total is rebounds that other players would have gotten to regardless.

Actually, instead of reviewing the evidence, I thought I'd just create my own. However, my numbers here follow on arguments made by others, in other places. (Here, for instance, is a comment by Guy listing four reasons why we should believe that "diminishing returns" is a real phenomenon.)

The easiest test, perhaps, is simply whether, when a player gets more rebounds, it means that his teammates get fewer. If they do, that's evidence that player is simply grabbing balls his teammates would have gotten anyway. If not, that's support for David Berri's hypothesis, that differences in rebounds are the result of the talent of the particular player credited with the rebound.

So, here's what I did. For every team in the 2008-2009 season, I ran a regression comparing the total rebounds by the team's centers (combined) to the total for the rest of the team (combined). I typed in the data manually from some team pages at the 82games.com website, because that's where I happened to find it broken down by position. (Strangely, basketball-reference.com doesn't do that.)

It's only one season's worth of data, but, still, the results are striking.

Every extra rebound by the center results in 0.61 fewer rebounds by teammates.

That means that only 39 percent of center rebounds are "real", in the sense that the other team would have got them if not for the center's efforts. If a center is (say) 20 rebounds above average for the season, then, on average, you should estimate that 12 of those rebounds would have been grabbed by a teammate anyway.


(For those of you who (unlike me) prefer correlation coefficients, the r was -.63, and the r-squared was .395.)

There's a possible alternative explanation for this: it might be that teams just don't want to acquire too much rebounding talent. So, if they get a center who's good at rebounding, they'll get below-average rebounders at the other positions. Still, even that possibility supports the "diminishing returns" hypothesis. Because, why else would teams limit their rebounders? Baseball teams don't limit their home run hitters ... if baseball teams try to cap the amount of rebounding talent on their team, it must be because they realize that the additional rebounders would be somewhat wasted.

And, in any event, I suspect the effect is much, much too strong to be the result of general manager choice. Here are some other ways of looking at the strength of this finding:

1. Let's convert to a won-loss record.

The center is either above or below average for the league. And the rest of the positions, combined, are either above or below average for the league. Suppose you call it a "win" (for the David Berri theory) when both tendencies are the same (both are above or below average). And suppose you call it a "loss" when the tendencies are different (centers above average and the rest of the team below average, or vice versa).

In that case, the Berri hypothesis goes 7-23.

In extreme cases, it does even worse. Breaking the 7-23 record down into how far centers were from the average:

4-9: Teams' centers within 1 Reb/G of average
3-7: Teams' centers between 1 and 2 Reb/G from average
0-7: Teams' centers more than 2 Reb/G from average

This exactly supports the hypothesis. The farther the center is above (below) average, the less likely his teammates are to also be above (below) average.

2. As others have pointed out before, teams are surprisingly close to each other in total rebounds.

So close, in fact, that it leads to this shocking result:

The variance in team rebounds per game is LESS than the variance in rebounds per game by the center alone!

Think about that, how unusual that is. It doesn't work for many other player skill stats in the world of team sports, does it?

Suppose I ask you to predict how many home runs a random full-time player hit last year. You might guess, say, 18. You'd be off by 18 if the guy hits zero, or you may be off by 36 if the guy turns out to be Jose Bautista. But 36 is your worst-case scenario.

Now, suppose I ask you to predict how many home runs a random TEAM will hit next year. You guess 154, which was the average in 2010. Now, you're in worse shape. The Blue Jays hit 257, and the Mariners hit only 101. The worst case is now 103 for a team, almost triple the worst case of 36 for a player. Instead of -18 to 36, the range of error is now -56 to 103. The variance, and the margin of error, is much higher for a team.

But for NBA rebounds, it's the other way around -- it's actually easier to predict rebounds for a random team than for a random team's centers!

-- For centers, the SD was 1.63, and the range from worst to best was 7.4 (10.3 to 17.7).
-- For teams, the SD was only 1.43. And the range was only 4.0 (38.8 to 43.8).

This piece of evidence is so surprising, and so strong, that it should be enough, by itself, to convince you that something's going on. The SD of rebounds by centers is higher than the SD of rebounds by their teams. That's it in one sentence.


3. There are five positions on a basketball team. In terms of rebounds, every one of those five positions had a negative correlation with the rest of their team. That's 5-0 for the diminishing returns hypothesis.

There are also 10 different pairs of two positions. If you run the correlation between rebounds for those 10 pairs, eight of them come out negative. So that's 8-2.

(The two positive ones were (1) shooting guard vs. small forward (r = .312; every additional rebound by the small forward leads to .31 more rebounds by the shooting guard), and (2) shooting guard vs. power forward (r = .024, every additional rebound by the power forward leads to .01 more rebounds by the shooting guard).


4. There was very, very interesting result for point guard vs. the rest of the team.

Every additional rebound taken by the point guard reduces the rest of the team's rebound total by ... almost exactly one rebound: .96, to be exact.

Taken at face value, that means that 24 out of 25 times, when a point guard gets a rebound, it's because the rest of the team deliberately leaves it for him!

DSMok1 posted this comment last week:

"For a significant number of defensive rebounds, there are multiple defensive players present for the rebound (could get the rebound), while the offense has already cleared out to cut off the fast break. These rebounds do not show value or skill to the player who gets them, but are rather a random/confounding variable. For some teams, their center will grab such "garbage" rebounds. For other teams, maybe the PG will grab them himself (I see OKC and Russel Westbrook do this)."


That's consistent with the data. If the PG only gets the rebound when the entire offense has cleared out, it means that there can't be any value added.

I don't know if this is true or not -- I really don't watch a lot of basketball -- but I find it interesting that the regression and DSMok1 are saying exactly the same thing, only several days apart. Admittedly, it could be just random error ... you'd want to check this for other years to make sure.


5. As I said, every position had a negative correlation with the other positions on the team. Here they are. (UPDATE, 1/23: I realized I accidentally left out one team, and entered one team twice. Corrections have been made below.)

-- PG: every extra rebound reduces his teammates' rebounds by 0.96 0.87.
-- SG: every extra rebound reduces his teammates' rebounds by 0.65 0.64.
-- SF: every extra rebound reduces his teammates' rebounds by 0.73 0.73.
-- PF: every extra rebound reduces his teammates' rebounds by 0.63 0.68.
--- C: every extra rebound reduces his teammates' rebounds by 0.65 0.69.

------

So, there you go: about 2/3 of marginal rebounds are taken away from a teammate.

Actually, it's worse than that! I'm not sure how much worse, but it's worse.

Why? Because the number of rebounds is mostly a function of your opponent's missed shots. The more missed shots, the more defensive rebounds are available. So your defensive rebounds depend, in part, on how good your team defense is.

If you have a good defense, and you get more rebounds than expected, you'd expect all players' rebounds should go up above average together. Same, in reverse, if you have a bad defense.

Again, the same is true for offensive rebounds. The worse your shooting, the more defensive rebounds come available, and vice-versa. But you'd expect all five of your players to get more rebounds, to some extent. So, again, rebounding moves together.

And, finally, there's another factor that should cause a certain amount of positive correlation -- pace. Teams that play faster or slower will see their rebound totals rise or fall in unison.

What all that means is that if the "diminishing returns" factor were zero, you'd have a *positive* correlation between teammates' rebounds. The fact that the actual correlation is negative, means that it has to be more negative than it looks -- it has to first overcome the positive correlation caused by team defense and shooting.

Does that make sense?

Look at it this way. The defense, shooting, and pace correlations are like gravity, pushing rebounds towards a positive relationship. For the "stealing rebounds" factor to turn that negative *against gravity* means that the negative is a bigger factor than it looked.

-----

Since I have no idea how to compensate for the "gravity" effect, let's ignore it for now, and stick with the 2/3 estimate. Assume that a typical player rebound takes 2/3 of a rebound away from a teammate, which means that only 1/3 of all individual rebounds are really individual. What does that mean for player evaluation?

The first instinct is to take every rebound, give 1/3 of the credit to the player who grabbed it, and spread the remaining 2/3 among the rest of the players.

Would that work? Well, it's better than giving the entire rebound to one player. But it still has a big problem.

And that problem is: the 2/3 figure is not random, nor is it evenly distributed. It varies by player and team.

There are some centers who deliberately let their teammates have more rebounds. There are others who deliberately take rebounds away from their teammates (not necessarily due to selfishness, by the way -- it could be the coach's strategy). If you just give centers 1/3 of their marginal rebounds, some centers will be regularly overestimated, and some will be regularly underestimated. Because it's not random, it won't even out over a career.

In his "Win Shares" baseball book, Bill James talked about assists by first basemen. When fielding a ground ball, some 1Bs would almost always take it themselves (Steve Garvey), and some would always toss it to the pitcher (Bill Buckner). As a result, Buckner would always wind up with more assists than Garvey. But it doesn't mean he was a better fielder.

Suppose that exactly half of first basemen tossed to the pitcher, and half of them didn't. If you just regress all of them by 50%, you may be closer, but you're still wrong. You want to regress Buckner 100%, and Garvey 0%, in order to get an accurate rating.

The same applies here. When you have two players who are above average in rebounding, how do you know which one is above average because he's really, really good at getting to the ball before the opposition, and which one is above average because he's just taking the easy ones away from his teammate?

One suggestion is, don't even try -- just give all the credit for rebounds to the defense. The problem with that, of course, is that the legitimately great rebounders no longer get credit.

Another suggestion is: use subjective evaluations. What do NBA observers think about who the best rebounders are? Combine that information with the numbers, and try to work out "custom" estimates for each player that still add up to the team's overall performance. The problem with that is that even the "experts" are often wrong about these things, both because they're not perfect, and because they almost certainly let the raw numbers influence their evaluations.

Still, there's probably *some* value there ... everyone was able to tell that Brooks Robinson and Ozzie Smith were great baseball fielders, even without statistics. So it should also be possible to tell, just from observation, who the best rebounders are. But, then, there's also the case of Derek Jeter, who won a lot of Gold Glove awards despite the objective statistics pointing to him as among the worst-fielding shortstops in baseball. So, you have to be careful.

-----

I don't have a good solution here. Obviously, it would be best if there were a way to figure the best rebounders objectively, using the evidence of the statistical record. It would be great if we could just take David Berri's stat, and adjust it a little, and wind up with the right answer.

But I don't see how it's possible, given the limitations of the data, to tell the talented from the "stealers."


In that light, we should maybe just admit our ignorance, for now, and treat rebounding as something that has to be evaluated subjectively. Lump it in with defense, as a team measure, while keeping in mind that individual rebounding skill does exist and needs to be considered for adjusting individual players when appropriate.

It's not a perfect solution, but it's obviously much, much better than giving the entire statistical credit to one player.

---

UPDATE: Eli Witus' excellent 2008 post (and others) point out that diminishing returns are much lower for offensive rebounds than for defensive. Does anyone know of a good source for ORB and DRB breakdowns by team by position, so I can rerun the analysis for them separately?



Labels: , , ,

21 Comments:

At Wednesday, January 12, 2011 7:31:00 AM, Anonymous Anonymous said...

To handle the "gravity" you could use rebounding percentage. It's a stat available at most of the sites with more advanced ststs. It adjusts for pace and shooting since it only measures the amount of rebounds grabbed from the available opportunities. When comparing this stat you can clearly see that it differs from rebounds per game.

 
At Wednesday, January 12, 2011 7:39:00 AM, Blogger Phil Birnbaum said...

Right, that would work. The 82games.com site I used for data doesn't have rebounding percentage broken down by position, but it's probably available somewhere else.

 
At Wednesday, January 12, 2011 9:16:00 AM, Anonymous Anonymous said...

I can't put my finger on it, but I think "positionality" is throwing something off here, too. What if, instead of going by "Centers", you ran the study with only three positions: "Point Guards", "Wings", and "Bigs"?

Is Tim Duncan a center or a forward? He's neither. He's a Big. There is little, if any, differentiation on the floor between what a Center and Power Forward do anymore.

-- David A.

 
At Wednesday, January 12, 2011 9:36:00 AM, Anonymous DSMok1 said...

My goodness, Phil. This is brilliant work! Your discussion of team variance vs. position variance basically cinches the matter, in my opinion.

I second "Anonymous" that rebound percentage is probably the way to look at this. I also agree with David A. that positionality is probably a bit of an issue.

Have you ever looked at Eli Witus's seminal work on this subject? It's at http://www.countthebasket.com/blog/2008/02/05/diminishing-returns-and-the-value-of-offensive-and-defensive-rebounds/ .

 
At Wednesday, January 12, 2011 9:53:00 AM, Anonymous DSMok1 said...

Sorry, that was just his first article on the subject. Eli's really good one looks at projected RB% based on the RB% of the players and compares whether they are truly additive. That's at http://www.countthebasket.com/blog/2008/02/23/more-diminishing-returns/

 
At Wednesday, January 12, 2011 10:08:00 AM, Anonymous EvanZ said...

Phil, hoopdata has total rebounds and %'s by position.

I can also give you rebound totals and "missed rebound" totals for every player for last season (and this season) by position in a .csv format.

Are you planning to look at offensive versus defensive rebounds? Diminishing returns are thought to be less on offense, so it would make sense to break it out.

 
At Wednesday, January 12, 2011 11:54:00 AM, Blogger Phil Birnbaum said...

DMok1, I'm familiar with some of Eli's work ... will check out those links.

Evan: Will check out the hoopdata numbers, thanks. Would be very pleased to return this for offensive and defensive rebounds separately if the data is available. And, um, forgive my substantial basketball ignorance, but what's a missed rebound?

 
At Wednesday, January 12, 2011 11:57:00 AM, Anonymous EvanZ said...

Phil, you may be interested in how I currently distribute credit for rebounds.

It starts with considering defense. When a player misses a shot, he is debited (PPP*DRR). PPP is the league-average points scored per possession (currently ~1.06). DRR is the league average defensive rebounding rate (currently ~0.74). The defense is credited with the same amount split among teammates or all to the individual in the case of a block. That leaves 26% of the credit for the rebounder. The credit that the defense gets for the defensive rebound is then also debited from the offense (by the position counterpart).

Using the same logic for an offensive rebound, the offensive rebounder gets credited PPP*DRR (thus, getting 74% of the credit), and the defense is debited by the same amount for "missing" the rebound.

Your finding of 1/3 credit to the rebounder is more than I currently give the defensive rebounder, but quite a bit less than I give the offensive rebounder. Since the offensive rebound generally does tend to be the work of one player (and not the team), it makes sense to me that it would be worth quite a bit more than the defensive rebound.

At any rate, it's nice to see your empirical findings lend some support to theoretically-supported choices Hollinger, Rosenbaum, and others, including myself have made. Thanks, for the great work!

 
At Wednesday, January 12, 2011 12:02:00 PM, Anonymous EvanZ said...

Phil, I consider every rebound a "missed" rebound by the other team. If you think about the offensive rebounding rate, for example,

ORR = ORB/(ORB+DRB_opp)

where ORB is the number of offensive rebounds the team acquired, and DRB_opp is the number of defensive rebounds the opponent acquired.

The same logic can be used at the player level:

ORR=ORB/(ORB+DRB_cp)

where DRB_cp would be the number of defensive rebounds the position-counterpart acquired. Same thing can be done for defensive rebounding rates (DRR). This enables us to take into account the pace and defense/offense inequalities among teams.

 
At Wednesday, January 12, 2011 12:04:00 PM, Blogger Phil Birnbaum said...

EvanZ: Right, I've seen some work that shows offensive rebounds have much less overlap than defensive rebounds. Your numbers make sense ... if a DReb is worth 0.26, and an OReb is worth 0.74, the average rebound is about 0.4, which is close to the .33ish that my little study found.

OK, now I see what you mean by "missed rebounds".

Just took a quick look at the Eli Witus link you sent ... I *had* seen that when it came out three years ago, and I'd forgotten. I'll take a closer look when I have more time, and maybe update my post to link to it.

 
At Wednesday, January 12, 2011 12:06:00 PM, Blogger Phil Birnbaum said...

One question about missed rebounds ... for individuals, why do you use the other team's position counterpart? Is it because they normally wind up in the same space, and normally compete for the same balls?

 
At Wednesday, January 12, 2011 12:22:00 PM, Anonymous EvanZ said...

Phil, the idea to use position counterpart is simply that it is the most fair way I can think of doing it. Originally, I used league average rebounding rates at each position. Turns out there is a high correlation between the two approaches, anyway:

http://thecity2.com/2011/01/10/adding-counterpart-rebounds-to-ezpm/

Here is a spreadsheet with the rebounding data (among a lot of other stuff) for 2009-2010. ORB and DRB are rebounds acquired by the player. ORBM and DRBM are rebounds acquired by the position counterpart. There is also listed team rebounds acquired (and missed). Hope this helps:

https://spreadsheets.google.com/ccc?key=0Al6a2ecvJfTidGRUYVJJVGdrb2I4RWdPVHo4dDZWZFE&hl=en&authkey=CLK0mIoL

 
At Wednesday, January 12, 2011 3:33:00 PM, Blogger Phil Birnbaum said...

Update: changed one sentence in the post to this:

"Every additional rebound taken by the point guard reduces the rest of the team's rebound total by ... almost exactly one rebound: .96, to be exact."

Previously, I accidentally and incorrectly had "point guard" and "rest of the team" reversed.

 
At Thursday, January 13, 2011 10:37:00 AM, Anonymous Guy said...

It will be interesting to see the data broken down by oreb and dreb. However, I do think you will find some significant (though weaker) diminishing returns on orebs. Eli found the player SD for oreb% to be about .024 (controlling for position). However, the team SD is also about .024. If players' oreb% was independent of teammates (and I've done my math right), then we should expect a much larger team SD of about .054. So that implies a non-trivial level of DR on offense as well.

 
At Thursday, January 13, 2011 10:56:00 AM, Anonymous DSMok1 said...

Phil, I've compiled the data for this year, for ORB%, DRB%, and TRB%. Some players play multiple positions, so I estimated playing time at each position and split their numbers accordingly.

The data is here: Excel Spreadsheet.

The results are rather interesting--nearly 0 correlation between PG, SG, SF, and team-level results. It looks like there's a bunch of noise, though.

 
At Thursday, January 13, 2011 11:40:00 AM, Blogger Phil Birnbaum said...

DSMok,

It looks like you're correlating the C to the entire team *including* the C. You need to correlate the C to the rest of the team *excluding* the C.

That would explain the low/positive correlations ...

 
At Thursday, January 13, 2011 1:18:00 PM, Blogger Phil Birnbaum said...

DSMok1,

I couldn't resist, so I did one of the 10 corrected regressions from your spreadsheet. For OREB, the slope was -0.709 for centers, almost exactly as predicted. The correlation was -0.7.

I can do the rest too if you'd like ...

 
At Thursday, January 13, 2011 4:09:00 PM, Anonymous DSMok1 said...

Go ahead. My mistake on the correlation! I'm still learning these things... I just wanted to get you the data and let you go to work.

 
At Thursday, January 13, 2011 6:12:00 PM, Blogger Phil Birnbaum said...

Just put up a new post, analyzing DSMok1's numbers.

 
At Friday, January 14, 2011 2:31:00 PM, Anonymous Anonymous said...

I'm sort of shocked by the result showing that PGs are taking rebounds away from the rest of the team more often than the other way around.

In my day to day observation I see many occasions where the PG gets "long" rebounds off missed 3 pointers and other long jumpers and none of the bigs were in a position to get it.

So I always assumed that the value of rebounds by PGs was greater than for Cs.

 
At Monday, January 17, 2011 9:28:00 AM, Blogger G Wolf said...

Basketball-Value.com has regression-adjusted rebounding numbers for on/off. For instance, check out the last two sets of columns here: http://basketballvalue.com/topplayers.php?year=2010-2011

 

Post a Comment

<< Home