Statistics in NBA Ming Luo Data: I downloaded the statistics of 5 most important players from each of the 30 NBA teams from NBA official website. For each team, the chosen players are 2 guards, 2 forwards and 1 center. The attribute number is 19. 1. How to evaluate a player in NBA? Almost all NBA fans are talking about the PPG (points per game) of an NBA player. But is it the best way to evaluate their value? The answer is NO, or maybe NO. First, you must also consider his other contributions such as rebounds, assists, steals and blocks. Second, you must consider the penalty to his missing shots and turnovers. NBA official sites deliver a comprehensive evaluation – “Efficiency”, which is calculated by the following formula: ((Points + Rebounds + Assists + Steals + Blocks) - ((Field Goals Attempted - Field Goals Made) + (Free Throws Attempted - Free Throws Made) + Turnovers)). Third, even though you only consider points, a better evaluation is PP48M (points per 48 minutes) than PPG, because a player should not be responsible for the time he is not on the field. For example, in a game S.O’neal was injured at the second minute and left the field. What he gets should be 0 points in 2 minutes, instead of 0 points in one game. Figure 1 shows top 10 “efficiency” players together with their PPG. Figure 2 shows top 10 “PP48M” players together with their PPG. Note: one basketball game is 48 minutes. Efficiency vs. Points per game (PPG) Efficiency vs. PPG 35 30 25 20 15 10 5 le r Mi l B. bb e r t We ya n Br ea l S. O' N ri o n n Ma nc a Du A. St o ud e mi re zk i es wi t No Ja m L. Ga rn e tt 0 Player Efficiency Points per game (PPG) Figure 1: Efficiency vs. Points per game (PPG) From Figure 1, we can find efficiency and PPG are different. Garnett (MVP of last year) has an unbelievably higher efficiency than all the other players. He is publicly thought to be the most almighty player. Although B.Miller is never among all stars, it seems he is under-estimated. Although Bryant has high PPG, his efficiency is not so great. Actually his field goal percent is only 0.411, much lesser than his foe S.O’Neal’s 0.599. Points per 48 minutes (PP48M) vs. Points per game (PPG) 40 35 PP48M vs. PPG 30 25 20 15 10 5 'N ea Iv l er s No on wi tz ki Br ya S. nt O' Ne B. al Go rd on Ar en as Ca rt e L. r Ja me s J. O St ou d em i re 0 Player Points per 48 minutes (PP48M) Points per game (PPG) Figure 2: Points per 48 minutes (PP48M) vs. Points per game (PPG) From Figure 2, we can find PP48M is always higher than PPG, because people cannot be always on field. And the player ranks of PP48M and PPG are also different. An interesting observation is B.Gordon, whose PPG is very low but PP48M very high. This is because he is a rookie this year, without so much trust from the coach yet. So in his 51 games, he is only a starter 3 times, and his minutes per game are only 23.2. However, listed in the top ten best players of PP48M, B.Gordon is getting points very efficiently. 2. Correlations between data I also looked into the correlations between data (attributes). Data are normalized column by column. Figure 3 and 4 show the results. Figure 3: Correlations between data (the 3 most positive correlated pairs) The top three positively correlated pairs of attributes are: 1) Total rebounds per game (TRPG) vs. Defensive rebounds per game (DRPG) The score is 0.965. It is straightforward. Generally, a player with high defensive rebounds will also have high total rebounds. 2) PP48M vs. PPG The score is 0.877. It shows that PPG is a good approximation to PP48M 3) Total rebounds per game (TRPG) vs. Offensive rebounds per game (ORPG) The score is 0.869. It is also straightforward like 1). What is interesting is why DRPG is more correlated to TRPG than ORPG. This is because, generally, it is much easier to get a defensive rebound than an offensive rebound. Players also have more defensive rebounds than offensive rebounds. So the number of defensive rebounds is more decisive than that of offensive rebounds in deciding the ability of a player to achieve rebounds. Figure 4: Correlations between data (the 3 most negative correlated pairs) The top three negatively correlated pairs of attributes are: 1) ORPG vs. 3 points percent (3 points goals made / 3 points goals attempted) The score is –0.633. It is because, generally, a player with high ability of rebounds (center or power forward) has a low ability of 3 points shooting and vice versa. 2) Blocks per game vs. Free throw percent The score is –0.503. It is an interesting result. It seems we can explain it in this way: block masters are mostly centers, who are not so good at free throwing as guards, who can barely block. For example, there are 12 centers in the top 14 blockers and 9 guards in the top 14 free-throwers. 3) 3 points percent vs. field goal percent The score is –0.486. This result seems weird at the first glance. But with the data, we can find why. In NBA official statistics, if a player never throws 3 points, the 3 points percent is defined 0. (Actually, it should be undefined.) So a lot of centers and power forwards have 3 points percent as 0. On the other hand, centers and power forwards have better field goal percent than other positions, generally, because they are closer to the basket. Thus, their data are small 3 points percent and high field goal percent. It seems they made a big impact on the whole result. 3. Player clusters I did the player clustering based on their statistics. The hierarchical result is very interesting and surprisingly meaningful. The following is what I gradually saw when I moved the “Minimum similarity” threshold bar from up to down in HCE 3.0. Figure 5 illustrates the clusters after the step 5). Poor guys Center/Power forward Guard/Small forward Monsters Almighty guards Figure 5: Player clusters 1) The first guy jumping out is B.Wallace, who is detected as the most unique (alone) one. It’s very reasonable! B.Wallace is the leader of Detroit, the champion last year. But he has only 45% field goal, 42% free throw (the lowest in 150 players), and 9.3 points per game, really low values for a start center. However actually, B.Wallace is a crazy defender, with 11.9 rebounds (the second highest) and 2.52 blocks (the fifth highest) per game. With such strange and special data, B.Wallace is of course the most unique one. 2) The first cluster jumping out is called by me as SUPER ALMIGHTY GUARDS, including L.James, Hughes, Iverson and Nash. They have high 3 points percent, efficiency, assists, steals, turnovers, points per game, and low offensive rebounds, blocks, fouls per game. A more interesting thing is Hughes, L.James, and Iverson are the top 3 stealers in NBA. 3) The second cluster coming out is called by me as MOSTERS, including S.O’Neal, Garnett, Nowitzki, Webber, Marion, Duncan, J.O’Neal, Brand, Gasol, Yao Ming, A.Stoudemire, Kirilenko and Walker. They are almost the best centers and power forwards in NBA. They have high efficiency, rebounds, blocks and points per game. 4) The third cluster coming out is 5 poor guys with low values for almost all attributes. 5) The fourth cluster coming like a milestone. Generally, it divides the left players into two parts: center / power forward vs. guard / small forward.
© Copyright 2026 Paperzz