Part 1: RBI RBI 8 7 6 Frequency 5 4 3 2 1 0 35 51 67 83 99 115 RBI The solution is not unique since the value of the mean is affected by the individual values in the sample. The team with the highest RBI is 5, 7, 11, 13, 14, 16, 17, 18, 22, 24. They have a team RBI of 94.2 Part 2: Strikeouts 1 1 2 5 11 12 (2) 11 11 8 5 3 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 4 5 248 45779 0 59 44 116 46 46 3 This solution is also unique for the same reason as in part 1. This team has a team mean of 55.6. The team is 1, 4, 8, 9, 12, 15, 19, 21, 23, 25. Part 3: Homeruns In this case, the solution is not unique. In fact, we can replace, for example, the best or worst member of the team with anyone else on the same side of the median and the median is not affected. That’s because median only considers the relative order of the observations. Hence, a team with the best team median homeruns is 24, 17, 5, 14, 13, 7, 2, 22, 18, 23. They have a median of 32. Part 4: Batting Average: The calculated batting averages are shown below: 23 12 4 7 6 5 18 11 22 PLAYER Ian Kinsler Ryan Theriot Brian Giles Aubrey Huff Shane Victorino Jermaine Dye Matt Kemp James Loney Garrett Batting Average: 0.319 0.307 0.306 0.304 0.293 0.292 0.290 0.289 0.286 8 17 16 1 10 24 15 Atkins Ivan Rodriguez Prince Fielder David Murphy Carl Crawford Aaron Rowand Carlos Delgado Edgar Renteria 0.276 0.276 0.275 0.273 0.271 25 2 0.271 20 14 9 13 21 19 0.270 3 Jeff Keppinger Cody Ross Kosuke Fukudome Pat Burrell Pedro Feliz Jason Giambi Jason Kendall Emil Brown Jeff Francoeur 0.266 0.260 0.257 0.250 0.249 0.247 0.246 0.244 0.239 Again, the team with the highest team median is not unique. So we choose this team with a highest median: 23, 12, 4, 7, 6, 5, 18, 11, 22, 8. The team median is 0.292. Part 5: Stolen Bases. This data is right skewed with large outliers. To find the team with the least variation, we observe that most of the data in this distribution are located at the lower values. Hence we choose the lowest 10 stolen bases to make up the team with the least variability in their number of bases stolen. The team is Part 6: Probability of Base Stealing. 6, 18, 23, 1, 12, 20, and 8 are the players who have stolen more than 10 bases. Hence, we choose these 7 players to be on the team and the other three players don’t matter. The probability of choosen a player with more than 10 stolen bases is 0.7. We choose, for the remaining three players (though this choice doesn’t matter) 14, 3, and 9. Part 7: Confidence interval We choose numbers from 1-25 out of a hat (without replacement, obviously) in order to select our team. The following team results: 4, 7, 8, 10, 15, 19, 21, 22, 24, 25. Sample average batting average is: 0.274 Sample standard deviation is 0.0208. Critical value for the CI: 2.262. Hence: 0.274 +/- 2.262*0.021/sqrt(10) = (0.258, 0.289) Since we noted earlier than the distribution of batting averages is skewed to the right, we choose the players with the lowest batting averages to make the narrowest confidence interval. That is because in right skewed data the lowest values are the most tightly grouped. Hence, they will have a lower standard deviation. Hence, the confidence interval will be narrower. The team is: 15, 25, 2, 20, 14, 9, 13, 21, 19, 3 The sample mean is 0.253. The sample standard deviation is 0.010. the confidence interval in this case is: So, 0.253 +/- 2.262*0.010/sqrt(10) = (0.246, 0.260) This interval, though, is not useful or interpretable. That is because the sample is not random. So we can’t interpret it in the same way as a regular confidence interval. Part 8: p-value. I’m assuming here that the hypotheses for this test are: H0: Ha: p>=0.5 p<0.5 The sample proportion for the random team is 0.2. The test statistic is: z=(0.2-0.5)/sqrt(.5*.5/10)= -1.897 The p-value for this test is P(z<-1.897) = 0.0289 Hence we would reject the null hypothesis (at 0.05 level). And conclude that the proportion of players with a batting average of 0.3 is less than 0.5. Part 9: Correlation. Runs v. Homeruns 40 35 Homeruns 30 25 20 15 10 5 0 38 58 78 98 Runs The red highlighted squares represent the chosen players. They were chosen because they seem to fall on the same line. The team that is represented by these points is 21, 25, 1, 15, 20, 3, 4, 11, 18, 22. The correlation coefficient of these points is 0.9126. Part 10: The best team! The best team that I can find to maximize (or minimize) the characteristics: Numbe r PLAYER 24 Carlos Delgado 5 Jermaine Dye 17 Prince Fielder 22 Garrett Atkins 7 Aubrey Huff 13 Jason Giambi 2 Cody Ross 11 James Loney 23 Ian Kinsler 4 Brian Giles AB 598 590 588 611 598 458 461 595 518 559 H 162 172 162 175 182 113 120 172 165 171 R 96 96 86 86 96 68 59 66 102 81 SB 1 3 3 1 4 2 6 7 26 2 SO 124 104 134 100 89 111 116 85 67 52 HR 38 34 34 21 32 32 22 13 18 12 Their team stats are: Average RBI Average Strikouts Median HR Median Batting Average St. Dev. Of SB r of runs v. homeruns P(SB<10) Width of confidence interval for mean batting average p-value for hypothesis test: RBI 115 96 102 99 108 96 73 90 71 63 Batting Average 0.271 0.292 0.276 0.286 0.304 0.247 0.260 0.289 0.319 0.306 91.300 98.200 27.000 0.288 7.472 0.357 0.900 0.031 0.1188
© Copyright 2026 Paperzz