STAT 210: Midterm - Takehome Fall 2016 Points: 50 Name(s): ________________________________ ________________________________ ________________________________ Consider the following statistics from Blair Walsh, the recently released Minnesota Vikings kicker. Let us first consider Walsh’s field goal statistics from 2016. He attempted a total of 16 field goals and made a total of 12. Walsh’s career percentage for field goals made is 84.2%. [See information from the boxes labeled A.] Research Question: Is there statistical evidence to suggest Blair Walsh was under-preforming in 2016 relative to his career percentage of 84.2%? 1. Identify the smallest possible value, largest possible value, location of the reference distribution, and the outcome from the study for this situation on the number line below. (4 pts) Smallest possible value Largest possible value Location of pyramid Outcome from study # Field Goals Made 1 2. Use the binomial probability model to complete the following statistical test. Research Question: Is there statistical evidence to suggest Blair Walsh was underpreforming this year relative to his career percentage of 84.2%? Testable Hypothesis 𝐻𝑂 : 𝜋 = 84.2% 𝐻𝐴 : 𝜋 < 84.2% a. What is the p-value for this test? (2 pts) b. Write a final conclusion for this research question. (3 pts) 3. Repeat Problem #2, but this time evaluate whether or not Blair Walsh was under- performing on Point After Touchdowns, i.e. PATs. [Use information from the boxes labeled B.] Research Question: Is there statistical evidence to suggest Blair Walsh was underpreforming on PATs this year relative to his career percentage for PATs? a. What is the p-value for this test? (3 pts) b. Write a final conclusion for this research question. (3 pts) 2 4. Between the 2014 and 2015 seasons, the NFL made a rule change for Point After Touchdowns (PAT). The distance of an PAT was 20 yards before the 2015 season; this distance was increased to 33 yards beginning in 2015. a. Review the PAT statistics for Blair Walsh for the 2015 and 2016 seasons. Does it appear that this rule change affected Blair Walsh? Briefly discuss. (2 pts) b. If the rule change affected Walsh, then it is not really not fair to compare his 2016 PATs performance against 94.5% -- his career PAT percentage. What would be a more fair percentage to use instead? Briefly discuss. (3 pts) For the remaining problems, we will consider the (unofficial) results from the 2016 US Elections as provided by Townhall.com. I have augmented the election results with several other demographic variables, e.g. % Bachelor’s degree, % Born in US, Median Household Income, etc., from the American Community Survey provided by the US Census Bureau. Note: Election results are not available for Alaska on Townhall.com website and hence are not included in this data. Data Sources: 2016 Raw Election Data: https://github.com/tonmcg/County_Level_Election_Results_12-16 2016 Election Results: http://townhall.com/election/2016/president/ ACS Data Dictionary: http://api.census.gov/data/2014/acs5/profile/variables.html Example API pull for ACS Data: http://api.census.gov/data/2014/acs5/profile?get=NAME,DP03_0062E&for=county:* 3 To begin, create a new variable called Winner. This new variable simply indicates whether or not Trump gained more votes than Clinton in each county. New Variable: Winner 5. Use Analyze > Distribution to determine the proportion of counties where Trump received more votes Clinton. Briefly discuss. Provide the JMP output. (2 pts) Next, pick three demographic variables of interest in this datasets. These include the variables from Percent_Households_Married_Family through Percent_Hispanic in the dataset. Use Cols > Utilities > Make Binning Formula to create a binned version of the three demographic variables you’ve decided to investigate. You should set the offset value to a reasonable value, the width should be set so that you have about 4-6 bins, and select Range Labels in the upper right corner. Click Make Formula Columns to create the new variable. An example is shown here for Percent_BachelorsDegree. Next, Use Analyze > Fit Y by X to explore the relationship between Winner and each of the demographic variables you’ve selected to investigate. Again, an example setup is provided here for Percent_BachelorsDegree Binned. Note: The default color scheme in JMP has Clinton as red and Trump as blue in the plots. Use Value Orderings to flip the order of the response variable so that Clinton in blue and Trump red which is the common color scheme used for the democratic and republican party, respectively. 4 6. Which of the three demographic variables appears to be most indicative to the outcome of the 2016 US Election? Explain how you made this determination. Provide your JMP output as supporting evidence. (6 pts) 7. Consider the demographic variable identified in the previous problem (the most indicative demographic variable). Use a local data filter to investigate whether or not the relationship discovered above is influenced in some way by the other demographic variables you’ve decided to investigate. Explain in details any relevant findings. Again, provide your JMP output as supporting evidence. (6 pts) Next, we will use JMP to do some mapping. Mapping will be easier if a yet another new variable is created. This will be a numeric version of the previously created Winner variable. New Variable: Winner_Number After this variable is created, select Graph > Graph Builder. Place Name in the Map Shape box in the lower left corner and slide the newly created Winner_Number variable onto the map. Click Done and a map should be produced. 5 Note: You can ignore the fact that LaSalle County, IL and Oglala County, SD are not being plotted on our map. 8. Use your map to determine which geographic areas of the country tend to vote Democratic? Which areas tend to vote republican? Briefly discuss. (3 pts) 9. You are able to use a local data filter on your map (just like any other output in JMP). Apply a local data filter to investigate how the map changes conditioning on another predictor variable of your choice. (2 pts) Consider the following summary done in JMP using Tables > Summary 6 10. What is the interpretation of the Mean(Winner_Number) value computed here? Briefly discuss. (2 pts) 11. Use Tables > Summary to answer the following. Provide a screen-shot of relevant JMP output for each. a. Where there any states where Trump won the majority of votes in all counties? If so, which states were these? (2 pts) b. Where there any states where Clinton won the majority of votes in all counties? If so, which states were these? (2 pts) c. What states tend to be most evenly divided between Trump and Clinton? Briefly discuss how you made this determination. (2 pts) 12. Use Tables > Summary to verify that Clinton actually gained a majority of the popular vote. Provide your JMP output as evidence. Briefly discuss. (3 pts) Hint: You should use Votes_Democrat and Votes_Republican for this problem. 7
© Copyright 2026 Paperzz