Project: Predicting the 2016 US Presidential election Due in Week

Project: Predicting the 2016 US Presidential election
Due in Week 10.
This is an Individual Project
Your project is to develop a statistical model to predict the 2016 US presidential election.
The dataset project.txt on the course web-page contains information on US presidential
elections1 from 1888 to 2014. Based on this data on the last 25 elections going back to 1916
there are three many conditions that affect voting patterns. The first is whether the president
is running again. If so this has a positive effect on votes for the president. The other variables
are based on the current state of the economy. A good economy clearly being a positive for the
incumbent party. For example, the price inflation variable measures the growth rate of the GDP
deflator in the first 15 quarters of the second Obama administrations term. The third variable
is a Goodness (Z) variable which is defined as the number of quarters during which the GDP
per capita growth has exceeded 3.2%. The slow growth of the US economy since the financial
crisis of 2008 is a negative for the incumbent party.
The election2016.txt dataset contains the following variables:
1. VP Democratic share of two-party Presidential vote.
2. VC Democratic share of two-party House vote.
3. I 1 if there is a Democratic incumbent at the time of the election and −1 if there is a
Republican incumbent.
4. DPER 1 if the Democratic presidential incumbent is running again, −1 if Republican,
and 0 otherwise.
5. DUR 0 if either party has been in the White House for one term, 1 [-1] if the Democratic
[Republican] party has been in the White House for two consecutive terms, 1.25 [-1.25]
if the Democratic [Republican] party has been in the White House for three consecutive
terms, 1.50 [-1.50] if the Democratic [Republican] party has been in the White House for
four consecutive terms, and so on.
6. WAR dummy for the elections of 1920, 1944, and 1948 and 0 otherwise.
7. GROWTH, G growth rate of real per capita GDP in the first three quarters of the
1
The econometrician Ray Fair has done extension work on this data and shown how to use statistics to predict
elections. His website provides background reading material.
1
election year (annual rate).
8. PRICE INFLATION, P absolute value of the growth rate of the GDP deflater in
the first 15 quarters of the administration (annual rate) except for 1920, 1944, and 1948,
where the values are zero.
9. GOODNEWS, Z number of quarters in the first 15 quarters of the administration in
which the growth rate of real per capita GDP is greater than 3.2 percent at an annual rate
except for 1920, 1944, and 1948, where the values are zero
For this project, the key variables are: Growth (G), Inflation (P) and Goodnews (Z). The current
values for the economic variables prior to the November 2016 election are:
G
P
Date
January 31, 2015 3.04 1.86
April 29, 2015 3.22 1.14
July 31, 2015 3.03 1.33
Z
3
5
3
You have to build a model to predict the Presential vote, in VOTE (VP) for the incumbent (Democratic) party in the November 2016 elections. Your analysis should build on the following:
• Build a regression model to predict the election outcome Vote as a function of the three
key variables Growth, Inflation, Goodnews, and explain the results.
• The intercept coefficient of the above base model might be different for different incumbent party and so are the slope coefficients. Start expanding the model by considering
different effect of I, the incumbent party, and explain the results.
• Try adding control variables in your model to increase predicting power.
• You should quantify how well your final model predictions have worked in the past.
• Given your forecast and its prediction interval, calculate your estimate of the probability
that the Democrats will win in 2016.
Keep in mind the data analytic tools that we’ve covered. The following is a list of techniques,
not all will necessarily be central to your analysis:
1. Descriptive statistics
2. Regression analytics
3. Diagnostics and Outliers
4. Prediction intervals
5. Normal distribution calculation
Present your analysis in a paper describing your approach with the above in mind. The maximum length is fifteen pages including exhibits.
Good Luck!
2