Problem Set No. 1 Introduction to Econometrics

Problem Set No. 1
Introduction to Econometrics
Professor Adriana Kugler
Due on Thursday, September 25, 2008
1. Suppose that a random sample of 200 twenty-year old men is selected from
a population and that these men’s heights and weights are recorded. A
regression of weight on height yields the following:
Wd
eight = −99.41 + 3.94 × Height,
R2 = 0.81
SER = 10.2,
where W eight is measured in pounds and Height is measured in inches.
(a) What is the regression’s weight prediction for someone who is 70
inches tall? 65 inches tall? 60 inches tall?
(b) A teenager has a growth spurt and grows 4 inches over the course of
a year. What is the regression’s prediction for the increase in this
teenager’s weight?
(c) The average height in this sample is 67 inches. What is the average
weight for this sample?
(d) What is the fraction of the variance of weight explained by height?
Does height explain a lot or little of the variation in weight?
2. On my website (http://www.uh.edu/˜adkugler/ProblemSets.html) you will
find a file called CP S04 that contains data for full-time, full-year workers, age 25-34, with a high school diploma or B.A./B.S. as highest degree.
Here, I have attached a detailed description of the data. In this exercise
you will investigate the relation between workers’ age and earnings.
(a) Construct a scatterplot of earnings on age. Does there appear to be
a relationship between the two variables?
(b) Run a regression of average hourly earnings (AHE) on age(Age).
What is the estimated intercept? What is the estimated slope?
(c) Jennifer is a 30 year-old worker. Predict Jennifer’s earnings using
the estimated regression.
(d) What is the standard error of the regression? What are the units in
which SER is measured?
1
Documentation for CPS04 Data
Each month the Bureau of Labor Statistics in the U.S. Department of Labor
conducts the “Current Population Survey” (CPS), which provides data on labor force
characteristics of the population, including the level of employment, unemployment, and
earnings. Approximately 65,000 randomly selected U.S. households are surveyed each
month. The sample is chosen by randomly selecting addresses from a database
comprised of addresses from the most recent decennial census augmented with data on
new housing units constructed after the last census. The exact random sampling scheme
is rather complicated (first small geographical areas are randomly selected, then housing
units within these areas randomly selected); details can be found in the Handbook of
Labor Statistics and is described on the Bureau of Labor Statistics website
(www.bls.gov).
The survey conducted each March is more detailed than in other months and asks
questions about earnings during the previous year. The file CPS04 contains the data for
2004 (from the March 2005 survey). These data are for full-time workers, defined as
workers employed more than 35 hours per week for at least 48 weeks in the previous
year. Data are provided for workers whose highest educational achievement is (1) a high
school diploma, and (2) a bachelor’s degree.
Series in Data Set:
FEMALE:
1 if female; 0 if male
YEAR:
Year
AHE :
Average Hourly Earnings
BACHELOR: 1 if worker has a bachelor’s degree; 0 if worker has a high school degree