Making Graphs with Excel

CENTER FOR FAMILY & DEMOGRAPHIC RESEARCH
Making Graphs with Excel
Summer 2014 Workshop Series
WHY-
CHARTS?
Picture Superiority Effect
Information is better remembered in
tests of recall and item recognition
when presented as pictures rather
than words
Fruit <
Why is it so
difficult for-
SOCIOLOGISTS?
How do you process?
•
•
•
•
•
•
•
•
•
•
Analytical
Logical
Precise
Repetitive
Organized
Details
Scientific
Detached
Literal
Sequential
•
•
•
•
•
•
•
•
•
•
Creative
Imaginative
General
Intuitive
Conceptual
Big picture
Heuristic
Empathetic
Figurative
Irregular
I Propose we Marry the Two
The pun is intended!
Organization of Presentation
•
•
•
•
•
Structure of an Excel Chart
Different Types of Excel Charts
Basic Principles of Chart Design
Graphing Interaction Effects
Creating a Chart with a Double Axis
What makes upTHE STRUCTURE OF
AN EXCEL CHART?
Let’s Dissect…
Predicted Marriage & Divorce Rates
Chart Title
Upper Bounds (UB) & Lower Bounds (LB)
Rates per 1,000 at Risk Population
Legend
50
45
40
35
30
25
20
15
10
5
0
UB
Mar Rt
LB
UB
Div Rt
Data Labels
LB
36.0
24.2
Gridlines
Axis
$0
Axis Titles
$5
$10
$15
$20
$25
$30
HMI Spending Per Population at Risk of Marriage or Divorce
Sources: U.S. Census Bureau, American Community Survey, 2008-2011; HMI spending data– Hawkins et al., 2013.
$35
$40
Source
What areTHE DIFFERENT
TYPES OF CHARTS?
Histograms
A vertical bar chart that depicts the
distribution of a set of data
Histograms, example
Pie Charts
Generally used to show percentage or
proportional data classified into nominal or
ordinal categories
Pie Charts, examples
Simple Pie
Pie-of-Pie
Top Reasons for Fathers Leaving
the Workforce in 2008
Percent of births by informal
marital status of mother, 20052010
Childcare
15%
Other
46%
School/
Training
20%
Married
57%
Unmarried
43%
Layoff
19%
Source: Survey of Income and Program
Participation, 2008 March Supplement
Source: NSFG 2006-2010
Cohabiting
25%
Single
18%
Pie Charts, examples
Simple Pie
Doughnut
College experiences of young
adults (by age 25)
Percent of young adults who
enroll in a 4-year program by
degree earned by age 25
Bachelor's
degree
21%
Associate's
degree
6%
Didn't
enroll
40%
No degree
33%
Did not
finish
44%
Bachelor's
degree
49%
Associate's
degree
7%
Source: National Longitudinal Survey of Youth 1997, Rounds 1-13: 1997-2009 weighted. U.S. Department of Labor, Bureau of
Labor Statistics, NCFMR analyses of valid cases.
Bar Chart, example
Prevalence of Pre-union First Birth
across Demographic Characteristics
Women
Pre-union First Birth
Hispanics
7%
Hispanics
GED
None
Yes
13%
13%
Blacks
H.S.
Yes
7%
No
93%
9%
Whites
Assoc. Deg.
Whites
11%
Men
B.A.+
Prevalence of Pre-union First Birth by
Race/Ethnicity:
33%
2%
No
87%
Blacks
8%
Yes
33%
12%
20%
19%
No
67%
Source: National Longitudinal Survey of Youth 1997 (NLSY97), Rounds 1-13: 1997-2009 (weighted). U.S. Department of Labor, Bureau
of Labor Statistics, NCFMR analyses of valid cases.
Column & Bar Charts
Useful for showing data changes over a
period of time or for illustrating
comparisons among items
Column Charts, examples
Simple
Side-by-Side
Fathers Living with All of Their Children
Race, Ethnicity & Nativity
Percentage of Same-Sex Couple
Households with Minor Children by Sex of
Couple and Race/Ethnicity of Household
Head
80%
74%
70%
62%
80%
Male-Male
70%
Female-Female
60%
49%
50%
40%
44%
38%
30%
20%
10%
25%
25%
20%
37%
22%
11%
0%
All fathers
White
Source: NSFG 2006-2010
Black
NB Hispanic FB Hispanic
White
Black
Asian
Hispanic
Source: U.S. Census Bureau, American Community Survey, 1Year Estimates, 2012
Column Charts, examples
Percent Change in Share of Aggregate Income from 1970-2009
Lowest fifth
Second fifth
Third fifth
Fourth fifth
Highest fifth
16%
-8%
-12%
-18%
-25%
Source: U.S. Census Bureau, Current Population Survey, Annual Social and Economic Supplements
Column Charts, examples
Public Assistance Participation among U.S. Children in Poverty
by Family Structure, 2010
SNAP
TANF
76%
65%
55%
53%
52%
31%
6%
Married Couple
Households
11%
1%
Different Sex
Couples
Male
Same Sex
Couples
6%
Female
Same Sex
Couples
Cohabiting Households
Source: U.S. Census Bureau, American Community Survey, 1-Year Estimates, 2010
11%
Father
Only
19%
Mother
Only
Single Parent Households
Column Charts, examples
Changes in the Shares of Births to Single and Cohabiting Mothers
Under Age 40
Single
Cohabiting
Total Non-Marital
100%
80%
60%
35%
40%
20%
0%
21%
6%
27%
11%
18%
42%
24%
15%
16%
17%
18%
1980-1984
1990-1994
1997-2001
2005-2009
Sources: 1980-1984 data, Bumpass & Lu (2000) using NSFH, 1987/1988; 1990-1994 & 1997-2001 data, Kennedy & Bumpass
(2008) using NSFG 1995 & NSFG 2002; 2005-2009, NCFMR analyses using NSFG 2006-2010.
Line Charts
Ideal for showing trends over time
Line Charts, examples
Share of Married Mothers Experiencing a Premarital Birth, by Race
and Marriage Cohort
70
Overall
60
White
50
57.1
Black
40
30.9
30
20
22.1
20.4
10
0
4.4
2.4
1955-59 1960-64 1965-69 1970-74 1975-79 1980-84 1985-89 1990-94 1995-99 2000-04 2005-09
Source: The Integrated Fertility Survey Series (IFSS) is a project of the Population Studies Center and the Inter-university
Consortium for Political and Social Research at the University of Michigan, with funding from the Eunice Kennedy Shriver National
Institute of Child Health and Human Development (NICHD, grant 5R01 HD053533; Pamela J. Smock, PI).
Line Charts, examples
Young Adults Living in a Parent's Household and Economic Recession
Years by Sex and Ages, 1940-2010
100%
Recessions
80%
18-24 Men
67%
60%
40%
51%
51%
47%
32%
20%
19%
10%
17%
1940
25-34 Men
42%
24%
0%
18-24 Women
1950
1960
7%
1970
15%
1980
1990
2000
25-34 Women
gf
2 per. Mov. Avg.
(18-24 Women )
2 per. Mov. Avg.
(25-34 Men)
2010
Source: U.S. Census Bureau, Decennial Census, 1940-2000 (IPUMS); U.S. Census Bureau, American Community Survey, 1-year
estimates 2010 (IPUMS)
Line Charts, examples
Annual HMI Spending and Marriage & Divorce Rates, 2000 - 2010
Marriage Rate
HMI Spending
$160
46.1
45
$122
33.9
Rates per 1,000 at Risk
40
35
$140
$120
30
$100
25
$80
20
18.4
18.4
15
$40
10
5
$60
Dollar Amounts in Millioms
50
Divorce Rate
$20
$5
0
$0
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Sources: CDC/NCHS, National Vital Statistics System, 2000; Glass & Levchak, 2010, NCFMR County-Level Marriage & Divorce
Data, 2000; U.S. Census Bureau, Decennial Census, 2000; U.S. Census Bureau, American Community Survey, 1-Year Estimates,
2008 – 2010; HMI Spending data – Hawkins et al., 2013.
Line Charts, examples
Crossover in median age at first marriage and first birth: Rising
proportion of births to unmarried women, 1980-Present
28
100
27
90
80
25
60
24
50
23
40
Percent
70
30
22
20
10
0
2010
2008
2006
2004
2002
1994
1992
1990
1988
1986
1984
1982
1980
20
2000
21
1998
Proportion of births to unmarried women
Women's median age at first marriage
Women's median age at first birth
1996
Age in years
26
Sources:
1. U.S. Census Bureau, Current Population Survey, March and Annual Social and Economic Supplements, 2012 and earlier.
2. Centers for Disease Control and Prevention. National Center for Health Statistics. Vital Stats. http://www.cdc.gov/nchs/vitalstats.htm. [March 2013].
3. Martin JA, Hamilton BE, Ventura SJ, et al. Births: Final data for 2009. National vital statistics reports; vol 60 no 1. Hyattsville, MD: National Center for Health Statistics.
2011.
4. Hamilton BE, Martin JA, Ventura SJ. Births: Preliminary data for 2010. National vital statistics reports web release; vol 60 no 2. Hyattsville, MD: National Center for
Health Statistics. 2011.
Scatter Plots
Commonly used to show the relationship
between two variables e.g. correlation
Scatter Plots, example
State Math Scores and Students' TV Viewing Habits
1992 State 8th-grade NAEP Math Score
290
MN
ND
280
IA ME
NE
WI
NH
CT
UT
270
ID MA
IN
CO
WY PA
MI
NJ
MO
OK OH
TX
RI
AZ
CA
260
NY
VA
NM
MD
NC
KY
WV
TN
DE
FL
SC
GA
HI
AR
AL
LA
250
MS
240
DC
230
0
5
10
15
20
% Students watching TV 6 hrs+ per day
Source: National Center for Educational Statistics, 1994
25
30
Area Charts
Show percentage or proportional data
classified into nominal or ordinal categories
over time
Area Charts, example
Marital Status of U.S. Population Aged 15 and Older, 1970-2012
100%
80%
60%
Married
Never Married
Divorced
40%
Widowed
20%
0%
1970
1980
1990
2000
2008
2012
Source: 1970-2000 data, U.S. Census Bureau, Current Population Survey, March and Annual Social and Economic Supplements.
2008 and 2012 data, U.S. Census Bureau, American Community Survey, (IPUMS)
What are someBASIC PRINCIPLES OF
CHART DESIGN?
1. Simplify
• Minimize ink-to-data ratio
• Remove unneeded chart
elements
Gridlines
Chart borders
Axis titles
Legends
Markers & data labels
Decimal points (in axis &
data labels)
– Trend lines
– NO 3D charts!!!!!!!!!!!!!!!!!!!
–
–
–
–
–
–
• Sort data in a meaningful
way
Example of a 3D Chart:
Fathers Living with All of Their Children
Race, Ethnicity & Nativity
74%
80%
62%
70%
49%
All
White
fathers
Black
NB
Hispanic
FB
Hispanic
2. Color vs. Black & White
• When in doubt  black & white
• Color can help tell a story
• Color = branding (e.g. CFDR, NCFMR,
BGSU)
– Use a cohesive and consistent color palette
– Be mindful of how audience will view
• Excel vs. Word vs. PDF
• Color vs. B&W print copy
3. Do NOT Use Distorted Charts
• Do NOT misrepresent your data!
• Use appropriate and consistent axis and
scales
4. Present Related Charts
Simultaneously
• One-after-another or side-by-side if
possible
• Emphasizes importance of appropriate
axis and scales
5. Know Your Audience
• Academics vs. lay folks
• Undergraduate students vs. graduate
students
• Graduate students vs. professors
• PAA presentation vs. job talk
6. TMC = TMI
• Too many charts (TMC) is as bad as too
much information (TMI)  Do NOT
overload your audience!
Let’s apply some principles:
Which is easier to understand?
Predicted Marriage & Divorce Rates
Predicted Marriage & Divorce Rates
Given State-level HMI Spending
Rates per 1,000 at Risk Population
Upper Bounds (UB) & Lower Bounds (LB)
UB
Mar Rt
LB
UB
Div Rt
LB
50
45
40
35.97
35
30
25
24.25
20
15
10
5
0
$0.00
$10.00 $20.00 $30.00 $40.00
HMI Spending Per Population at Risk of
Marriage or Divorce
Mar Rt
Div Rt
45
40
35 39
36
30
25
24***
20
15 19
10
5
0
$0
$8
$16
$24
Sources: U.S. Census Bureau, American Community Survey, 2008-2011; HMI spending data– Hawkins et al., 2013.
$32
7. Do you need a chart?
million
$117
The amount annual
HMI spending in the
U.S. increased from
2000 – 2010
Sources: U.S. Census Bureau, American Community Survey, 2008-2011; HMI
spending data– Hawkins et al., 2013.
How do ICHART INTERACTION
EFFECTS?
Logistic Regression Predicting Ever
Marrying
• An interaction between a categorical and
continuous predictor (DeMaris 2004, p 143):
E(Y) = β0 + δ1Black + β1Parity + ϒ1Black*Parity
–
–
–
–
The subpop consists of only White and Black women
Black is a dummy variable
Parity indicates number of live births, range 0-15
Analyses is weighted
Logistic Regression Predicting Ever
Marrying, cont.
• Stata Output for Full Model:
. svy, subpop(blkwht): logistic evermar black PARITY PARITYblk, coef
(running logistic on estimation sample)
Survey: Logistic regression
Number of strata
Number of PSUs
=
=
56
152
Number of obs
=
12279
Population size
= 61754741
Subpop. no. of obs =
8568
Subpop. size
= 45835139
Design df
=
96
F(
3,
94)
=
186.25
Prob > F
=
0.0000
-----------------------------------------------------------------------------|
Linearized
evermar |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------black | -.4698438
.1172022
-4.01
0.000
-.7024885
-.2371992
PARITY |
1.458909
.0707637
20.62
0.000
1.318444
1.599374
PARITYblk | -.9253343
.0978554
-9.46
0.000
-1.119576
-.7310928
_cons | -.8652098
.0616793
-14.03
0.000
-.9876423
-.7427772
------------------------------------------------------------------------------
Logistic Regression Predicting Ever
Marrying, cont.
• Table of Results
Logistic Regression Predicting Ever Marrying
Model 1
(Zero-Order)
Black
Parity
Black X Parity
Constant
Coef.
SE
-0.854 0.325 ***
1.040 0.054 ***
Model 2
Coef.
SE
-1.589 0.113 ***
1.150 0.053 ***
-0.679 0.06 ***
Model 3
(Full)
Coef.
-0.470
1.459
-0.925
-0.865
SE
0.117 ***
0.071 ***
0.098 ***
0.062 ***
Logistic Regression Predicting Ever
Marrying, cont.
• Equation for Full Model
E(Y) = β0 + δ1Black + β1Parity + ϒ1Black*Parity
• Equation for Black Women
E(Y) = β0 + δ1 + β1Parity + ϒ1*Parity
• Equation for White Women
E(Y) = β0 + β1Parity
• Now, Plug and Play in Excel!
Logistic Regression Predicting Ever
Marrying, cont.
Unformatted
Formatted
25.000
Effect of Parity on Ever Marrying for
Black and White Women
20.000
Black Women
White Women
25
15.000
20
Black Women
10.000
White Women
15
10
5
5.000
0
0.000
1 2 3 4 5 6 7 8 9 10111213141516
-5
-10
-5.000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Parity