SOC-GA 2312

SOC-GA 2312-001. Advanced Multivariate Statistics
NYU Sociology Fall 2011
Time: Monday, 6:20pm – 9:00pm
Location: Room C-6, 25 West 4th Street
Instructor: Dohoon Lee
E-mail: [email protected]
Office: Room 4113, Puck 4th Floor
Office hours: Tuesday, 10:00am – noon
Course Description
This course provides a structured approach towards the development of an analytic toolkit for use in
quantitative sociological research. The two overarching goals of this course are: (1) to understand the
statistical/sociological underpinnings of advanced quantitative research methods and (2) to make
practical applications of such methods. In this course, we will cover generalized linear models,
growth curve/multilevel models, and propensity score models: (1) models for categorical and limited
dependent variables; (2) analyses of repeated measures over time and of clustered data; and (3)
potential outcomes approaches. While enough time will be devoted to discussions about why we
need these methods and what they do, the emphasis will also be placed on how we can use them to
analyze data, interpret results and critically review published research. Therefore, we will allocate
time for short data analysis session in each class. The prerequisite is the first two semesters of
quantitative methods (SOC-GA 2331-001. Methods & Statistics I and SOC-GA 2330-001. Intro to
Methods) or the equivalent as approved by the instructor.
Course Requirement
1. 3 memos
All students are required to submit 3 memos (up to 3 pages each) that summarize and critique the
published articles of their choice (see Course Schedule for the list [M]). The due dates are October
10, November 21, and December 12. Each memo addresses one of the three topics covered in this
course, with a focus on methodological critique. Specifically, the memos should identify existing or
potential problems and provide viable alternatives on a methodological ground.
2. Final project
There are three types of final projects that students can consider (see below). Regardless of the type
chosen, the final project should employ at least one statistical method discussed in this course,
demonstrating the potential to be published in peer-reviewed journals after revision in regard to
substance and methodology. That being said, each student should convince me that his/her project is
manageable enough to be submitted on time. A one or two page memo outlining the topic, data, and
method is due October 17; the final paper is due December 15. All students are encouraged to meet
with me to talk about their project.
2.1. Writing a research paper (up to 15 pages): the most desirable option
2.2. Replicating a published study (up to 15 pages): e.g., Mouw and Sobel (2001); Young (2009)
2.3. Writing 2 research proposals (up to 7 pages each)
1|Page
3. Class presentation
All students will have 15 minutes to present their final project and to discuss it with the class on the
last day of the course.
Software
We will use STATA unless otherwise notified.
Course Web Page
The course website is available through Blackboard. If you enrolled in this course, you should access
the site by going to http://home.nyu.edu. Once there, go to Academics and click on the link to
Advanced Multivariate Statistics under Blackboard Courses. You will see all announcements and
course materials there, so visit frequently.
Texts
Note: * more substantive; † more practical; ‡ both
1. Required
†Long, J. Scott and Jeremy Freese. 2006. Regression Models for Categorical Dependent Variables
Using Stata, Second Edition. College Station, TX: Stata Press. [LF]
‡Singer, Judith D. and John B. Willett. 2003. Applied Longitudinal Data Analysis: Modeling Change
and Event Occurrence. New York, NY: Oxford University Press. [SW]
*Morgan, Stephen L. and Christopher Winship. 2007. Counterfactuals and Causal Inference:
Methods and Principles for Social Research. New York, NY: Cambridge University Press.
[MW]
2. Recommended
*Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables.
Thousand Oaks, CA: Sage Publications.
*Raudenbush, Stephen W. and Anthony S. Bryk. 2001. Hierarchical Linear Models: Applications
and Data Analysis Methods, Second Edition. Thousand Oaks, CA: Sage Publications. [RB]
‡Gelman, Andrew and Jennifer Hill. 2006. Data Analysis Using Regression and
Multilevel/Hierarchical Models. New York, NY: Cambridge University Press.
†Guo, Shenyang Y. and Mark W. Fraser. 2010. Propensity Score Analysis: Statistical Methods and
Applications. Thousand Oaks, CA: Sage Publications.
3. Additional reading materials
There are supplemental articles (see Class Schedule) recommended to read in some classes [R]. They
are either available through e-journals or posted on our Blackboard web page.
Class Schedule
Week 1: 9/12
Generalized Linear Models
Overview
What are GLMs?; When do we need them?
2|Page
Binary Outcomes
Week 2: 9/19
Ordinal Outcomes
Week 3: 9/26
Nominal Outcomes
Week 4: 10/3
Count Outcomes
Liner probability, logit, and probit models
[LF] Ch. 4
[M] Brooks and Manza (1997); Browne (1997)
Ordered logit and ordered probit models
[LF] Ch. 5
[M] Espenshade and Fu (1997)
Multinomial logit models
[LF] Ch. 6
[M] Lewis and Oppenheimer (2000)
Poisson and negative binomial regressions
[LF] Ch. 7
[M] Kalleberg, Reskin, and Hudson (2000); Minkoff (1997)
NO MEETING
Week 5: 10/10
> 1st memo due
Week 6: 10/17
Growth Curve/Multilevel Models
Overview
What are GCM/HLM?; When do we need them?
[R] Karney and Bradbury (1995)
Growth Curve Models 1
Longitudinal data on change; Within- and between-individual change
[SW] Chs. 1, 2, and 3
> Final project memo due
Week 7: 10/24
Growth Curve Models 2
Longitudinal data analysis of individual change; Flexible treatment of
time; Nonlinear change
[SW] Chs. 4, 5, and 6
Week 8: 10/31
Growth Curve Models 3
Latent growth curve modeling
[SW] Chs. 7 and 8
[M] Cherlin, Chase-Lansdale, and McRae (1998); Downey, von
Hippel, and Broth (2004)
Week 9: 11/7
Multilevel Models 1
History of HLM; Random coefficients model
[RB] Chs. 1, 2, 3, and 4
Week 10: 11/14
Multilevel Models 2
Applications of HLM
[RB] Ch. 5
[M] Gamoran (1992); Grodsky and Pager (2001)
Week 11: 11/21
Propensity Score Models
3|Page
Counterfactual Approaches
Causality; Counterfactual model
[MW] Chs. 1, 2, and 3; [R] Holland (1986); LaLonde (1986)
> 2nd memo due
Week 12: 11/28
Matching
Origins and motivations; Matching as what?
[MW] Ch. 4; [R] Rosenbaum and Rubin (1983); D’Agostino (1998); Ho et al. (2007)
[M] Bingenheimer, Brennan, and Earls (2005); Brand and Xie (2010); Dehejia and
Wahba (1999)
Week 13: 12/5
Weighting
Origins and motivations; Marginal structural models
[R] Kurth et al. (2005)
[M] Crosnoe (2009); Mincy, Hill, and Sinkewicz (2009)
Week 14: 12/12
Class Presentation
> 3rd memo due
> Final project due by 5pm, 12/15 (please drop your paper in my mailbox)
4|Page