Before the class starts: Login to a computer Read the Data analysis assignment 1 on MyCourses If you use Stata: Start Stata Start a new do file Open the PDF documentation about regression If you use RStudio: Start RStudio Start a new R script Open R in Action, chapter 8 Maximum likelihood estimation A better approach Fit a curve instead of a line • The example is the logit function 1.2 Menarche 0.8 Interpretation stays the same: • Expected value of Menarche given Age • i.e. probability 0.4 0.0 10 12 14 Age 16 18 Model Linear regression model y = β0 + β1x1 + β2x2 + … + βkxk + u 0.8 Menarche Nonlinear regression model y = g(β0 + β1x1 + β2x2 + … + βkxk) +u g(x) = 1/(1+e-x) 1.2 0.4 Remarks • The inverse of g(x), f(x) is called link function • f(x) = ln(x/1-x) is the logit function • f(x) = x reduces to linear regression model 0.0 10 12 14 Age Wooldridge, J. M. (2009). Introductory econometrics: a modern approach (4th ed). Mason, OH: South Western, Cengage Learning., Section 17.1 16 18 Basic principle Sample of 9 observations Population has bernoulli distribution • Only 0 and 1 • Relative frequencies of 0 and 1 unknown • The population is very large The estimation principle: • Find the relative frequency that will maximize the likelihood of the sample Observed value Probability Cumulative Probability 0 ? ? 0 ? ? 0 ? ? 0 ? ? 0 ? ? 0 ? ? 0 ? ? 1 ? ? 1 ? ? Maximum likelihood estimate Likelihood of the sample Example Menarche = g(-20.0 + 1.54 Age) + u Age Menarche Fitted p ln(p) Girl 1 13.6 1 73.6% 73.6% -0.306 Girl 2 11.4 0 8.0% 92.0% -0.083 Girl 3 12.6 1 35.2% 35.2% -1.045 Girl 4 13.1 1 56.2% 56.2% -0.576 Girl 5 12.6 0 34.6% 65.4% -0.425 Girl 6 10.3 0 1.5% 98.5% -0.015 Girl 7 10.2 0 1.3% 98.7% -0.013 Girl 8 15.4 1 97.8% 97.8% -0.022 Girl 9 15.2 1 96.9% 96.9% -0.031 Girl 10 13.8 1 79.2% 79.2% -0.233 Likelihood (product) Log-likelihood (sum) 6.4% -2.749 Example data A researcher is interested in how variables, such as 1. GRE (Graduate Record Exam scores), 2. GPA (grade point average) and 3. prestige of the undergraduate institution, effect admission into graduate school. The response variable, admit/don't admit, is a binary variable. The variable rank takes on the values 1 through 4. Institutions with a rank of 1 have the highest prestige, while those with a rank of 4 have the lowest. http://www.ats.ucla.edu/stat/stata/dae/logit.htm Excel example Normally distributed example Cumulative probability density 0.205 0.185 0.407 0.084 1.588 0.16 0.013 -1.13 0.151 0.002 -0.08 0.386 0.001 0.132 0.405 0.0003 0.708 0.366 0.0001 -0.24 0.36 0.00004 1.984 0.085 0.000003 0.10 0.15 0.35 0.40 0.205 0.30 Probability density 0.25 Observed value -0.897 0.05 Propability density Sample of 9 observations 0.20 Population has normal distribution • Mean and SD are estimated −1 0 Value 1 2 Likelihood of the sample Cumulative probability and probability density Excel example Data analysis assignment 2 Task Do a moderation and a mediation analysis with a statistical software of your choice using the approaches presented by Baron and Kenny (1986) using the Prestige dataset used in the class. Answer the following two research questions: 1. Are women dominated professions rewarded less for prestigiousness than men dominated professions? 2. To what extent can the positive relationship between education and income mediated by prestigiousness? You can explain either income or if you see it necessary, the logarithm of income. How to get your analysis file started Stata RStudio • Load the data following the • Load the data following the instructions instructions • Explore the data using e.g. • Load the psych, car, effects, describe, summarize, and texreg packages by inspect, codebook, graph adding library command matrix, and stem to start of the R file. (If a package is not found, you need to install it) • Explore the data using e.g. describe, lowerCor, corr.test, and scatterplotMatrix How to submit your answer Stata • • Set your working directory Start your do file with log using assingment1, replace text • End your do file with log close • After each graph add graph export plotX.pdf • Open the Word document template from MyCourses Copy-paste the content of assignment1.log to the document template and insert the exported figures into right places. In word, write comments in normal style and use headings where appropriate • • RStudio • Compile a notebook in MS Word format • In word, write comments in normal style and use headings where appropriate Are women dominated professions rewarded less for prestigiousness than men dominated professions? Workflow of the analysis 1. Fit a model with direct effects only (done already as a part of the previous assignment) 2. Add the interaction term to the model and compare the models with a nested model test (F test) 3. Do an inteaction plot 4. Interpret the results paying particular attention to interpreting effect sizes Stata RStudio • Use nestreg, test , or • Load the effects package ftest for nested model test • Use anova for nested model and margins and test and effect and plot for marginsplot for marginal marginal effects effects To what extent can the positive relationship between education and income mediated by prestigiousness? Workflow of the analysis 1. 2. 3. 4. Fit a model of Y on X and controls Fit a model of M on X and controls Fit a model of Y on X, M, and controls Calculate the sobel test (http://quantpsy.org/sobel/sobel.htm) 5. Interpret the results paying particular attention to interpreting effect sizes Stata • Use the user written sgmediation command or online calculator RStudio • Calculate sobel test manually by calculating the z statistic and testing it with pnorm or use the online calculator Simulation demonstration: heteroskedasticity
© Copyright 2026 Paperzz