6.1.1 Deriving OLS OLS is obtained by minimizing the sum of the square errors. This is done using the partial derivative N 2 ˆ min e ˆ ˆ i 1 , 2 i 1 where : eˆi Yi ˆ1 ˆ2 X i eˆi 2 Yi ˆ1 ˆ2 X i 0 1 ˆ 2 1 eˆi 2 (Yi ˆ1 ˆ2 X i ) X i 0 2 ˆ 2 2 6.1.1 Deriving OLS These can simplify to: ˆ ˆ X 0 Y i 1 2 i OR eˆ i 0 1a (Y ˆ ˆ X ) X 0 1 i 2 i OR eˆ X i i 0 2a i 6.1.1 Deriving OLS These can be expressed in their normal equations form: Y i nˆ1 ˆ2 X i 0 1b Y X i i ˆ1 X i ˆ2 X i 0 2b Notice that we have two equations with two unknowns (β2hat and β1hat). All other components come from our data set. After some math, we get the OLS estimates of β1hat and β2hat: 6.1.1 Deriving OLS N ˆ2 (X i 1 i X )(Yi Y ) N (X i 1 N ˆ2 (X i 1 N i (X i 1 i X) 2 X )(Yi ) i X )Xi ˆ1 Y ˆ1 X Finally, we check the second derivative to confirm a minimum. 6.2 OLS Estimators and Goodness of Fit On average, OLS works well: The average of the estimated errors is zero The average of the estimated Y’s is always the average of the observed Y’s Proof: Note: This comes ei 0 1a from our derivation of OLS as the sum Yi Yi 0 of squared errors. Y Y i i N N
© Copyright 2026 Paperzz