THE LINE OF BEST FIT VIA TRANSFORMATIONS DAVID G RADCLIFFE Given a sequence of n points in the plane (X1 , Y1 ), . . . , (Xn , Yn ) we seek the linear equation y = a + bx that approximates the P points as closely as possible, in that the sum of the squared errors E = ni=1 (Yi − a − bXi )2 is minimized. We assume that not all of the points lie on a single horizontal or vertical line. InPthat case, wePcan apply P P a transformation to the points so that xi = yi = 0 and x2i = yi2 = 1. The transformation is defined by Xi − X Yi − Y xi = qP and yi = qP . 2 2 (Xi − X) (Yi − Y ) Let r = P xi yi . Then X E= (yi − a − bxi )2 X = (yi2 + a2 + b2 x2i − 2ayi − 2bxi yi + abxi ) X X X X X X = yi2 + a2 + b2 x2i − 2ayi − 2bxi yi + abxi = 1 + na2 + b2 − 2br = (1 − r2 ) + na2 + (b − r)2 . The sum is minimized when a = 0 and b = r, so the line of best fit is y = rx. What a simple equation! Unfortunately, the equation is a bit messier when expressed in terms of the original variables. P (X − x − y−Y X)(Y − Y ) X i i q qP = qP P P 2 2 2 2 (Yi − Y ) (Yi − Y ) (Xi − X) (Xi − X) P (Xi − X)(Yi − Y ) y−Y = (x − X) . P (Xi − X)2 Date: December 25, 2014.
© Copyright 2026 Paperzz