- MATHEMA77CSAND COMPUTER EDUCA77ON - BOUNDED POPULATION GROWTH: A CURVE FITTING LESSON by John H. Mathews California State University Fullerton, California The logistic curve is presented as a mathematical model for population growth in calculus, differential equations and mathematical modeling textbooks. However, there is seldom a development of how to obtain the coefficients for this model. The purpose of this article is to present two methods for fitting the logistic curve to data supplied by the U. S. census bureau. The computer algebra software Mathematica is used in this article to carry out the computations and plot graphs . However, it is not essential to use such sophisticated software . It is easy for students to obtain the solution using the "data linearization" method on a pocket calculator such as the Hewlett- Packard 20S or Texas Instrument 30SLR . Let p(t) denote the population at time t. Assume that Lim p(t) = L . The differential equation relating p(t) and p'(t) is: (1) t--~00 P'(t) = rp(t)[l-~] The parameter r is the initial rate of increase when p(t) is small, and L is the carrying capacity or limiting population as t--~ -. The form of the solution to (1) is known to be : (2) L 1+cerW to) The value to is usually chosen to be a convenient starting date such as 1900. We want to fit the curve of the form (2) to a given set of data (to ,P0), (tl , pl ), . . ., ( to Pn) The model (2) involves three unknown parameters r, c and L . However, by judiciously selecting a value for L, we can reduce the problem to finding just two parameters r and c. This will permit us to use the technique known as "data linearization" which reduces the problem to finding a least squares line . First, rearrange (2) in the form : (3) P = P(t) = L -1=cer(t_ t o) P Vol. 26, No. 2 Spring 1992 169 - MATHEMATICS AND COMPUTER EDUCATION - Now take the logarithm of both sides of (3) and obtain : (4) In (L -1) = In(c) + r (t - to) . Next, we introduce the change of variables (5) T = t - to , P = In( L _ 1 ), B = In(c) and A = r. This will transform equation (4) into the linear form: (6) P = F(T) = AT + B . The data points to be used in (6) are the transformed pairs: (7) (Tk,Pk) _ In( ( tk - to, pk -1 )) for k When the least squares line is fit to the data (7), the coefficients A and B in equation (6) are obtained . This computation is carried out automatically on almost any pocket calculator, or statistical software such as Minitab or a spreadsheet or a computer algebra system such as Mathematica . Finally, the coefficients c and r are calculated : (8) c = eB and r= A . The 1990 U . S. census was taken . However, the final figures will not be available until fall of 1991 or later. We shall show how to fit data to the above model and use formula (2) to estimate the population in the years 1990 through 2050. One must proceed with caution for extrapolation too far in the future because of: immigration into the U . S., wars, advances in medical technology, and so forth (see references [1], [2] and [4]) . Numerical methods for fitting data to a logistic curve are discussed in [3] . For the "data linearization" method, presented in Example 1, we have assigned the limiting population as Lim p(t) = L = 1000 . For the 'least squares fit" t-->method, presented in Example 2, the computer determines the parameter L = 1481 .53 . The former requires only a working knowledge about the least squares line which is programmed into almost every modern pocket 17 0 - MA774BNA77CS AND COMPUTER EDUCA770N - calculator . The latter method involves the more sophisticated problem of minimizing the sum of the squares of the residuals and requires a minimization subroutine for functions of several variables. The solutions obtained from the two methods will differ, but their values for the near future (1990<_ t <_ 2000) are almost the same . Example 1 . Fit the curve p(t) = L 1+cer(t- to) to the points (1900, 75.995), (1910, 91 .972), (1920, 105.711), (1930, 122.755), (1940, 131 .699), (1950, 150.697), (1960, 179.323), (1970, 203.212) and (1980, 226 .505) which are the census figures for the population ofthe U.S. Assume that L = 1000 and use the method of "data linearization." Solution. Enter the nine data points into the two dimensional array tps : 1900, 75 .995), { 1910, 91 .972), { 1920,105.711) , { 1930,122.755), { 1940,131 .699) , { 1950,150.697), { 1960,179 .323) , { 1970,203 .212) , 11980,226.505)) tps = { ( Set L = 1000 and proceed with the change of variables . In Mathematica this is accomplished by first forming the transpose of the data: trans = Transpose[ tps] . Then the list of first coordinates trans[ [ 1] ] is selected and placed in the variable ts: is = trans[111] { 1900, 1910, 1920, 1930, 1940, 1950, 1960, 1970, 1980}, Similarly, the list of second coordinates trans[[2]] ps = is placed in ps : trans[ [ 2] ] 175 .995, 91 .972, 105 .711, 122 .755, 131 .699, 150 .697, 179 .323, 203 .212, 226.505) Now the change of variables indicated in (5) is performed . The value to = 1900 is used: Ts = is -1900 { 0, 10, 20, 30, 40, 50, 60, 70, 80} Ps = Log[IJps-1] 12 .49805, 2 .28979, 2 .13532, 1 .9666, 1 .88602, 1 .72914, 1 .52094, 1 .36634, 1 .22815) The transformed pairs (Tk, array TPs : TPs = Spring Pk) are then placed in the two-dimensional Transpose[ { Ts,Ps) ] 1992 Vol . 26, No. 2 - MATHEMATICSAND COMPUTER EDUCATION ((0, 2.49805), { 10, 2.28979), (20, 2.13532), (30, 1 .9666), (40, 1.88602), ( 50, 1 .729141, ( 60, 1 .52094), 170, 1 .36634), ( 80, 1.22815) 1 Mathematica's subroutine Fit is used to obtain the least squares line in (6) in the transformed (T,P) plane : F[Tj = Fit[TPs,(1,T},T] 2.46778 - 0.0155269 T A plot of the transformed data (7) and the line (6) is useful in understanding the process, and is shown in Figure l . The Mathematica commands used to create the plot are : graphl = Plot[ F[ T],( T,0,80) ,PIotRange-> 104.511 ; dots= ListPlot[ TPs,PIotStyle-> (PointSize[0 .02] 1 ] ; Show[ graphl,dots,AxesLabel-> ('T' ;'P'}] ; 20 40 60 80 T Figure 1 . The least squares line for the transformed data. Set c = exp(B) and r= A to get the coefficients of p(t) = back in the original (t p) plane . c = Exp[2.46778] L 1+ cer (t- to) 11.7962 r = -0.0155269 -0.0155269 Now the function p(t) is defined with the p[tj = LJ(1 + c Exp[r(t-1900)]) 172 Mathematica command : - MATHEMATICS AND COMPUTER EDUCATION 1000 11 .7962 1+ E0.0155269(- 1900+ 1) A plot of the logistic curve p(t) is given in Figure 2. This was accomplished by typing : graph2 = Plot[ p[ t],(t,1890,2000) ,PlotRange-> (0,300}] ; dots= ListPlot[tps,PlotStyle-> { PointSize[ 0.02] } ] ; ; Show[ graph2,dots,AxesLabel-> ("t ","p'}] 1920 1940 1960 1980 2000 Figure 2. The logistic curve fit p(t) for the population data. The population estimate p(1990) = 255.335 is obtained with this formula. Furthermore, is also worthwhile to graph the curve p(t) over a larger range of values so that the inherent "S" shape is visible . This is given in 2000 2100 2200 2300 Figure 3. Extrapolation using the logistic curve p(t). Spring 1992 173 Vol. 26. No. 2 - MATHEMATICS AND COMPUTER EDUCA77ON - We now investigate how close our guess L = 1000 actually fits the data. The next example shows how to find L by solving the 'least-squares" problem when all three parameters r, c and L are determined numerically by the computer. Example 2 . Fit the curve f(t) = 1+cer ( t to) to the data points in Example 1 by finding the parameters c, r and L which minimize the quantity : n E(rc,L) = J[ k= 1 1+ceLtk- tot (9) Pk]2 Solution . Clear the variables r c,L and f and form f(t) : Clear[f,r,c,L] ; f t-] = L/(1 + e Exp[r(t-1900)]) L 1+ E '(- 1900+ t) c The quantity E(r c,L) is formed and stored in the variable sum with the command : sum := Sum[(f[ts[[i]]]-ps[[i]])^ 2,{i,1,9}] [-75 .995+ 1+c]2 + [-91 .972+ 1 + E10rc ]2 + [-105.711+ 1+E20rc]2 [-122 .755+ ]2 L L L rc ]2 + [-131 .699+ ]2 + [-150.697+ rc 1+E50 1+E30 1+E40rc [-179 .323+ 1+E60rc]2+[-203 .212+ 1+E70rc]2+[-226 .505+ 1+E80rc]2 Then the parameters are found by invoking Mathematica's minimization routine : sol = FindMinimum[ sum,{ r,-0.0146982},{ c,17 .8357}, { L,1481 .53},AccuracyGoal-> 19, WorkingPrecision-> 19,MaxIterations-> 250] (80.1745, { r --->-0 .0146898, c ---> 17.8261, L -41481 .531) Notice that the value L = 1481 .53 is involved in the solution to the "least-squares" problem, The population function q(t) corresponding to these parameters is: 17 4 - MATHEMATICS AND COMPUTER EDUCATION q[tj = f[t]/ .sol [[2]] 1481 .53 17.8261 l+ E0.0146898(- 1900+ t) Graphs of q(t) over the intervals [1900,2000] and [1900,2030] are shown in Figures 4 and 5, respectively. For the near future (1990 <_ t<_ 2000) the functions p(t) and q(t) agree within 1 .5% . As mentioned in the introduction, it is difficult to extrapolate too far in the future . The values p(2030) = 389 .531 and q(2030) = 406.953 differ by about 4.5% . 1920 1940 1960 1980 2000 Figure 4. The logistic curve fit q(t) for the population data. g 1400 120010001 800 600 4001 2002000 2100 2200 2300 Figure 5. Extrapolation using the logistic curve q(t). Spring 1992 175 Vol. 26, No. 2 - MATHBJWATICS AND COMPUTER EDUCATION - In conclusion, a comparison of the original data and the logistic curves p(t) and q(t) is given in Table l . These examples illustrate how curve fitting is used in a simple mathematical model. Ifthe limiting value of L is known, or can be determined by other means, then the first method should be used and p(t) is used to fit the data and to extrapolate . If L is not known, but can be estimated, then the second method can be used to determine L and the function q(t) is used to model the data . Year tk 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020 2030 2040 2050 Observed Population Prediction using Pk p(t k ) 75 .995 91 .972 105 .711 122 .755 131 .699 150 .697 179 .323 203 .212 226 .505 78 .148 90 .092 103 .656 118 .996 136 .260 155 .587 177 .093 200 .865 226 .948 255 .335 285 .959 318 .685 353 .303 389 .531 427 .021 465 .369 Percent error in p(t k ) 100 pk - P (tk) P -2 .83 2 .04 1 .94 3 .06 -3 .46 -3 .25 1 .24 1 .15 -0 .20 Prediction using q(t k ) 78 .696 90 .388 103 .690 118 .782 135 .854 155 .101 176 .716 200 .887 227 .788 257 .565 290 .335 326 .162 365 .056 406 .953 451 .714 499 .112 Percent error in q(t k ) 100 Pk - q (t k ) P -3 .55 1 .72 1 .91 3 .24 -3 .16 -2 .92 1 .45 1 .14 -0 .57 Table 1. Comparison of the census data pk and values of p(tk) and q(tk). REFERENCES 1 . M . Braun, Differential Equations and TheirApplications : an Introduction to Applied Mathematics, 2nd Ed., Springer-Verlag, New York (1978) . 2. F. R.Giordano,andM .D .Weir,A First Course in Mathematical Modeling, Brooks/Cole Pub. Co., Monterey, CA (1985) . 3. J. H . Mathews, Numerical MethodsforComputerScience, Engineeringand Mathematics, Prentice-Hall, Inc., Englewood Cliffs, NJ (1987) 4. W. J . Meyer, Concepts of Mathematical Modeling, McGraw Hill, New York (1984) . 176
© Copyright 2025 Paperzz