Geog410 Modeling of Environmental Systems Lab3 Introduction to Matlab III Date Due: 5pm Sep, 18th, 2008 1. Goals (1) To learn how to import/export into/out from Matlab. (2) To learn how to do Numerical Integration with Matlab. (3) To learn how to perform simple linear regression analysis with Matlab. 2. Import/Export The basic data arrangement in Matlab is in columns and/or rows, called arrays in computer language. The numbers of rows and columns are called data dimension. In many other types of software, you have to define the data dimension before data import/export, but you don’t have to do this in Matlab. We learned how to provide data to Matlab in the first Lab on screen. However, this manual input only works with very limited data volume. If we have thousands of input data saved in a file, it would be very convenient and efficient to read the data directly from the file instead of punching the keyboard. The functions in Matlab for data import is >> matlab_variable=dlmread(‘text_filename’); Note: dlmread() is the function name for reading an external text file. The filename inside the single right quote is the file name of a text file with either comma (‘) or space delimited between numbers. The file MUST be located in the current directory as indicated at the top of the Matlab window. The “matlab_variable” is the variable name to note the data in matlab.The semicolon at the end suppresses the output on screen. If your files are big, it is important to have it. Example: >>x=dlmread(‘myfile.txt’); The function for data export is >>dlmwrite(‘text_filename’, matlab_variablename); Note: “text_filename” is the external file name onto which the contents of the “matlab_variablename” will be written to. It is comma delimited text file. I recommend 1 your name the file as filename.txt. Adding the .txt as extension helps you to recognize the data type later on. Copy the file “theta.txt” in the data/ directory to your geog410 student folder. Start Matlab, and set the current directory as your student folder. Then import the theta.txt into Matlab by issuing the following command: >>theta=dlmread(‘theta.txt’) You should see the contents of theta on the screen as we did not put a semicolon at the end. Use wordpad to open thata.txt from your student folder, you should be able to see that the thata.txt has the exact content as you have on the screen. “theta.txt” contains the 100 angles in radiance equally spaced between 0 and 2π. Now issue the following commands: >> x=cos(theta); >> y=sin(theta); % taking a cosine of the angles and assigned the output to x; % taking a cosine of the angles and assigned the output to y; Now we can export x and y as in the following: >>dlmwrite(‘x.txt’,x); >>dlmwrite(‘y.txt’,y); You should see two new files names as x.txt and y.txt in your student folder. If you type >>x >>y You should see the content of x and y. Please use wordpad to open x.txt and y.txt from your student folder. You should see the exact content as you have in the Matlab window. The content of x and y are now in two separate files. Sometimes, we want them to be two columns in a single file. We can combine them into an ARRAY with two columns of data with column 1 being x and column 2 being y. >>xy(:,1)=x; >>xy(:,2)=y; The above commands mean assigning x to the first column of xy (xy is the name of a new Matlab variable), and y to the second column of xy. Please note the colon and the comma in the parentheses. For any array in Matlab, we can reference any element of the array by its rows and columns as >>variable(row,col) such as xy(50,1) will find the element in the 50th row and the first column in the xy array. >> xy(50,1) 2 ans = -0.9995 If you want to reference to the entire column, you have >> xy(:,1) Or >>xy(:,2) If you want to reference to the entire row, you can do >>xy(50,:) Now the Matlab variable “xy” has two columns x, and y. We can export the two columns at one time into a single file as >>dlmwrite(‘xy.txt’,xy); You should see a new file named xy.txt in your student folder. Please use wordpad to open it and see its content. Exercises 1 1. Data Export: for the line y 2 x 1, 0 x 10 , taking 100 linearly spaced points in the given interval for x (refer to the last lab instruction for how to do this) and calculate y for each x. Combine the x and y into an array, and save the coordinates of x and y to a file named ‘array_line.txt’ in your geog410 student folder. (Tips: x=0:0.1:10) 2. Data Import: Import the data in the text file ‘xy.txt’ you just created and plot the first column as horizontal axis and the second column as the vertical axis. Save your figure in your geog410 student folder as lab3-figure.tif. Include the figure in your lab report. 3. Integration We define f (x) as a continuous function on the interval [a, b]. The area enclosed in Figure 1 by the curve: y f ( x) , the x-axis, and the two vertical lines at x a, and x b b is called a trapezium. Definite integration of f(x) over [a,b], f ( x)dx , is to calculate the a area of the enclosed area. We can divide the interval [a, b] into n small intervals, the dividing points are: a x0 x1 x2 xn b . The whole trapezium is divided into n small trapezium. For the kth trapezium, the length of the base xk xk xk 1 the area of 3 the small trapezium can be approximated as the area of a rectangle, calculated as: S k f ( xk )xk . The area of the whole trapezium approximates to the summation of n n k 1 k 1 the area of all small trapezium as S S k f ( xk )xk . Fig. 1 The basic principle of numerical integration When the number of dividing points increases, The area of the whole enclosed areas is n increasingly close to S S k . k 1 Let’s working with an example using the function, y f ( x) e x ,0 x 3 ; the process to calculate the integration of the function on interval [0, 3] is below: >> x=0:0.01:3; The above command is to assign x with a value starting from 0, with an increase step of 0.01 all the way to 3. This would assign 300 values to x. >>y=exp(x); If we do a plot for x and y, >> plot(x,y) This is what we get 4 >> If we divide the x axis into 300 equal intervals with a step Δx=0.01, then the small trapezium enclosed by any x, x+Δx would be 0.01*y; The total area of below the entire curve would be >>A=sum(0.01*y) 3 This would provide a numerical integration of e dx x 0 Note You can repeat the above example with Δx=0.1, or 0.001 etc. The smaller the step size, the more accurate the integration. Exercises2 ,1 x 10 (Tip: ln(x) in MATLAB 1. Calculate the integration: y f ( x) ln(x 1) is log(x)). Include all the MATLAB commands and the final result in your report. x2 1 2. Calculate the integration: y f ( x) 2 ,2 x 6 . Include all the MATLAB x 1 commands and the final result in your report. (Tips:y=(x.^2-1)./(x.^2+1);) 4. Simple Linear Regression Analysis In the regression analysis, the variable which is to be estimated is called dependent variable, usually denoted as y. The variable which is used to estimate the dependent variable is called independent variable, usually denoted as x. The Simple linear regression model between x and y can be expressed: y a bx 0 , Where a and b are called regression coefficients, and the error term is 0 . Our goal is to get the estimation of a and b as well as the coefficient of determination (R2). For notational convenience, we usually denote yˆ a bx 5 The regression coefficients, a and b, are estimated such that sum of the squared difference between the observation (yi) and the estimation( ŷi ) is the smallest. Mathematically, the solution for the regression coefficients is n n n n x y x i i i yi i 1 i 1 i 1 b n n , 2 n xi ( x i ) 2 i 1 i 1 a y bx And the coefficient of determination is R 2 ( yˆ (y i y)2 i y) 2 , where 0 R 2 1 The closer R 2 to 1, the stronger linear relationship exists between x and y. The closer R 2 to 0, the weaker linear relationship. We do not have to remember how to calculate the coefficients; MATLAB provides some functions which can calculate the coefficients easily. First please copy the file “mgt-ndvi0082.txt” from the data/ directory to your geog410 student folder. This is a file containing the change in vegetation index measured from satellites (first column) and the times of increase in migration from 1982 to 2000 for a dozen eastern provinces in China. These data are from Dr. Song’s research (Song, C., Lord, W. J., Zhou, L. and Xiao, J. 2008. Empirical Evidence for Impacts of Internal Migration on Vegetation Dynamics in China from 1982 to 2000. Sensors, 8: 5069-5080; DOI: 10.3390/s8085069) Second, let’s import the data into Matlab as >>mgtndvi=dlmread(‘mgt-ndvi0082.txt’) You should be able to see the data on the screen since we did not put a semicolon at the end. We will use the first column as dependent variable (y), and second column as independent variable (x). In Matlab, the dependent variable has to be in a single column matrix, and the independent variable (x) has to be in a two column matrix with the first column being ones and the second column being the actual x value. This seems weird, but it is how matrix operation for simple linear regression works. 6 y1 1 x1 y 1 x 2 2 Y . and X . . . . . y n 1 x n In matrix notation, the regression between X and Y can be written as Y=Xb+ε I put everything in bold font to indicate the every letter is a matrix. The matrix b contains the regression coefficient a and b as a b b Now let’s create X and Y for the simple linear regression analysis using the data in mgtndvi. >>Y(:,1)=mgtndvi(:,1) >>X(:,1)=ones(12,1) >>X(:,2)=mgtndvi(:,2) Note there is a Matlab function “ones(#1,#2)”, which will create an array of ones in #1 rows and #2 columns. In the above, ones(12,1) creates a matrix in 12 rows and one column of ones, which will be the first column of X. Then we assign the second column of X the real x values. The output of Matlab regression looks weird again. >>[b,bint,r,rint,stats]=regress(Y,X); Where b is the vector b above, bint is the possible interval for a and b, r is the residual, i.e. ( yi yˆ i ) , and rint is the possible interval for each residual. The “stats” contains R2, F value, P value and the variance of errors. The function in Matlab to perform regression is “regress(dep, indep)”. >> [b,bint,r, rint,stats]=regress(Y,X) b= 0.5829 -0.0843 7 bint = 0.4303 0.7355 -0.1106 -0.0581 r= -0.0163 -0.0471 -0.0027 -0.0753 -0.0530 0.0470 0.0814 0.0342 -0.0036 -0.0936 0.1933 -0.0644 rint = -0.1976 0.1649 -0.2129 0.1187 -0.1795 0.1741 -0.2466 0.0959 -0.2360 0.1299 -0.1371 0.2311 -0.0899 0.2528 -0.1538 0.2222 -0.1835 0.1764 -0.2666 0.0794 0.0699 0.3168 -0.2171 0.0883 stats = 0.8365 51.1600 0.0000 0.0071 Based on the above results, the regression equation should be y=0.5829 -0.0843x, R2=0.8365, P=0.0000; 8 Exercise 3 Copy the file mgt-ndvi0082-2.txt in the data/ directory to your geog410 student folder. This is the same data as used in the example above, but for another set of provinces. Please import the data into Matlab, run a simple linear regression analysis with the first column being the dependent variable, and the second column as the independent variable. Copy the model output into your lab report, identify the regression coefficient, a and b, and R2. Describe how x and y are related. 9
© Copyright 2026 Paperzz