DSCI 325: Handout 6 – More on Manipulating Data in SAS Spring 2017 CREATING VARIABLES IN SAS: A WRAP-UP As you have already seen several times, SAS variables can be created with an assignment statement in a DATA step. The assignment statement evaluates the expression on the right side of the equal sign and stores the result in the variable whose name is specified on the left side of the equal sign. For example, consider the following statements. DATA employees; Salary = 40000; Gender = 'M'; Hire_Date = '29JAN2017'D; RUN; PROC PRINT; FORMAT Hire_Date DATE9.; RUN; The results are shown below: Review of Arithmetic Operators: Operator ** * / + - Definition Priority Exponentiation 1 Negative prefix 1 Multiplication 2 Division 2 Addition 3 Subtraction 3 Operations of Priority 1 are performed before operations of Priority 2, etc. Consecutive operations with the same priority are performed from right to left within priority 1 and from left to right within priorities 2 and 3. Parentheses can be used to control the order of operations. 1 Review of Date Functions: SAS date functions can be used to either create SAS date values or extract information from existing SAS date values. For example, consider the following: Function Description Extracts the year and returns a four-digit value for year Extracts the quarter and returns a number from 1 to 4. Extracts the month and returns a number from 1 to 12. Extracts the day of the month and returns a number from 1 to 31. Extracts the day of the week and returns a number from 1 to 7, where 1 represents Sunday, etc. YEAR(SAS-date) QRTR(SAS-date) MONTH(SAS-date) DAY(SAS-date) WEEKDAY(SAS-date) THE DROP AND KEEP STATEMENTS You can use the DROP statement to specify variables you want to omit from the output data set(s). Conversely, the KEEP statement specifies the names of the variables you want to be written to the output data set(s). DATA employees2; SET employees; KEEP Salary Hire_Date; RUN; PROC PRINT data=employees2; RUN; Complete the DROP statement in the code below that would yield the same result for the employees2 data set: DATA employees2; SET employees; DROP RUN; ; PROC PRINT data=employees2; RUN; 2 Alternatives to the DROP and KEEP statements are the DROP= and KEEP= data set options placed in the DATA statement. DATA employees2 (KEEP = Salary Hire_Date); SET employees; RUN; DATA employees2 (DROP = Gender); SET employees; RUN; PROC PRINT data=employees2; RUN; PROC CONTENTS DATA=employees2; RUN; Note that the DROP= and KEEP= data set options can be used in situations where the DROP and KEEP statements cannot. In particular, the DROP= and KEEP= data set options can be used in any PROC step to control which variables are used in the procedure: DATA employees2; SET employees; RUN; PROC PRINT DATA=employees2 (KEEP = Salary Hire_Date); RUN; PROC CONTENTS DATA=employees2; RUN; 3 SUBSETTING OBSERVATIONS We can subset observations in SAS using either the WHERE, IF, or IF-THEN DELETE statements. The WHERE statement This statement subsets observations that meet a particular condition. For example, it is used below to create a new data set (GradeA) that contains only the students that earned an A. DATA Grades3; SET Hooks.Grades_missing; TotalQuiz = SUM(Quiz1,Quiz2,Quiz3,Quiz4,Quiz5,Quiz6,Quiz7,Quiz8,Quiz9,Quiz10,Quiz11,Quiz12); TotalExam = SUM(Exam1,Exam2,Exam3); FinalPercent = (TotalQuiz + TotalExam + EC + Final)/640; IF FinalPercent=. THEN ELSE IF FinalPercent ELSE IF FinalPercent ELSE IF FinalPercent ELSE IF FinalPercent ELSE Grade='F'; Grade='Incomplete'; >= 0.90 THEN Grade='A'; >= 0.80 THEN Grade='B'; >= 0.70 THEN Grade='C'; >= 0.60 THEN Grade='D'; RUN; DATA GradeA; SET Grades3; WHERE Grade='A'; RUN; PROC PRINT Data=GradeA; VAR FirstName LastName Final FinalPercent Grade; RUN; Note that the WHERE statement selects observations before they are brought into the program data vector. As a result, the following code would produce an error because the data set Hooks.Grades_missing does not contain the variable Grade. DATA GradeA; SET Hooks.Grades_missing; TotalQuiz = SUM(Quiz1,Quiz2,Quiz3,Quiz4,Quiz5,Quiz6,Quiz7,Quiz8,Quiz9,Quiz10,Quiz11,Quiz12); TotalExam = SUM(Exam1,Exam2,Exam3); FinalPercent = (TotalQuiz + TotalExam + EC + Final)/640; IF FinalPercent=. THEN ELSE IF FinalPercent ELSE IF FinalPercent ELSE IF FinalPercent ELSE IF FinalPercent ELSE Grade='F'; Grade='Incomplete'; >= 0.90 THEN Grade='A'; >= 0.80 THEN Grade='B'; >= 0.70 THEN Grade='C'; >= 0.60 THEN Grade='D'; WHERE Grade='A'; RUN; 4 The Subsetting IF statement This statement continues processing only those observations that meet the specified condition. For example, consider the following. DATA GradeA; SET Hooks.Grades_missing; TotalQuiz = SUM(Quiz1,Quiz2,Quiz3,Quiz4,Quiz5,Quiz6,Quiz7,Quiz8,Quiz9,Quiz10,Quiz11,Quiz12); TotalExam = SUM(Exam1,Exam2,Exam3); FinalPercent = (TotalQuiz + TotalExam + EC + Final)/640; IF FinalPercent=. THEN ELSE IF FinalPercent ELSE IF FinalPercent ELSE IF FinalPercent ELSE IF FinalPercent ELSE Grade='F'; Grade='Incomplete'; >= 0.90 THEN Grade='A'; >= 0.80 THEN Grade='B'; >= 0.70 THEN Grade='C'; >= 0.60 THEN Grade='D'; IF Grade='A'; RUN; PROC PRINT Data=GradeA; VAR FirstName LastName Final FinalPercent Grade; RUN; Note that the subsetting IF statement is not processed before observations are brought in to the data vector. Instead, it simply determines whether an observation continues to be processed (a false IF expression simply causes the observation to not output to the data set). The IF-THEN-DELETE statement This can be used as an alternative to the subsetting IF statement. For example, consider the following. DATA GradeA; SET Grades3; IF Grade NE 'A' THEN DELETE; RUN; PROC PRINT Data=GradeA; VAR FirstName LastName Final FinalPercent Grade; RUN; 5 6
© Copyright 2026 Paperzz