Objectives
Reading SAS Data Sets and
Creating Variables
Create a SAS data set using another SAS data set as
input.
Create SAS variables.
Use operators and SAS functions to manipulate data
values.
Control which variables are included in a SAS data set.
2
Reading a SAS Data Set
Reading a SAS Data Set
To create a SAS data set using a SAS data set as input,
you must use a
DATA statement to start a DATA step and name the
SAS data set being created (output data set:
onboard)
Create a temporary SAS data set named onboard
from the permanent SAS data named ia.dfwlax and
create a variable that represents the total passengers on
board.
Sum FirstClass and Economy values to compute
Total.
SAS date values
ia.dfwlax
Flight Date
Dest
FirstClass Economy
439
921
114
LAX
DFW
LAX
20
20
15
14955
14955
14956
New
Variable
137
131
170
3
To create a variable, you must use an
assignment statement to add the values of the
variables FirstClass and Economy and assign
the sum to the variable Total.
Total
157
151
185
4
Reading a SAS Data Set
Assignment Statements
An assignment statement
evaluates an expression
assigns the resulting value to a variable.
General form of a DATA step:
DATA
DATAoutput-SAS-data-set;
output-SAS-data-set;
SET
SETinput-SAS-data-set;
input-SAS-data-set;
additional
additionalSAS
SASstatements
statements
RUN;
RUN;
General form of an assignment statement:
variable=expression;
variable=expression;
By default, the SET statement reads all of the
observations from the input SAS data set
variables from the input SAS data set.
5
SET statement to identify the SAS data set being read
(input data set: ia.dfwlax).
6
1
SAS Expressions
Using Operators
Selected operators for basic arithmetic calculations in an
assignment statement:
An expression contains operands and operators that
form a set of instructions that produce a value.
Operands are
variable names
constants.
Operator Action
Operators are
symbols that request
arithmetic calculations
SAS functions.
7
Priority
Addition
Sum=x+y;
III
-
Subtraction
Diff=x-y;
III
*
Multiplication
Mult=x*y;
II
/
Division
Divide=x/y;
II
**
Exponentiation
Raise=x**y;
I
-
Negative prefix
Negative=-x;
I
8
Compiling the DATA Step
Compiling the DATA Step
libname ia 'SAS-data-library';
data onboard;
set ia.dfwlax;
Total=FirstClass+Economy;
run;
libname ia 'SAS-data-library';
data onboard;
set ia.dfwlax;
Total=FirstClass+Economy;
run;
PDV
PDV
Flight Date Dest FirstClass Economy
.
.
Flight Date Dest FirstClass Economy Total
.
.
c07s1d1 ...
9
Executing the DATA Step
Flight
439
921
114
.
.
c07s1d1 ...
Executing the DATA Step
ia.dfwlax
Flight
439
921
114
ia.dfwlax
Date
Dest FirstClass Economy
12/11/00 LAX
20
137
12/11/00 DFW
20
131
12/12/00 LAX
15
170
data onboard;
set ia.dfwlax;
Total=FirstClass+Economy;
run;
Dest FirstClass Economy Total
.
.
.
PDV
Flight Date
Dest FirstClass Economy Total
439
12/11/00 LAX
20
137
.
Dest FirstClass Economy Total
Flight Date
onboard
Flight Date
.
10
Date
Dest FirstClass Economy
12/11/00 LAX
20
137
12/11/00 DFW
20
131
12/12/00 LAX
15
170
data onboard;
set ia.dfwlax;
Total=FirstClass+Economy;
run;
PDV
Flight Date
.
11
Example
+
onboard
...
12
Dest FirstClass Economy Total
...
2
Executing the DATA Step
Flight
439
921
114
Executing the DATA Step
ia.dfwlax
Date
Dest FirstClass Economy
12/11/00 LAX
20
137
12/11/00 DFW
20
131
12/12/00 LAX
15
170
data onboard;
set ia.dfwlax;
Total=FirstClass+Economy;
run;
PDV
Flight Date
Dest FirstClass Economy Total
439
12/11/00 LAX
20
137
157
onboard
Flight Date
onboard Automatic output
Flight Date
Dest FirstClass Economy Total
439
12/11/00 LAX
20
137
157
Dest FirstClass Economy Total
...
13
Executing the DATA Step
Flight
439
921
114
...
14
Executing the DATA Step
ia.dfwlax
Date
Dest FirstClass Economy
12/11/00 LAX
20
137
12/11/00 DFW
20
131Total
Reinitialize
12/12/00 LAX
15
170
to missing
data onboard;
set ia.dfwlax;
Total=FirstClass+Economy;
run;
Flight
439
921
114
PDV
Flight Date
Dest FirstClass Economy Total
439
12/11/00 LAX
20
137
.
ia.dfwlax
Date
Dest FirstClass Economy
12/11/00 LAX
20
137
12/11/00 DFW
20
131
12/12/00 LAX
15
170
data onboard;
set ia.dfwlax;
Total=FirstClass+Economy;
run;
PDV
Flight Date
Dest FirstClass Economy Total
921
12/11/00 DFW
20
131
.
onboard
onboard
Flight Date
Dest FirstClass Economy Total
439
12/11/00 LAX
20
137
157
Flight Date
Dest FirstClass Economy Total
439
12/11/00 LAX
20
137
157
...
15
Executing the DATA Step
Flight
439
921
114
...
16
Executing the DATA Step
ia.dfwlax
Date
Dest FirstClass Economy
12/11/00 LAX
20
137
12/11/00 DFW
20
131
12/12/00 LAX
15
170
data onboard;
set ia.dfwlax;
Total=FirstClass+Economy;
run;
ia.dfwlax
Flight Date
Dest FirstClass Economy
439
12/11/00 LAX
20
137
921
12/11/00 DFW
20
131
114
12/12/00
LAX
15
170
Automatic
return
data onboard;
set ia.dfwlax;
Total=FirstClass+Economy;
run;
PDV
Flight Date
Dest FirstClass Economy Total
921
12/11/00 DFW
20
131
151
PDV
Flight Date
Dest FirstClass Economy Total
921
12/11/00 DFW
20
131
151
onboard
onboard Automatic output
Flight
Flight
439
439
921
Flight Date
Dest FirstClass Economy Total
439
12/11/00 LAX
20
137
157
17
ia.dfwlax
Flight Date
Dest FirstClass Economy
439
12/11/00 LAX
20
137
921
12/11/00 DFW
20
131
114
12/12/00
LAX
15
170
Automatic
return
data onboard;
set ia.dfwlax;
Total=FirstClass+Economy;
run;
PDV
Flight Date
Dest FirstClass Economy Total
439
12/11/00 LAX
20
137
157
...
18
Date
Date
12/11/00
12/11/00
12/11/00
Dest
Dest FirstClass
FirstClass Economy
Economy Total
Total
LAX
20
137
157
LAX
20
137
157
DFW
20
131
151
...
3
Executing the DATA Step
Flight
439
921
114
Assignment Statements
ia.dfwlax
proc print data=onboard;
format Date date9.;
run;
Date
Dest FirstClass Economy
12/11/00 LAX
20
137
12/11/00 DFW
20
131
12/12/00 LAX
15
170
data onboard;
set ia.dfwlax;
Total=FirstClass+Economy;
run;
The SAS System
Obs
1
2
3
4
5
6
7
8
9
10
PDV
Flight Date
Dest FirstClass Economy Total
114
12/12/00 LAX
15
170
185
onboard
19
Flight
439
921
114
Date
12/11/00
12/11/00
12/12/00
Dest FirstClass Economy Total
LAX
20
137
157
DFW
20
131
151
LAX
15
170
185
20
Using SAS Functions
Flight
439
921
114
982
439
982
431
982
114
982
Date
11DEC2000
11DEC2000
12DEC2000
12DEC2000
13DEC2000
13DEC2000
14DEC2000
14DEC2000
15DEC2000
15DEC2000
Dest
First
Class
Economy
Total
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
20
20
15
5
14
15
17
7
.
14
137
131
170
85
196
116
166
88
187
31
157
151
185
90
210
131
183
95
.
45
Why is Total missing in observation 9?
c07s1d1
Using SAS Functions
SAS functions
perform arithmetic operations
compute sample statistics (for example: sum, mean,
and standard deviation)
manipulate SAS dates and process character values
perform many other tasks.
A SAS function is a routine that returns a value that is
determined from specified arguments.
General form of a SAS function:
function-name(argument1,argument2,
function-name(argument1,argument2,.....).)
Example
Sample statistics functions ignore missing values.
Total=sum(FirstClass,Economy);
21
22
Using the SUM Function
Using the SUM Function
proc print data=onboard;
format Date date9.;
run;
data onboard;
set ia.dfwlax;
Total=sum(FirstClass,Economy);
run;
The SAS System
Obs
1
2
3
4
5
6
7
8
9
10
23
c07s1d2
24
Flight
439
921
114
982
439
982
431
982
114
982
Date
11DEC2000
11DEC2000
12DEC2000
12DEC2000
13DEC2000
13DEC2000
14DEC2000
14DEC2000
15DEC2000
15DEC2000
Dest
First
Class
Economy
Total
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
20
20
15
5
14
15
17
7
.
14
137
131
170
85
196
116
166
88
187
31
157
151
185
90
210
131
183
95
187
45
c07s1d2
4
Using Date Functions
Date Functions: Create SAS Dates
You can use SAS date functions to
create SAS date values
extract information from SAS date values.
25
obtains the date value from the
system clock.
MDY(month,day,year)
uses numeric month, day, and year
values to return the corresponding
SAS date value.
26
Date Functions: Extracting Information
Using the WEEKDAY Function
YEAR(SAS-date)
extracts the year from a SAS date
and returns a four-digit value for
year.
QTR(SAS-date)
extracts the quarter from a SAS
date and returns a number from
1 to 4.
MONTH(SAS-date)
extracts the month from a SAS
date and returns a number from
1 to 12.
WEEKDAY(SAS-date)
extracts the day of the week from
a SAS date and returns a number
from 1 to 7, where 1 represents
Sunday, and so on.
27
Add an assignment statement to the DATA step to create a
variable that shows the day of the week that the flight
occurred.
data onboard;
set ia.dfwlax;
Total=sum(FirstClass,Economy);
DayOfWeek=weekday(Date);
run;
Print the data set, but do not display the variables
FirstClass and Economy.
28
c07s1d3
Using the WEEKDAY Function
Selecting Variables
proc print data=onboard;
var Flight Dest Total DayOfWeek Date;
format Date weekdate.;
run;
You can use a DROP or KEEP statement in a DATA step
to control what variables are written to the new SAS data
set.
The SAS System
29
TODAY()
Obs
Flight
Dest
Total
Day
Of
Week
1
2
3
4
5
6
7
8
9
10
439
921
114
982
439
982
431
982
114
982
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
157
151
185
90
210
131
183
95
.
45
2
2
3
3
4
4
5
5
6
6
General form of DROP and KEEP statements:
DROP
DROPvariables;
variables;
Date
Monday,
Monday,
Tuesday,
Tuesday,
Wednesday,
Wednesday,
Thursday,
Thursday,
Friday,
Friday,
December
December
December
December
December
December
December
December
December
December
11,
11,
12,
12,
13,
13,
14,
14,
15,
15,
2000
2000
2000
2000
2000
2000
2000
2000
2000
2000
What if you do not want the variables FirstClass and
Economy in the data set?
c07s1d3
KEEP
KEEPvariables;
variables;
30
5
Equivalent
Selecting Variables
Selecting Variables
Do not store the variables FirstClass and Economy
in the data set.
proc print data=onboard;
format Date date9.;
run;
data onboard;
set ia.dfwlax;
drop FirstClass Economy;
Total=FirstClass+Economy;
run;
The SAS System
keep Flight Date Dest Total;
D
D
PDV
Flight Date Dest FirstClass Economy Total
.
.
.
.
31
c07s1d4
Obs
Flight
1
2
3
4
5
6
7
8
9
10
439
921
114
982
439
982
431
982
114
982
Date
11DEC2000
11DEC2000
12DEC2000
12DEC2000
13DEC2000
13DEC2000
14DEC2000
14DEC2000
15DEC2000
15DEC2000
Dest
Total
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
157
151
185
90
210
131
183
95
.
45
32
c07s1d4
Summary
The SET statement can be used to use a SAS data set
as input in a Data Step.
Simple algebraic expressions can be used to create
SAS variables in a SAS Data Step.
The DROP and KEEP statements can be used to
control which variables are included in a SAS data set.
Conditional Processing
33
Objectives
35
Conditional Execution
Execute statements conditionally using IF-THEN logic.
Control the length of character variables explicitly with
the LENGTH statement.
Select rows to include in a SAS data set.
Use SAS date constants.
International Airlines wants to compute revenue for
Los Angeles and Dallas flights based on the prices in
the table below.
DESTINATION CLASS
LAX
First
Economy
DFW
First
Economy
AIRFARE
2000
1200
1500
900
36
6
Conditional Execution
Conditional Execution
General form of IF-THEN and ELSE statements:
Compute revenue figures based on flight destination.
DESTINATION CLASS
AIRFARE
LAX
First
2000
Economy
1200
DFW
First
1500
Economy
900
IF
IFexpression
expression THEN
THENstatement;
statement;
ELSE
ELSEstatement;
statement;
Expression contains operands and operators that form
a set of instructions that produce a value.
Operands are
variable names
constants.
37
Operators are
symbols that request
– a comparison
– a logical operation
– an arithmetic calculation
SAS functions.
Only one executable statement is allowed on an
IF-THEN or ELSE statement.
data flightrev;
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if Dest='LAX' then
Revenue=sum(2000*FirstClass,1200*Economy);
else if Dest='DFW' then
Revenue=sum(1500*FirstClass,900*Economy);
run;
38
Conditional Execution
Conditional Execution
data flightrev; TRUE
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if Dest='LAX' then
Revenue=sum(2000*FirstClass,1200*Economy);
else if Dest='DFW' then
Revenue=sum(1500*FirstClass,900*Economy);
run;
PDV (First Observation)
data flightrev; TRUE
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if Dest='LAX' then
Revenue=sum(2000*FirstClass,1200*Economy);
else if Dest='DFW' then
Revenue=sum(1500*FirstClass,900*Economy);
run;
PDV (First Observation)
Flight Date
Dest First Economy Total Revenue
Class
14955 LAX
20
137
157
.
Flight Date
439
439
...
39
Dest First Economy Total Revenue
Class
14955 LAX
20
137
157 204400
...
40
Conditional Execution
Conditional Execution
data flightrev;
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if Dest='LAX' then
Revenue=sum(2000*FirstClass,1200*Economy);
else if Dest='DFW' then
Revenue=sum(1500*FirstClass,900*Economy);
run;
PDV (First Observation)
41
c07s2d1
data flightrev; FALSE
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if Dest='LAX' then
Revenue=sum(2000*FirstClass,1200*Economy);
else if Dest='DFW' then
Revenue=sum(1500*FirstClass,900*Economy);
run;
PDV (Fourth Observation)
Flight Date
Dest First Economy Total Revenue
Class
14955 LAX
20
137
157 204400
Flight Date
439
982
...
42
Dest First Economy Total Revenue
Class
14956 dfw
5
85
90
.
...
7
Conditional Execution
Conditional Execution
data flightrev; FALSE
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if Dest='LAX' then
Revenue=sum(2000*FirstClass,1200*Economy);
else if Dest='DFW' then
Revenue=sum(1500*FirstClass,900*Economy);
run;
PDV (Fourth Observation)
data flightrev;
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if Dest='LAX' then
Revenue=sum(2000*FirstClass,1200*Economy);
else if Dest='DFW' then
Revenue=sum(1500*FirstClass,900*Economy);
run;
PDV (Fourth Observation)
Flight Date
Dest First Economy Total Revenue
Class
14956 dfw
5
85
90
.
Flight Date
982
982
...
43
...
44
Conditional Execution
The UPCASE Function
proc print data=flightrev;
format Date date9.;
run;
You can use the UPCASE function to convert letters from
lowercase to uppercase.
General form of the UPCASE function:
The SAS System
Obs
1
2
3
4
5
6
7
8
9
10
45
Flight
439
921
114
982
439
982
431
982
114
982
Date
11DEC2000
11DEC2000
12DEC2000
12DEC2000
13DEC2000
13DEC2000
14DEC2000
14DEC2000
15DEC2000
15DEC2000
Dest
First
Class
Economy
Total
Revenue
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
20
20
15
5
14
15
17
7
.
14
137
131
170
85
196
116
166
88
187
31
157
151
185
90
210
131
183
95
187
45
204400
147900
234000
.
263200
126900
.
89700
224400
48900
Why are two Revenue values missing?
UPCASE
UPCASE (argument)
(argument)
c07s2d1
46
Conditional Execution
Conditional Execution
Use the UPCASE function to convert the Dest values to
uppercase for the comparison.
data flightrev;
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if upcase(Dest)='LAX' then
Revenue=sum(2000*FirstClass,1200*Economy);
else if upcase(Dest)='DFW' then
Revenue=sum(1500*FirstClass,900*Economy);
run;
FALSE
data flightrev;
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if upcase(Dest)='LAX' then
Revenue=sum(2000*FirstClass,1200*Economy);
else if upcase(Dest)='DFW' then
Revenue=sum(1500*FirstClass,900*Economy);
run;
PDV (Fourth Observation)
c07s2d2
upcase('dfw')='DFW'
Flight Date
982
47
Dest First Economy Total Revenue
Class
14956 dfw
5
85
90
.
48
Dest First Economy Total Revenue
Class
14956 dfw
5
85
90
.
...
8
Conditional Execution
Conditional Execution
TRUE
data flightrev;
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if upcase(Dest)='LAX' then
Revenue=sum(2000*FirstClass,1200*Economy);
else if upcase(Dest)='DFW' then
Revenue=sum(1500*FirstClass,900*Economy);
run;
TRUE
data flightrev;
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if upcase(Dest)='LAX' then
Revenue=sum(2000*FirstClass,1200*Economy);
else if upcase(Dest)='DFW' then
Revenue=sum(1500*FirstClass,900*Economy);
run;
upcase('dfw')='DFW'
PDV (Fourth Observation)
PDV (Fourth Observation)
Flight Date
Dest First Economy Total Revenue
Class
14956 dfw
5
85
90
.
Flight Date
982
982
...
49
upcase('dfw')='DFW'
Dest First Economy Total Revenue
Class
14956 dfw
5
85
90
84000
...
50
Conditional Execution
Conditional Execution
proc print data=flightrev;
format Date date9.;
run;
You can use the DO and END statements to execute a
group of statements based on a condition.
General form of the DO and END statements:
The SAS System
Obs
1
2
3
4
5
6
7
8
9
10
Flight
439
921
114
982
439
982
431
982
114
982
Date
11DEC2000
11DEC2000
12DEC2000
12DEC2000
13DEC2000
13DEC2000
14DEC2000
14DEC2000
15DEC2000
15DEC2000
Dest
First
Class
Economy
Total
Revenue
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
20
20
15
5
14
15
17
7
.
14
137
131
170
85
196
116
166
88
187
31
157
151
185
90
210
131
183
95
187
45
204400
147900
234000
84000
263200
126900
233200
89700
224400
48900
51
IF
IFexpression
expressionTHEN
THENDO;
DO;
executable
executablestatements
statements
END;
END;
ELSE
ELSEDO;
DO;
executable
executablestatements
statements
END;
END;
c07s2d2
52
Conditional Execution
Conditional Execution
proc print data=flightrev;
var Dest City Flight Date Revenue;
format Date date9.;
run;
Use DO and END statements to execute a group of
statements based on a condition.
data flightrev;
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if upcase(Dest)='DFW' then do;
Revenue=sum(1500*FirstClass,900*Economy);
City='Dallas';
end;
else if upcase(Dest)='LAX' then do;
Revenue=sum(2000*FirstClass,1200*Economy);
City='Los Angeles';
end;
run;
The SAS System
Obs
Dest
City
Flight
1
2
3
4
5
6
7
8
9
10
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
Los An
Dallas
Los An
Dallas
Los An
Dallas
Los An
Dallas
Los An
Dallas
439
921
114
982
439
982
431
982
114
982
Date
Revenue
11DEC2000
11DEC2000
12DEC2000
12DEC2000
13DEC2000
13DEC2000
14DEC2000
14DEC2000
15DEC2000
15DEC2000
204400
147900
234000
84000
263200
126900
233200
89700
224400
48900
Why are City values truncated?
53
c07s2d3
54
c07s2d3
9
Variable Lengths
The LENGTH Statement
You can use the LENGTH statement to define the
length of a variable explicitly.
At compile time, the length of a variable is determined the
first time the variable is encountered.
data flightrev;
set ia.dfwlax;
Total=sum(FirstClass,Economy);
if upcase(Dest)='DFW' then do;
Revenue=sum(1500*FirstClass,900*Economy);
City='Dallas';
end;
else if upcase(Dest)='LAX' then do;
Revenue=sum(2000*FirstClass,1200*Economy);
City='Los Angeles';
6 characters
end;
between the quotes:
run;
Length=6
...
55
General form of the LENGTH statement:
LENGTH
LENGTHvariable(s)
variable(s)$$length;
length;
Example:
length City $ 11;
56
The LENGTH Statement
The LENGTH Statement
proc print data=flightrev;
var Dest City Flight Date Revenue;
format Date date9.;
run;
data flightrev;
set ia.dfwlax;
length City $ 11;
Total=sum(FirstClass,Economy);
if upcase(Dest)='DFW' then do;
Revenue=sum(1500*FirstClass,900*Economy);
City='Dallas';
end;
else if upcase(Dest)='LAX' then do;
Revenue=sum(2000*FirstClass,1200*Economy);
City='Los Angeles';
end;
run;
57
c07s2d4
The SAS System
1
2
3
4
5
6
7
8
9
10
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
Los Angeles
Dallas
Los Angeles
Dallas
Los Angeles
Dallas
Los Angeles
Dallas
Los Angeles
Dallas
Flight
439
921
114
982
439
982
431
982
114
982
Date
Revenue
11DEC2000
11DEC2000
12DEC2000
12DEC2000
13DEC2000
13DEC2000
14DEC2000
14DEC2000
15DEC2000
15DEC2000
204400
147900
234000
84000
263200
126900
233200
89700
224400
48900
c07s2d4
You can use a DELETE statement to control which rows
are written to the SAS data set.
WHERE statement
DELETE statement
subsetting IF statement.
General form of the DELETE statement:
IF
IFexpression
expressionTHEN
THENDELETE;
DELETE;
The WHERE statement in a DATA step is the same as
the WHERE statement you saw in a PROC step.
59
City
Deleting Rows
In a DATA step, you can subset the rows (observations)
in a SAS data set with a
Dest
58
Subsetting Rows
Obs
The expression can be any SAS expression.
60
10
Deleting Rows
Deleting Rows
proc print data=over175;
var Dest City Flight Date Total Revenue;
format Date date9.;
run;
Delete rows that have a Total value that is less than or
equal to 175.
data over175;
set ia.dfwlax;
length City $ 11;
Total=sum(FirstClass,Economy);
if Total le 175 then delete;
if upcase(Dest)='DFW' then do;
Revenue=sum(1500*FirstClass,900*Economy);
City='Dallas';
end;
else if upcase(Dest)='LAX' then do;
Revenue=sum(2000*FirstClass,1200*Economy);
City='Los Angeles';
end;
run;
61
c07s2d5
The SAS System
Obs
Dest
1
2
3
4
LAX
LAX
LaX
LAX
City
Los
Los
Los
Los
Angeles
Angeles
Angeles
Angeles
Flight
114
439
431
114
Date
12DEC2000
13DEC2000
14DEC2000
15DEC2000
Total
Revenue
185
210
183
187
234000
263200
233200
224400
62
Selecting Rows
c07s2d5
Process Flow of a Subsetting IF
You can use a subsetting IF statement to control which
rows are written to the SAS data set.
Subsetting IF:
General form of the subsetting IF statement:
DATA Statement
Read Observation
or Record
IF
IFexpression;
expression;
IF expression False
The expression can be any SAS expression.
True
Continue Processing
Observation
The subsetting IF statement is valid only in a DATA step.
Output Observation to SAS
Data Set
63
64
Process Flow of a Subsetting IF
Subsetting IF:
Selecting Rows
Select rows that have a Total value that is greater
than 175.
data over175;
set ia.dfwlax;
length City $ 11;
Total=sum(FirstClass,Economy);
if Total gt 175;
if upcase(Dest)='DFW' then do;
Revenue=sum(1500*FirstClass,900*Economy);
City='Dallas';
end;
else if upcase(Dest)='LAX' then do;
Revenue=sum(2000*FirstClass,1200*Economy);
City='Los Angeles';
end;
run;
DATA Statement
Read Observation
or Record
IF expression False
True
Continue Processing
Observation
Output Observation to SAS
Data Set
65
...
66
c07s2d6
11
Selecting Rows
Selecting Rows
proc print data=over175;
var Dest City Flight Date Total Revenue;
format Date date9.;
run;
The variable Date in the ia.dfwlax data set
contains SAS date values (numeric values).
01JAN1960
The SAS System
Obs
Dest
1
2
3
4
LAX
LAX
LaX
LAX
City
Los
Los
Los
Los
Angeles
Angeles
Angeles
Angeles
Flight
114
439
431
114
01JAN1961
14DEC2000
366
???
01/01/1960
12/14/2000
store
Date
12DEC2000
13DEC2000
14DEC2000
15DEC2000
Total
Revenue
185
210
183
187
234000
263200
233200
224400
0
display
01/01/1959
What if you only wanted flights that were before a
specific date, such as 14DEC2000?
67
c07s2d6
68
Using SAS Date Constants
Using SAS Date Constants
The constant 'ddMMMyyyy'd (example: '14dec2000'd)
creates a SAS date value from the date enclosed in
quotes.
dd
is a one- or two-digit value for the day.
MMM is a three-letter abbreviation for the month (JAN,
FEB, MAR, and so on).
yyyy
is a two- or four-digit value for the year.
d
is required to convert the quoted string to a SAS
date.
69
data over175;
set ia.dfwlax;
length City $ 11;
Total=sum(FirstClass,Economy);
if Total gt 175 and Date lt '14dec2000'd;
if upcase(Dest)='DFW' then do;
Revenue=sum(1500*FirstClass,900*Economy);
City='Dallas';
end;
else if upcase(Dest)='LAX' then do;
Revenue=sum(2000*FirstClass,1200*Economy);
City='Los Angeles';
end;
run;
70
Using SAS Date Constants
Subsetting Data
proc print data=over175;
var Dest City Flight Date Total Revenue;
format Date date9.;
run;
The SAS System
71
Obs
Dest
City
1
2
LAX
LAX
Los Angeles
Los Angeles
Flight
114
439
c07s2d7
Date
12DEC2000
13DEC2000
Total
Revenue
185
210
234000
263200
c07s2d7
What if the data were in a raw data file instead of a
SAS data set?
data over175;
infile 'raw-data-file';
input @1 Flight $3. @4 Date mmddyy8.
@12 Dest $3. @15 FirstClass 3.
@18 Economy 3.;
length City $ 11;
Total=sum(FirstClass,Economy);
if Total gt 175 and Date lt '14dec2000'd;
if upcase(Dest)='DFW' then do;
Revenue=sum(1500*FirstClass,900*Economy);
City='Dallas';
end;
else if upcase(Dest)='LAX' then do;
Revenue=sum(2000*FirstClass,1200*Economy);
City='Los Angeles';
end;
run;
72
c07s2d8
12
Subsetting Data
WHERE or Subsetting IF?
proc print data=over175;
var Dest City Flight Date Total Revenue;
format Date date9.;
run;
Dest
City
1
2
LAX
LAX
Los Angeles
Los Angeles
Flight
114
439
WHERE
IF
Yes
No
No
No
Yes
Yes
Yes
Yes
Variable in ALL data sets
Yes
Yes
Variable not in ALL data sets
No
Yes
PROC step
DATA step (source of variable)
The SAS System
Obs
Step and Usage
Date
12DEC2000
13DEC2000
Total
Revenue
185
210
234000
263200
INPUT statement
Assignment statement
SET statement (single data set)
SET/MERGE (multiple data sets)
73
c07s2d8
74
WHERE or Subsetting IF?
WHERE or Subsetting IF?
Use a WHERE statement and a subsetting IF statement in
the same step.
data over175;
set ia.dfwlax;
where Date lt '14dec2000'd;
length City $ 11;
Total=sum(FirstClass,Economy);
if Total gt 175;
if upcase(Dest)='DFW' then do;
Revenue=sum(1500*FirstClass,900*Economy);
City='Dallas';
end;
else if upcase(Dest)='LAX' then do;
Revenue=sum(2000*FirstClass,1200*Economy);
City='Los Angeles';
end;
run;
75
c07s2d9
proc print data=over175;
var Dest City Flight Date Total Revenue;
format Date date9.;
run;
The SAS System
Obs
Dest
City
1
2
LAX
LAX
Los Angeles
Los Angeles
Flight
114
439
Date
12DEC2000
13DEC2000
Total
Revenue
185
210
234000
263200
76
c07s2d9
Summary
IF-THEN can be used to conditionally execute SAS
statements.
The length of character variables can explicitly be
controlled with the LENGTH statement.
The subsetting IF statement, DELETE or OUTPUT
SAS statements can be used to select which rows to
include in a SAS data set.
SAS date constants can be used to indicate a specific
date.
Dropping and Keeping Variables
77
13
Objectives
Selecting Variables
Compare DROP and KEEP statements to DROP= and
KEEP= data set options.
You can use a DROP= or KEEP= data set option in a
DATA statement to control what variables are written to
the new SAS data set.
General form of the DROP= and KEEP= data set options:
SAS-data-set(DROP=variables)
SAS-data-set(DROP=variables)
or
or
SAS-data-set(KEEP=variables)
SAS-data-set(KEEP=variables)
Equivalent
79
80
Selecting Variables
Selecting Variables
Do not store the variables FirstClass and Economy
in the data set.
proc print data=onboard;
format Date date9.;
run;
data onboard(drop=FirstClass Economy);
set ia.dfwlax;
Total=FirstClass+Economy;
run;
The SAS System
data onboard(keep=Flight Date Dest Total);
D
D
PDV
Flight Date Dest FirstClass Economy Total
.
.
81
.
.
c07s3d1
...
Equivalent
Equivalent
1
2
3
4
5
6
7
8
9
10
439
921
114
982
439
982
431
982
114
982
Date
11DEC2000
11DEC2000
12DEC2000
12DEC2000
13DEC2000
13DEC2000
14DEC2000
14DEC2000
15DEC2000
15DEC2000
Dest
Total
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
157
151
185
90
210
131
183
95
.
45
c07s3d1
Summary
DROP= and KEEP= data set options in a DATA
statement are similar to DROP and KEEP statements.
data onboard(drop=FirstClass Economy);
set ia.dfwlax;
Total=FirstClass+Economy;
run;
data onboard(keep=Flight Date Dest Total);
Equivalent Steps
83
Flight
82
Selecting Variables
data onboard;
drop FirstClass Economy;
set ia.dfwlax;
Total=FirstClass+Economy;
run;
Obs
DROP= and KEEP= data set options are an alternative
to the DROP and KEEP statements.
The placement of the DROP= and KEEP= data set
options can effect the availability of the variables for
processing during the SAS data step.
keep Flight Date Dest Total;
c07s3d2
84
14
Objectives
Create a SAS data set from an Excel spreadsheet that
contains date fields.
Extract SAS date values from SAS datetime values.
Reading Excel Spreadsheets
86
Business Task
The IMPORT Procedure
The flight data for Dallas and Los Angeles are in an
Excel spreadsheet. The departure date is stored as a
date field in the spreadsheet.
Excel Spreadsheet
Use the IMPORT procedure to create a SAS data set
from the spreadsheet (the SAS LE does not include
PROC IMPORT).
proc import out=work.dfwlaxdates
datafile='datefields.xls'
dbms=excel2000;
getnames=yes;
run;
Excel Spreadsheet
SAS Data Set
Flight Date
439
921
114
Dest
FirstClass Economy
SAS Data Set
12/11/00 LAX
12/11/00 DFW
12/12/00 LAX
20
20
15
137
131
170
87
88
The IMPORT Wizard from the FILE Menu
The Resulting SAS Data Set (SAS V9.1)
Use the IMPORT wizard to create a SAS data set
(implementations vary from one version to another.
The IMPORT procedure stores date fields read from
spreadsheets correctly.
proc print data=work.dfwlaxdates;
run;
The SAS System
Obs
89
90
1
2
3
4
5
6
7
8
9
10
Flight
439
921
114
982
439
982
431
982
114
982
Date
11DEC2000
11DEC2000
12DEC2000
12DEC2000
13DEC2000
13DEC2000
14DEC2000
14DEC2000
15DEC2000
15DEC2000
Dest
First
Class
Economy
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
20
20
15
5
14
15
17
7
.
14
137
131
170
85
196
116
166
88
187
31
15
The Resulting SAS Data Set (Version 8.2)
The IMPORT procedure stores date fields read from
spreadsheets as SAS datetime values, rather than dates.
proc print data=work.dfwlaxdates;
run;
The SAS System
91
Obs
Flight
1
2
3
4
5
6
7
8
9
10
439
921
114
982
439
982
431
982
114
982
Date
11DEC2000:00:00:00
11DEC2000:00:00:00
12DEC2000:00:00:00
12DEC2000:00:00:00
13DEC2000:00:00:00
13DEC2000:00:00:00
14DEC2000:00:00:00
14DEC2000:00:00:00
15DEC2000:00:00:00
15DEC2000:00:00:00
Dest
First
Class
Economy
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
20
20
15
5
14
15
17
7
.
14
137
131
170
85
196
116
166
88
187
31
SAS Datetime Values
A SAS datetime value is interpreted as the number of
seconds between midnight, January 1, 1960, and a specific
date and time.
31DEC1959:23:00:00
01JAN1960:01:00:00
01JAN1960:00:00:00
31DEC1959:23:59:00
informat
-60
-3600
01JAN1960:00:01:00
0
60
3600
format
01JAN1960:00:00:00
92
The DATEPART Function
The DATEPART Function
You can use the DATEPART function to extract the date
portion of a SAS datetime value.
Use the DATA step to create a SAS data set that contains
SAS date values instead of SAS datetime values.
DATEPART(SASdatetime) returns the SAS date value
from a SAS datetime value.
SAS datetime value
(stored in seconds)
1291281300
Formatted SAS
datetime value
01DEC2000:09:15:00
data work.dfwlax;
set work.dfwlaxdates(rename=(Date=OldDate));
drop OldDate;
Date=datepart(OldDate);
format Date date9.;
run;
datepart(SASdatetimevalue)
SAS date value
(stored in days)
14945
Formatted SAS
date value
01DEC2000
93
94
The DATEPART Function
The Menu Option TOOLS/IMPORT DATA …
proc print data=work.dfwlax;
run;
The Tools/Import Data option in the SAS Learning Edition
(Enterprise Interface) can be used to convert Excel
Spreadsheets to a SAS data set.
The SAS System
Obs
1
2
3
4
5
6
7
8
9
10
95
Flight
439
921
114
982
439
982
431
982
114
982
Dest
First
Class
Economy
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
20
20
15
5
14
15
17
7
.
14
137
131
170
85
196
116
166
88
187
31
Date
11DEC2000
11DEC2000
12DEC2000
12DEC2000
13DEC2000
13DEC2000
14DEC2000
14DEC2000
15DEC2000
15DEC2000
96
16
Using Dynamic Data Exchange (DDE)
The DDE Sample Program
DDE can be used to read data from the CLIPBOARD,
or from various Windows applications.
The application (in this case, Excel) must be open to
the data that you want to read.
This may be the only option available if you are using
the Learning Edition of SAS in the Programming
Interface.
filename in1 DDE
'Excel|C:\mysasdata\[datefields.xls]DFWLAX!R1C1:R11C5';
data dates;
infile in1 notab dlm='09'x dsd
missover firstobs=2;
input Flight Date Dest $ FirstClass Economy;
informat date mmddyy10.;
format date mmddyy10.;
run;
proc print data=dates;
run;
97
98
The DDE Triplet
The DDE Sample Program
Fileref
filename in1 DDE
'Excel|C:\mysasdata\[datefields.xls]DFWLAX!R1C1:R11C5';
Path
File
Spreadsheet
filename in1 DDE
'Excel|C:\mysasdata\[datefields.xls]DFWLAX!R1C1:R11C5';
data dates;
Excel
infile in1 notab dlm='09'x dsd
Stuff
missover firstobs=2;
input Flight Date Dest $ FirstClass Economy;
informat date mmddyy10.;
format date mmddyy10.;
run;
Start with 2nd
row
proc print data=dates;
run;
Cells
Dates, coming
in and going
out…
99
100
The Resulting PROC PRINT
Summary
proc print data=work.dates;
run;
The SAS System
101
Obs
Flight
1
2
3
4
5
6
7
8
9
10
439
921
114
982
439
982
431
982
114
982
Date
12/11/2000
12/11/2000
12/12/2000
12/12/2000
12/13/2000
12/13/2000
12/14/2000
12/14/2000
12/15/2000
12/15/2000
Dest
First
Class
Economy
LAX
DFW
LAX
dfw
LAX
DFW
LaX
DFW
LAX
DFW
20
20
15
5
14
15
17
7
.
14
137
131
170
85
196
116
166
88
187
31
SAS data set can be created from an Excel
spreadsheet using the IMPORT Procedure, or the
IMPORT Wizard in the full implementations of SAS.
SAS date values can be extracted from SAS datetime
values using the DATEPART function.
The DDE interface to Excel can be especially useful in
accessing specific cells within a spreadsheet.
102
17
Objectives
Minitab Statements for Creating
Variables, Conditional Processing,
and Erasing Variables
Create a new Minitab columns or constants using the
LET commands
Conditionally execute blocks of Minitab commands
with the IF command.
Eliminate columns or constants no longer needed with
the ERASE command.
Eliminate rows in a Minitab worksheet with the
DELETE command.
104
The LET command
Examples of the LET Command
Command Syntax
LET C1 = (C2 + C3)*10 - 60
LET C1 = C1 - MEAN(C1)
LET C(K) = K (a constant, or a particular value)
LET K1 = 5.3
LET expression
LET K2 = MEAN(C10)/STDEV(C1)
LET C5 = (C1 < 5)
The expression may contain arithmetic operations,
comparison operations, logical operations, and functions.
LET K2 = C1(28)
LET C1(28) = 2.35
Arguments may be columns, stored constants or
numbers.
105
LET C5(15) = "blue"
106
The IF Command
Logical Expressions
IF logical expression
(a block of Minitab and macro commands)
Comparison and Boolean operators:
ELSEIF logical expression
(a block of Minitab commands and macro statements)
ELSE
(a block of Minitab commands and macro statements)
ENDIF
Allows you to execute different blocks of code depending on a logical
condition.
107
=
~=
<
>
<=
>=
or
or
or
or
or
or
EQ
NE
LT
GT
LE
GT
&
|
~
or
or
or
AND
OR
NOT
equal to
not equal to
less than
greater than
less than or equal to
greater than or equal to
108
18
Example of the IF Command
The ERASE Command
LET K1 = MEAN(C1)
LET K2 = MEAN(C2)
LET K3 = MEAN(C3)
Command Syntax
ERASE E...E
IF K1 < K2 AND K1 < K3
PRINT C1
ELSEIF K2 < K1 AND K2 < K3
PRINT C2
ELSEIF K3 < K1 AND K3 < K2
PRINT C3
ELSE
NOTE Note: There are ties.
ENDIF
Erases any combination of columns (including their
names), constants, and matrices.
It is a good practice to erase all columns, constants, and
matrices you no longer need.
109
110
Example of the ERASE Command
DELETE Command (rows)
ERASE C3
Command Syntax
changes the worksheet as follows:
C2
23
31
22
26
32
30
24
C3
4
5
4
3
6
6
4
C4
154
175---------->
143
153
167
158
155
C2
23
31
22
26
32
30
24
DELETE
C3
C4
154
175
143
153
167
158
155
111
Deletes rows K…K from columns C...C, and moves the
remaining rows up to close the gap.
DELETE works with both text and numeric columns.
112
Example of the DELETE Command
SUMMARY
DELETE 2 5 6 C2-C4
• LET statements can be used to created new variables
in Minitab.
changes the worksheet as follows:
C2
23
31
22
26
32
30
24
113
rows K...K of C...C
C3
4
5
4
3
6
6
4
C4
154
175---------->
143
153
167
158
155
C2
23
22
26
24
• Conditional processing can be used to execute Minitab
C3
4
4
3
4
statements when certain conditions are true.
C4
154
143
153
155
• Columns in a Minitab worksheet may be deleted using
the ERASE command.
114
19
© Copyright 2026 Paperzz