13. Trasforming data with SAS funcGons

13.TrasformingdatawithSAS
func)ons
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
305
SASfunc)ons
FUNCTION-NAME(argument1<,…,argument-n>);
argumentscanbe:
-  Variables
-  Ex.:MEAN(var1,var2,var3)
-  Constants
-  Ex.:MEAN(456,12,5)
-  Expressions
-  Ex:MEAN(456*2,12/5,5,MEAN(456,12,5))
-  Forsomefunc)onsvariableslistsandarrayscanalsobeusedas
arguments,precededbythewordOF
-  Ex.:MEAN(OFvar1-var3)MEAN(OFnewarray{*})
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
306
TARGETVariables
Atargetvariableisthevariabletowhichtheresultofafunc)onisassigned
-  Ex.:AvgVar=MEAN(var1,var2,var3);
-  Unlessthelengthofthetargetvariablehasbeenpreviouslydefined,adefaultlengthis
assigned
-  Thedefaultlengthdependsonthefunc)on(forcharacterfunc)oncanbeaslongas
200)
-  Defaultlengthcouldtakemorespacethannecessary.Tosavestoragespace,youcan
addaLENGTHstatementtospecifyalengthforthecharactertargetvariablebeforethe
statementthatcreatesthevaluesofthatvariable
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
307
Exampledataset
PROCSORTDATA=Lib9_3.Order_factOUT=Order_fact;
BYCustomer_ID;
WHEREyear(Order_Date)=2007;
RUN;
DATAOrders(KEEP=Customer_IDCustomer_FirstNameCustomer_Lastname
Customer_NameGenderCustomer_addressStreet_numberBirth_Date
Delivery_DateOrder_DateCostPrice_Per_Unit);
MERGELib9_3.Customer(IN=cust)Work.Order_fact(IN=order);
BYCustomer_ID;
IForder=1ANDcust=1ANDCustomer_ID<20;
RUN;
PROCCONTENTSDATA=Orders;
RUN;
PROCPRINTDATA=Orders;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
308
Automa)ccharacter-to-numericconversion
-  Ifyoureferenceacharactervariableinanumericcontext(inanarithme)copera)on,in
anumericcomparisonorinafunc)onasanargument),SAStriestoconvertthevariable
valuestonumeric.
-  SASdoesnotchangethevariabletype,butitcreatesatemporarynumericvalueforeach
charactervalueofthevariabletoconvert.
-  Whendataisautoma)callyconverted,amessageiswri{entotheSASlog
-  TheWHEREstatementdoesnotperformautoma)cconversionincomparison
Theautoma)cconversion
-  Usesthew.informat,wherewisthewidthofthecharactervaluethatisbeingconverted
-  Producesanumericmissingvaluefromanycharacterthatdoesnotconformtostandard
numericnota)ons
DATAprova(KEEP=NewvarOrder_Date);
SETorders;
Newvar=street_number+costprice_per_unit;
Order_Date=Customer_FirstName;
RUN;
PROCPRINTDATA=prova;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
309
Explicitcharacter-to-numeric
expression
Thisfunc)onconvertsacharactercharactervariable,constantorexpressionto
anumericvalue
INPUT(source,informat);
-  Source:Indicatesthecharactervariable,constantorexpressiontobe
convertedtoanumericvalue
-  Informat:anumericinformat
DATAprova(KEEP=NewvarNewvar2street_numbercostprice_per_unit);
SETorders;
Newvar=INPUT(street_number,4.)+costprice_per_unit;
Newvar2=INPUT(street_number,3.)+costprice_per_unit;
RUN;
PROCPRINTDATA=prova;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
310
Automa)cnumeric-to-characterconversion
-  Ifyoureferenceanumericvariableinacharactercontext(whenassigninganumericvalueto
acharacter-typevariable,whenusinganumericvaluewithanoperatorforcharactervalues
orinafunc)onwhichrequirescharactervalues),SAStriestoconvertthevariablevaluesinto
characters.
-  Whendataisautoma)callyconverted,amessageiswri{entotheSASlog
Theautoma)cconversion
-  SASwritesthenumericvalueusingtheBEST12.format.Thisimpliesthatiftheoriginal
numericvaluehasfewerthan12digits,theresul)ngcharacterwillhaveleadingblanks
DATAprova(KEEP=Customer_LastName2Customer_LastNamecostprice_per_unit);
SETorders;
Customer_LastName2=Customer_LastName;
Customer_LastName=costprice_per_unit;
RUN;
PROCPRINTDATA=prova;
RUN;
PROCCONTENTSDATA=prova;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
311
Explicitnumeric-to-character
conversion
Thisfunc)onconvertsanumericvariable,constantorexpressiontoacharacterstring
PUT(source,format);
-  Sourceindicatesthenumericvariable,constantorexpressiontobe
convertedtoacharactervalue
-  Format:aformatmatchingthedatatypeofthesource
DATAprova(KEEP=Customer_LastNamecostprice_per_unitnewCP_unitStreet_Number);
SETorders;
Customer_LastName=costprice_per_unit;
newCP_unit=PUT(costprice_per_unit,DOLLAR7.2);
RUN;
PROCPRINTDATA=prova;
RUN;
N.B.:numericformatsright-aligntheresult;characterformatsle|-aligntheresults
(SeevariablesStreet_NumberandCostPrice_per_Unit)
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
312
Manipula)ngSASdatevalueswith
func)ons
SASdate,)meanddate)mevariablesarenumericvariables
SASstores:
-  AdatevalueasthenumberofdaysfromJanuary1,1960toagivendate
-  A)mevalueasthenumberofsecondssincemidnight
-  Adate)mevalueasthenumberofsecondsfrommidnightonJanuary1,1960toagiven
dateand)me
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
313
YEAR,QRT,MONTHandDAYfunc)ons
YEAR(date)
QTR(date)
-DateisaSASdatevaluethatisspecifiedeitherasavariablesorasaSAS
MONTH(date)
dateconstant
DAY(date)
WEEKDAY(date)
DATAprova(KEEP=Birth_DateWeekDayDayMonthYear);
SETorders;
WeekDay=WEEKDAY(Birth_Date);
Day=DAY(Birth_Date);
Month=MONTH(Birth_Date); -  YEAR returns a four-digit numeric value that
representstheyear
Year=YEAR(Birth_Date);
-  QTRreturnsavalueof1,2,3or4toindicatethe
RUN;
quarteroftheyearinwhichthedatevaluefalls
PROCPRINTDATA=prova;
-  MONTH returns a value from 1 to 12
RUN;
represen)ngthemonth
-  DAY returns a numeric value from 1 to 31
represen)ngthedayofthemonth
-  WEEKDAY returns a numeric value from 1
(sunday)to7(saturday)represen)ngthedayof
theweek
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
314
MDYfunc)on
Thisfunc)onreturnstheSASdatevaluecorrespondingtotheday,monthand
yearspecifiedinthearguments
MDY(month,day,year)
-  month:anumberfrom1-12oravariablerepresen)ngthemonth
-  day:anumberfrom1-31oravariablerepresen)ngtheday
-  year:anumberthathas2or4digitsoravariablerepresen)ngtheyear.
DATAprova(KEEP=Birth_DateWeekDayDayMonthYearmyDATE);
SETorders;
Day=DAY(Birth_Date);
Month=MONTH(Birth_Date);
Year=YEAR(Birth_Date);
myDATE=MDY(Month,Day,Year);
RUN;
PROCPRINTDATA=prova;
FORMATmyDateDATE9.;
RUN;
-  Ifyouspecifyaninvaliddate,SASreturnsamissingdata
-  Ifyouspecifyonly2digitspaya{en)ontotheYEARCUTOFFsystemop)on!
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
315
DATEandTODAYfunc)ons
-  Thesefunc)onsreturnthecurrentdatefromthesystemclockasaSASdatevalue
-  Theycanusedinterchangeably
DATE()
TODAY()
-Thesefunc)onsdonotrequireanyarguments,buttheymusts)llbefollowedbyparenthesis
DATAprova(KEEP=Today);
SETorders;
Today=DATE();
RUN;
PROCPRINTDATA=prova(OBS=5);
FORMATTodayDATE9.;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
316
INTCKfunc)on
Thisfunc)oncountsintervalsfromfixedintervalbeginnings
INTCK(‘interval’,from,to)
-  ‘interval’specifiesacharacterconstantorvariable.Itcanoneamong:DAY,WEEKDAY,WEEK,TENDAY,
SEMIMONTH,MONTH,QTR,SEMIYEAR,YEAR.Thetypeofinterval()me,date,date)me)mustmatch
thetypeoffrom
-  fromspecifiesaSASdate,)meordate)mevaluethatindicatesthebeginningofthe)me
-  tospecifiesaSASdate,)meordate)mevaluethatindicatestheendofthe)mespan
Forexample,WEEKintervalsarecountedbySundays,MONTHintervalsarecountedfromtheday
1ofeachmonthandYEARintervalsarecountedfromJAN01.
DATAprova(KEEP=Birth_DateOrder_DateDelivery_DateDelai_LivrSem1Sem2MoisAnAge);
SETorders;
Delai_Livr=INTCK("DAY",Order_Date,Delivery_Date);
Sem1=INTCK("WEEK",'31dec2011'd,'01jan2012'd);
Sem2=INTCK("WEEK",'31dec2012'd,'01jan2013'd);
Mois=INTCK("MONTH",'31dec2012'd,'01jan2013'd);
An=INTCK("YEAR",'31dec2012'd,'01jan2013'd);
Age=INTCK("YEAR",Birth_date,TODAY());
RUN;
PROCPRINTDATA=prova;
317
RUN;GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
TheINTNXfunc)on
TheINTNXfunc)onappliesmul)plesofagivenintervaltoadate,)meordate)mevalue
andreturnstheresul)ngvalue
INTNX(‘interval’,start-from,increment,<‘alignement’>)
- 
- 
- 
- 
‘interval’ specifies a character constant of variable. It can one among: DAY, WEEKDAY, WEEK, TENDAY,
SEMIMONTH,MONTH,QTR,SEMIYEAR,YEAR.Thetypeofintervalmustmatchthetypeofstart-from
start-fromspecifiesaSASdate,)meordate)mevaluethatindicatesthebeginningofthe)me
incrementnega)veorposi)veintegerthatrepresents)meintervalstowardthepastorthefuture
‘alignment’specifiesthatthereturneddateisalignedtothebeginning(BEGINNINGorB,defaultvalue),
middle(MIDDLEorM),end(ENDorE)oftheinterval,orthesameday(SAMEDAYorS)oftheinputdate
DATAprova(KEEP=Order_DateIntervMonthBIntervMonthMIntervMonthEIntervMonthS);
SETorders;
IntervMonthB=INTNX('MONTH',Order_Date,2,'B');
IntervMonthM=INTNX('MONTH',Order_Date,2,'M');
IntervMonthE=INTNX('MONTH',Order_Date,2,'E');
IntervMonthS=INTNX('MONTH',Order_Date,2,'S');
RUN;
PROCPRINTDATA=prova(OBS=10);
FORMATIntervMonthBIntervMonthMIntervMonthEIntervMonthSdate9.;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
318
DATDIFandYRDIF
Theyreturnthedifferenceindaysandyearsbetweentwodates,respec)vely
DATDIF(start-date,end-date,‘basis’)
YRDIF(start-date,end-date,‘basis’)
-  start-datespecifiesthestar)ngdateasaSASdatevalue
-  end-datespecifiestheendingdateasaSASdatevalue
-  basisspecifieshowSAScalculatethedatedifference
•  FourcharacterstringsarevalidforbasisinYRDIFfunc)on:
’30/360’:specifiesa30daymonthanda360dayyear
‘ACT/ACT’:usestheactualnumberofdaysoryearsbetweendates
‘ACT/360’:usestheactualnumberofdaysbetweendatesanddividesitby360forcalcula)ngthenumber
ofyears(validonlyforYRDIF)
•  ‘ACT/365’:usestheactualnumberofdaysbetweendatesanddividesitby365forcalcula)ngthenumber
ofyears(validonlyforYRDIF)
• 
• 
• 
DATAprova(KEEP=Birth_DateOrder_DateDelivery_DateDelai_LivrDelai_Livr1AgeAge1);
SETorders;
Delai_Livr=DATDIF(Order_Date,Delivery_Date,"ACT/ACT");
Delai_Livr1=INTCK("DAY",Order_Date,Delivery_Date);
Age=YRDIF(Birth_date,TODAY(),"ACT/ACT");
Age1=INTCK("YEAR",Birth_date,TODAY());
RUN;
PROCPRINTDATA=prova;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
319
SCANfunc)on
Thisfunc)onseparatesacharactervalueintowordsandreturnsthen-thword
SCAN(argument,n,<delimiters>)
-  argumentspecifiesthecharactervariableorexpressiontoscan
-  nspecifieswhichwordtoreturn
-  delimitersarespecialcharactersthatmustbeenclosedinsinglequota)onmarks
-  Ifyouspecifymul)pledelimiters,SASusesanyofthedelimiters,singularlyonin
anycombina)on
-  Defaultdelimiters:blank.<(+&!$*);^-/,%
DATAprova(KEEP=Customer_NameNomPrenom);
SETorders;
LENGTHNom$20Prenom$20;
N.B.:
Nom=SCAN(Customer_Name,2);
This func)on assigns a length of 200 to each target
Prenom=SCAN(Customer_Name,1); variable. To save storage space, you can add a
RUN;
LENGTH statement to specify a length for the
PROCPRINTDATA=prova;
character target variable before the statement that
RUN;
containstheSCANfunc)on
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
320
SUBSTRfunc)on
Thisfunc)onextractsorreplaceapor)onofacharactervariable
-  Whenthefunc)onisontherightsideofanassignmentstatement,thefunc)on
returnstherequestedstring
-  Whenthefunc)onisonthele|sideofanassignmentstatement,thefunc)on
replacesthestringbythestringindicatedontherightside
SUBSTR(argument,posi+on,<n>)
-  argumentspecifiesthecharactervariableorexpressiontoscan
-  posi+onisthecharacterposi)ontostartfrom
-  nspecifiesthenumberofcharactertoextract.Ifitisomi{edalltheremaining
charactersareincludedinthesubstring
DATAprova(KEEP=Customer_NameIni)al);
SETorders;
Ini)al=SUBSTR(Customer_Name,1,1);
IF(Customer_NameEQ'SandrinaStephano')THENSUBSTR(Customer_Name,6,3)='ona';
RUN;
PROCPRINTDATA=prova(OBS=7);
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
321
TRIMfunc)on
TRIM(argument)
-  argumentspecifiesanycharacterexpression
Thisfunc)onremovestrailingblanksfromcharactervalues
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
322
CATXfunc)on
Thisfunc)onconcatenatescharacterstrings,removesleadingandtrailing
blanksandinsertseparators
CATX(separator,string-1<,…,string-n>)
-  separatorspecifiesthecharacterstringthatisusedasa
separatorbetweenconcatenatedstrings
-  stringspecifiesaSAScharacterstring
DATAprova(KEEP=Customer_FirstNameCustomer_LastNamemyNameGender);
SETorders;
IF(GenderEQ'F')
THENmyName=CATX('','M.me',Customer_FirstName,Customer_LastName);
ELSEmyName=CATX('','M.',Customer_FirstName,Customer_LastName);
RUN;
PROCPRINTDATA=prova;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
323
INDEXfunc)on
Thisfunc)onfindsthestringandreturnstheposi)onofthestring’sfirst
character;ifitisnotfound,itreturnsavalueof0
INDEX(source,‘excerpt’)
-  sourcespecifiesthecharactervariableorexpressiontosearch
-  excerptspecifiesacharacterstringinclosedinquota)onmarks
DATAprova(KEEP=Customer_NameBlack_pos);
SETorders;
Black_pos=INDEX(Customer_Name,'Black');
IFBlack_pos>0;
RUN;
PROCPRINTDATA=prova;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
324
FINDfunc)on
Thisfunc)onfindsthesubstringandreturnstheposi)onofthesubstring’sfirstcharacter;
ifitisnotfound,itreturnsavalueof0.
FINDissimilartoINDEX,butitallows‘modifiers’anda‘star)ngposi)on’tobespecified
(between quota)on marks). Two modifiers exist, ‘i’ and ‘t’. The ‘i’ modifier tells SAS to
ignorethecase,the‘t’modifiertrimstrailingblanksfromstringandsubstring
FIND(string,substring<’modifiers’><startpos>)
-  stringspecifiesthecharacterconstant,variableorexpressionthatwillbesearchedforsubstrings
-  substringisthecharacterconstant,variableorexpressionthatspecifiesthesubstringofcharacters
tosearchforinstring
-  modifiersisthecharacterconstant,variableorexpressionthatspecifiesoneormoremodifiers.
-  startposisanintegerthatspecifiestheposi)onatwhichtheresearchshouldstart.Itssignspecifies
thedirec)onofthesearch(-=le|,+=right)
DATAprova(KEEP=Customer_Address);
SETorders;
DATAprova(KEEP=Customer_Address);
SETorders;
IFFIND(Customer_Address,'1068','t',-6)>0
IFFIND(Customer_Address,'1068',-6)>0; ORFIND(Customer_Address,'bryant',"i",1)>0;
RUN;
RUN;
PROCPRINTDATA=prova;
PROCPRINTDATA=prova;
RUN;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
325
UPCASEandLOWCASE
UPCASE(argument)
LOWCASE(argument)
-  argumentcanbeanySASexpression
Thisfunc)onsconvertallle{ersinacharacterexpressontouppercaseand
lowecase,respec)vely
DATAprova(KEEP=Customer_LastName);
SETorders;
Customer_LastName=UPCASE(Customer_LastName);
RUN;
PROCPRINTDATA=prova;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
326
PROPCASE
Thisfunc)onconvertscharacterexpressionssothatthefirstle{erineachword
iscapitalised
PROPCASE(argument,<delimiter(s)>)
-  argumentcanbeanySASexpression
-  delimitersspecifiesoneormoredelimiterseclosedinquota)onmarks.
Thedefaultdelimitersare:blank/-(.Tab
DATAprova(KEEP=Customer_NameC_N);
SETorders;
Customer_Name=UPCASE(Customer_Name);
C_N=PROPCASE(Customer_Name);
RUN;
PROCPRINTDATA=prova;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
327
TRANWRDfunc)on
Thisfunc)onremovesalloccurrencesofapa{ernofcharactersfromthesource
stringandreplacesthembyanewsubstring
TRANWRD(source,target,replacement)
-  sourcespecifiesthesourcestringthatyouwanttotranslate
-  targetspecifiesthestringthatSASsearchesfor
-  replacementspecifiesthestringthatreplacestarget
DATAprova(KEEP=Customer_NameCustomer_Name2);
SETorders;
Customer_Name2=TRANWRD(Customer_Name,'Sandrina','Laura');
RUN;
PROCPRINTDATA=prova(OBS=7);
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
328
Modifyingnumericvalueswithfunc)ons
INT(argument)
Returnstheintegerpor)onofanumericvalue
-  argumentcanbeanumericvariable,constantorexpression
ROUND(argument,round-off-unit)
Roundsvaluestonearestspecifiedunit
-  argumentcanbeanumericvariable,constantorexpression
-  round-off-unitisnumericandnon-nega)ve.Itmustbeprecededbyaperiod
DATAprova(KEEP=CostPrice_Per_UnitRoundedPriceIntegerPrice);
SETorders;
RoundedPrice=ROUND(CostPrice_Per_Unit,.1);
IntegerPrice=INT(CostPrice_Per_Unit);
RUN;
PROCPRINTDATA=prova;
FORMATRoundedPriceIntegerPriceDOLLAR7.2;
RUN;
GIORGIORUSSOLILLO-Coursdeprépara)onàlacer)fica)onSAS«BaseProgramming»
329