CMG 2007

To Compress or not to
Compress?
Chuck Hopf
What is your precious?
• Gollum says every data center has
something that is precious or hard to come
by
– CPU Time
– DASD Space
– Run Time
– IO
– Memory
Lots of talk
• On the LISTSERVE – does compression
use more CPU? Does it save DASD
space?
• On the LISTSERVE – what is the best
BUFNO= to use with MXG
Testing the theories
• Built two tests
– COMPRESS=NO varying BUFNO from 2 10
15 20
– COMPRESS=YES again varying the BUFNO
An Epiphany!
• What if you run with COMPRESS=NO and
send the output to PDB as a temporary
dataset and then at the end, turn on
COMPRESS=YES and do a PROC COPY
INDD=PDB OUTDD=PERMPDB
NOCLONE; ? That would eliminate all of
the compression during the reading and
writing of all of the interim datasets but still
create a compressed PDB.
So there are now 3 Tests!
• TEST=NO - COMPRESS=NO
• TEST=NO/YES - COMPRESS=NO but
final PDB is compressed
• TEST=YES – COMPRESS=YES
CPU Time
Elapsed Time
Low Memory
High Memory
EXCP DASD
DASD IO Time
DASD Space
DASD Space by DDNAME
Conclusions?
• Running with COMPRESS=NO and then
copying to a compressed PDB optimizes
permanent DASD space and uses very
little additional CPU.
• Even better, use the LIBNAME OPTION to
turn it on where you want:
– LIBNAME PDB COMPRESS=YES; /* zOS
only */
• Memory requirements increase with
BUFNO but are not really that bad and
Caveats!
• BLKSIZE matters. SAS procs are
sometimes built with a BLKSIZE of 6160
on WORK. This radically affects the IO
counts. Use the recommended
BLKSIZE=DASD(OPT) and leave the DCB
attributes off of SAS datasets.
• REGION may have to be increased – use
REGION=0M and be sure you are using
the MXG defaults for MEMSIZE.
• This all applies to zOS not to ASCII
platforms
So What About ASCII?
• Using the same data, tests run with SAS
9.2 on Win 7 system
• 1.5GB memory
• Dell 4600 – P4 2.7GHz
ASCII Results
Test
COMPRESS=NO
COMPRESS=YES
COMPRESS=NO
COMPRESS=YES
COMPRESS=NO
COMPRESS=YES
COMPRESS=NO
COMPRESS=YES
COMPRESS=NO
COMPRESS=YES
COMPRESS=NO
COMPRESS=YES
COMPRESS=NO
COMPRESS=YES
COMPRESS=NO
COMPRESS=YES
COMPRESS=NO
COMPRESS=YES
COMPRESS=NO
COMPRESS=YES
COMPRESS=NO
COMPRESS=YES
COMPRESS=NO
COMPRESS=YES
COMPRESS=NO
COMPRESS=YES
BUFNO Elapsed CPU
DEFAULT
06:45.4 03:25.0
DEFAULT
06:12.5 02:51.6
16K
07:35.1 03:56.8
16K
05:57.8 02:49.1
32K
07:39.2 02:58.0
32K
06:05.4 02:51.0
40K
08:28.6 04:17.0
40K
06:20.4 02:59.1
80K
07:44.1 04:02.8
80K
05:59.2 02:54.5
16M
07:42.1 04:01.1
16M
06:09.2 02:53.0
32M
07:43.4 03:54.9
32M
05:57.4 02:51.1
64M
08:02.7 03:58.7
64M
06:37.8 02:55.5
128M
08:14.2 03:55.0
128M
06:30.0 02:58.0
10 07:11.5 03:16.1
10 05:56.2 02:37.1
40 07:17.5 03:20.9
40 06:00.1 02:41.1
80 07:13.0 03:24.1
80 05:57.6 02:36.2
160 07:16.1 03:24.0
160 05:44.6 02:26.5
Memory
95713k
95721K
275537K
179769K
275537K
179679K
275537K
179769K
275537K
179769K
275537K
179769K
275537K
179769K
275537K
179769K
275537K
179769K
96259K
96649K
97603K
98892K
99529K
102095K
103379K
108825K
Wow!
• COMPRESS=YES outperforms
COMPRESS=NO!
• BUFNO makes some difference but not a
lot and BUFNO=10 looks to be optimal
– Difference is in seconds not minutes
– But… there is something we don’t understand
in the memory numbers
• Runs faster under Win 7 than under zOS
– But does not include download time
So What Should You Do?
• It Depends on what your ‘precious’ is
– Running zOS
• Optimal for CPU and DASD is COMPRESS=NO
with a copy to a compressed dataset at the end or
by setting the compress=YES option with a
LIBNAME
• Optimal for CPU is COMPRESS=NO
• Optimal for DASD is COMPRESS=YES
• BUFNO=10 is optimal for run time
– Running ASCII
• Optimal for CPU and DASD is COMPRESS=YES
JCL
//* SAMPLE JCL TO RUN BUILDPDB WITH COMPRESS=NO AND COMPRESS AT
//* THE END USING PROC COPY
//S1
EXEC MXGSASV9
//PDB
DD DSN=MXG.PDB(+1),SPACE=(CYL,(500,500)),
//
DISP=(,CATLG,DELETE)
//SPININ
DD DSN=MXG.SPIN(0),SPACE=(CYL,(500,500))
//
DISP=(,CATLG,DELETE)
//SPIN
DD DSN=MXG.SPIN(+1),DISP=OLD
//CICSTRAN DD DSN=MXG.CICSTRAN(+1),SPACE=(CYL,(500,500)),
//
DISP=(,CATLG,DELETE)
//DB2ACCT DD DSN=MXG.DB2ACCT(+1),SPACE=(CYL,(500,500)),
//
DISP=(,CATLG,DELETE)
//SMF
DD DSN=YOUR,SMF DATA,DISP=SHR
//SYSIN
DD *
OPTIONS COMPRESS=NO BUFNO=10;
LIBNAME PDB COMPRESS=YES;
LIBNAME SPIN COMPRESS=YES;
%LET SPININ=SPININ;
%UTILBLDP(
MACKEEPX=
MACRO _LDB2ACC DB2ACCT.DB2ACCT %
MACRO _KDB2ACC COMPRESS=YES %
MACRO _KCICTRN COMPRESS=YES %
,
SPINCNT=7,
SPINUOW=2,
OUTFILE=INSTREAM);
%INCLUDE INSTREAM;
JCL is in the 27.10 SOURCLIB as JCLCMPDB
Why UTILBLDP?
• Allows you to add data sources to
BUILDPDB without having to edit the
macros in the SOURCLIB.
• Allows you to suppress data sources like
110 and DB2 and TYPE74 and process
them in other jobs again without editing
the macros.
• Flexibility
Example
OPTIONS COMPRESS=NO BUFNO=10;
LIBNAME PDB COMPRESS=YES;
LIBNAME SPIN COMPRESS=YES;
%LET SPININ=SPININ;
%UTILBLDP(
USERADD=42,
SUPPRESS=110 DB2,
SPINCNT=7,
OUTFILE=INSTREAM);
%INCLUDE INSTREAM;
RUN;
MXG User Experience
• Running MXG with WPS instead of SAS
• Data from multiple platforms
• Processed under two Virtual products
• Also, Comparison of SAS/PC and WPS on
zLinux
PC/SAS VMWARE/Windows versus PC/SAS Hyper-V/Windows:
(four platform’s data, three installation “groups” PROD/QA/DEV)
Data From
VMWARE(PROD)
Hyper-V(PROD)
Unix
zOS
zVM/Linux
Windows Servers
00:05:30
00:01:30
00:03:07
02:43:08
00:10:56
00:04:54
00:08:08
09:32:57
Data From
VMWARE(QA)
Hyper-V(QA)
Unix
ZOS
zVM/Linux
Windows Servers
00:00:31
00:01:27
00:01:02
00:41:24
00:04:18
00:02:46
00:07:06
02:34:19
VMWARE(DEV)
Hyper-V(DEV)
Data From
Unix
ZOS
zVM/Linux
Windows Servers
00:00:43
00:00:21
00:01:08
00:09:06
00:02:42
00:01:42
00:03:34
00:38:47
Processing of performance
Data collected from Unix,
zVM/Linux, zOS and
Windows.
PC/SAS versus LNX/WPS
• PC/SAS VMWARE/Windows versus WPS zVM/Linux
• PC/SAS VMWARE is taking 2:43:08 to process the data
from “Window Servers” for what the WPS zVM/Linux
environment can do in 1:30:00 (hh:mm:ss).
• That is, the Mainframe WPS zVM/Linux is a 45%
improvement over the PC/SAS VMWARE/WIN.
• This is most likely due to the extra bandwidth the
mainframe has for I/O’s compared to the Windows
environment.
• The results for Windows would probably be better if
WIN2008 had been used.
PC/SAS versus WPS on z
•PC/SAS under Hyper-V
•WPS under zVM/Linux on z-10
Z10: SAS versus WPS
•
•
•
•
•
•
•
zOS/SAS versus zOS/WPS to run MXG
30% more I/O’s for SAS
TCB for WPS = 551,423
TCB for SAS = 551,273
NOTES:
WPS version 2.4.0.1 and SAS 9.1.3
MXG from FEB 2009