T EC HN IC ALB RIEF Primary Storage Compression

TECHNICAL BRIEF
Primary Storage Compression
with Storage Foundation 6.0
Technical Brief
Primary Storage Compression with Storage
Foundation 6.0
Contents
Introduction ................................................................................................................................................................... 4
What is Compression? .................................................................................................................................................... 4
Differentiators ............................................................................................................................................................... 4
Compression Targets ..................................................................................................................................................... 5
Tools/Reporting ............................................................................................................................................................. 5
Performance................................................................................................................................................................... 5
Introduction
Primary Storage, the first line in maintaining your company’s mission critical data, is under increasing pressures
in today’s IT environment. Unstructured data generated by both end-users and applications is growing exponentially,
increasing in size with more complex applications, and stored longer to align to more stringent retention requirements.
This can include anything from: spreadsheets from the CFO; PowerPoints from the sales managers; and archive log files
from Oracle databases.
Symantec and Storage Foundation 6.0 enable customers to maximize the primary storage they already have to
reduce complexity and improve optimization through host-based compression. Enabling compression at the filesystem
layer enables storage savings on top of any storage and removes complex and expensive appliances associated with
primary compression.
What is Compression?
Storage Foundation 6.0 File Compression is performed without needing any application changes and with
minimal overhead. Compression is controlled through a CLI at the directory or file level and executed out-of-band, or
after the write. Once compression is enabled, directories and files will begin to have a mix of compressed and
uncompressed data blocks. This is managed automatically by the File System and uncompressed data will be
compressed during the next sweep.
When dealing with compressed files, there are four main actions to consider:
1)
Compression: When initiated, the compression engine will look at the data in 1M blocks, determine which extents
can be compressed, and compresses those extents using the gzip compression algorithm.
2)
Reading: Compression in Storage Foundation does not change the file extension, allowing users and applications
to use files normally. When opened, only those blocks to be read will be uncompressed in memory, not on disk. While a
performance impact will be seen on initial read, subsequent reads may perform better as the data come directly from
cache.
3)
Writing: When modifying files, only those blocks with changed data will be expanded on disk, not the entire file.
4)
Backup: Back-up programs that read files through the namespace will not be aware of the compressed file. When
read, the file-system will be aware of any compressed extents, and un-compress them to be backed up.
Figure 1 - File extents compressed in blocks
Differentiators
4
No hardware dependencies: Compression within Storage Foundation 6.0 works in-place on any supported
storage. It is supported on Storage Foundation Standard and Enterprise licenses.
-
Inode maintenance: Unlike gzip or other compression tools, inode numbers and file extensions are
maintained. This allows users and applications to continue without changing their behavior.
Metadata consistency: Compression does not modify the metadata of files and directories, allowing for
consistent and accurate life-cycle management.
Compression Targets
-
Database Archive logs
o Oracle best practices recommend archive logs for database recovery
o As databases age and used strictly for reads, log files go stale and unchanged.
Unstructured Data
o Studies indicate 90% of user-created data is never accessed after creation.
o Increased regulatory restrictions leads to longer term storage.
Tools/Reporting
To the end user, the file and folders will look exactly the same, for the storage admin a new CLI has been
introduced in 6.0 to report physical vs logical consumption, blocks used, % of extents compressed, and the compression
ratio.
In Figure 2, bigfile is shown using both the ls command and the vxcompress command. The ls command shows
the number of blocks used (1797) and the logical size of the file (5.6 MB). The vxcompress identifies compression ratio,
compression type, and the percentage of compressed extents.
Figure 2 - Compressed File
Performance
How well a system handles the compression and un-compression of files is a key metric in deciding which data
types can be compressed at which stage in its lifecycle. Compression is CPU heavy and the CPU load should be
considered carefully. Reading from compressed files can also result in performance degradation due to the increased
I/O. However, with multi-core CPUs becoming standard in large enterprises, CPU time should be readily available. The
total space savings and time to compress/uncompress will vary greatly depending on: server type; server load; file type;
and compression settings. Figure 3 illustrates results from sample configurations, however, individual investigation is
recommended for specific use-cases.
Figure 3 - Compression Ratio Examples
Data Type
Unstructured (80,000
files)
Oracle archive log
5
Platform
Original Size
Savings
Solaris Sparc 10
5 GB
70%
Linux RHES
18 GB
60%
CPU Usage
1 CPU: 6%
4 CPU: 20%
About Symantec
Symantec is a global leader in
providing security, storage and systems
management solutions to help
consumers and organizations secure and
manage their information-driven world.
Our software and services protect
against more risks at more points, more
completely and efficiently, enabling
confidence wherever information is used
or stored. Headquartered in Mountain
View, Calif., Symantec has operations in
40 countries. More information is
available at www.symantec.com.
For specific country offices and contact
numbers, please visit our website. For
product information in the U.S., call tollfree 1 (800) 745 6054.
Symantec Corporation
World Headquarters
350 Ellis Street
Mountain View, CA 94043 USA
1 (800) 721 3934
www.symantec.com