EPA Wins the Game by Sliding into Home Plate with FDR/UPSTREAM

EPA Wins the Game by Sliding into
Home Plate with FDR/UPSTREAM
THE U.S. EPA MAKES A SMOOTH
TRANSITION TO INNOVATION DATA
PROCESSING’S FDR/UPSTREAM
BYLINE: JIM UTSLER
It’s most every IT manager’s nightmare: a crucial piece of
operational software that’s going to be discontinued or has
become obsolete. They could continue using it, but it may not
be supported in the future and upgrades—and, therefore,
state-of-the-art technologies—are unlikely to follow.
Alternately, they’ll probably have to bite the bullet and find a
new solution—no matter how expensive. But this may mean
dealing with an unfamiliar vendor and its equally unfamiliar
product. And then, there are the testing, training and deployment costs. Hardly a heart-warming proposition.
One such organization that was facing a similar situation
was the U.S. Environmental Protection Agency (EPA). The
backup software it had relied on for many years was coming to
an end and it had to find another application to take its place.
Fortunately, the agency did its homework when evaluating and
finally decided upon just the right solution. After deploying
Innovation Data Processing’s FDR/UPSTREAM, which is a
distributed-backup tool using the z/OS and OS/390 mainframe
as the backup server, it now has a highly functional and robust
storage environment that will take it well into the future.
PLAY BALL!
Although it may seem that the EPA has been around for
a century or more, its origins go back only to 1970, when
then-President Nixon and the U.S. Congress established the
agency to help monitor and manage the pollutants that impact
both human health and the environment. With an increasingly
focused mandate, it now enforces Congressional environmental laws; ensures that environmental protection standards are
consistent with national environmental goals; conducts
research into the effects of pollutants; recommends policy
changes as a result of these studies; and assists other organizations to help stem pollution.
something up almost all the time,” VanBenthuysen adds.) This
rigorous backup regimen allows the organization to rest
assured that in the event of a complete server meltdown, the
data residing on it can be easily restored from an up-to-date
backup. Prior to deploying FDR/UPSTREAM, which allows
for automated backup operations, the EPA had been using
another backup solution. While the application largely
worked well, it did have its limitations, had an end-to-life
date and did not support a future direction that included
zLinux. The writing on the wall, the organization had to find
a suitable—if not more robust—replacement solution.
BATTER UP!
VanBenthuysen, who, with David Wilson, also with IBM’s
Programming Systems Group, oversaw the FDR/UPSTREAM
implementation, was already familiar with Innovation Data
Processing, and the EPA had been a customer of the vendor for
its mainframe DASD volumes with Innovation’s FDR and CPK
products for more than 20 years. These relationships made the
decision to go with FDR/UPSTREAM somewhat simple to
make, with VanBenthuysen calling Innovation Data Processing,
“very responsive, with great support and products.”
Prior to the choice, however, the EPA put
FDR/UPSTREAM, which is used by the EPA’s operations
staff, through its paces, testing it for approximately three
months. Although VanBenthuysen was convinced that the
Innovation solution would work for all of its varied servers—
currently represented by 126 boxes on seven different
platforms with 6TB of data to be backed up—the system
administrators (SAs) had to be persuaded. As VanBenthuysen
explains, “The SAs wanted to install FDR/UPSTREAM and
make sure that it cohabitated well with our standard configuration, say, on Linux or on Sun. So there was a great deal of
testing that went into the implementation before we could get
FDR/UPSTREAM approved to be added to those platforms.”
EPA’s comprehensive testing prompted Innovation to add
the Rescuer to FDR/UPSTREAM. This was in response to
calls from the EPA SAs for what they call “bare-metal
restores” to the Sun Solaris environment. What the Rescuer
essentially does is allow for full-file system recovery. The
SAs simply insert the single user or emergency boot facilities
provided by the UNIX or Linux vendor into the server, create
a RAM disk environment and then run a disaster-recovery
(DR) script. FDR/UPSTREAM then checks its mainframebased database and displays all the file systems that were
backed up. The SAs then choose the file system to restore to
the server. “This really makes life a lot easier for our system
administrators,” VanBenthuysen says. “They don’t have to
reinstall everything all over again.”
Before taking the leap, however, the EPA sent several key
data-management employees to Innovation’s headquarters
in Little Falls, N.J., for training. This product-familiarity
session lasted three days, and consisted of a classroom
environment during which the participants were introduced
to every facet of the product, including installation,
configuration, problem determination and operation.
As one might expect, the agency needs a near-bulletproof
IT infrastructure to keep up with all of these activities. To a
large extent, it does indeed have just that, even though it’s
using a host of servers representing many different platforms,
including an IBM S/390 with 3.3TB of DASD running z/OS
1.4 that hosts “the bulk of the EPA’s critical applications,
including finance, payroll, etc.,” according to Dan
VanBenthuysen, an enterprise server system manager
contracted to the EPA from IBM’s Systems Programming
Group. “In addition, we’re also working to develop
mainframe Linux as a viable platform.”
The mainframe and a multitude of other servers, including
Windows, UNIX and Linux systems, are located and
managed at the EPA’s National Computer Center. Its backup
silo, which employs five StorageTek robotic tape libraries, is
situated at an offsite location. While many of the agency’s
servers are backed up locally, the agency doesn’t, as
VanBenthuysen explains, “consider a backup a backup until
it’s in a different building.”
The EPA doesn’t “consider a backup a
backup until it’s in a different building.”
Sound advice, indeed, especially when one takes into
account the vital nature of the organization’s data, which
includes not only operational data, such as financials and human
resources, but also, for example, complex atmospheric models.
In order to keep pace with its backup needs of more
than 6TB of data and growing, the EPA runs both full and
incremental backups. The former are typically run over the
weekend, when user activity is low, and the latter during
the week, typically at night, but also during the day if the
backups aren’t very CPU intensive. (“We’re generally backing
2
only one first full. When you do your next full, it merges all
the incrementals into the full and creates a new full without
moving all the data again. It’s very efficient, because we’re
not consuming excess bandwidth or CPU.”
In order to help facilitate faster backups, FDR/UPSTREAM
offers five levels of compression functionality that allow the
data to be compressed before it’s sent to the mainframe. The
EPA has made several recommendations on compression,
and in most cases, the data is compressed in order to increase
throughput. If, however, a user is concerned that the
compression process will negatively impact local CPU usage,
as in the case of the systems responsible for the creation of the
atmospheric models, which typically run around the clock, only
light compression is necessary. “It’s a tradeoff between CPU
consumption on the client machine during compression versus
the impact on the network if there is no compression,” says
Wilson. “In either case, it’s nice that the product offers us
the choice.”
With FDR/UPSTREAM, interrupted
backups and restores are automatically
re-launched, with the transmission beginning
where it had left off—no tears needed.
AND HERE’S THE WINDUP—THE PITCH—
AND A FASTBALL STRIKE!
The deployment of FDR/UPSTREAM has been so
successful that other operating units within the EPA are
subscribing to the backup service, with billing being based on
per-GB-transmission rate. Other backup products, including
another still in operation at the EPA, hadn’t allowed for this
or would have been too expensive for the units to afford.
“Previously, this was pretty much a free service the data
center offered,” VanBenthuysen remarks. “But we were
encouraged to implement backup services as a chargeable
benefit, and it’s now a money-making operation.” As of
October 2003, 126 servers (including those used by the EPA’s
atmospheric modeling group, which, having files that at
1/2TB in size, is the largest EPA backup customer) were part
of this new paradigm, and “getting more business is one of
the data center’s priorities,” VanBenthuysen adds.
As implemented at the EPA, the FDR/UPSTREAM
solution consists of server software that resides on the
mainframe—or “the catcher,” as VanBenthuysen calls it—
and an OS-specific client software that resides on the
servers—or “pitchers”—that are to be backed up. Most of the
configuration takes place on the mainframe side, including
the construction of the server profiles (a rule-based policy
defining to UPSTREAM what functions the server is allowed
to do). On the client side, the SAs simply have to either
download the software from the EPA mainframe Web server
or install it via a CD, pick a profile that defines the rules
for the data, determine a schedule (daily incrementals and
weekly full merges, for example) and then decide upon a start
time. The mainframe will then initiate the automated
backups. As VanBenthuysen succinctly explains, “The
regularly scheduled backups are started by the catcher, and
then the pitcher begins sending the data.”
VanBenthuysen has found full merges, which allows users
to conduct only one actual full-system backup per system, to
be particularly helpful. To use the full-merge backup, a
first-time baseline full backup of the file server is performed,
with all the specified files being backed up and transmitted to
the mainframe. When full-merge backups are conducted,
FDR/UPSTREAM sends a picture of the file systems to be
backed up to the UPSTREAM MVS database, which
then uses previously existing backups plus changed files to
construct the full-merge backups. The benefits to this function,
because relatively few files have to be read or transmitted, are
greatly reduced network traffic, a lessening of complexity and
dramatic time savings. As VanBenthuysen explains, “There’s
A HOME RUN!
Another compelling feature built into FDR/UPSTREAM
is its ability to automatically restart interrupted transmissions.
Prior to using UPSTREAM, operations or data-management
staff would have to manually intervene if a backup was
interrupted—and the backup would have to be started from
the beginning.
“It was enough to make you cry if you were near the end
of 2TB backup and it failed,” remarks VanBenthuysen.
“You’re not going to meet your backup window, and if you
3
The EPA’s move to Innovation’s
FDR/UPSTREAM is paying off, in terms
of both time and costs, with increased
efficiencies, lower administrative
overhead and improved automation
making all the difference.
had been using a homegrown notification to send reports to
SAs so they can go over the previous night’s backup activity.
It has now switched to the notification built into
FDR/UPSTREAM, with VanBenthuysen adding that the
organization is “always in favor of eliminating the
homegrown aspects of our backup procedures.”
The EPA is also running proofs on concept on the
FDR/UPSTREAM database agents. These are essentially
APIs that interface with application databases, including, for
example, Lotus Notes and Oracle databases. In the case of the
Oracle API, it interfaces with the Oracle recovery manager
(RMAN), allowing database administrators (DBAs) to
continue to use the recovery manager while FDR/UPSTREAM
acts as the data mover. “It’s seamless, and the DBAs don’t have
to be concerned with the underlying technologies. They just
keep doing what they’re used to doing,” VanBenthuysen notes.
have to start again from the top, you’re surely going to run on
into Monday or Tuesday, which interferes with the network
during the week.” With FDR/UPSTREAM, interrupted
backups and restores are automatically re-launched, with the
transmission beginning where it had left off—no tears needed.
FDR/UPSTREAM also supports Linux (SuSE and
Red Hat) on the mainframe. Because the EPA is vigorously
headed in that direction, supporting Linux-based Oracle
databases and, perhaps in the future, Lotus Notes-based
e-mail on its S/390, it needed to ensure that it could easily
backup that data. Fortunately, FDR/UPSTREAM was the
first distributed-backup product to offer that type of seamless
Linux/mainframe functionality. “We just wouldn’t go into
Linux on the mainframe production without a valid backup
for that,” VanBenthuysen says. “But with FDR/UPSTREAM,
we already have the tool in place.”
The agency is also embarking on using an UPSTREAM
migration tool to merge backups on disk and then migrate
them to tape. This data consolidation allows for much more
efficient use of tape, because the organization has more
servers than it has tape drives to facilitate backups. As Wilson
explains, “You can put everything on disk and then have
FDR/UPSTREAM move it to tape as a backend process. So
what you’re able to do is stack multiple servers onto the same
tape, and UPSTREAM will then update the catalog to point
to where that data is on tape. Once this is done and you need
to do a file restore, you can do it directly from tape without
having to first dump it back onto disk.”
Also related to tape storage is FDR/UPSTREAM’s vaulting capability, which allows users to make copies of their
tapes for disaster-recovery purposes, such as DR testing.
Because copies of the tapes are being used during DR drills,
operational backups can continue unhindered. “Because we
want to be able to conduct drills in parallel with ongoing
operations, vaulting works very well,” Wilson says. “We can
take our vault tapes to our DR site without disturbing our
daily operations.”
Additionally, FDR/UPSTREAM also offers an automated
notification system that sends e-mails to operations personnel.
These notifications might contain, for example, immediate
information on failures or comprehensive job logs. The EPA
A WINNING TEAM
It’s this type of simplicity that’s one of FDR/UPSTREAM’s
many key differentiators. Its “Director” interface, for example,
is both Java-based and browser enabled, allowing DBAs to
monitor backups and perform inquires and restore requests with
a simple Web browser (although there’s also a client-based
application that can be used, as the EPA is doing). Using this
tool, SAs and others can look at job logs and perform both
pre- and post-processing analysis. “It offers them a single
interface from which they can get all of the information they
need very quickly and at a glance,” says Wilson. “So on a
day-to-day basis, they’ll look for slowdowns, check to see if any
configurations have changed on the client end and notice if
performance has degraded.”
The FDR/UPSTREAM Director has paid off in more
significant ways than simple efficiency. It has also helped
pinpoint backup bottlenecks, especially as they pertain to the
client side. For example, data managers have been known to
contact SAs to notify them if an Ethernet card has been
misconfigured. “These cards seem to slip into simplex out
of duplex, which has a lot less throughput,” notes
VanBenthuysen. “So a data manager might call an SA and
say, ‘It looks like your IP card isn’t in duplex mode.’And sure
enough, they’re probably right.”
While the EPA could have been left dumbfounded by
the end-of-life product announcement, it instead turned the
challenge into an opportunity and real triumph. And already, the
move to Innovation’s FDR/UPSTREAM is paying off, in terms
of both time and costs, with increased efficiencies, lower
administrative overhead and improved automation making all
the difference. As VanBenthuysen puts it, “Thanks to
Innovation’s excellent software , we have a winning team.”
For more information on FDR/UPSTREAM contact
Innovation Data Processing at 973-890-7300,
e-mail [email protected] or visit us at
http://www.innovationdp.fdr.com
4