RP and - Information Sciences Institute

Wide Area Data Replication for
Scientific Collaborations
Ann Chervenak, Robert Schuler, Carl Kesselman
USC Information Sciences Institute
Scott Koranda
Univa Corporation
Brian Moe
University of Wisconsin Milwaukee
Motivation



Scientific application domains spend considerable
effort managing large amounts of experimental
and simulation data
Have developed customized, higher-level Grid data
management services
Examples:



Laser Interferometer Gravitational Wave Observatory
(LIGO) Lightweight Data Replicator System
High Energy Physics projects: EGEE system, gLite,
LHC Computing Grid (LCG) middleware
Portal-based coordination of services (E.g., Earth
System Grid)
Motivation (cont.)

Data management functionality varies by application

Share several requirements:







Publish and replicate large datasets (millions of files)
Register data replicas in catalogs and discover them
Perform metadata-based discovery of datasets
May require ability to validate correctness of replicas
In general, data updates and replica consistency services
not required (i.e., read-only accesses)
Systems provide production data management services
to individual scientific domains
Each project spends considerable resources to design,
implement & maintain data management system

Typically cannot be re-used by other applications
Motivation (cont.)

Long-term goals:





Paper describes one higher-level data management
service: the Data Replication Service (DRS)
DRS functionality based on publication capability of
the LIGO Lightweight Data Replicator (LDR) system
Ensures that a set of files exists on a storage site


Generalize functionality provided by these data
management systems
Provide suite of application-independent services
Replicates files as needed, registers them in catalogs
DRS builds on lower-level Grid services, including:


Globus Reliable File Transfer (RFT) service
Replica Location Service (RLS)
Outline

Description of LDR data publication capability

Generalization of this functionality

Define characteristics of an application-independent
Data Replication Service (DRS)

DRS Design

DRS Implementation in GT4 environment

Evaluation of DRS performance in a wide area Grid

Related work

Future work
A Data-Intensive Application Example:
The LIGO Project






Laser Interferometer Gravitational Wave Observatory
(LIGO) collaboration
Seeks to measure gravitational waves predicted by
Einstein
Collects experimental datasets at two LIGO instrument
sites in Louisiana and Washington State
Datasets are replicated at other LIGO sites
Scientists analyze the data and publish their results,
which may be replicated
Currently LIGO stores more than 40 million files across
ten locations
The Lightweight Data Replicator


LIGO scientists developed the Lightweight Data
Replicator (LDR) System for data management
Built on top of standard Grid data services:



Globus Replica Location Service
GridFTP data transport protocol
LDR provides a rich set of data management
functionality, including





a pull-based model for replicating necessary files to a LIGO
site
efficient data transfer among LIGO sites
a distributed metadata service architecture
an interface to local storage systems
a validation component that verifies that files on a storage
system are correctly registered in a local RLS catalog
LIGO Data Publication and Replication
Two types of data publishing
1. Detectors at Livingston and Hanford produce data
sets





Approx. a terabyte per day during LIGO experimental runs
Each detector produces a file every 16 seconds
Files range in size from 1 to 100 megabytes
Data sets are copied to main repository at CalTech, which
stores them in tape-based mass storage system
LIGO sites can acquire copies from CalTech or one another
2. Scientists also publish new or derived data sets as
they perform analysis on existing data sets


E.g., data filtering or calibration may create new files
These new files may also be replicated at LIGO sites
Some Terminology

A logical file name (LFN) is a unique identifier for the
contents of a file



A physical file name (PFN) is the location of a copy of
the file on a storage system.


Typically, a scientific collaboration defines and manages
the logical namespace
Guarantees uniqueness of logical names within that
organization
The physical namespace is managed by the file system or
storage system
The LIGO environment currently contains:


More than six million unique logical files
More than 40 million physical files stored at ten sites
Components at Each LDR Site




Local storage system
GridFTP server for file transfer
Metadata Catalog: associations
between logical file names and
metadata attributes
Replica Location Service:




Local Replica Catalog (LRCs)
stores mappings from logical
names to storage locations
Replica Location Index (RLI)
collects state summaries from
LRCs
Scheduler and transfer daemons
Prioritized queue of requested
files
Scheduler
Daemon
Local
Replica
Catalog
Prioritized
List of
Requested
Files
Replica
Location
Index
MySQL
Database
Metadata
Catalog
Transfer
Daemon
GridFTP
Server
Site
Storage
System
LDR Data Publishing

Scheduling daemon runs at each LDR site




Queries site’s metadata catalog to identify logical files with
specified metadata attributes
Checks RLS Local Replica Catalog to determine whether
copies of those files already exist locally
If not, puts logical file names on priority-based scheduling
queue
Transfer daemon also runs at each site





Checks queue and initiates data transfers in priority order
Queries RLS Replica Location Index to find sites where
desired files exists
Randomly selects source file from among available replicas
Use GridFTP transport protocol to transfer file to local site
Registers newly-copied file in RLS Local Replica Catalog
Generalizing
the LDR Publication Scheme

Want to provide a similar capability that is



Independent of LIGO infrastructure
Useful for a variety of application domains
Capabilities include:





Interface to specify which files are required at local site
Use of Globus RLS to discover whether replicas exist
locally and where they exist in the Grid
Use of a selection algorithm to choose among available
replicas
Use of Globus Reliable File Transfer service and GridFTP
data transport protocol to copy data to local site
Use of Globus RLS to register new replicas
Relationship to
Other Globus Services
At requesting site,
deploy:

WS-RF Services




Data Replication Service
Delegation Service
Reliable File Transfer
Service
Pre WS-RF
Components


Replica Location Service
(Local Replica Catalog,
Replica Location Index)
GridFTP Server
Local Site
Data
Replication
Service
Delegation
Service
Reliable
File
Transfer
Service
Replicator
Resource
Delegated
Credential
RFT
Resource
Web Service Container
Local
Replica
Catalog
Replica
Location
Index
GridFTP
Server
DRS Functionality

Initiate a DRS Request

Create a delegated credential

Create a Replicator resource

Monitor Replicator resource

Discover replicas of desired files in RLS, select among
replicas

Transfer data to local site with Reliable File Transfer Service

Register new replicas in RLS catalogs

Allow client inspection of DRS results

Destroy Replicator resource
DRS implemented in Globus Toolkit Version 4, complies with
Web Services Resource Framework (WS-RF)
WSRF in a Nutshell


Service
State Management:

Service
EPR
EPR
EPR
GetRP


GetMultRPs
Resource
SetRP
Subscribe

SetTermTime


SetTerminationTime
ImmediateDestruction
Notification Interfaces



GetRP, QueryRPs,
GetMultipleRPs, SetRP
Lifetime Interfaces:

Destroy
Endpoint Reference
State Interfaces:

QueryRPs
RPs
State Identification:


Resource
Resource Property
Subscribe
Notify
ServiceGroups
Create Delegated Credential
Client
Data Rep.
•Create delegated
credential resource
•Set termination time
EPR
•Credential
RFT EPR
returned
•Initialize user
proxy cert.
Replica
Index
Replica
Catalog
proxy
Delegation
Credential
RP
Replica
Catalog
Replica
Catalog
GridFTP
Server
GridFTP
Server
MDS
Service Container
Replica
Catalog
Create Replicator Resource
EPR
Client
Data Rep.
Replicator
RP
•Create Replicator
resource
•Pass delegated
credential EPR
•Set termination time
•Replicator EPR
returned
RFT
Replica
Index
Replica
Catalog
Delegation
Credential
RP
Replica
Catalog
Replica
Catalog
•Access delegated
GridFTP
credential resource
Server
GridFTP
Server
MDS
Service Container
Replica
Catalog
Monitor Replicator Resource
Data Rep.
Replicator
Client
RP
•Subscribe to
ResourceProperty
changes for “Status”
RP and “Stage” RP
•Add Replicator
resource to MDS
Information service
Index
RFT
Replica
Index
Replica
Catalog
EPR
Delegation
Credential
RP
MDS
Index
RP
Service Container
Replica
Catalog
•Periodically polls
GridFTP
Replicator RP via
Server
GetRP or GetMultRP
Replica
Catalog
GridFTP
Server
•Conditions may
trigger alerts or other
actions (Trigger
service not pictured)
Replica
Catalog
Query Replica Information
Client
Data Rep.
Replicator
RP
•Notification of “Stage”
RP value changed to
“discover”
RFT
•Replicator queries
RLS Replica Index to
find catalogs that
contain desired
Delegation
replica
information
Credential
RP
Replica
Catalog
Replica
Index
Replica
•Replicator queries
Catalog
RLS Replica
Catalog(s) to retrieve
mappings from
logical name to
GridFTP
target name (URL)
Server
Replica
Catalog
GridFTP
Server
MDS
Index
RP
Service Container
Replica
Catalog
Transfer Data
Data Rep.
Replicator
Client
RP
•Notification of “Stage”
RP value changed to
“transfer”
•Create Transfer
resource
•Pass credential EPR
•Set Termination Time
•Transfer resource
EPR returned
•Access delegated
credential resource
EPR
EPR
RFT
Transfer
RP
Delegation
Credential
RP
•Periodically poll
“ResultStatus” RP via GetRP
•When “Done”, get state
information for each file
transfer
Replica
Index
•Data transferReplica
between
GridFTP Server
sites
Catalog
Replica
Catalog
Replica
Catalog
GridFTP
Server
GridFTP
Server
MDS
Index
RP
Service Container
•Setup GridFTP Server
transfer of file(s)
Replica
Catalog
Register Replica Information
Client
Data Rep.
Replicator
RP
•Notification of “Stage”
RP value changed to
“register”
RFT
Transfer
RP
•Replicator registers
new file mappings in
RLS Replica
Catalog
Delegation
Credential
RP
Replica
Catalog
Replica
Index
Replica
Catalog
Replica
Catalog
•RLS Replica Catalog
sends update of new
GridFTP
replica mappings to
Server
the Replica Index
GridFTP
Server
MDS
Index
RP
Service Container
Replica
Catalog
Client Inspection of State
Client
Data Rep.
Replicator
RP
•Notification of
“Status” RP value
changed to “Finished”
•Client inspects
Replicator state
RFT
information for each
replication in the
Transfer
request
RP
Delegation
Credential
RP
Replica
Catalog
Replica
Index
Replica
Catalog
Replica
Catalog
GridFTP
Server
GridFTP
Server
MDS
Index
RP
Service Container
Replica
Catalog
Resource Termination
Client
Data Rep.
Replicator
TIME
RP
RFT
Transfer
RP
•Resources destroyed
(Credential, Transfer,
Replicator)
Delegation
Credential
RP
•Termination time (set
Replica
by client) expires
Index
eventually
Replica
Catalog
Replica
Catalog
Replica
Catalog
GridFTP
Server
GridFTP
Server
MDS
Index
RP
Service Container
Replica
Catalog
Performance Measurements:
Wide Area Testing

The destination for the pull-based transfers is located in
Los Angeles



Dual-processor, 1.1 GHz Pentium III workstation with 1.5
GBytes of memory and a 1 Gbit Ethernet
Runs a GT4 container and deploys services including RFT
and DRS as well as GridFTP and RLS
The remote site where desired data files are stored is
located at Argonne National Laboratory in Illinois


Dual-processor, 3 GHz Intel Xeon workstation with 2
gigabytes of memory with 1.1 terabytes of disk
Runs a GT4 container as well as GridFTP and RLS services
DRS Operations Measured





Create the DRS Replicator resource
Discover source files for replication using local RLS
Replica Location Index and remote RLS Local
Replica Catalogs
Initiate an Reliable File Transfer operation by
creating an RFT resource
Perform RFT data transfer(s)
Register the new replicas in the RLS Local Replica
Catalog
Experiment 1: Replicate
10 Files of Size 1 Gigabyte
Component of Operation
Time (milliseconds)
Create Replicator Resource
317.0
Discover Files in RLS
449.0
Create RFT Resource
808.6
Transfer Using RFT
Register Replicas in RLS
1186796.0
3720.8

Data transfer time dominates

Wide area data transfer rate of 67.4 Mbits/sec
Experiment 2: Replicate
1000 Files of Size 10 Megabytes
Component of Operation
Time (milliseconds)
Create Replicator Resource
1561.0
Discover Files in RLS
9.8
Create RFT Resource
1286.6
Transfer Using RFT
Register Replicas in RLS

963456.0
11278.2
Time to create Replicator and RFT resources is larger

Need to store state for 1000 outstanding transfers

Data transfer time still dominates

Wide area data transfer rate of 85 Mbits/sec
Future Work


We will continue performance testing of DRS:

Increasing the size of the files being transferred

Increasing the number of files per DRS request
Add and refine DRS functionality as it is used by
applications


E.g., add a push-based replication capability
We plan to develop a suite of general, configurable,
composable, high-level data management services