FutureProofDPMEGITF

Future Proof Storage with DPM
Oliver Keeble
(on behalf of the CERN
IT-GT-DMS section)
EMI is partially funded by the European Commission under Grant Agreement RI-261611
DPM today
• Disk storage for the grid
• 36PB
– 10 sites with > 1PB
Future Proof Storage with DPM, EGI TF Prague
2
EMI INFSO-RI-261611
• Over 200 sites in 50 regions
• Over 300 VOs have access to DPMs
DPM today
• Current production version is 1.8.3
–https://svnweb.cern.ch/trac/lcgdm/blog/official-releaselcgdm-183
–In EPEL, EMI1 and EMI2
–No gLite – see tutorial on Wed on how to upgrade
–EPEL compliance
–HTTP/DAV frontend (old dpm httpd is gone)
–NFSv4.1 frontend (read only)
–Thread safe clients
–Synchronous get requests
–…
Future Proof Storage with DPM, EGI TF Prague
3
EMI INFSO-RI-261611
• What’s new
1.8.4
• Next core release: 1.8.4
–First DMLITE release (see later…)
• DAV frontend will start using the new libraries
• Other components not using dmlite for now
–Improved replication mechanism
–32bit client support
–Lots of other small fixes
–Feature complete… finishing certification
• Change in the release process
Future Proof Storage with DPM, EGI TF Prague
4
EMI INFSO-RI-261611
–Independent component releases
–Faster, lighter releases
Refactoring & DMLITE
• DMLITE is the result of a significant refactoring
effort to make DPM modular
• Better separation between frontends and backend
• Improved integration with standard building blocks
–Hadoop, Memcache, S3, Lustre, …
Future Proof Storage with DPM, EGI TF Prague
5
EMI INFSO-RI-261611
–Cleaner, more open, much improved performance
Improved Frontends
• Standard protocols, standard clients
–HTTP/DAV
• + extras, WAN transfers, 3rd party copy…
–NFS 4.1/pNFS (r/o)
–Xrootd (rewritten)
• Forthcoming
–GridFTP
–No large scale revalidation required
Future Proof Storage with DPM, EGI TF Prague
6
EMI INFSO-RI-261611
• Legacy interfaces remain untouched
Improved Frontends
• Standard protocols, standard clients
–HTTP/DAV
• + extras, WAN transfers, 3rd party copy…
–NFS 4.1/pNFS (r/o)
–Xrootd (rewritten)
• Forthcoming
• Legacy interfaces remain untouched
–No large scale revalidation required
Future Proof Storage with DPM, EGI TF Prague
7
EMI INFSO-RI-261611
–GridFTP
–Ubiquitous access to grid storage
Improved Backends
• This is where DMLITE shines
–Plugin based, open for constant evolution
• Improved nameserver performance
–Connection pooling, improved SQL, memcache layer, …
• Support for multiple pool types
–Legacy DPM, Hadoop/HDFS, S3, …
–Sharing a single namespace if desired
–Possibility for opportunistic pools
• Federation
–See the following presentation
–Python bindings, Lustre, VFS, …
Future Proof Storage with DPM, EGI TF Prague
8
EMI INFSO-RI-261611
• And this is the beginning, much more coming
Improved Backends
• This is where DMLITE shines
–Plugin based, open for constant evolution
• Improved nameserver performance
–Connection pooling, improved SQL, memcache layer, …
• Support for multiple pool types
–Legacy DPM, Hadoop/HDFS, S3, …
–Sharing a single namespace if desired
–Possibility for opportunistic pools
• Federation
–See the following presentation
–Python bindings, Lustre, VFS, …
Future Proof Storage with DPM, EGI TF Prague
9
EMI INFSO-RI-261611
• And this is the beginning, much more coming
I/O performance
Protocol
N. Reads
Read Size
Read Time
HTTP
500
22,773,112
0.43
HTTP
1000
46,027,143
0.87
XROOT
500
22,773,112
0.39
XROOT
1000
46,027,143
0.78
RFIO
500
22,773,112
136.35
RFIO
1000
46,027,143
274.89
Future Proof Storage with DPM, EGI TF Prague
10
EMI INFSO-RI-261611
LAN / Chunk Size: 10k-20k / File Size: 2G
Performance, performance, performance
Future Proof Storage with DPM, EGI TF Prague
11
EMI INFSO-RI-261611
https://cdsweb.cern.ch/record/1458022?ln=en
Easy administration
• Puppet for configuration
–Popular among large data centers
–Lots of modules for popular tools (which we now
rely on – apache, memcache, nagios, …)
• Nagios for monitoring
–We reuse as much as possible
–Added specific plugins for detailed status and
performance monitoring
Future Proof Storage with DPM, EGI TF Prague
12
EMI INFSO-RI-261611
• And there’s a lot already available
Easy administration
• Puppet for configuration
–Popular among large data centers
–Lots of modules for popular tools (which we now
rely on – apache, memcache, nagios, …)
• Nagios for monitoring
–We reuse as much as possible
–Added specific plugins for detailed status and
performance monitoring
Future Proof Storage with DPM, EGI TF Prague
13
EMI INFSO-RI-261611
• And there’s a lot already available
Why “future proof”?
• Standards
–Leverage existing components & clients
–Mature ecosystem
–Less maintenance work
–Guidance by the stakeholders
–Independence from funding cycles
–Now in talks for the creation of a “DPM
Collaboration” to drive the project post-EMI
Future Proof Storage with DPM, EGI TF Prague
14
EMI INFSO-RI-261611
• Community
The DPM Collaboration
–GridPP (UK)
–WLCG France
–Taipei (WLCG Tier 1 using DPM)
Future Proof Storage with DPM, EGI TF Prague
15
EMI INFSO-RI-261611
• DPM is the most numerous SE on the WLCG
infrastructure
• The project has never been in better shape
• We are soliciting statements of support for
a collaboration
• In discussions with
DPM around the world
DPM-Instances/region
30
25
20
15
Instance…
10
0
fr
it
uk es
nl
gr
ro
pl
ru my de bg br
rs
ie
tw
pt
kr ua
tr
by hr mk org sk hu
Future Proof Storage with DPM, EGI TF Prague
jp
eu th
16
EMI INFSO-RI-261611
5
Summary
• DPM has received a lot of investment
thanks to EMI
–These developments are now being released
–Making it modular
–Supporting standards (HTTP, NFSv4.1,…)
–Profiting from existing technology
–Performance
–Manageability
Future Proof Storage with DPM, EGI TF Prague
17
EMI INFSO-RI-261611
• This investment has gone into
Recent Issues
• Issue with LFC API / Py26 (#84716)
– Tracked down to issue with EPEL5 Swig version
– Currently considering the best solution
• Cleaning up invalid LFC replicas (#83335)
– Solved… bulk requests using API/CLI tools
• DAV EMI2 install failure (#85141)
– Related to dependencies on gridsite, solved with the EPEL update
• EMI2 LFC dies regularly (#85161)
– Happens to any of lfc/dpm/dpns daemons
– Documented as a known issue with 1.8.3
• https://svnweb.cern.ch/trac/lcgdm/blog/official-release-lcgdm-183
– Available in EMI2 and EPEL repositories
Future Proof Storage with DPM, EGI TF Prague
18
EMI INFSO-RI-261611
• 32 bit support (#81508)