Virtual Smart Card

xrootd
Andrew Hanushevsky
Stanford Linear Accelerator Center
30-May-03
Goals
High Performance File-Based Access

Scalable, extensible, usable
Fault tolerance
Server failures handled in a natural way
 Servers may be dynamically added and removed

Flexible Security

Allowing use of almost any protocol
Rootd Compatibility
May 30, 2003
2: xrootd
Achieving High Performance
Scalable request/response protocol
Multi-threaded multi-process architecture
Architecture sensitive polling
MRU scheduling
Sticky sockets
Adaptive reconfiguration
Versatile sfs layer (based on proven oofs)
May 30, 2003
3: xrootd
Scalable Protocol I
Connection multiplexing

One connection per client/host

Multiple logically independent streams
Request redirection supported

Similar to http redirection

Supports dynamic load balancing and fail-over
Uses an intentional request header

Can better optimize request processing
May 30, 2003
4: xrootd
Scalable Protocol II
Asynchronous mode allowed
Multiple processing-order-independent requests
 Optional application-directed pre-read

I/O segmenting

Able to naturally deal with very large transfers

Better use of server resources
Request deferral

Client waits for resources without using server
resources
May 30, 2003
5: xrootd
Scalable Protocol III
Unsolicited Reverse Request Mode

Allows server to manage client for recovery

Asynchronous redirect, deferral, and messages
Protocol may be compatibly extended

Mechanism to send opaque information

Accommodate things that were “forgotten”




May 30, 2003
Messaging interface
Cache group
Request priority
And so on….
6: xrootd
MT/MP Architecture
Normally one multi-threaded server per host

Should be able to utilize available resources

Easy to administer
Optionally, multiple servers per host

Fully utilize large machines
May 30, 2003
7: xrootd
Architecture Sensitive Polling
All POSIX systems support poll()

Used by default

Not always an efficient I/O “interrupt” mechanism
Alternate polling mechanisms allowed

/dev/poll


Available on Solaris and patched RH Linux
Up to an order of magnitude reduction in CPU

May 30, 2003
Essential to reduce latency
8: xrootd
MRU Scheduling
Connections processed in most recently
used order
Gives priority to active connections
 Reduces polling overhead
 Essentially a fair scheduling algorithm

Starvation cannot occur
 Longer running tasks tend to get started first


May 30, 2003
Assuming all other things being equal
9: xrootd
Sticky Sockets
Connection temporarily binds to a thread
Avoids polling and scheduling overhead
 Significantly reduces latency

Connection automatically unbinds
Client is not sufficiently active
 Number of other requests approaches available
threads

May 30, 2003
10: xrootd
Adaptive Reconfiguration
Server dynamically adjusts configuration

Number of threads


Kept proportionate to number of active requests
Pre-allocated buffers

Sizes track actual usage profile


Pre-allocated objects


Recomputed periodically
Number tracks recent needs
High latency connections rescheduled
May 30, 2003
11: xrootd
Versatile sfs Layer I
Integrates multiple performance features

Dynamic load balancing


File descriptor partitioning


Client redirected to “best” server of the moment
Reduces socket polling overhead
File system interface reuse

Prevents open file proliferation and attendant overhead


Same file opened in same mode shared by multiple clients
File system interface timeout

May 30, 2003
Reduces overhead caused by idle opened files
12: xrootd
Dynamic Load Balancing
Dynamic
Selection
May 30, 2003
13: xrootd
DLB Implementation
xrootd
xrootd
xrootd
dlbd
dlbd
dlbd
subscribe
open again
open
Client
wait
try host:port
May 30, 2003
(any number)
I do
who has the file?
dlbd
xrootd
(any number)
14: xrootd
Versatile sfs Layer II
Dynamic disk cache integration
Allows unlimited file system size
 Provides superior internal load balancing

Mass Storage Integration

HPSS, Castor, Enstore, etc
RFIO Integration
Scalable authorization

From file sub-trees to single files
May 30, 2003
15: xrootd
Cache File System
/databases/mydbfile
Index Area
Optional data cache
Default data area
symlink
Multiple
Independent
Filesystems
/cache1/databases:mydbfile
Data Area
Any number
Any Size
Chosen based on free
space in LRU order
/cache2
/cache3
May 30, 2003
Naming convention
allows for
audit and index recovery
16: xrootd
Fault Tolerance I
Servers may come and go

Uses load balancing to effect recovery
New servers can be added at any time
 Servers may be brought down for maintenance
 Files can be moved around in real-time

Client simply adjust to the new configuration

XTNetFile object handles recovery protocol
May 30, 2003
17: xrootd
Fault Tolerance II
Whenever client looses r/o connection

Back to distinguished xrootd(s) for reselection
Whenever client looses r/w connection

Limited wait/retry loop on the same server

We will be working to improve this next year!
All handled in the XTNetFile class

Disruptions merely delay the client
May 30, 2003
18: xrootd
Flexible Security
Negotiated Security Protocol

Allows client/server to agree on protocol

E.g., Kerberos, GSI, AFS Kerberos, etc.
Can be easily extended

Multi-protocol authentication support
May 30, 2003
19: xrootd
Security Architecture
login
Client-Specific Security Configuration
authenticate
Protocol
Selection
Multiple handshakes allowed
during authentication phase
(required by some PKI protocols)
libooseccl.so
Self
Configuration
libooseccl.so
Security Token
May 30, 2003
20: xrootd
Heterogeneous Security Support
• Servers have one or more
protocol objects
• Server protocol objects created
•
•
•
•
•
at server initialization time
Client selects which protocol to
use when security context created
Protocol object created based on
configuration returned by xrootd
One security context object per
physical xrootd connection
Protocol objects may be shared
by one or more contexts
Each “pass” through a security
context object may generate
credentials to be passed to xrootd
protocols
May 30, 2003
21: xrootd
Simple & Effective Interface
For each login that requires authentication

XrdSecCreateSecurityContext(ipaddr, config)

Returns security protocol object



XrdSecClientSecurity
Based on server ipaddr and server-supplied config
XrdSecClientSecurity::getCredentials()

Returns credentials to be sent to the server

Done via authenticate request and possible authmore response
Based on well tested and documented oofs security
May 30, 2003
22: xrootd
Optional Scalable Authorization
libooseccl.so
libooacc.so
Authentication
Authorization
u abh rw /slac/rootfiles/usr/abh
r /cern/rootfiles
May 30, 2003
23: xrootd
Security Summary
Multi-protocol Authentication

Supports distributed heterogeneous environments
Scalable Authorization

Open-ended capability based model
Integrated Auditing

To keep the security hard hats happy
Well defined, proven interfaces

Trivially replaceable for a plug & play architecture
May 30, 2003
24: xrootd
rootd Compatibility
Bilateral compatibility
XTNetfile reverts to TNetFile for rootd servers
 XRootd reverts to rootd protocol for TNetFile

Allows for transparent introduction
Can run mixed mode
 Binary is multi-environment compatible

May 30, 2003
25: xrootd
Compatibility Modes
Client-Side Compatibility
Application
XTNetFile
rootd
xrootd
TNetFile
Server-Side Compatibility
Application
rootd
May 30, 2003
TNetFile
26: xrootd
xrootd
rootd compability
xrootd Architecture
Protocol Manager
Protocol Layer
Filesystem Logical Layer
Filesystem Physical Layer
Filesystem Implementation
May 30, 2003
27: xrootd
xrootd Internals
Dynamically loaded
(can also be static)
May 30, 2003
28: xrootd
Conclusion
xrootd provides high performance file access

Improves over afs, ams, nfs, etc.

Unique performance, usability, scalability, security,
compatibility, and recoverability characteristics
xrootd can provide a firm server foundation
for native file system implementations

E.g. alienfs, gridfs, slashgrid, etc
For now, aim is to support BaBar
May 30, 2003
29: xrootd