Managing Data from HighPerformance Lustre to Deep
Tape Archives
Dr. Thomas Schoenemeyer
Senior HPC Solution Architect
CO M PU TE
|
S TO RE
|
A N A LY Z E
CO M PU TE
|
S TO RE
CRAY INC - PROPRIETARY
|
A N A LY Z E
2
Enable scientists to solve the world’s largest
computational problems
22500 nodes
CO M PU TE
|
S TO RE
CRAY INC - PROPRIETARY
|
A N A LY Z E
4
CO M PU TE
|
S TO RE
CRAY INC - PROPRIETARY
|
A N A LY Z E
5
… also at smaller scale
Cray CS300 Cluster
LUSTRE
STORE
ARCHIVE
Simulated data & Processed data
Experimental data & Observational data
CO M PU TE
|
S TO RE
CRAY INC - PROPRIETARY
|
A N A LY Z E
6
Observation (1)
●
●
●
●
●
●
(Almost) all Cray systems connect to a Lustre file system
150+ Cray-supported Lustre deployments
120+ Petabytes of deployed storage
Write and read data across hundreds storage servers
BlueWaters with an aggregated I/O over one TB/s
Organizational data grow by 30% each year
Plan 2015
Trinity
ORNL
BlueWaters
ECMWF
0,0
20,0
40,0
60,0
80,0
100,0
Lustre Capacity [Petabyte]
CO M PU TE
|
S TO RE
CRAY INC - PROPRIETARY
|
A N A LY Z E
7
Observation (2)
● Huge user base
● 70% of Top100 sites use Lustre
● Community-driven
● Solutions available from many vendors
● Many companies supply code changes
● Lustre is Open Source (GPLv2)
● OpenSFS is sponsoring Lustre development*
● Main development done by Intel’s High Performance Data
Devision (HPDD)
● Scalable, failover capabilities, open bug tracking
CO M PU TE
|
S TO RE
CRAY INC - PROPRIETARY
|
A N A LY Z E
* 4 promoters
8
How to connect to Deep Tape Archives?
● Until Lustre 2.4
● Difficult; no automated method for data movement across
storage tiers included
● Admin or users responsibility to purge files or move files to other
storage tiers
● HSM functionality was introduced with Lustre 2.5
● GA end of October 2013
● Cray Tiered Adaptive Storage (TAS)
● Announced in November 2013
● Connection to Cray HSM product
● Cray TAS Connector Juni 2014
CO M PU TE
|
S TO RE
CRAY INC - PROPRIETARY
|
A N A LY Z E
9
Lustre 2.5 HSM functionality (RBH)
● Admin-defined rules
● Build-in policies
●
●
●
●
fileclass BigLogFiles {
definition { type == file and size > 100MB
and (path ==f fs/logdir/*
or name == *.log) }
…
}
Purge
Directory removal
Deferred removal
Archiving and Releasing
Purge_policies {
ignore_fileclass = my_fileclass;
● Policy definition
● Attribute-based
● Using fileclass definitions
http://sourceforge.net/p/robinhood/news/
CO M PU TE
|
policy purge_logs {
target_fileclass = BigLogFiles ;
condition { last_mod > 15d }
}
}
S TO RE
CRAY INC - PROPRIETARY
|
A N A LY Z E
10
Why do we need RobinHood Policy Engine?
● Time critical, millions of files
● Policy Engine generates file list and actions based upon
database data
● MySQL as backend, near real-time data
● Filled by reading Lustre changelogs provided by the MDS
● Only one initial scan is needed
● Lustre coordinator communicates with PE and MA
● Migration agents inform Lustre MDS and HSM metadata
manager
● On open() with no file, coordinator initiates restore
immediately
CO M PU TE
|
S TO RE
CRAY INC - PROPRIETARY
|
A N A LY Z E
11
Cray TAS Connector for Lustre File System
Clients
Client
MDS
Coordinator
OSS
OSS
● Policy engine
Policy Engine
Agent
Agent
Agent
Tool
Agent Copy
Copy
Tool
Copy
CopyTool
Tool
Lustre Space
HSM Space
● Robinhood policy engine to manage Lustre namespace activity
● Coordinator
● Communicates with policy engine and agents to manage data movement
● Agent (HSM client) and Copy Tool
● Lustre clients with copy tool software to migrate data between Lustre and TAS
CO M PU TE
|
S TO RE
CRAY INC - PROPRIETARY
|
A N A LY Z E
12
Additional features with HSM support
● Purge policies to release disk space in Lustre OSTs when
needed (file remains visible in Lustre for users)
● GC of deleted files
● Disaster recovery: to rebuild a Lustre file system from the
archive
● Purge Use case
● Robinhood monitors free space per OST
● Avoid “ENOSPC” errors caused by full OSTs
● Admin defines high/low OST usage thresholds and purge policy
rules
CO M PU TE
|
S TO RE
CRAY INC - PROPRIETARY
|
A N A LY Z E
13
Complete Solution
Cray TAS
Connector for
Lustre
Primary
16
L
L
0
A
L
1
A
L
2
A
L
3
A
L
4
A
L
5
A
L
6
A
L
7
A
L
8
A
L
9
A
L
10
A
L
11
A
L
12
A
L
13
A
L
14
A
L
15
A
L
0
A
L
1
A
L
2
A
L
3
A
L
4
A
L
5
A
L
6
A
L
7
A
L
8
A
L
9
A
L
10
A
L
11
A
L
12
A
L
13
A
L
14
A
L
15
A
A
17
L
SANbox
5600
18
L
A
42
19
A
16
L
SANbox
5600
42
L
A
18
L
A
41
40
A
17
19
L
L
A
A
41
40
39
39
38
38
Nearline
37
XC30 as an example
CO M PU TE
|
S TO RE
CRAY INC - PROPRIETARY
|
37
36
36
35
35
34
34
33
33
32
32
31
31
30
30
29
29
28
28
27
27
26
26
25
25
24
24
23
23
22
22
21
21
20
20
19
19
18
18
17
17
16
16
15
15
14
14
13
13
12
12
11
11
10
10
09
09
08
08
07
07
06
06
05
05
04
04
03
03
02
02
01
01
A N A LY Z E
Deep Tape
Archive
14
User Perspective?
● Lustre mount point /data
# df -h .
Filesystem
Size
192.168.100.10@tcp:/data
12G
Used Avail Use% Mounted on
895M
11G
8% /data
● Directory listing
# ls -l
total 101376
-rw-r--r-- 1 root root 10485760 Dec
-rw-r--r-- 1 root root 10485760 Dec
2 16:29 test.dat.0
2 16:29 test.dat.1
● Display archive status before archiving
# lfs hsm_state test.dat.*
test.dat.0: (0x00000000)
test.dat.1: (0x00000000)
CO M PU TE
|
S TO RE
|
A N A LY Z E
15
User Perspective - Lustre HSM Examples
● Archive files (manually initialized)
# lfs hsm_archive test.dat.*
● Display archive status after archiving
# lfs hsm_state test.dat.*
test.dat.0: (0x00000009) exists archived, archive_id:1
test.dat.1: (0x00000009) exists archived, archive_id:1
● Release files and show status
# lfs hsm_release test.dat.*
# lfs hsm_state test.dat.*
test.dat.0: (0x0000000d) released exists archived, archive_id:1
test.dat.1: (0x0000000d) released exists archived, archive_id:1
● Restore files and show status
# lfs hsm_restore test.dat.*
# lfs hsm_state test.dat.*
test.dat.0: (0x00000009) exists archived, archive_id:1
test.dat.1: (0x00000009) exists archived, archive_id:1
CO M PU TE
|
S TO RE
|
A N A LY Z E
16
Cray Tiered Adaptive Storage
Deployment-ready open archive system for Big Data and Supercomputing
● Preserve data indefinitely
●
●
●
Optimized for scale
Data fully protected
Upgrade with technology
●
●
●
Up to 5 tiers with Lustre - flexible media choices
Non-disruptively upgrade storage and media
Familiar tools and commands to SAM-QFS users
●
●
●
Data protection and accessibility at scale
Expert design for maximum scalability
Single point of support by Cray
● Simplified management and implementation
● Access data forever
CO M PU TE
|
S TO RE
|
A N A LY Z E
Cray and Oracle - Tiered Storage Solutions
17
Summary
Lustre =Fast
• Transparently manage data within
Lustre
• Up to 5 storage tiers
• 5 copies per file protected over
multiple tiers
• Efficiently utilizing Lustre storage
• On and off-site copy options for
disaster recovery
• Open systems and formats
Primary
Nearline
Deep Tape Archive
18
CO M PU TE
|
S TO RE
|
A N A LY Z E
Questions ?
●
Dr. Thomas Schoenemeyer
Senior HPC Solution Architect
[email protected]
CO M PU TE
|
S TO RE
CRAY INC - PROPRIETARY
|
A N A LY Z E
19
© Copyright 2026 Paperzz