FROM RESEARCH TO INDUSTRY Lustre HSM integration Project update HUF 2015 | Thomas Leibovici <[email protected]> CEA, DAM, DIF, F-91297 Arpajon, France SEPTEMBER, 29th 2015 CEA | 10 AVRIL 2012 | PAGE 1 SUMMARY Principle Architecture and components Project status Vendors integration HPSS integration Future work 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 2 BIG PICTURE Principle Take the best of each world: HSM seamless integration Clients Client Client Clients Client Client Clients Client Client Lustre: High performant disk-cache in front of the HSM - Parallel filesystem - High I/O performance - POSIX access HSM backend HSM: long term data storage - Manage large number of cheaper disks and tapes - Huge storage capacity Ideal for center-wide Lustre filesystem. 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 3 FEATURES AND COMPONENTS Features Copy data to HSM (Archive) Free disk space when needed (Release) Bring back data on cache-miss (Restore) Supports multiple backends Policy management (migration, purge, removal,…) Import from existing backend Undelete Needed components Lustre (2.5+) Copy tool (backend specific user-space daemon) Posix copy tool shipped with Lustre Policy Engine (user-space daemon) RobinHood Policy Engine (open source) 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 4 ARCHITECTURE (1/2) Coordinator, Agent and Copy tool Clients MDS Coordinator OSS Client Client Client “Agent” “Agent” “Agent” OSS Lustre world Archivingtool tool Archiving Copy tool HSM protocols HSM world The coordinator gathers archive requests and dispatches them to agents. Agent is a client which runs a copytool to transfer data between Lustre and the HSM. 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 5 ARCHITECTURE (2/2) Policy Engine manages Archive and Release policies Clients MDS Coordinator OSS OSS PolicyEngine Client A user-space tool which communicates with the MDT and the coordinator. Watches the filesystem changes. Triggers actions like archive, release and removal in backend. 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 6 EXAMPLES (1/3) Command line tools Sysadmins and users can manage file system states: $ lfs hsm_archive /mnt/lustre/foo ARCHIVE $ lfs hsm_state /mnt/lustre/foo /mnt/lustre/foo: (0x00000009) exists archived, archive_id:1 $ lfs hsm_release /mnt/lustre/foo RELEASE AUTOMATIC RESTORE 16 septembre 2015 $ lfs hsm_state /mnt/lustre/foo /mnt/lustre/foo: (0x0000000d) released exists archived, archive_id:1 $ md5sum /mnt/lustre/foo ded5b0680e566aa024d47ac53e48cdac /mnt/lustre/foo $ lfs hsm_state /mnt/lustre/foo /mnt/lustre/foo: (0x00000009) exists archived, archive_id:1 HUF2015 | 29 SEPTEMBER 2015 | PAGE 7 EXAMPLES (2/3) Example RobinHood policy: Migration Migrate files older than 12 hours with a different behavior for small ones. Filesets { FileClass small_files { definition { tree == "/mnt/lustre/project" and size < 1MB } ... } } Migration_Policies { ignore { size == 0 or xattr.user.no_copy == 1 } ignore { tree == "/mnt/lustre/logs" and name == "*.log" } policy migrate_small { target_fileclass = small_files; condition { last_mod > 6h or last_archive > 1d } migration_hints = "cos=12" ; } ... policy default { condition { last_mod > 12h } migration_hints = "cos=3" ; } } 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 8 EXAMPLES (3/3) Example RobinHood policy: Release Release archived files when FS usage is above 90 % but ignore some files. Purge_trigger { trigger_on = ost_usage; high_watermark_pct = 90%; low_watermark_pct = 80%; } Purge_Policies { ignore { size < 1KB or owner == “root” } policy purge_quickly { target_fileclass = class_foo; condition { last_access > 1min } } ... policy default { condition { last_access > 1h } } } 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 9 PROJECT STATUS Initially developed by CEA in Collaboration with CFS/Sun/Oracle/WhamCloud/Intel... HSM feature initially released in Lustre 2.5.0 (oct. 2013) Now supported and maintained by main Lustre vendors: Intel, Cray, Seagate, Bull, DDN, SGI... HSM support is very active: Lustre 2.5.1: 19 patches (1 improvement, 1 feature) Lustre 2.5.2: 7 patches (1 improvement) Lustre 2.5.3: 6 patches Lustre 2.5.4 (Intel Fundation Edition): 8 patches (1 improvement) Lustre 2.6: 35 patches (3 improvements) Lustre 2.7: 18 patches Lustre 2.8 (under development): 14 patches 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 10 VENDORS INTEGRATION Intel: Support for HSM feature, Posix Copytool and Robinhood Policy Engine is part of the Intel Enterprise Edition Lustre solution. SGI: Developed a specific copytool for DMF Now in production on large systems: NCI/ANU (Australia), CINES (France), ... Cray: Developed a specific copytool (enhanced POSIX copytool + Versity support) Active developer of Robinhood Policy Engine First customer: KAUST University (Saudi Arabia) Seagate: Working on HSM support (including Robinhood) Grau Data: Developed a parallel copy tool for OpenArchive Support for other backends under development 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 11 HPSS INTEGRATION CEA developed a copytool for HPSS based on HPSS Client API Available as open-source to HPSS sites: http://lustrehpss.sourceforge.net In production at CEA Advanced (successful) testing at SLAC Other sites downloaded it (no feedback yet) Mailing list (questions, new releases...): [email protected] 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 12 Clients Client Client Clients Client Client Clients Client Client Policy Engine Robinhood lustreapi lustreapi lustreapi HPSS Copytool HPSS Copytool HPSS Copytool HPSS client API HPSS client API HPSS client API HPSS servers Lustre world HPSS world HUF2015 | 29 SEPTEMBER 2015 HPSS SPECIFIC FEATURES Admin-Friendly HPSS namespace Lustre namespace /fs/proj1/grp/user/foo /fs/proj/grp/user1/dir1/bar /fs/proj/grp/user1/dir2/save /fs/mylogs/tool/20131011.log /fs/mylogs/tool/20131012.log HPSS namespace /fs/proj/grp/user/foo__0x200000201:0x1a43f5:0x0 /fs/proj/grp/user1/dir1/bar__0x200100201:0x2f320:0x0 /fs/proj/grp/user1/dir2/save__0x200450201:0x74320:0x0 /fs/mylogs/tool/20131011.log__0x201210201:0x43112:0x0 /fs/mylogs/tool/20131012.log__0x200300201:0x24120:0x0 Relies on HPSS UDAs Else: namespace based on ids Just for admin convenience (no path update on rename) Lustre/HSM configurations 1 HPSS system <--> several Lustre filesystems Contraint: distinct directories 1 Lustre filesystem <--> several storage systems (HPSS, POSIX, ...) Policy driven e.g. small files to a NFS filer, big files to HPSS... HUF2015 | 29 SEPTEMBER 2015 FAQ “If a top directory is renamed, does this trigger millions of renames in HPSS?” The only operations in the backend are copy operations “archive” and “restore”, + “remove” (cleaning deleted files). Path in HPSS is only for admin convenience, it is not used for working. The metadata replicate used for disaster recovery is maintained by the Policy Engine (Robinhood). No massive metadata update is performed in HPSS. “How long is an import operation?” Import is a metadata-only operation (creating a “released” file in Lustre). File data is restored on first access (or explicit “hsm_restore” command). “How is Disaster Recovery performed? Are the files recovered by copying the files from HPSS to Lustre?” Policy Engine replicates filesystem metadata near real time (thanks to Lustre changelogs). The contents of its DB can be used to restore files in “released” state with the right path and attributes. This is a metadata only operation. 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 15 LIMITATIONS AND FUTURE WORK Whole files The current implementation only support whole file copy Support of partial files copy is part of HSMv2 TODO list DNE support (multiple Lustre MDS) HSM is compatible with DNE pĥase 1 (static namespace partitioning) HSM is not compatible with later phases (in TODO list for HSMv2) Scalability: distributed policy engine DB (WIP) Disaster recovery process All needed information is available in Robinhood DB to restore a Lustre filesystem with the current backend contents. Disaster recovery can be implemented using 'import' command + information from Robinhood DB. However, a more automated/integrated command would be better. Intel is working on it. 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 16 GETTING STARTED Enable HSM feature on your Lustre filesystem: In “Lustre Manual”: - See “Hierarchical Storage Management (HSM)” Get HPSS copytool at: http://lustrehpss.sourceforge.net (download) Doc is in tar file: share/doc/hpss_ct.pdf Then you can test manual actions (archive, release, restore...) To massively trigger automatic actions, get the policy engine: http://robinhood.sourceforge.net -> “Download latest version” (lhsm flavor) Lustre/HSM related doc: “Online documentation” -> “robinhood-lhsm tutorial” 16 septembre 2015 HUF2015 | 29 SEPTEMBER 2015 | PAGE 17 Thanks for your attention ! Questions ? Commissariat à l’énergie atomique et aux énergies alternatives CEA / DAM Ile-de-France| Bruyères-le-Châtel - 91297 Arpajon Cedex T. +33 (0)1 69 26 40 00 Etablissement public à caractère industriel et commercial | RCS Paris B 775 685 019 DAM Île-de-France
© Copyright 2026 Paperzz