LOSF Storage Optimization in Swift Jeff Li Senior Software Engineer, Technology and Products Center, iQiyi.com Outline • Background – – • Blob Engine – – – – • • • Introduction Motivation Persist objects Locate objects Replicate object Volume compaction Performance Future Q&A Background Who are we Why Swift • • • • Simple Low cost Have been in use since 2012 Serve video, image, text etc. at iQiyi Video Transcoding Clients Transcoding System Standard Swift Customized Middlewares Standard Swift …… Proxy Node M Proxy Node 1 W/R Customized Services Entry: /srv/node/ Storage Node 1 Standard Swift W/R W/R Customized Services Entry: /srv/node/ Storage Node 2 Customized Middlewares Customized Services Standard Swift W/R W/R …… Standard Swift W/R Entry: /srv/node/ Storage Node N Other Use Cases • Video snapshots • Archive with Swift EC • Social product Massive small files storage matters! Our Problem with LOSF • Write performance degradation Replication storage engine contributes most of the latency Storage Engine • Erasure coding • Replication Engine – Every replica is saved as a file – Metadata is saved as extended attributes Write Pipeline of Replication Engine Begin Check if objects exists Rename End Create temp File Make dirs Write data Invalid hash Write metadata Drop cache fsync Why Replication Engine Inadequate with LOSF • Heavy inodes usage • Heavy random IO • Synchronous pipeline Our attempts • Expand the cluster • PyPy • Hummingbird None resolves the issue completely Blob Engine Blob Store System • • • • • Mainly designed for binary object storage Small files are stored in a big file File handle with encoded metadata Reduce random IO at best FastDFS, Haystack, SeaweedFS, Ambry, TFS Blob Store Architecture • • • • Distributed fault tolerant Central lightweight metadata server Data servers File handle with encoded information Clients 1 Metadata Server 2 3 Data Servers Disk i Disk n Challenges in Swift • • • • • No centralized meta servers No file handle File path based replication Customized object metadata WSGI’s multiple workers model Persist Objects • Volume files to save needles(objects) • Embedded key value database KV Database Volume 0 Disk A Volume n Locate Objects • Replication Engine – /account/container/object -> Partition – Partition -> Disk • Blob Engine – Partition – Disk – Volume – Offset – Size Locate Objects(cont.) o1 o1 o2 o1 o2 o2 Partition 0 Partition x Partition 0 Partition y Partition 0 Partition z Volume 0 Volume x Volume 0 Volume y Volume 0 Volume z Disk A Disk B Disk C Replicate Objects • Based on Object Replicator • Path mock in key value database DB Key: /3/63c/3e19cafe6fc6d71c6ee3fe814ef4d63c/ Compact Volumes • In place copy • Punch continuous file hole in volume files Superblock Superblock Hole Hole Needle 4 Deleted Needle Needle 6 Needle 6 Needle 7 Needle 7 Needle 4 original volume compacted volume Implementation • • • • Based on Hummingbird RocksDB as key value database Leverage Python Swift code gRPC Performance Average Write Latency 95th Percentile Write Latency Average Read Latency 95th Percentile Read Latency Future Future • • • • Full Go stack Operation tools Better large file support System performance observation Summary Motivation Blob Engine Roadmap THANK YOU! @ffejfd
© Copyright 2026 Paperzz