Reliability of MEMS-Based Storage Enclosures Bo Hong, Thomas J. E. Schwarz, S. J.* Scott A. Brandt, Darrell D. E. Long Storage Systems Research Center University of California, Santa Cruz UC Santa Cruz *Also Santa Clara University, Santa Clara, CA MEMS Storage Technology Micro-Electro-Mechanical Systems (MEMS) storage • A promising alternative secondary storage technology • Hardware Research: IBM, HP, CMU, Nanochip • Magnetic storage, but very different mechanics Spring 2 MEMS Storage Technology MEMS-based storage vs. Magnetic Disk • • • • • • • Provides non-volatile storage, too. Delivers 10 * faster access time (< 1 ms) Delivers higher bandwidth (100 MB – 1 GB/s) Small (size of penny, cent) Consumes 100* less power Costs ~10 USD per device Expected to be more reliable • Stores limited amount of data per device (3-10 GB) A serious alternative to disk drives, in particular for mobile computing applications 3 Reliability Implication of MEMS-based Storage Storage systems built from MEMS-based storage … • Require more MEMS devices At least 10 times the number of disks to meet capacity requirements • Require more connection components Reliability implication • More components, hence (?) lower reliability 4 MEMS Storage Enclosure Our proposal: MEMS Enclosures • A device with dozens of MEMS • Single interface to rest of system • Might be serviceable, but service calls during economic lifetime should be very rare Interface 5 MEMS Storage Enclosures Reliability an issue: • MTTF 1- 2 years without redundant data storage Uses RAID Level 5 technology with distributed sparing • Additional k spares Calls for service when necessary • i.e. when we run out of spares Organization and number of spares can • Decrease the data recovery time and thus improve reliability • Reduce human interference No errors servicing Reduce maintenance costs 6 MEMS Enclosure Reliability Measure MTBF for enclosures • Without replacing spares • With replacing spares (service calls) Determine number of failures that trigger a service call Mandatory replacement: no redundancy left Preventive replacement: no spare left 7 MEMS Enclosure Reliability without Replacement 5 spares 4 spares 8.1 Yrs 6.9 Yrs 3 spares Disk 5.8 Yrs 23 Yrs Disk 11.5 Yrs 2 spares 1 spare 4.6 Yrs MTTFDISK = 11.5 or 3.5 Yrs 23 yrs No spare MTTFMEMS = 23 yrs 2.3 Yrs 19 data + 1 parity + k dedicated spares 15-minute data recovery MTTF is not enough to measure reliability of enclosures without repairs Instead: focus on data reliability during the economic lifetimes (3-5 years) of enclosures 8 MEMS Enclosures with Replacement Markov model for a MEMS enclosure with N data, one parity, and one dedicated spare devices • • • • N – Normal; D – Degraded; DL – Data Loss 1/ – MTTFMEMS (in tens of years) 1/µ – Mean Time Between Recovery (in minutes) 1/ – Mean Time Between Replacement (in days, weeks) Preventive replacement Mandatory replacement Preventive and mandatory replacement 9 MEMS Enclosure Reliability with Replacement 1, 2, 3 – Number of spares Preventive + mandatory 1 2 3 Mandatory 1 2 3 No spare Preventive replacement increases reliability and reduces replacement urgency 10 MEMS Enclosure Reliability Dedicated Sparing • Replace all data from a failed MEMS on a single spare MEMS Distributed Sparing • Every spare contains Client data Parity data Spare space 11 Distributed Sparing [Menon and Mattson 1992] X Before failure After MEMS 4 fails Shorter data recovery time More devices can fail 12 Reliability Comparison: Dedicated Sparing vs. Distributed Sparing 1, 2– Number of spares Dedicated Preventive + mandatory Mandatory No spare 1 2 Dedicated 1 2 Compare with following slide 13 Reliability Comparison: Dedicated Sparing vs. Distributed Sparing 1, 2– Number of spares Distributed Dedicated Preventive + mandatory 2 Mandatory No spare 1 Dedicated & Distributed 1 2 Distributed sparing only better at short replacement times when using preventive replacement 14 Durability of MEMS Storage Enclosures All about economy • How long can MEMS enclosures work without repairs? • How often do they need repairing in the first 3-5 years? • How does replacement policies affect maintenance frequency? # of failures an enclosure with k spares can tolerate before the (m+1)th repair is scheduled (m >= 0): • (m + 1) × k, under the preventive replacement policy • (m + 1) × (k + 1), under the mandatory replacement policy 15 Durability of MEMS Storage Enclosures 10 failures 6 failures 8 failures Disk 23 Yrs 4 failures 1 failure 2 failures No failure Probabilities that a MEMS storage enclosure has up to k failure during (0, t] First year survivability: 95.7% of disk vs. 98.8% of MEMS enclosures with two spares Chance that MEMS enclosure with four spares requires more than one service in five years: 3.5% (preventive) vs. 0.6% (mandatory) 16 Related Work MEMS-based storage technology development • IBM, HP, CMU CHI2PS, Nanochip Digital Micromirror Devices by TI • Reported Mean Time Between Failure: 650,000 hours [Douglass] RAID reliability • Dedicated sparing [Dunphy et al.] • Distributed sparing [Menon and Mattson] • Parity sparing [Reddy and Banerjee] Disk failure prediction • S.M.A.R.T (Self-Monitoring Analysis and Reporting Technology) 17 Summary Reliability of MEMS storage enclosures • Can be more reliable than disks even without failed device replacement • Highly reliable when using preventive replacement • Dedicated sparing and distributed sparing provide comparable or almost identical reliability Economy of MEMS storage enclosures • Preventive replacement trades more maintenance services for higher reliability 18 Thank You! Acknowledgements • Dave Nagle, Greg Ganger, CMU PDL • The rest of the UCSC SSRC More information: • http://ssrc.cse.ucsc.edu • http://ssrc.cse.ucsc.edu/mems.shtml Questions? 19 Backup Slides 20 MEMS Storage Technology Micro-Electro-Mechanical Systems (MEMS) storage • A promising alternative secondary storage technology • Hardware Research: IBM, HP, CMU, Nanochip Radical differences between MEMS storage and magnetic disk technologies Disk MEMS Recoding media Magnetic Magnetic or physical (non-volatile) Recoding technique Longitudinal Orthogonal (higher density) R/W head Single Thousands – tip array (Higher bandwidth and parallelism) Media movement Rotation Media sled moves in X and Y independently (no rotation delay) 21 MEMS Storage Device Characteristics Physical size: 1 – 2 cm2 Recording density: 250 – 750 Gb/in2 Throughput 7GB/s Predicted Performance in 2005 DRAM 6GB/s 0.5–2 GB $100-$200/GB 5GB/s 4GB/s 3GB/s 3–10 GB $5-$50/GB 2GB/s MEMS 1GB/s DISK 1ns 10ns 100ns 1us 10us 100us 1ms 100–500 GB $1-$2/GB 10ms Access Latency 22 MEMS Storage Device Spring Y X 23 Durability of MEMS Storage Enclosures 6 failures 10 failures 8 failures Disk 23 Yrs 4 failures 1 failure 2 failures No failure Probabilities that a MEMS storage enclosure has up to k failure during (0, t] 24
© Copyright 2026 Paperzz