How Many Spares Does One Really Need? Tony Frisch, Xtera Communications More spares = more confidence that that they will be available when needed, but also = more cost Possible criteria for selecting quantity of spares 1. 2. 3. To ensure 99.999% availability …………………………………………… Links to a network requirement To be X% confident of having spares available when needed … What value for X? Some other criterion e.g. cost-effectiveness …………………………. Balance the costs of SLA payments against the cost of buying spares Spares available Analyse by number of spares being repaired - Out of spares Calculate using Failure On a site basis, or looking at the whole network often a good way to reduce spares needed 0 1 2 1. 4 3 2. Repair Short / No outage Main weakness is the assumption of a constant, known failure rate and a well-defined return time 3. Long outage Simple formula e.g. Poisson Easy to use: lacks flexibility Monte-Carlo simulation Rigorous and flexible, but hard to check Steady-state Markov Flexible and not difficult to check Uncertainty of FIT derivation 1.5 Failures Failures 10 10 One approach is to use 95% Upper Confidence Limit (UCL) values; typically these are 2-3x larger than the true values and will generally result in requiring more spares Only a few failures during testing means uncertainty of failure rate Relative Probability Rapid development makes it difficult to test large numbers of units for long periods of time 1 55 22 0.5 11 Example network (4 sites) 8 WL SLTE 8 WL SLTE Some units are more significant than others Common units affect multiple wavelengths 95% UCL Expected 8 WL SLTE 8 WL SLTE M U X 0 0 1 2 3 Normalised FIT Value 8 WL SLTE 8 WL SLTE 8 WL SLTE WL 10,000 8 WL SLTE Failure probability vs. FIT value Cumulative Probability FIT 95% 10,000 5,000 5,000 Spares x 16 x 16 x 64 1 97.720% 99.385% 92.144% 2 99.828% 99.977% 98.841% 3 99.990% 99.999% 99.870% 4 100.000% 100.000% 99.988% 5 100.000% 100.000% 99.999% Actual FIT value = 5,000; MTTR = 90 days FITs 95% FIT value ≈ 10,000 More spares per site using 95% UCL than using "expected" FIT value Network topology and protection are also important Different protection schemes tolerate more failures before outage occurs SLTE Economic Analysis Still need at least one per site Depends on being able to move spares easily – may not be practical in some cases e.g. due to Customs N+1 Cost of outage = SLA payment Spares Outage Total 1200 May be somewhat simplistic, but could be improved by use of a "utility" function which includes additional factors such as: 1000 Relative cost Graph shows the example of a network of 4 sites, each with 16 units with a true FIT value of 5,000 and a repair time of 90 days Important not to base on point-to-point SLTE computation, as this also increases the spare requirement for no real benefit 1400 Details of Service Level Agreements vary, so for simplicity assume cost of outage is: proportional to the length of outage proportional to the number of circuits affected SLTE Overall cost during warranty period Aim to find lowest total cost, analysed over the warranty period - NPE (Ring) 300 FIT Wavelength units are highest FIT and cost Most significant for spares calculation More savings by sharing spares between sites SLTE 100 - 800 Reputation loss Risk of losing existing customers 600 The calculation is not difficult and the minimum is not very sensitive to precise input values 400 Worth considering as an additional analytic tool? 200 0 0 2 4 6 Number of spares 8 10 Conclusions / ways to reduce costs 1 2 3 4 Calculations should be done with "expected" FITs If practical, share spares between sites Seek faster return times Consider overall costs of operation
© Copyright 2026 Paperzz