Report on Network Down event in Cyberjaya Data Center on 4th

INCIDENT REPORT
Confidential
INCIDENT REPORT DATA CENTRE POWER FAILURE
Date
: 17th May 2014
Time
: 16:45-18:20 (1 hour 35 minutes)
Location
: VADS DC, Level 2 & 3, CBJ2
Summary
of the
Event
Planned power maintenance and upgrading work by Telekom Malaysia
Property Operations (TM PO) was scheduled to start at 09:00 (17th May
2014) and to complete at 23:59 (18th May 2014)
The load from TNB utility was successfully transferred to two nos of the
building generator sets (Genset) at 10:20. The power maintenance exercise
experiencing difficulty when both of Gensets shutdown at 14:15 and 14:30
respectively.
The shutdown of the both Gensets had interrupted both DC cooling and
power supply to customer racks (after UPS batteries drained and flat)
Cause of
Problem
Both building Gensets shutdown after running almost 4 hours. The failure
was due to fuel level switch failed and did not trigger fuel auto pump to pump
diesel from main tank to service tank. This has caused static Gensets to shut
down due to no diesel being fed into the Gensets from service tank.
Chronology Chronology:
of the event
and Action 17th May 2014
Plans
Time
Event/Action
10.20
TM PO maintenance work started by shutting down the 11kV
dual feeder to CBJ2 HT panel. Both 2MVA Gensets (Genset#1
and Genset#2) were running on-load with load approximately
800Amps per Genset
14:15
Genset#1 which is connected to MSB#1 shutdown. All
supplies to MSB#1 are cutoff. UPS Socomec (300kVa x 2 nos)
and all connected rectifiers switch to Battery-on-load mode.
14:30
Genset#2 which is connected to MSB# 2 shutdown. All
supplies to MSB#2 are cutoff. UPS APC (240kVa x 2 nos),
UPS Liebert (160kVa x 2 nos) and connected rectifiers switch
to Battery-on-load mode
15:00
TM PO Contractor was called to mobilize 2 nos of mobile
Genset to Cyberjaya 2.
16:00
TM PO’s Mobile Genset was up and running to supply load for
INCIDENT REPORT
16:34
16:45
16:45
17:00
18:00
18:05
18:05
18:20
18:20
18:20
18:40
18:40
Confidential
MSB#1 and UPS Socomec.
UPS APC and UPS Liebert shutdown due to battery drained,
and all load transferred to UPS Socomec. (Both UPS Liebert
and UPS APC had back-up load for 2 hours).
TM PO’s Mobile Genset unstable (due to high load), causing
the UPS Socomec to switch to Battery-on-load and shutdown
due to earlier battery-on-load (battery drained) and caused
total shutdown to all load connected to UPS.
Most power supplies to level 2 and level 3 customer racks were
interrupted. Except UPS APC 200KVA still sustain to support
several customer racks at level 3.
2 nos of Mobile Genset from TM PO Contractor reached
Cyberjaya 2 and immediately being readied to connect to both
MSB.
TM PO Contractor’s Mobile Genset connected to MSB#2 and
re-energize. UPS APC and UPS Liebert back on.
Genset#1 back on and re-energize MSB#1. UPS Socomec
back on.
Cooling system at the DC immediately restored and DC
temperature resume normal.
All output breakers to STS /PDU at level 2 were switched ON
and power supply to level 2 customer racks are being
normalized.
Power supply fully resumed to all racks at CBJ2 DC level 2
DC network services restored
All output breakers to STS /PDU at level 3 were switched ON
and power supply to level 3 customer racks are being
normalized
Power supply fully resumed to all racks at CBJ2 DC level 3
18th May 2014
Time
03:20
Event/Action
TM PO has normalized the incoming power back to TNB
supply.
TMPO has halted and postponed the building power
upgrading/ maintenance work until further notice.
Solutions
i) TM PO temporarily using Mobile Gensets to supply power to the building
(17th may 2014 at 18:00)
ii) TM PO reverted to TNB utility power source (18th May 2014 at 03:20)
INCIDENT REPORT
Mitigation
Plans
Confidential
Proposed mitigation plans by TM PO:
i)
To built up separate service tank complete with own fuel auto pump for
respective static Genset in Cyberjaya 2
ii) To replace faulty fuel level switch for existing service tank
iii) To ensure all manual changeover at each MSB being equipped with
mobile Genset termination box
iv) To ensure all diesel motor pump system function properly during
Genset on load test and prolong TNB power supply failure (more than 1
hour)
Customers
Affected
Power Issue
: All customers at Level 2 and Level 3, CBJ2 DC
Network Issue : All customers CBJ1, CBJ2, CBJ5 and CBJ6 DC affected
for 2 hours. Network restored at 18:20 .
Status
The power supply to the building was fully restored and powered by mobile
Gensets at 18:00. All breakers/PDU are normalized by stages between 18:20
– 18:40 (17th May 2014)
TM PO has successfully normalized the incoming power back to TNB supply
at 03:20 (18th May 2014)
TM PO postponed the power upgrading/maintenance work until further notice.
All DC power supply and cooling had resumed normal operation.
** End of Report **
Prepared by,
Data Centre Operation CBJ2