OPNFV Summit 2015 Doctor: Failure Detection and Notification for NFV Gerald Kunzmann, DOCOMO Carlos Goncalves, NEC Ryota Mibu, NEC 1 Doctor Overview • Goal – Build fault management and maintenance framework • Approach – – – – • Identify requirement Gap Analysis Implementation work in Upstream (OpenStack) Integration and testing Status – Initial Requirement study, architecture design, Gap analysis : Done – Collaborative Development: On-going (3 merged Blueprints in OpenStack Liberty) – Standardization Sync: On-going (by NFV member efforts, joint meeting) 2 Key Requirements as VIM Consistent Resource State Awareness Immediate Notification Extensible Monitoring Fault Correlation 3 Doctor Demo Overview Quick Recovery Video Player Streaming Server Application ACT Switch Act-Sby Manager SBY Virtualized Infrastructure VM-1 Down Virtual Virtual Compute Storage Virtual Network Reaction Detection without Doctor (few minutes) Detection with Doctor (1 second) Virtualization Layer Virtualized Infrastructure Manager (VIM) = OpenStack Hardware Resources Host-A Down 4 Fault Management Sequence App Manager + Viewer Streaming Application Server 0. Set Alarm Manager 6-. Action 5. Notify Error Liberty 4. Notify all Controller Controller Controller Resource Nova Notifier Ceilometer Map Virtualized Infrastructure (Resource Pool) Monitor Monitor Monitor Alarm Conf. 3. Update State 2. Find Affected 1. Raw Failure Log Monitor Inspector State Reflector Failure Policy 5 Service Healing Process Alarm Notification Host A Host B VM9 VM0 VM1 App Manager Streaming Server Streaming Server vNIC vNIC vSwitch vSwitch NIC NIC Video Player Control Data Flow (Before) Data Flow (After) Switch 6 Doctor Demo Screen App Manager Service Control VM List (Horizon) App Manager Event/Action Log VM Egress Stats (Zabbix) Demo Operation Console Video Player (with Doctor) Video Player (without Doctor) 7 Doctor Demo 8 Doctor Blueprints in OpenStack Liberty Cycle ✓ Using in This Demo Project Blueprint Spec Drafter Developer Status Ceilomete r Event Alarm Evaluator Ryota Mibu (NEC) Ryota Mibu (NEC) Completed (Liberty) ✓ New nova API call to mark novacompute down Tomi Juvonen (Nokia) Roman Dobosz (Intel) Completed (Liberty) ✓ Support forcing service down Tomi Juvonen (Nokia) Carlos Goncalves (NEC) Completed (Liberty) ✓ Get valid server state Tomi Juvonen (Nokia) Add notification for service status change Balazs Gibizer (Ericsson) Nova Spec approved (Mitaka) Balazs Gibizer (Ericsson) Waiting for spec approval (Mitaka) 9 Doctor BP Detail: Nova – Mark Nova-Compute Down Monitoring External Monitoring Service Client NEW API to update nova-compute service state Force-down API nova api VM service state Hyperviso r nova compute EXISTING (periodic update) vSwitch BMC queue nova conductor nova DB nova scheduler Host / Machine 10 Doctor BP Detail: Ceilometer - Event Alarm Nova event Neutron Cinder Manager stats sample notification EXISTING (polling-based) Notificationdriven alarm evaluator NEW Shortcut (notification-based) Audit Service 11 Who made this demo? • Upstream OSS Community & Developer – OpenStack Contributors including Doctor Developers • OPNFV Doctor Team – Doctor contributors who worked on requirement study, gap analysis and implementation design • Doctor PoC Demo Team – NTT DOCOMO – NEC: Toshiaki Takahashi, Takahiro Suzuki, Ryuji Ishikawa, ... 12 Visit DOCOMO Booth, PoC Demo Zone 13
© Copyright 2026 Paperzz