Demo - OPNFV Wiki

OPNFV Summit 2015
Doctor: Failure Detection and Notification for
NFV
Gerald Kunzmann, DOCOMO
Carlos Goncalves, NEC
Ryota Mibu, NEC
1
Doctor Overview
•
Goal
– Build fault management and maintenance framework
•
Approach
–
–
–
–
•
Identify requirement
Gap Analysis
Implementation work in Upstream (OpenStack)
Integration and testing
Status
– Initial Requirement study, architecture design, Gap analysis : Done
– Collaborative Development: On-going (3 merged Blueprints in OpenStack
Liberty)
– Standardization Sync: On-going (by NFV member efforts, joint meeting)
2
Key Requirements as VIM
Consistent
Resource State
Awareness
Immediate
Notification
Extensible
Monitoring
Fault Correlation
3
Doctor Demo Overview
Quick Recovery
Video
Player
Streaming
Server
Application
ACT
Switch Act-Sby
Manager
SBY
Virtualized Infrastructure
VM-1 Down
Virtual
Virtual
Compute Storage
Virtual
Network
Reaction
Detection without
Doctor (few minutes)
Detection with
Doctor (1 second)
Virtualization Layer
Virtualized Infrastructure Manager
(VIM)
= OpenStack
Hardware Resources
Host-A Down
4
Fault Management Sequence
App Manager
+ Viewer
Streaming
Application
Server
0. Set Alarm
Manager
6-. Action
5. Notify Error
Liberty
4. Notify all
Controller
Controller
Controller Resource
Nova
Notifier
Ceilometer
Map
Virtualized Infrastructure
(Resource Pool)
Monitor
Monitor
Monitor
Alarm
Conf.
3. Update State
2. Find Affected
1. Raw Failure
Log
Monitor
Inspector
State
Reflector
Failure
Policy
5
Service Healing Process
Alarm
Notification
Host A
Host B
VM9
VM0
VM1
App Manager
Streaming Server
Streaming Server
vNIC
vNIC
vSwitch
vSwitch
NIC
NIC
Video
Player
Control
Data Flow (Before)
Data Flow (After)
Switch
6
Doctor Demo Screen
App Manager
Service
Control
VM List
(Horizon)
App Manager
Event/Action
Log
VM Egress Stats
(Zabbix)
Demo Operation
Console
Video Player
(with Doctor)
Video Player
(without Doctor)
7
Doctor Demo
8
Doctor Blueprints in OpenStack Liberty Cycle
✓
Using in This
Demo
Project
Blueprint
Spec Drafter
Developer
Status
Ceilomete
r
Event Alarm Evaluator
Ryota Mibu
(NEC)
Ryota Mibu
(NEC)
Completed
(Liberty)
✓
New nova API call to mark novacompute down
Tomi Juvonen
(Nokia)
Roman
Dobosz (Intel)
Completed
(Liberty)
✓
Support forcing service down
Tomi Juvonen
(Nokia)
Carlos
Goncalves
(NEC)
Completed
(Liberty)
✓
Get valid server state
Tomi Juvonen
(Nokia)
Add notification for service status
change
Balazs Gibizer
(Ericsson)
Nova
Spec approved
(Mitaka)
Balazs Gibizer
(Ericsson)
Waiting for spec
approval
(Mitaka)
9
Doctor BP Detail: Nova – Mark Nova-Compute Down
Monitoring
External
Monitoring
Service
Client
NEW API
to update nova-compute
service state
Force-down API
nova api
VM
service state
Hyperviso
r
nova
compute
EXISTING
(periodic update)
vSwitch
BMC
queue
nova
conductor
nova DB
nova
scheduler
Host / Machine
10
Doctor BP Detail: Ceilometer - Event Alarm
Nova
event
Neutron
Cinder
Manager
stats
sample
notification
EXISTING
(polling-based)
Notificationdriven alarm
evaluator
NEW Shortcut
(notification-based)
Audit Service
11
Who made this demo?
• Upstream OSS Community & Developer
– OpenStack Contributors including Doctor Developers
• OPNFV Doctor Team
– Doctor contributors who worked on requirement study, gap analysis and
implementation design
• Doctor PoC Demo Team
– NTT DOCOMO
– NEC: Toshiaki Takahashi, Takahiro Suzuki, Ryuji Ishikawa, ...
12
Visit DOCOMO Booth, PoC Demo Zone
13