Nova Austin Summit Update

Nova Austin Design Summit Update
Yingxin Cheng, Nova Developer, Intel OTC
Alex Xu, Nova Core, Intel OTC
‹#›
Summit schedule
Date-time
Topic
Decisions
09:00 - 10:30, Wed, April 27
Scheduler and resource tracking evolution
https://etherpad.openstack.org/p/newton-nova-scheduler
11:00 - 11:40, Wed, April 27
Neutron cross-project
https://etherpad.openstack.org/p/newton-nova-neutron
11:50 - 12:30, Wed, April 27
Performance VMs CI and technical debt
https://etherpad.openstack.org/p/newton-nova-performance-vms
14:40 - 16:10, Wed, April 27
Cells v2
https://etherpad.openstack.org/p/newton-nova-cells
16:30 - 17:10, Wed, April 27
Low-hanging fruit / getting started in Nova
https://etherpad.openstack.org/p/newton-nova-getting-started
17:20 - 18:00, Wed, April 27
Live Migration
https://etherpad.openstack.org/p/newton-nova-live-migration
09:00 - 10:30, Wed, April 28
API discoverability and policy
https://etherpad.openstack.org/p/newton-nova-api
11:00 - 11:40, Wed, April 28
Cinder cross-project
https://etherpad.openstack.org/p/newton-nova-cinder
11:50 - 12:30, Wed, April 28
Feature classification and testing
https://etherpad.openstack.org/p/newton-nova-feature-classification
14:20 - 16:50, Wed, April 28
Glance/Ironic/Keystone cross-project
https://etherpad.openstack.org/p/newton-nova-glance
https://etherpad.openstack.org/p/newton-nova-ironic
https://etherpad.openstack.org/p/newton-nova-keystone
17:00 - 17:40, Wed, April 28
Priorities and schedule for Newton
https://etherpad.openstack.org/p/newton-nova-summit-priorities
Wed, April 27-28
Nova: Unconference #1, #2
https://etherpad.openstack.org/p/newton-nova-summit-unconference
All Day, Wed, April 29
Contributors Meetup
https://etherpad.openstack.org/p/newton-nova-meetup
‹#›
2
Newton Priorities
Item
Primary Contacts
Area
Cells V2
Andrew Laski
Cells
Scheduler
Jay Pipes
Scheduler
API Improvements
Andrew Laski, Sean Dague
API
os-vif Integration
Daniel Berrange
Refactor
Libvirt Storage Pools
Matthew Booth
Live Migration
Get Me a Network
Matt Riedemann
Cross Project (Neutron)
Glance v2 Integration
Mike Fedosin
Cross Project (Glance)
3
Main content
• Cells V2*
• Scheduler**
• API
• Live migration
• Versioned notifications
• Cross-project: Neutron, Cinder, Ironic, Glance
• Unconference
• Other topics
‹#›
4
Cells v2
‹#›
Cells 背景
• 瓶颈:
Compute节点数量(400)
Scheduler容量(1000)
database,MessageQueue
Neutron, Cinder
• 现有的扩展方法:多OpenStack部署,多Region部署
功能限制
• Cells:
保证OpenStack在不牺牲功能下,易扩展,能容灾和可分区
‹#›
6
Cells v1
• 数据管理方式
Cell间定期同步,有数据冗余
• 消息管理方式
Cell间转播,对Nova透明的消息处理
• 调度: 二级调度模式
• 部署和使用
部署不灵活,使用复杂,有功能限制
升级困难,扩展能力有限
• 实验性产品: 测试,开发,维护
‹#›
7
Cells v2
API Cell
Cell0
• 数据管理方式:实时更新,数据无冗余
nova api
api database (minimal global data and links)
nova DB
nova api DB
nova scheduler nova conductor
• 消息管理方式:消息切换并转发
• 调度: 全局调度,完整功能
• 部署和使用:
Cell Foo
Cell Bar
Cell为唯一部署方式
nova DB
功能无限制,一键升级
MessageQueue nova compute
nova conductor
nova DB
nova conductor
MessageQueue nova compute
• 补丁模式 -> 正式内置支持
‹#›
8
Cells v2 进展及将来
• Mitaka
API database
单cell,数据库切换,cell v2 升级工具
• Newton
data migration: flavor, aggregate, quota, keypair
MessageQueue切换,scheduler升级,升级工具
Cell0实现,多cell支持
• Ocata +:cell awareness scheduling, cell v1-v2 upgrade tool
• 困难:数据库失效问题,跨服务支持
‹#›
9
Scheduler
‹#›
Scheduler 现状及问题
• Nova-scheduler与Nova深度耦合
只能调度nova相关资源
只能接受nova的调度请求
目前处于半独立状态
• 性能问题: “Dive into nova scheduler performance”
• 功能扩展:目前只支持filter scheduler和caching scheduler
‹#›
1
1
Towards generic scheduler
• 共享资源的处理:shared-storage, routed network, FPGA
• 定量资源的通用化处理:resource-provider, inventories, allocations, generic resource pool
• 定性资源的通用化处理:resource-providers-standardize-extra-specs(sys, dev, hypervisor)
• 分离Scheduler API/endpoint:scheduler-rest-api, scheduler-separate-endgoal
• 独立scheduler服务,不再依赖nova:generic scheduler
‹#›
1
2
Scheduler 性能提升
Messages
REQ
API
Conductor
Scheduler Conductor
Compute
VM
• Scheduler 性能剖析:从数据库中刷新所有的节点信息
• 瓶颈:database, 90%+
• 解决方案0:caching-scheduler
• 解决方案0.1:multiple filter schedulers
• 解决方案1:resource-provider scheduler
• 解决方案2:shared-state scheduler
‹#›
1
3
Scheduler 功能完善
• Filtering and weighing
• Congress: policy engine
• Watcher: policy-based scheduler
• Machine learning
• Network/Storage infrastructure
awareness
‹#›
1
4
API
‹#›
Nova REST API 现状
• /v1.1: deprecated and removed
• /v2.0: supported, legacy, frozen
deprecated in Mitaka, removed in Newton
• Third party API: supported, legacy
AWS EC2
GCE
• /v2.1: current since Liberty, v2.0 compatible
‹#›
1
6
Nova REST API v2.1 特性
• One Compute API(in progress): no extension
• Compatibility & Evolution: microversion support
• Validation:
json-schema
Explicit exception handling
• policy.json
‹#›
1
7
Nova REST API v2.1: microversion
HTTP request header:
• Not provided:
Act as the minimum supported version
• Provided:
Respond with the API at the specified microversion
Respond “406 Not Acceptable” if outside the range of supported microversion
• Latest:
Respond at the maximum microversion
‹#›
1
8
Nova REST API v2.1: Newton plan
• Policy default in code
• Policy discovery from API
• API documentation within Nova project
• Deprecation:
v2.0 code
proxy API
API extension
‹#›
1
9
Live Migration
‹#›
Live-migration 现状
• 冷迁移(code migration)
• 热迁移 - 预拷贝(pre-copy migration)
拷贝内存 -> 更新脏页 -> 迁移虚拟机
问题1:内存更改速度过快 vs 网速过慢
问题2:迁移时间不可预测
问题3:增加网络压力
解决方案:
取消迁移 -> 冷迁移 -> 强制迁移(暂停VM) ->
延长迁移时间 -> auto-converge (降低VM性能) -> post-copy migration
‹#›
2
1
Live-migration 后拷贝技术(post-copy)
http://review.openstack.org/301509 Adds post-copy live migration support to Nova
• 迁移虚拟机 -> 请求内存 -> 拷贝剩余内存
• 优点:
不需要考虑网络速度与内存更改速度
内存页传输一次性完成,传输数据大小固定
剩余数据量和迁移完成时间可预测
• 缺陷:
性能波动,无法容错
功能限制:RDMA迁移/multithread compression/xbzrle compression/file backed memory
‹#›
2
2
Live-migration 自动化决策
• 定义虚拟机的可用性
tolerate_migration_risk = True
down_time_max = 10
cpu_throttling_max = 50
tolerate_suspend/force_complete = True
• 根据不同的可用性来计费
• 预拷贝-后拷贝技术
‹#›
2
3
Live-migration Miscs
• Code and Live Migration Scheduling
scheduler自动选择
人工选择,scheduler检查约束条件
人工强制选择
• Live Migration orchestrated via conductor
取消compute - compute 的直接通信
迁移过程由nova-conductor监视及控制
• PCI/NUMA claims
‹#›
2
4
Versioned Notifications
• Nova实时消息推送
• 公共接口(public interface)
• Envelope (oslo.messaging) + Payload (versioned)
• instance.update, instance.delete
‹#›
Cross-project
Neutron, Cinder, Ironic, Glance
‹#›
Cross-project
Neutron
• Neutron routed networks https://review.openstack.org/263898
“A routed network, in this context, is a physical network infrastructure that implements scaled
networks by routing instead of large L2 broadcast domains.”
• Get me a network https://review.openstack.org/283206
执行“nova boot”, 自动生成网络结构(网络,子网,路由和端口)
• Deprecate nova-network https://review.openstack.org/310539
简化cell v2的实现
没有分离nova-net的计划
最早在Ocata版本完成
‹#›
2
7
Cross-project
Contd.
• Cinder
Volume multi-attach
服务热备份
应用支持:Oracle RAC etc.
• Ironic
复用nova-scheduler
重构networking代码(nova, neutron)
• Glance
Glance v2 integration
‹#›
2
8
Nova unconference
‹#›
Nova unconference
• Scheduler debugging: https://review.openstack.org/306647
• Supporting ephemeral storage backed by ScaleIO: https://review.openstack.org/304715
• Injecting credentials when booting an instance: https://review.openstack.org/305455
• Scoped policies for server groups: https://review.openstack.org/247654
• Attach/detach filesystem shares created by Manila to Nova instances:
https://review.openstack.org/310050
• Disabling local/ephemeral disk in flavors: https://review.openstack.org/267673
• Mix and match resource federation: https://blueprints.launchpad.net/nova/+spec/mix-and-matchresource-federation
• Direct download/upload of images from/to the RBD glance backend:
https://review.openstack.org/308481
‹#›
3
0
Other topics
How to start to contribute code
The nova feature classification
‹#›
Low hanging fruit
• Python3 integration
• Test:
mox -> mock
fake UUID usage
clean up random stacktraces
• Remove NovaObjectDictCompat
• Api-ref documentation
• Bug triage
• Full list with contacts: https://etherpad.openstack.org/p/nova-low-hanging-fruit
‹#›
3
2
Nova feature
classification
• Feature classification:
Incomplete
Experimental
Complete: API doc, Admin doc, Feature tempest test, reliable, fully support
Complete and required: Supported by all drivers
Deprecated
• Doc: http://docs.openstack.org/developer/nova/feature_classification.html
• Support matrix: http://docs.openstack.org/developer/nova/support-matrix.html
‹#›
3
3
Q&A