Let Me Contain That For You Victor Marmol ([email protected]) Rohit Jnagal ([email protected]) Google Confidential and Proprietary Containers @ Google ● ● ● ● Early users: Scaling process management and isolation. What: Linux cgroups + user-space policies and monitoring. Everywhere: SaaS, PaaS, IaaS; Private and Public clouds. Containerizing shared machines ○ ○ ○ Asymmetric workloads : Latency, bandwidth, and priority Asymmetric Isolation High churn ● Goals: ○ ○ ○ ○ ○ Performance guarantees. High utilization across resources. Shared resources. Overcommitment: Invisible workload from reclaimed resources. Near zero overhead. ● Other use cases: ChromeOS et al LPC 2013 Google Confidential and Proprietary A Shared Google Machine I/O:CPU:Mem Sensitive Front End Job Back End Job Allocation BACKGROUND System Daemons LPC 2013 Batch workload TASKS Soaker workload Google Confidential and Proprietary Resource Isolation ● Quality of service ○ ○ ○ ○ Bandwidth - Fair share, progress guarantees, availability. Latency - wakeup, allocation, access times Priority - Order of importance. Performance: Microarchitecture interference (CPI2); Locality ● Solution: ○ ○ ○ ○ Scheduling a good mix. Hierarchical resource management for effective sharing. Maximize utilization across all dimensions. Cgroup-aware tasks: ■ ■ ■ User subcontainers [eg. Query management] User schedulers. Self-correcting tasks: Notifications image credit LPC 2013 Google Confidential and Proprietary Scalability ● Churn ○ 1 Creation/Deletion per 10 seconds ● Per Container ○ ○ Read: O(10) cgroup-based stats per second Write: O(1) cgroup-based param per second ● Per Machine ○ ○ O(100) containers Looks to grow dramatically ● Overall ○ ○ Read: 1000’s per second Write: 100’s per second ● Users can do a lot more. ● Precise accounting for chargeback ● Monitoring built in at multiple layers ● Extremely low overhead LPC 2013 Google Confidential and Proprietary Let Me Contain That For You ● Revised container management ○ ○ Separate cgroup abstraction from policies. Configuring cgroups with an intent-based resource specification. ● Built for scalability and parallel access. ● Also includes extra kernel patches for: ○ ○ ○ ○ Improving resource isolation. Providing tighter performance guarantees. Precise accounting in face of sharing. Cap for global resources. ● Allow users to create subcontainers with restrictions. ● Open-source: Sharing use-cases, problems, and benchmarks. ● Implement policies in a higher layer: ○ ○ ○ ○ LPC 2013 Continuous monitoring and fine-tuning. No critical loops [Remember LPC2011?] Machine-level utilization and isolation management. Isolated from system APIs. Google Confidential and Proprietary Hierarchical Sharing An allocation A1 with two tasks T1 and T2 /dev/cgroup/cpu/A1 [2048] T1 [1536] /dev/cgroup/mem/A1 [4G] T2 [512] T1 [2G] T2 [3G] Task running in an allocation sharing resources with co-located siblings. LPC 2013 Google Confidential and Proprietary Managing priority across resources Block I/O Cpu T2 [0.8] Default [0.1] Memory T2 [1024] Default [2] T1 [1G] T2 [2G] T3 [1G] T1 [0.1] T3 [0.1] T1 [512] Cgroups for low-priority batch tasks LPC 2013 T3 [256] Cgroups for a latency sensitive task Google Confidential and Proprietary Managing priority across resources Block I/O Default [0.1] Cpu T1 [0.8] Default [2] Memory T1 [2048] T1 [4G] T2 [0.1] T1 [0.3] T1 [PRIO] [0.5] T2 [2G] T2 [1024] Cgroups for a high I/O priority latency sensitive task Cgroups for a low priority task A task may require multiple containers for the same resource to balance its workload priorities. I/O server T1 uses two subcontainers to differentiate incoming I/O requests and moves threads to the right subcontainer. LPC 2013 Google Confidential and Proprietary Splitting hierarchies for performance Block I/O T1 [0.8] Default [0.1] T1 [0.5|P] T2 [0.1] Cpu T3 [0.1] T1 [2048] T3 [1024] Splitting hierarchies reduces stranded resources and improves performance for highly sensitive tasks. LPC 2013 T2 [1024] Default [2] T1 [0.3] Memory T3 [2G] T1 [4G] T2 [2G] Cpu, Memory and I/O sensitive task Cpu & Memory sensitive task with low I/O priority Low priority batch task Google Confidential and Proprietary User Subcontainers App Engine Task Protected server app Server Subcontainers with tailored spec and priority Instances OOM Instance1 Instance2 Instance3 App Engine uses on-demand container creation: fair sharing, notifications, and isolation of misbehaving apps LPC 2013 Google Confidential and Proprietary Takeaways ● Cgroups support goes beyond containerized VMs. ● Sharing and overcommitment is a key to higher utilization. ● Managing each resource separately helps fine-tune utilization and performance. ● More power to users means better flexibility and scalability. Come find us for chat, discussions, BoF, and drinks. Or virtually: [email protected] [email protected] LPC 2013 Google Confidential and Proprietary
© Copyright 2026 Paperzz