CLOUD COMPUTING AND DATA CENTER NETWORKING ZILONG YE, PH.D. [email protected] WHAT IS CLOUD COMPUTING? Cloud Computing is a general term used to describe a new class of network based computing that takes place over the Internet, basically a step on from Grid Computing a collection/group of integrated and networked hardware, software and Internet infrastructure (called a platform). Using the Internet for communication and transport provides hardware, software and networking services to clients These platforms hide the complexity and details of the underlying infrastructure from users and applications by providing very simple graphical interface or API (Applications Programming Interface). 2 WHAT IS CLOUD COMPUTING? In addition, the platform provides on demand services, that are always on, anywhere, anytime and any place. Pay for use and as needed, elastic scale up and down in capacity and functionalities The hardware and software services are available to general public, enterprises, corporations and businesses markets 3 CLOUD SUMMARY Cloud computing is an umbrella term used to refer to Internet based development and services A number of characteristics define cloud data, applications services and infrastructure: Remotely hosted: Services or data are hosted on remote infrastructure. Ubiquitous: Services or data are available from anywhere. Commodified: The result is a utility computing model similar to traditional that of traditional utilities, like 4gas and electricity - you pay for what you would want! CLOUD ARCHITECTURE 5 CLOUD SERVICE MODELS Software as a Service (SaaS) Platform as a Service (PaaS) Infrastructure as a Service (IaaS) SalesForce CRM LotusLive Google App Engine 6 Adopted from: Effectively and Securely Using the Cloud Computing Paradigm by peter Mell, Tim Grance BASIC CLOUD CHARACTERISTICS The “no-need-to-know” in terms of the underlying details of infrastructure, applications interface with the infrastructure via the APIs. The “flexibility and elasticity” allows these systems to scale up and down at will utilizing the resources of all kinds CPU, storage, server capacity, load balancing, and databases The “pay as much as used and needed” type of utility computing and the “always on!, anywhere and any place” type of network-based computing. BASIC CLOUD CHARACTERISTICS Cloud are transparent to users and applications, they can be built in multiple ways branded products, proprietary open source, hardware or software, or just off-the-shelf PCs. In general, they are built on clusters of PC servers and off-the-shelf components plus Open Source software combined with in-house applications and/or system software. 8 CLOUD COMPUTING CHARACTERISTICS Common Characteristics: Massive Scale Resilient Computing Homogeneity Geographic Distribution Virtualization Service Orientation Low Cost Software Advanced Security Essential Characteristics: On Demand Self-Service Broad Network Access Rapid Elasticity Resource Pooling Measured Service VIRTUALIZATION Virtual workspaces: An abstraction of an execution environment that can be made dynamically available to authorized clients by using well-defined protocols, Resource sharing (e.g. CPU, memory share), Software configuration (e.g. O/S, provided services). Implement on Virtual Machines (VMs): Abstraction of a physical host machine, Hypervisor intercepts and emulates instructions from VMs, and allows management of VMs, VMWare, Xen, etc. Provide infrastructure API: Plug-ins to hardware/support structures App App App OS OS OS Hypervisor Hardware Virtualized Stack VIRTUAL MACHINES App App App App App Guest OS (Linux) Guest OS (NetBSD) Guest OS (Windows) VM VM VM Virtual Machine Monitor (VMM) / Hypervisor Hardware Xen VMWare UML Denali etc. 11 VIRTUALIZATION IN GENERAL Advantages of virtual machines: Run operating systems where the physical hardware is unavailable, Easier to create new machines, backup machines, etc., Software testing using “clean” installs of operating systems and software, Emulate more machines than are physically available, Timeshare lightly loaded systems on one host, Debug problems (suspend and resume the problem machine), Easy migration of virtual machines (shutdown needed or not). Run legacy systems! 12 WHAT IS HADOOP? At Google MapReduce operation are run on a special file system called Google File System (GFS) that is highly optimized for this purpose. GFS is not open source. Doug Cutting and others at Yahoo! reverse engineered the GFS and called it Hadoop Distributed File System (HDFS). The software framework that supports HDFS, MapReduce and other related entities is called the project Hadoop or simply Hadoop. This is open source and distributed by Apache. FAULT TOLERANCE Failure is the norm rather than exception A HDFS instance may consist of thousands of server machines, each storing part of the file system’s data. Since we have huge number of components and that each component has non-trivial probability of failure means that there is always some component that is non-functional. Detection of faults and quick, automatic recovery from them is a core architectural goal of HDFS. HADOOP DISTRIBUTED FILE SYSTEM HDFS Server Master node HDFS Client Application Local file system Block size: 2K Name Nodes Block size: 128M Replicated HDFS ARCHITECTURE Metadata ops Metadata(Name, replicas..) (/home/foo/data,6. .. Namenode Client Block ops Read Datanodes Datanodes replication B Blocks Rack1 Write Client Rack2 MAPREDUCE MapReduce is a programming model Google has used successfully to process its “big-data” sets (~ 20000 peta bytes per day) A map function extracts some intelligence from raw data. A reduce function aggregates according to some guides the data output by the map. Users specify the computation in terms of a map and a reduce function, Underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, and Underlying system also handles machine failures, efficient communications, and performance issues. Large scale data splits Map <key, 1> <key, value>pair Reducers (say, Count) Parse-hash Count P-0000 , count1 Parse-hash Count P-0001 , count2 Parse-hash Count Parse-hash P-0002 ,count3 CLOUD COMPUTING RESEARCH TOPICS Virtual machine placement One-dimensional VM placement Multi-dimensional VM placement VM placement in single DC VM placement in multi DC Virtual machine live migration Availability-aware virtual machine placement VIRTUAL MACHINE PLACEMENT One single physical machine can host multiple VMs as long as the capacity of the physical machine is not exceeded VM placement: determine how to allocate VMs on physical machines Objective: minimize the number of physical machines used Constraint: physical machine capacity, other QoS requirements, SLA requirements One-dimensional VM placement problem is similar to the bin-packing problem. BIN PACKING (1-D) The bins; (capacity 1) Bin Packing Problem …… 1 .5 .7 .5 .5 .5 .2 .4 .2 .4 .2 .2 .7 Items to be packed .5 .5 .1 .6 .1 .6 BIN PACKING (1-D) Bin Packing Problem …… 1 .5 .7 .5 .2 .4 .2 .5 .1 .6 Optimal Packing .1 .5 .5 .2 .7 .5 .4 .6 .2 N0 = 4 NEXT FIT PACKING ALGORITHM Bin Packing Problem .5 .7 .5 .2 .4 .2 .5 .1 .6 .1 .5 .5 .2 .7 .5 .4 N0 = 4 .6 .2 Next Fit Packing Algorithm .2 .5 .7 .5 .2 .1 .4 .5 .6 N=6 FIRST FIT PACKING ALGORITHM .5 .7 .5 .2 .4 .2 .5 .1 .6 Next Fit Packing Algorithm .2 .5 .7 .5 .2 .1 .4 .5 .5 .6 .6 First Fit Packing Algorithm .5 .1 .2 .2 .5 .7 .4 N=5 OTHER APPROACHES Best fit bin packing Load balanced bin packing First fit with decreasing demand MULTI-DIMENSIONAL VM PLACEMENT Not only CPU capacity is considered, but also memory and storage capacity are also considered, so it becomes a multidimensional bin packing problem Possible solutions: Map VM to a physical machine that can satisfy all the requirements The physical machine has the highest remaining capacity in terms of the average of all the requirements VM PLACEMENT IN MULTI-DATACENTERS Considering not only the VM placement, but also the network consumption between different VMs. Similar to the Virtual Infrastructure Mapping problem e 60 35 75 1 2 60G a 30G b 40 40G a 10G d 25 100G 40G 60G e 35 d 10G 140G 50G 6 3 30 25G 65 c 50 Virtual request 5 10 40G 4 60 Physical substrate 120G b c VM MIGRATION Dynamic VM placement problem Why VM migration? Failure Migration to save energy Cons: additional network consumption, overhead (the new VM should be initiated first, and get all the replicated data from the old VM before the old VM shuts down) Objective: minimize disruption time and SLA violation Questions: when to migrate? How to migrate? OTHER DYNAMIC VM PLACEMENT PROBLEMS VM splitting in order to increase the utilization of the physical machine Computing overhead, e.g., a VM with 100 computing demands can be split into two VMs, each of which has 55 computing demands Additional network load between the VMs which are split from the original one The VMs’ computing demand are changing over time (so best fit may not be the bet) There may be upgrading VMs joining the application, which may need to be provisioned close to where the original VMs locate RESILIENCY IN VM PLACEMENT One working VM should have around 3 or 5 backup VMs Data are replicated from the working VM to the backup VMs The dataflow can follow unicast or multicast How to choose the multicast routing to save bandwidth How to make the dataflow reliable against failure Dynamic VM resiliency enhancement, considering the application state, the mean time to repair
© Copyright 2026 Paperzz