OPEN DATA CENTER ALLIANCE Master Usage Model: Compute Infrastructure as a Service REV 1.0 SM Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Table of Contents Legal Notice................................................................................................................................................................................................ 6 Acknowledgments....................................................................................................................................................................................... 6 Terminology and Provenance....................................................................................................................................................................... 6 1.0 Executive Summary............................................................................................................................................................................... 7 2.0 Purpose................................................................................................................................................................................................ 7 3.0 Taxonomy............................................................................................................................................................................................. 8 Table 3.1–Terms and Definitions................................................................................................................................................... 8 4.0 Defining CIaaS...................................................................................................................................................................................... 9 Figure 4.0.1 ODCA Conceptual Framework.................................................................................................................................... 9 4.1 CIaaS Scope.................................................................................................................................................................................10 Figure 4.1.1 CIaaS in Context.......................................................................................................................................................10 4.2 CIaaS Workloads...........................................................................................................................................................................11 4.3 Deployment Models......................................................................................................................................................................11 Table 4.3.1 Cloud Models.............................................................................................................................................................11 4.4 CIaaS Service Attributes...............................................................................................................................................................12 Table 4.4.1 Service Tier Attributes................................................................................................................................................12 4.5 General Capabilities......................................................................................................................................................................13 4.6 CIaaS Key Performance Indicators.................................................................................................................................................15 5.0 Interoperability.....................................................................................................................................................................................17 6.0 Business Drivers and Usage Scenarios.................................................................................................................................................17 6.1 Business Usage Scenarios.............................................................................................................................................................17 Figure 6.1.1 Sample CIaaS Usage Scenarios.................................................................................................................................18 6.2 Development/Test.........................................................................................................................................................................19 Table 6.2.1 Requirements for Development and Test.....................................................................................................................19 6.3 Load Stress Test Environment...................................................................................................................................................... 20 Table 6.3.1 Load Stress Test....................................................................................................................................................... 20 6.4 Grid and High-Performance Computing........................................................................................................................................ 21 Table 6.4.1 Requirements for Grid and High Performance Computing........................................................................................... 21 6.5 Standalone Traditional Production Environment............................................................................................................................ 22 Table 6.5.1 Standalone Cloud-Enabled Production....................................................................................................................... 22 6.6 Enterprise Traditional Production Environment.............................................................................................................................. 23 2 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Table 6.6.1 Enterprise Traditional Production............................................................................................................................... 23 6.7 Enterprise Cloud-Aware Production Environment...........................................................................................................................24 Table 6.7.1 Enterprise Cloud-Aware Production.............................................................................................................................24 6.8 Cloud Brokering and Federation....................................................................................................................................................24 7.0 Service Attribute Details...................................................................................................................................................................... 25 7.1 Functionality................................................................................................................................................................................. 25 7.2 Availability................................................................................................................................................................................... 25 7.2.1 Service Tier Summary: Availability...................................................................................................................................... 25 7.3 Recoverability.............................................................................................................................................................................. 26 7.3.1 Service Tier Summary: Recoverability................................................................................................................................. 26 7.4 Security....................................................................................................................................................................................... 27 7.4.1 Service Tier Summary: Security.......................................................................................................................................... 28 7.5 Elasticity...................................................................................................................................................................................... 29 7.5.1 Service Tier Summary: Elasticity........................................................................................................................................ 30 7.6 Manageability Services................................................................................................................................................................. 30 7.6.1 Service Tier Summary: Monitoring...................................................................................................................................... 31 7.6.2 Service Tier Summary: Reported Sampling Interval............................................................................................................. 31 7.6.3 Service Tier Summary: Number of Active Automated Tasks................................................................................................. 32 7.6.4 Service Tier Summary: Reporting....................................................................................................................................... 32 7.7 Performance................................................................................................................................................................................ 33 7.7.1 Performance SLA................................................................................................................................................................ 33 7.7.2 Performance Measuring...................................................................................................................................................... 33 7.7.3 Performance Reporting....................................................................................................................................................... 33 7.7.4 Performance Monitoring..................................................................................................................................................... 33 7.7.5 Performance Analysis......................................................................................................................................................... 34 7.7.6 Performance Definition and Interface.................................................................................................................................. 34 8.0 Service Interface and Reference Model............................................................................................................................................... 34 8.1 ODCA Conceptual Architecture..................................................................................................................................................... 34 Figure 8.1.1 ODCA Conceptual Framework.................................................................................................................................. 34 8.2 Basic Cloud Lifecycle................................................................................................................................................................... 35 8.3 Service Interface Requirements.................................................................................................................................................... 35 8.4 Specific Required Interfaces......................................................................................................................................................... 37 Table 8.4.1 Specific Required Interfaces...................................................................................................................................... 37 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 3 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 8.5 Services Orchestration................................................................................................................................................................. 40 Figure 8.5.1 Interfaces................................................................................................................................................................ 40 8.6 Usage Scenario Example: Burst Capacity at a Specified SLA..........................................................................................................41 9.0 Operations and Management............................................................................................................................................................... 42 9.1 Overview...................................................................................................................................................................................... 42 9.2 Motivations.................................................................................................................................................................................. 42 9.3 CIaaS Operations Usage Scenarios............................................................................................................................................... 43 9.3.1 Usage Scenario 1–Access and Control Configuration.......................................................................................................... 43 9.3.2 Usage Scenario 2– Provisioning/Deprovisioning Capabilities............................................................................................... 44 9.3.3 Usage Scenario 3– SLA or Service Fault Identification by Provider...................................................................................... 45 9.3.4 Usage Scenario 4– SLA or Service Fault Identification by Subscriber.................................................................................. 45 9.3.5 Usage Scenario 5– Service Change or Outage Notification................................................................................................. 46 9.3.6 Usage Scenario 6– Service Monitoring............................................................................................................................... 46 9.3.7 Usage Scenario 7– Subscriber Billing and Usage.................................................................................................................47 9.4 Operations and Management Service Tiering................................................................................................................................ 48 Table 9.4.1 Operations and Management Service Tiering............................................................................................................. 48 10.0 Technical Architecture....................................................................................................................................................................... 48 10.1 Assumptions and Context........................................................................................................................................................... 48 10.2 Components.............................................................................................................................................................................. 48 Figure 10.2.1 CIaaS Architecture Components............................................................................................................................. 49 10.3 Compute Layer........................................................................................................................................................................... 49 10.3.1 Service Tier Summary: Compute Instance Attributes......................................................................................................... 50 10.4 Storage Layer............................................................................................................................................................................ 50 10.4.1 Block Storage Requirements............................................................................................................................................. 50 10.4.1.1 Service Tier Summary: Block Storage Attributes..................................................................................................... 50 10.4.2 Object Storage Requirements............................................................................................................................................51 10.4.3 Storage Fabric..................................................................................................................................................................51 10.4.3.1 Service Tier Summary: Storage Fabric Attributes.................................................................................................... 52 10.5 Network Fabrics......................................................................................................................................................................... 52 10.5.1 Service Tier Summary: Network Fabric Attributes............................................................................................................. 53 10.6 Management.............................................................................................................................................................................. 53 10.6.1 Service Tier Summary: Management................................................................................................................................ 54 11.0 Security Considerations..................................................................................................................................................................... 54 4 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 11.1 Security Requirements................................................................................................................................................................ 54 11.2 Implementation Guidelines.......................................................................................................................................................... 56 11.2.1 Assumptions.................................................................................................................................................................... 56 11.2.2 General Guidelines........................................................................................................................................................... 56 11.2.3 Service Tier Specific Guidelines:....................................................................................................................................... 57 11.3 Security Service Catalog............................................................................................................................................................ 57 11.4 Malware Protection.................................................................................................................................................................... 57 11.5 Admission Control...................................................................................................................................................................... 58 11.6 Security Audit and Governance................................................................................................................................................... 59 11.6.1 Security Governance Usage Scenario................................................................................................................................ 59 12.0 Commercial Considerations............................................................................................................................................................... 60 13.0 Regulatory Considerations................................................................................................................................................................. 60 14.0 RFP Requirements............................................................................................................................................................................. 60 15.0 Next Steps and Summary of Industry Actions Required...................................................................................................................... 60 15.1 Industry Actions Required........................................................................................................................................................... 63 15.2 Future CIAAS Requirements Development.................................................................................................................................. 63 16.0 References........................................................................................................................................................................................ 64 ODCA Usage Models.......................................................................................................................................................................... 64 Other Sources.................................................................................................................................................................................... 65 Endnotes........................................................................................................................................................................................... 66 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 5 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Legal Notice Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. This “Open Data Center AllianceSM Master Usage Model: Compute Infrastructure as a Service” is proprietary to the Open Data Center Alliance, Inc. NOTICE TO USERS WHO ARE NOT OPEN DATA CENTER ALLIANCE PARTICIPANTS: Non-Open Data Center Alliance Participants only have the right to review, and make reference or cite this document. Any such references or citations to this document must give the Open Data Center Alliance, Inc. full attribution and must acknowledge the Open Data Center Alliance, Inc.’s copyright in this document. Such users are not permitted to revise, alter, modify, make any derivatives of, or otherwise amend this document in any way. NOTICE TO USERS WHO ARE OPEN DATA CENTER ALLIANCE PARTICIPANTS: Use of this document by Open Data Center Alliance Participants is subject to the Open Data Center Alliance’s bylaws and its other policies and procedures. OPEN DATA CENTER ALLIANCESM, ODCA SM, and the OPEN DATA CENTER ALLIANCE logo® are trade names, trademarks, service marks and logotypes (collectively “Marks”) owned by Open Data Center Alliance, Inc. and all rights are reserved therein. Unauthorized use is strictly prohibited. This document and its contents are provided “AS IS” and are to be used subject to all of the limitation set forth herein. Users of this document should not reference any initial or recommended methodology, metric, requirements, or other criteria that may be contained in this document or in any other document distributed by the Alliance (“Initial Models”) in any way that implies the user and/or its products or services are in compliance with, or have undergone any testing or certification to demonstrate compliance with, any of these Initial Models. Any proposals or recommendations contained in this document including, without limitation, the scope and content of any proposed methodology, metric, requirements, or other criteria does not mean the Alliance will necessarily be required in the future to develop any certification or compliance or testing programs to verify any future implementation or compliance with such proposals or recommendations. This document does not grant any user of this document any rights to use any of the Alliance’s Marks. All other service marks, trademarks and trade names referenced herein are those of their respective owners. Published November, 2012 Acknowledgements ODCA would like to acknowledge the substantial contributions of content and prior art from the Enterprise Cloud Leadership Council of the TM Forum, CloudScaling and Atos. Terminology and Provenance Some of the content of this document has been sourced with permission from work product outside the ODCA. Every effort has been made to reconcile terminology and nomenclature. Where this is a conflict, however, ODCA terms take precedence. 6 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. OPEN DATA CENTER ALLIANCE Master USAGE MODEL: Compute Infrastructure as a Service REV 1.0 SM 1.0 Executive Summary Given the broad range of cloud consumers and their compute infrastructure requirements, there is a wide spectrum of capabilities that service providers could offer to meet compute service needs, and ultimately to deliver excellent, cost-effective end user application usage experiences. Clearly it is not possible for service providers to meet all possible permutations of demand and capabilities, particularly for edge cases where lack of scale or volume limit service providers’ ability to achieve economies of scale. In order to meet the compute infrastructure requirements for the broad range of service consumers, a common framework is required around which infrastructure as a service can be defined, provisioned, monitored and managed. A common set of principles, metrics and architectural frameworks can be defined, resulting in consistent capabilities, service levels and service attributes across multiple providers, while still allowing the individual providers to innovate and differentiate. By 2014, the Open Data Center Alliance (ODCA) and its members would like to see a robust marketplace, with full coverage for all of the usage scenarios contemplated herein. Furthermore, most providers should offer service for at least half of the usage scenarios. This ODCA Master Usage Model: Compute Infrastructure as a Service (CIaaS) is intended to help facilitate the potential for this by establishing a requirements framework for open, interoperable compute infrastructure services. To date, the efforts of the ODCA have focused on the top concerns of service consumers and providers. The resulting original usage models focused on specific topics, such as measurement and identity management, among others. The purpose of this newest round of usage models is to continue to develop these specific focus topics and to provide a platform on which to bring the topics together in a more holistic manner. The master usage models will reference and support the previously published usage models. This document serves a variety of audiences. Business decision makers looking for specific solutions and enterprise IT groups involved in planning, operations, and procurement will find this document useful. Solution providers and technology vendors will benefit from its content to better understand customer needs and tailor service and product offerings. Standards organizations will find the information helpful in defining end-user relevant and open standards. 2.0 Purpose This document and its referenced supporting usage models describe the requirements for complete compute infrastructure as a service. There are aspects of this usage model where requirements are more stringent than found in popular public clouds today. It is important to understand that this document specifies enterprise requirements, sufficient to displace incumbent enterprise data centers. This common framework is required so that CIaaS services can be evaluated, acquired and disposed of by enterprises in a way that reflects the ODCA member firms’ vision of a robust and vibrant market by the end of 2014. The IaaS area is quite broad and can be segmented into three different “as a service” offerings: compute, storage and network, each offered in a variety of usage models. While storage and network services areas are essential to the overall cloud services model, the foundation of cloud computing is fundamentally based on “compute as a service” capabilities. Thus, this document addresses the compute portion of IaaS in greater depth than storage and network areas. ODCA will address these other areas of IaaS in future work. 7 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 3.0 Taxonomy Table 3.1–Terms and Definitions 8 Actor/Term Definition Cloud-Aware Application Cloud-aware applications have been designed and built with the sole intent of running in a cloud environment. Cloud Broker A cloud broker is an entity that manages the use, performance, and delivery of cloud services and negotiates relationships between cloud providers and cloud subscribers. In general, a cloud broker can provide services in three categories: Service Intermediation, Service Aggregation, and Service Arbitrage.1 Cloud Federation A concept of service aggregation characterized by interoperability features, addressing the economic problems of vendor lock-in and provider integration.2 Cloud Provider An organization providing cloud services and charging cloud subscribers. A cloud provider provides services over the Internet. A cloud subscriber could be its own cloud provider, such as for private clouds. Cloud Standards Body An entity responsible for setting and maintaining the cloud orchestration standards contemplated in this usage model. Cloud Subscriber A person or organization that has been authenticated to a cloud and maintains a business relationship with a cloud provider. Maintenance Window A period of time designated in advance by the cloud provider, during which preventative maintenance is performed, that could otherwise cause disruption of service.3 Recovery Consistency Objective (RCO) RCO defines a measurement for the consistency of distributed business data after a disaster or other business continuity event.6 Recovery Point Objective (RPO) The maximum tolerable period in which data might be lost from an IT service due to a major incident.4 Recovery Time Objective (RTO) The duration of time and a service level within which a business process must be restored after a disruption in order to avoid unacceptable consequences.5 Traditional Application A program or system that has not been specifically designed (or remediated) to transparently leverage the unique capabilities of cloud computing. Workload A machine image or virtual machine instance, together with the needed information about the technical layout (e.g., number of cores, RAM), network configuration and the data store directly associated with the VM. The VM is the abstraction of all the workload’s constituent elements. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 4.0 Defining CIaaS In order to work to a common framework, we use the NIST definition for Infrastructure as a Service: “Infrastructure as a Service (IaaS). The capability provided to the consumer is to provision processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, deployed applications, and possibly limited control of select networking components (e.g., host firewalls).” 7 We specifically position the work in this master usage model as a general-purpose cloud compute container, including the necessary supporting network and storage capabilities to make it useful. However, the emphasis of this usage model is on compute capabilities. As illustrated in the conceptual framework below, this foundation will allow us to consider separately higher-level IaaS, PaaS, and SaaS solutions higher up the stack. Also, while storage and network requirements are addressed in this document, distinct usage model requirements for storage as a service and network as a service will be addressed in the future. Figure 4.0.1 - ODCA Conceptual Framework v1.0 SaaS IT Ops Cloud Aware Apps PaaS Web Database Traditional & Cloud Aware Apps IaaS Compute Storage Network Facility Dynamic Management & Orchestration of End-to-End Services Application Development Actionable Service Catalog (UI and API) Business Processes End User Web & Data Service Interoperability Capacity & Performance SW Delivery OS and Apps Configuration Event Security Power, Cooling, Physical Space Consume / Subscribe Provide The ODCA Conceptual Framework illustrated above shows that CIaaS services may be used for both “traditional” and “cloud-aware” applications. Indeed, this document includes sample usage scenarios for both types of applications. It is worth, however, first defining what we mean by cloud-aware and traditional applications. • Cloud-Aware: Cloud-aware applications have been designed and built with the sole intent of running in a cloud environment. They are both free of the dependencies and assumptions which burden traditional or legacy applications, while simultaneously able to fully exploit the inherent advantages of cloud. Other terms have been used for these as well, including cloud-native, cloud-architected, born-cloud, or cloud-enabled. However, the attributes used to describe such architectures are generally agreed to include the following8,9: ºº Composable: Applications are distributed and dynamically wired. ºº Elastic: The ability to scale up but also to scale down based on the load. ºº Evolvable: This is related to portability, and suggests the ability to replace existing underlying technology or vendor decisions with others, as the needs of the business or market change, with minimal impact to the business. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 9 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 ºº Extensible: Applications are incrementally deployed and tested. That is, there is the ability to easily grow the application over time. ºº Granular metering and billing ºº Multi-tenant: Multiple cloud subscribers may be using the same underlying resources and infrastructure from the cloud provider, with reliability, security, and consistent performance. ºº Portable: Applications can run almost anywhere, any cloud provider and from any device. ºº Self-service • Traditional: Simply put, a program or system that has not been specifically designed (or remediated) to transparently leverage the unique capabilities of cloud computing. Rather, such applications may be migrated to run in a cloud context, but the value realization from such instances will be limited. 4.1 CIaaS Scope CIaaS requires the following elements: • Compute instance (may or may not be virtual, although virtual machines (VMs) are typical) • CPU and memory resources • Network components • Storage These will be deployed in different configurations to meet a range of service capabilities and attributes. We do not seek to define technical implementation in this usage model. Instead, we seek to define the capabilities required in common terms and measures. Figure 4.1.1 - CIaaS in Context Cloud Subscriber Cloud Broker Cloud Provider 1 Network 10 Server Storage Network Container Orchestration Storage Container Orchestration Orchestration Container Server Cloud Provider 3 Cloud Provider 2 Server Storage Network Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 4.2 CIaaS Workloads10 Generically, a workload is an encapsulation of the following: • Application processes • Data • Configuration information • State This also includes metadata that describes the relationships among those elements. For Infrastructure as a Service (IaaS) services, the workload encapsulation is usually the virtual machine. Best practices11 dictate that service descriptions and levels should be consistent in order to ensure transparency and to fairly compare/ contrast cloud providers against each other. Additionally, billing models, etc. should be comparable across environments. These are key practical realizations of cloud interoperability. See also the ODCA Master Usage Model:Commercial Framework 12, and ODCA Usage Model:Regulatory Framework 13 for more on this broad topic. 4.3 Deployment Models Unless specified otherwise, the requirements described in this document are assumed to apply to all potential cloud deployment and procurement models: Table 4.3.1 Cloud Models Model Definition Private Cloud The cloud infrastructure is operated solely for an organization (cloud subscriber). It may be managed by that organization itself (an Enterprise Internal Private Cloud) or a third party and may exist on premise (Managed Internal Private Cloud) or off premise (Managed External Private Cloud). (based on NIST definition11) Community Cloud The cloud infrastructure is shared by several organizations and supports a specific community or industry vertical that has shared concerns (e.g., mission, security requirements, policy, and compliance considerations). It may be managed by the organizations or a third party and may exist on premise or off premise. Members may have similar or correlated utilization profiles. (based on NIST definition) Sector Cloud Demand may be highly correlated among the members of a community cloud, undermining its economics. Thus, a Sector Cloud is a multi-industry community cloud where peaks and troughs in demand may be smoothed out, allowing for greater optimization opportunities.14 Public Cloud The cloud infrastructure is made available to the general public or a large industry group and is owned by an organization selling cloud services. (based on NIST definition) Hybrid Cloud The cloud infrastructure is a composition of two or more clouds that remain unique entities but are bound together by technology that enables data and application portability (e.g., cloud bursting for load-balancing between clouds). (based on NIST definition) Cloud Marketplace Professional exchanges of cloud services by members (cloud providers and cloud subscribers) who agree on common rules, along with being certified and audited by third-party cloud auditors to ensure consistency and quality. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 11 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 4.4 CIaaS Service Attributes A compute IaaS offering will be defined using the service assurance attributes below, most of them as per the ODCA Usage Model: Standard Units of Measure for IaaS.15 We additionally include two further terms, functionality and interoperability. • Availability: The degree of uptime for the solution, such as taking into account contention probabilities. • Performance: The extent to which the solution is assured to deliver a level of output. • Recoverability: The solution’s recovery point and recovery time objectives. • Security: The extent of the solution’s protection (e.g., encryption, tripwires, virtual local area network or VLAN, port filters, etc.). • Manageability: The degree of automation and control available for managing the solution. • Client SLA priority: The service contention design for handling peak demand. • Functionality: The essential services provided by the cloud provider to the cloud subscriber. • Interoperability: The degree to which a cloud subscriber can do all of the following16,17: ºº Migrate workloads from one cloud to another (including across cloud providers). ºº Link disparate clouds. ºº Compare cloud providers based on cost and capabilities. ºº Utilize consistent management interfaces. The service attributes can be described in terms of multiple service tiers. Specifically, each of these attributes can be defined at the bronze, silver, gold and platinum service levels, described in the table below. The intent of this master usage model is that service tiers can be mixed and matched across the different service attributes, but not within them. That is, it is possible to have a cloud service with bronze availability but gold performance. However, all of the elements that comprise a given attribute’s service tier must be met. For example, all of the subrequirements for gold performance must be met for it to be deemed gold-level service tier for performance (see section 7.0 Service Tier Details). Table 4.4.1 Service Tier Attributes Service Tier Positioning Description Bronze Basic Representing the lower-end corporate requirement, possibly equating to a reasonably high level for a small to medium business customer Silver Enterprise Equivalent Representing higher quality than bronze, and a trade-off with higher costs, within the SLA range Gold Critical market or business sector equivalent Representing a preference for a further higher quality of service within the SLA range. Platinum Military or safety-critical equivalent Representing the maximum contemplated corporate requirement, stretching towards the lower end of military or safety-critical needs It is assumed that lower service tiers correspond with lower cloud provider pricing. The service levels are specifically defined using qualitative and quantitative measures aligned with the ODCA Usage Model: Standard Units of Measure for IaaS.15 Note: service levels for “Interoperability” are defined in the ODCA Usage Model: Guide to Interoperability Across Clouds,17 and CIaaS “functionality” is defined here for the first time. Given the broad range of CIaaS requirements and capabilities, a good way to understand specific service attributes for CIaaS is to use examples. These are introduced in a subsequent section. 12 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 4.5 General Capabilities CIaaS must include the following general management and operational capabilities as also highlighted by the TM Forum’s Enterprise Cloud Leadership Council.18 ODCA has extended these requirements to enhance security and fit within the enterprise. IT Operations Management: Note: Item 4 below is an ODCA requirement, beyond those of the TM Forum. 1. The service must support a wide range of x86-based operating systems, including Windows (server and desktop OS), Solaris x64 and Linux (leading distributions) in 32-bit and 64-bit versions. 2. The service must support network isolation controls for inbound and outbound traffic. 3. The service must support the deployment of Web, Application, Database and Infrastructure Service components, such as LDAP components. 4. Alignment with Information Technology Infrastructure Library (ITIL) processes for change, incident and configuration management Network Management: 1. The service provider must provide options for consumer network connectivity, such as internet VPN, and leased lines. In addition, the service provider must articulate any other network requirements, stipulations and constraints, such as NAT, IP address overlays, and latency controls. 2. The service must include instrumentation to provide the consumer with a view of bandwidth, performance, and latency. These should be available via the service interfaces (including the service portal user interface and the API). Security Management: Note: Items 3 through 5 below are ODCA requirements, beyond those of the TM Forum. 1. The service provider must provide architectural, design, policy and other artifacts that demonstrate the degree to which the cloud subscriber’s service is being segregated from other subscribers. This applies to single-tenant and multi-tenant cloud services. 2. The cloud provider must formally and explicitly affirm that the storage, network, and processing security meet the requirements of the cloud subscriber’s contracted tier (bronze, silver, gold, or platinum). Additionally, the cloud provider must support independent verification. 3. The subscriber must adhere to the security policies and procedures implemented by the provider in order to comply with the actual assurance level of the cloud. 4. In a managed cloud environment, certain software used by the subscriber within the cloud should integrate with the monitoring tools provided in the cloud that ensure the cloud’s integrity. This includes such software for intrusion detection and prevention, thresholds on access logs, and so on. 5. Application monitoring is under control of the subscriber and can run locally in the cloud as well as be connected to the subscriber’s monitoring infrastructure. Application level security monitoring is the subscriber’s responsibility (Note: in higher-value service types an application firewall might be part of the provider’s offering). Workload Management: Note: Item 4 below is an ODCA requirement, beyond those of the TM Forum. 1. The service must provide volume flexibility, and allow the consumer to dial up or down the resources being consumed. 2. The service must be capable of integrating with the consumer cloud management tools programmatically and through standard APIs. 3. The service should allow the consumer to change workload policy rules and parameters at will within specific criteria. 4. The service must provide a cloud service management portal. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 13 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Compliance Management: 1. The provider must agree and adhere to, and permit enforcement of governing frameworks and policies, internal and external audits, minimum standards/certifications and consequence management. Lack of controls may subject providers to penalties. 2. There may be technical and procedural requirements based on the cloud subscriber’s industry or country in which they operate or have customers. This may also include requirements such as data that must stay within the country of origin, or regular, prescriptive disaster recovery testing. The cloud subscriber may be required to provide evidence of compliance, and thus may need the provider’s assistance to produce that. Problem Management: 1. Each party must have established an effective root cause analysis of incidents related to contracted or consumed services to prevent recurrence of negative service impacts. Service Continuity Management: Note: Item 2 below is an ODCA requirement, beyond those of the TM Forum. 1. Each party must have effective processes to ensure that IT services can recover and continue even after a serious incident occurs. This will also include the business continuity of material suppliers. 2. The cloud provider must ensure that a third party cloud subscriber cannot impact the cloud subscriber, such as in “noisy neighbor” situations. Vulnerability Management: Note: Items 2 through 4 below are ODCA requirements, beyond those of the TM Forum. 1. The cloud provider must establish a regular practice of identifying, classifying, remediating, and mitigating vulnerabilities, including patch management. Furthermore, the provider must notify the cloud subscriber of any actions or incidents, known or suspected, that may risk the cloud subscriber’s assets or data via the provider’s service. 2. The cloud provider works closely with its ISPs and with regional and international security organizations in order to prevent Internet driven attacks against its clients or its infrastructure. The provider has a risk response team and a security operations team that is trained to respond quickly to attacks, and to preserve their clients from being impacted. Furthermore, for the gold and platinum services, DOS and DDOS attacks are contained by filtering the attacking servers as early as possible on the ISP’s infrastructure. 3. For service tiers silver, gold and platinum, the subscriber has a vulnerability management system in place and applies security patches in a timely manner, as defined in the ODCA Usage Model: Provider Assurance.19 4. For service tiers gold and platinum, the subscriber performs analysis of its access and application logs and communicates identified patterns to the provider, in order to improve the accuracy of the provider’s filters. Monitoring Service: Note: Item 2 is an ODCA requirement, beyond those of the TM Forum. 1. The cloud provider must monitor the environment, including event, capacity, security and utilization, to ensure SLAs are met. The provider’s monitoring data must be provided over standardized APIs. 2. If application layer monitoring and analytics point to the infrastructure, then the infrastructure metrics should be easily accessible and available for root cause analysis, troubleshooting, and to provide early warnings of issues that may be preventable. Incident Management: Note: Items 2 through 4 below are ODCA requirements, beyond those of the TM Forum. 1. 14 Each party must inform the other of incidents that may affect the other. Pre-defined agreements must be established on prioritization of an incident and level of effort required by the cloud provider during an incident. Automated and standardized Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 interfaces are to be established to manage incidents. 2. Note that incidents to be communicated also include those where the cloud subscriber must inform the cloud provider of incidents that may affect other third-party subscribers. 3. For major incidents, as agreed in a contract between the provider and the subscriber, the cloud provider must notify the affected customers within 48 hours. 4. Incident responses must be agreed between both parties in order to make them as effective as possible. Coordinated activities help prevent service degradation by avoiding conflicting actions. Change Management: 1. Each party will notify the other when a change in configuration or other operational aspect may affect the service capabilities of the other party. Proactive management is required to ensure a stable environment. Governance: Note: Items 2 and 3 below are ODCA requirements, beyond those of the TM Forum. 1. The provider must adhere to and permit enforcement of governing frameworks and policies, internal and external audits, minimum standards and certifications, and security controls. Penalties and termination of contracts may be established where requirements are not met. 2. For gold and platinum service tiers, the contract between the subscriber and the provider will typically stipulate geographic or jurisdictional limitations where the subscriber’s data can be stored and processed, including secondary site and backup tapes location. 3. For gold and platinum tiers, the provider must notify the subscriber of the parent company or legal jurisdiction if they have a US parent company governed by the US PATRIOT Act. Provisioning of Services: 1. Provider must have effective automated mechanisms to request, provision, manage, and meter usage of services wherever possible. 4.6 CIaaS Key Performance Indicators A key performance indicator (KPI) is an IT term of art for a type of performance measurement. A very common way of choosing KPIs is to apply a management framework (for example, the balanced scorecard), and consolidate a number of SLA perspectives and metrics into a smaller set of overall indicators. Some KPI types include:20 • Quantitative indicators; potentially anything numeric and relevant to business objectives and service contract. • Practical indicators that interface or align with enterprise processes. • Directional indicators specifying whether a service or an organization is improving or not. • Actionable indicators are those which are sufficiently in an organization’s control in order to affect change. • Financial indicators that could include spend, savings, service credits for SLA failure, etc. Key performance indicators, in practical terms and for strategic development, are objectives to be targeted that will add the most value to the business. KPI Principles: • KPIs should define specific and unambiguous measure titles or labels. • The parameters of the KPI help constitute the SLA. • KPIs have a high and low water mark, against which SLAs are set. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 15 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 • KPIs can have multiple dimensions, some of which are shared, such as: ºº Cloud subscriber view to gauge quality of service. ºº Cloud provider view to manage overall services. ºº Shared view on some items. The key performance indicators for CIaaS correspond closely with the service attributes. They are defined in order to provide an effective way to measure the service. They are also important to cloud subscribers when making services purchases, and to cloud providers when benchmarking their services. KPIs should focus on a small number of core and meaningful statistics that are cost-effective and useful. KPIs associated with the Service Attributes These are the KPIs that will be used on a day-to-day basis by the cloud subscriber to ensure that the service is being delivered to requirements, and to track deviations from the norm. These can be easily described using the elements and desired service tiers in the “Service Attributes” and “Commercial Considerations” sections of this document. However, the major KPI themes include:21 • Provisioned and contracted capacity • Capacity usage and utilization • Performance parameters • Invocations of defined automated actions • Service availability • Supported service tiers (bronze, silver, gold and platinum) • Service Continuity parameters (RTO, RPO, RCO) • Data classification and retention metrics • Security controls • Defined reports and auditing metrics Example KPI: Service is available as expected May Include: • System Uptime • Network Uptime • Storage Uptime • Incident Response Time Aggregated SLA Committed Low Water Mark: Minimum acceptable level Actual Achievement High Water Mark: Target achievement Suggested Additional Business Value KPIs for Cloud Subscribers Cloud subscribers should consider developing internal KPIs which can be used to gauge the value derived by their business from adoption of CIaaS. These are optional, but may include:22 • Metrics measuring performance of the service against the strategic business and IT plans. • Metrics on risks and compliance against regulatory, security, and corporate governance requirements for the service. • Metrics measuring financial contributions of the service to the business. 16 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Suggested Additional Relationship KPIs for Cloud Providers Below are some suggested KPIs cloud providers should track to ensure customer satisfaction and to benchmark services against their peers.22 If possible, cloud providers should maintain an open and ongoing dialogue with their enterprise cloud subscribers regarding the above “business value” KPIs as well. These are optional, but may include: • Metrics monitoring the key IT processes supporting the service. • Metrics measuring customer satisfaction. 5.0 Interoperability Interoperability is concerned with portability of workloads, interconnectivity of clouds, and the ability to integrate systems. Interoperability is important to cloud subscribers because it helps avoid provider lock-in while ensuring flexibility for the subscriber. It allows compute service decisions to be less tightly coupled with business priorities. Interoperability is also important to cloud providers because it can help prevent disqualification by potential customers fearing lock-in. As discussed in the ODCA Usage Model: Guide to Interoperability Across Clouds17, there are two key aspects of interoperability: portability of workloads and interconnectability of systems. Each of these aspects comes with its own set of unique requirements. • Portability is required to be triggered as needed, as an event, including maintaining access to data and control over the workload, and allowing dynamic configuration and reconfiguration. • Interconnectability addresses the need for applications and services to establish and preserve, on an ongoing basis, complex connections that occur between different systems. Consistent manageability across the connected clouds is implicitly included. 6.0 Business Drivers and usage scenarios The purpose of this master usage model is not to define in detail what the service providers deliver in terms of precise technical architecture. Some consumers want providers to provide “enterprise” grade infrastructure, meaning compute infrastructure that can replace internal enterprise-provided and managed infrastructure, and can be used interchangeably. However, the real benefits and economies from the cloud model stem from approaches where applications are designed and built from the ground up to take advantage of the commodity common services available from multiple providers. There are two approaches to consider: should the cloud adapt to legacy enterprise requirements, as in the case of traditional applications, or should the enterprise adapt to the cloud, such as with cloud-aware applications? The answer is both. Large enterprises have correspondingly large application portfolios, many of which have been designed and implemented within traditional distributed computing environments. These enterprises have neither the desire nor economic justification to refactor their entire application estate to optimally leverage the cloud delivery model. Over time, it can be expected that enterprises will adapt to the cloud delivery model. However, this will be an extended journey. There is an opportunity for service providers to deliver a wide range of service levels and capabilities - ranging from high grade “enterprise class” compute infrastructure through to fully commoditized capacity. Ensuring service providers deliver environments, services, and associated tools that enable enterprises to operate effectively and efficiently across these hybrid environments will be key as cloud computing penetration increases, while enterprise also continues to make the most of investments they already have in place. 6.1 Business USAGE Scenarios In this section, we propose some example business usage scenarios for deployment to Compute IaaS. For each one, we characterize the general service attributes of the usage scenarios. These general attributes are elaborated into detailed elements elsewhere in this document. See the “Technical Architecture” and “Service Tiers” sections for more detail. The usage scenarios contemplated herein are not intended to be comprehensive. They are, rather, intended to be representative of the kinds of business problems for which CIaaS services are suitable. There are innumerable other potential combinations of requirements. The CIaaS framework put forth by this master usage model should allow the reader to describe other usage scenarios in a similar manner. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 17 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Below is a simple diagram illustrating the hierarchy of usage scenarios described in this usage model. Traditional and cloud-aware applications are described elsewhere. Figure 6.1.1 - Sample CIaaS Usage Scenarios CIaaS Usage Scenarios Non-Production Production Traditional Standalone Cloud-Aware Enterprise Grid Base Enterprise Variable Traditional Dev / Test Cloud-Aware Load Testing Strategic Dev Ad Hoc Dev / Test QA Hybrid cloud environments will also be around for a while. There is a requirement for IT operations professionals to be able to manage, measure and report on systems running behind the firewall in a manner consistent and compatible with those in the cloud, traditional or cloud-aware. Industry can and should provide this ability for cloud subscribers operating in hybrid environments. Note that the usage scenarios below do not differentiate between internal or employee-facing scenarios vs. external or client/public-facing. However, it is reasonable to assume that where multiple attributes service tiers are indicated, public and client-facing requirements will often exceed those of internal and employee-facing scenarios. Business criticality is also indicated for each example. Most of the usage scenarios below cite and expand on scenarios previously documented by the TM Forum’s Enterprise Cloud Leadership Council (ECLC). For the production usage scenarios below, Client SLA priority was derived using the following scenario assumptions: • For high priority events and requests, as agreed between cloud subscriber and cloud provider • Time to respond: bronze, 4 hours; silver, 1 hour; gold, 10 minutes; platinum, immediate/instant. 18 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 6.2 Development/Test This is a subclass of non-production CIaaS use. For this example, we are assuming traditional, or non-cloud-aware, workloads. The ECLC description follows: This is one of the most accessible and obvious usage scenarios for CIaaS, as development and test environments are typically temporary and disposable. By adopting CIaaS for development and test, environments are more easily segregated –or logically isolated –from production, helping to eliminate interdependencies and negative consequences from inadvertent interaction with production systems. These environments may also be provisioned and deployed much more rapidly than would be feasible with physical systems, allowing for improvements in business agility and time to market. Requirements for development and test on the cloud can be further subdivided into three sub-cases: Strategic Development, Ad Hoc Development and Test, and Quality Assurance. • Strategic Development: Major, long-term software development efforts core to the cloud subscriber’s business. • Ad Hoc Development and Test: Informal, short term or experimental software development that is not core to the cloud subscriber’s business. • Quality Assurance: Formal testing of software by cloud subscriber as part of quality assurance, systems integration or other similar formal testing of software that is core to the cloud subscriber’s business. This will necessitate greater resilience and performance consistency from the cloud service. Table 6.2.1 – Requirements for Development and Test Requirement (O-Optional, M-Mandatory) Strategic Development Ad Hoc Dev / Test Quality Assurance Business Criticality Low or Medium Low Low or Medium Service Attribute Tier Tier Tier Availability Bronze or Silver Bronze Bronze or Silver Performance Bronze Bronze Bronze or Silver Recoverability Bronze or Silver Bronze Bronze or Silver Security Silver or Gold Silver Silver or Gold Elasticity Bronze Bronze Bronze Manageability Silver Bronze or Silver Silver Client SLA priority Bronze Bronze Bronze Interoperability Silver Bronze Silver Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 19 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 6.3 Load Stress Test Environment This is also a subclass of non-production CIaaS use. For this example, we are assuming traditional, or non-cloud-aware, workloads. From the ECLC description of the usage scenario with the same name: “This usage scenario is emerging as a key enabler for online businesses and other businesses that require massive scaling. As an extension of development and test, load testing is a specialized domain that can be used to detect algorithmic weaknesses that only surface at scale. While an application may run as expected as a singleton or as part of a small cluster, doubling or exponentially increasing the number of processing nodes may yield unexpected results. CIaaS allows a developer to simulate this scaling using a shared internal or external resource, eliminating the capital outlay that would be required to provision capacity for hypothetical load levels. Additionally, client-server systems may benefit from the cloud by leveraging CIaaS capacity to simulate large numbers of clients to drive load against servers.” Additionally, there are now means for test scripts to reflect real user patterns of transactions that occur concurrently and stress the underlying IaaS services in differing ways. Load stress tests should incorporate real user usage patterns from real user monitoring systems to implement these tests. Table 6.3.1 – Load Stress Test 20 Requirement (O-Optional, M-Mandatory) Load Stress Test Business Criticality Low Service Attribute Tier Availability Bronze Performance Gold (fidelity with the target production environment is a key consideration) Recoverability Bronze Security Bronze or Silver Elasticity Silver or Gold Manageability Silver Client SLA priority Bronze Interoperability Bronze Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 6.4 Grid and High-Performance Computing Grid and high-performance computing (HPC) applications may be either traditional or cloud-aware. Due to their inherent parallel distributed compute pattern, grid applications are often natural good fits for cloud architectures. For this example, we are assuming the grid application is cloud-aware. However, a traditional, non-cloud-aware grid system should not have substantively different requirements. From the ECLC description for Grid computing applications: “While Grid computing predates cloud computing, the cloud can be used to optimize Grid-based usage models. Whereas a classic compute Grid was traditionally housed in a static farm of physical compute servers, CIaaS offers dynamic capacity management, allowing the logical Grid to expand and contract based on the business demand and marginal benefit or cost to enabling additional parallel processing nodes.” Requirements for grid and high performance computing in the cloud can be further subdivided into two sub-cases: base grid capacity and variable grid. Table 6.4.1 – Requirements for Grid and High Performance Computing Requirement (O-Optional, M-Mandatory) Base Grid Capacity Variable Grid Capacity Business Criticality High Medium or High Service Attribute Tier Tier Availability Gold or Platinum Bronze Performance* Bronze to Gold Bronze to Gold Recoverability Bronze Bronze Security Gold or Platinum Gold or Platinum Elasticity Bronze Gold or Platinum Manageability Silver Silver Client SLA priority Gold or Platinum Bronze Interoperability Silver or Gold Silver or Gold Additional assumptions: In both usage scenario subtypes above, we explicitly assume that the operating model has the cloud subscriber doing their own management and support from the operating system up. Both usage scenario subtypes assume production enterprise workloads. *Note that performance may go as low as bronze for low-cost economy grids, depending upon cloud subscriber requirements and price sensitivity. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 21 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 6.5 Standalone Traditional Production Environment This may be thought of as monolithic deployment. It is a subtype of traditional or legacy workloads, migrated over to the cloud. For example, a team or departmental production application that is important, but not critical, to day-to-day business operations. The ECLC description for standalone production is appropriate: “Applications with minimal external dependencies can run more or less isolated from the corporate environment. While a bridge solution may be required to integrate with identity or other directory services, or to connect into other business workflows, these applications may function equally well in any location. These types of applications may be well-suited for deployment in a CIaaS environment, allowing them to benefit from migration or extension to other geographies in support of cost or productivity (follow-the-moon/follow-the-sun).” This is a specific type of production requirement. However, as with other enterprise-production scenarios, the service in this case is deemed mission critical. Loss of the compute service will have material negative impacts on the cloud subscriber, such as significant loss of revenue or reputational damage. Standalone production environments are intended to cover group and departmental applications that are considered production for the purposes of support and recovery, but have a limited impact to the business. Examples of these are internal file servers, web servers, and collaboration sites. Loss of these services is considered an inconvenience rather than a disruption to business operations, and manual workarounds, while feasible, might be undesirable. Table 6.5.1 – Standalone Cloud-Enabled Production 22 Requirement (O-Optional, M-Mandatory) Standalone Cloud-Enabled Production Business Criticality Low - Medium Service Attribute Tier Availability Silver or higher Performance Silver or higher Recoverability Silver or higher Security Silver or higher Elasticity Bronze Manageability Silver Client SLA priority Silver Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 6.6 Enterprise Traditional Production Environment This is usually legacy distributed compute systems which have been migrated to the cloud, but not been fully remediated to be cloud-aware. This is a subclass of production use. Enterprise applications differ from standalone production applications in their scope and importance to an organization. These are mission critical applications that affect the reputation and revenue stream of an organization. Loss of the enterprise production environment would result in significant loss of revenue and/or damage to the reputation of the subscriber. Examples of this are high volume transaction systems (ERP), customer facing web sites, partner integration sites and financial data processing systems where manual processing options are not feasible due to the volume or nature of the transactions. These applications have external dependencies and integration requirements. Bridging solutions may be required to integrate with identity or other directory services, or to connect into other business workflows. They may not be able to fully exploit cloud elasticity. Compared to “standalone” applications, these are larger and more complex and may not be well-suited for migration from one geographic location to another due to the size or volume of data required. Table 6.6.1 – Enterprise Traditional Production Requirement (O-Optional, M-Mandatory) Enterprise Traditional Production Business Criticality Medium – High Service Attribute Tier Availability Gold or Platinum Performance Gold or Platinum Recoverability Gold or Platinum Security Gold or Platinum Elasticity Bronze or Silver Manageability Gold Client SLA priority Gold or Platinum Interoperability Silver or Gold Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 23 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 6.7 Enterprise Cloud-Aware Production Environment This may be thought of as compute systems that, although not fully PaaS-based, have been architected for the cloud from the outset. Such systems may include newer web-style applications, exploit higher automation, and can query the infrastructure server about available capacity, services, and so on. This is a subclass of production use. Infrastructure-aware applications are newer applications built to tolerate failure of underlying infrastructure components. These applications make use of services provided by the infrastructure providers to recover from component outages or scale as needed to support application workload demands. As with the traditional distributed computing variant, loss of the production application would result in significant loss of revenue and/or reputational damage to the subscriber. The difference is that the application is architecture to run on infrastructure with a lower level of service, reliability and availability. These applications may also be distributed across the resources of multiple providers. These applications have internal/external dependencies and integration requirements. Table 6.7.1 – Enterprise Cloud-Aware Production Requirement (O-Optional, M-Mandatory) Enterprise Cloud-Aware Production Business Criticality Medium or High Service Attribute Tier Availability Silver or higher Performance Bronze or higher Recoverability Gold or Platinum Security Gold or Platinum Elasticity Gold or Platinum Manageability Gold or Platinum Client SLA priority Gold or Platinum Interoperability Gold or Platinum 6.8 Cloud Brokering and Federation Cloud brokering and cloud federation services are an important part of the cloud ecosystem, and will be addressed in a future update to the ODCA Master Usage Model: Compute Infrastructure as a Service (CIaaS). 24 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 7.0 Service Attribute Details Each of the service attributes introduced above is explained in further detail below. Each attribute is subdivided into one or more elements. Together the values for the elements determine that attribute’s service tier. All of the elements that comprise a given attribute’s service tier must be met. For example, all of the sub-requirements for gold performance must be met for it to be deemed gold-level service tier for performance. The following assumptions apply for each service attribute section below: Standardization: Services from cloud provider are standardized and consistent. Documentation is available. 7.1 functionality The infrastructure must support basic functional cloud services, as defined by NIST. The provider should support at least version n-1 of the latest version of current and future cloud service standards where specified herein or delivered in response to ODCA requirements by SDOs. 7.2 AVAILABILITY In the ODCA Usage Model: Standard Units of Measure for IaaS,15 ODCA has defined availability as the degree of uptime for the solution, such as taking into account contention probabilities.This should be construed as the overall service as a whole. There are certain general aspects that availability of CIaaS must address. The scope of CIaaS availability includes: • Overall availability number -- for example, the number of nines. • Connection from the POPs (point of presences) or internet connection points of the provider. • How maintenance windows are defined-- for example, fixed--chosen by cloud provider, or flexibly--chosen by cloud subscriber. • The possibility to define “critical time windows” would be available in the flexible versions. • Note that contractually agreed planned maintenance windows do not count against availability targets. • Target outage duration vs. number of outages-- Some workloads can better deal with short outages, even if there are many of them. Others need to have as few as possible, allowing longer downtime for each of them. • Existence of contractually-specified penalties imposed for breach of SLA, as in the table below. 7.2.1 Service Tier Summary: Availability Elements Bronze Silver Gold Platinum Overall availability 99% 99.9% 99.9% 99.99% Maintenance windows Fixed Fixed Flexible Flexible Permissible unplanned outage frequency* 2 1 0 0 Per-incident RTO** 60 min 30 min 0 min 0 min Target cumulative unplanned outage duration 120 min 30 min 0 min 0 min Permissible unplanned internet connectivity disruptions* 3 2 0 or 1 0 Penalty for SLA breach None Yes Yes Yes * Number of outages within a specified time window aligned with billing periods. Any penalty credits will be applied to the corresponding billing cycle. The intent here is to address transient or intermittent behavior where disruptions or outages are very brief but may materially impact workload performance and/or service quality. ** Total time duration per outage incident Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 25 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 7.3 Recoverability ODCA has defined recoverability as the solution’s recovery point and recovery time objectives.15 This should be construed for the recovery of the overall service as a whole. There are certain general aspects that recoverability of CIaaS must address. The scope of CIaaS recoverability includes: • Backup and restore with or without incremental revisions • Mean time to recovery • Number or frequency of recovery points 7.3.1 Service Tier Summary: Recoverability Elements Bronze Silver Gold Platinum Data Backup No backup One copy Two copies Three copies Geographic dispersion of data backups None One site Two sites Three sites, and off-site RTO for data restoration N/A Within 12 hours of service restoration Within nine hours of service restoration Within six hours of service restoration Data Replication None None Snapshot, at least four times daily Synchronous and Async to remote sites Backup frequency N/A Weekly full Daily incremental, weekly full Daily full RTO for compute instance restoration Subscriber responsibility 60 minutes 15 minutes 10 minutes RCO for distributed data None 80% 90% 100% (O-Optional, M-Mandatory) 26 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 7.4 Security ODCA has defined security as the extent of the cloud solution’s or cloud provider’s protection. Below are the detailed security elements that must be addressed: • Antivirus and malware protection (with definition updates within 24 hours) • Vulnerability management process exists and is fully tested to ensure no impact to target hosts • Network and firewall isolation of cloud subscriber systems • Physical access control into cloud data center • Secure protocols used for remote administration (for example, SSL,SSH, and RDP) • All default passwords and guest access removed • Use of non-disclosure agreements (NDAs) for cloud provider staff • Use of Information Technology Infrastructure Library (ITIL) processes for change, incident and configuration management • Identity management for subscriber assets • Data retention and deletion managed • Security incident and event monitoring • Network intrusion prevention • Event logging for all administration-level events (requires controlled access to logs) • Four-eye principle for key administrator changes • Cloud provider has an implemented and tested technical continuity plan • Fully documented and controlled network • Option to perform penetration testing on hosted systems • Physical segmentation of hardware (server, storage, network, etc.) to ensure isolation from all other systems • Encrypted communication between cloud provider and cloud subscriber for management • Multi-factor authentication • Ability for cloud subscriber to define geographic limits for hosting • Storage encryption at logical unit number (LUN) level • No administrative access for cloud provider staff • Strong encryption for all data in-flight and at rest Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 27 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 7.4.1 Service Tier Summary - Security 28 Elements (O-Optional, M-Mandatory) Bronze Silver Gold Platinum Antivirus and malware protection, with definition updates within 24 hours M M M M Vulnerability management process exists and is fully tested to ensure no impact to target hosts M M M M Network and firewall isolation of cloud subscriber systems M M M M Physical access control into cloud data center M M M M Secure protocols used for remote administration (for example, SSL,SSH, and RDP) M M M M All default passwords and guest access removed M M M M Use of non-disclosure agreements (NDAs) for cloud provider staff M M M M Use of Information Technology Infrastructure Library (ITIL) processes for change, incident and configuration management M M M M Identity management for subscriber assets (Refer to ODCA Usage Model: Identity Management Interoperability Guide) M M M M Data retention and deletion managed M M M M Security incident and event monitoring M M M M Network intrusion prevention O M M M Event logging for all administration-level events (requires controlled access to logs) O M M M Four-eye principle for key administrator changes O M M M Cloud provider has an implemented and tested technical continuity plan O M M M Fully documented and controlled network O M M M Option to perform penetration testing on hosted systems O O M M Physical segmentation of hardware (server, storage, network, etc.) to ensure isolation from all other systems O O M M Encrypted communication between cloud provider and cloud subscriber for management O O M M Multi-factor authentication for the cloud to provider’s administrative access M M M M Offer the capability for multi-factor authentication for the cloud to subscriber’s administrative access O O M M Ability for cloud subscriber to define geographic limits for hosting O O M M Storage encryption at logical unit number (LUN) level O O M M No administrative access for cloud to provider’s staff O O O M Strong encryption for all data in-flight and at rest O O O M Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 7.5 Elasticity ODCA defines elasticity as the configurability and expandability of the solution (consistent with NIST taxonomy7,23). Centrally, it is the ability to scale up and scale down capacity based on subscriber workload. There are certain general aspects that elasticity of CIaaS must address. The scope of CIaaS elasticity includes: • Ability to scale both up and down • The definition of one or more policies that control how the cloud subscriber’s application should be scaled • Responsiveness, such as speed of dynamic scaling • Configuration of clones • Configuration of environment such as network and security elements • Execution of additional tasks on trigger • Possible ratio, up to X times the initial number of instances • Automatable, via an API • Notification • Exception handling Service Tiering Capacity to scale depends on compute, network, storage quota and lease term. Increase (responsiveness and rate of increase) • Both subscribers and providers have responsibilities to plan for capacity needs and forecast respectively, but the focus, of course, is on provider responsibilities. • An area for potential provider differentiation is how to arbitrate subscriber contention. That is, what if multiple customers are competing for a finite number of resources? Who should be able to acquire the resources? Should there be other market economics at play here? Maybe those paying for platinum or gold have priority? • Cloud providers have additional opportunities for differentiation through combinations of options, such as: Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 29 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 7.5.1 Service Tier Summary: Elasticity Elements Bronze Silver Gold Platinum Supports horizontal scaling Yes or No Yes Yes Yes Supports vertical scaling Yes or No Yes or No Yes or No Yes or No Automatable via API Yes or No Yes Yes Yes Growth (horizontal scaling) <=10% <=10% 25% within 2 hours 100% within 2 hours 50% within 24 hours > 1000% within one month Responsiveness Within 5 business days Within 1 business day 300-1000% within one month 7.6 Manageability Services ODCA defines manageability as the degree of automation and control available for managing the solution. There are certain general aspects that manageability of CIaaS must address. Note that this section addresses the management of these different service characteristics below, not the characteristics themselves. The scope of CIaaS manageability includes: Availability, event, and performance monitoring of: • Infrastructure ºº Compute ºº Storage ºº Network • Virtual machines, when used • Regions and availability zones • Applications and services • Remediation This covers the management of automated behavior, triggered by problems in the cloud services, such as scripts that run in response to unacceptably high packet loss or automatic capacity bursting if utilization hits a designated threshold. To be clear, this does not include the given remediation functionality itself, just the management of said functionality. • Automated and scheduled tasks • Automated scaling (elasticity of demand and supply) • Self-recovery of applications and services 30 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 • Notification ºº Should utilize a publish-and-subscribe model ºº Must support various channels for receiving messages, such as SMS, email, pager, message queue, and so on • Availability, event, and performance reporting • Management of data backup and recovery Web services API for integration and data retrieval: The cloud subscriber should be able to script what they need. This includes enablement of application layer monitoring, wherein the infrastructure monitoring should be extensible or otherwise accommodate application layer monitoring and management. Service Tiering Below are the suggested functionality and capability levels for specific manageability elements. For the purposes of calibration and comparability, thresholds have been specified where possible for each of the service levels. Unless specified otherwise, these thresholds are defined at a per-compute instance level. Note that a cloud provider may consolidate events per overall service. For the purposes of comparison, they are indicated here per instance. Monitoring Event and Performance Monitors: • Note that basic monitoring encompasses core system performance statistics as typically reported by modern operating systems. CPU and memory utilization, network and disk I/O, etc. • For each level, there may be an unlimited number of defined, but inactive monitors. However, only a fixed maximum may be active and reporting at any point in time, as per below. 7.6.1 Service Tier Summary: Monitoring Elements Bronze Silver Gold Platinum Basic Yes Yes Yes Yes Custom monitors – defined Unlimited Unlimited Unlimited Unlimited Customer monitors – active 5 10 30 50 Frequency or interval of data collection: • This refers to the data that is exposed, or made available, to the cloud subscriber. The actual data resolution maintained by the cloud provider “behind the scenes” is at the discretion of the cloud provider. • When specifying data collection frequency, ensure that it reconciles properly with service availability and recoverability thresholds. 7.6.2 Service Tier Summary: Reported Sampling Interval Service Tier Reported Sampling Interval (minutes) Bronze 15 Silver 10 Gold 5 Platinum 1 minute or better Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 31 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Notification: • Standard functions available to all service tiers Remediation: • Remediation in this context refers simply to the management of automated service recovery or remediation actions. This is not about the remediation itself, but rather about the management of remediation capabilities. Automated or scheduled tasks: 7.6.3 Service Tier Summary: Number of Active Automated Tasks Service Tier Number of active automated tasks Bronze 5 Silver 10 Gold 30 Platinum 50 Reporting: • To be clear, this is about reporting of event and performance data, not the cloud subscriber’s actual application data. • ODCA anticipates that storage and cost considerations may limit the period for which full-fidelity, high-resolution reporting data may be cost-effectively retained. So, our general guide is that the most recent 10%-20% (approximate) time period for each total duration period below must have full resolution. The remaining may have reduced or aggregate resolution. 7.6.4 Service Tier Summary: Reporting Elements Bronze Silver Gold Platinum* High resolution (days) 1 20 90 1 year or better Reduced resolution (days; total duration period) 10 180 365 5 years or better *Note that there may be considerable differentiation at the platinum level due to regulatory requirements in different industries and different jurisdictions. Backup • There is an opportunity for provider differentiation, such as on storage quota. • Interval of backup –see the Recoverability section • Data retention period –see the Recoverability section Availability: • Same as for overall uptime targets in the Availability section of this document. 32 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 7.7 Performance 7.7.1 Performance SLA Performance is referenced within several ODCA documents. These sections define the requirements from a provider and subscriber perspective. Service Catalog: Performance parameters are part of the product attributes and defined thereby, and an API is required to permit programmatic access. A user interface is required for convenient human access. The architecture must provide elements that report on these values, below. In order to get these different activities handled, it is required to divide the performance values into different process areas: • Performance measuring • Performance reporting • Performance monitoring • Performance analysis • Performance definition and interface The architecture must provide interfaces for all of the above methods. 7.7.2 Performance Measuring All layers, server, storage and networking areas must provide interfaces to measure the performance on a fine granularity. The CIaaS platform should also make application layer performance measurement easy to implement or accommodate, if not an integrated aspect of the offering. The analysis function must include up to at least a one millisecond interval. Granularity of measurement which needs to be exposed by the cloud provider: it may be coarse for lower service levels, and increasingly fine for higher service levels. The higher the service level, the larger the scope of measurement has to be. Simple uncorrelated component measures are adequate for lower service levels; end-to-end and correlated measures are expected for higher service levels. The higher the service level, the more metrics have to be exposed by the cloud provider. 7.7.3 Performance Reporting The architecture must provide an interface to report on the performance. Due to the variable nature of performance, it may not be possible to definitively report whether the system can provide the contracted maximum values without observing the subscriber’s application hitting these values. Therefore, the architecture must provide a method to probe all elements and all layers for their performance values. Such probes should come in two types: a probe that runs in off-hours against the system in an “IO-meter-like manner,” and a drone that a subscriber or provider can program with an API that is stamped at all I/O layers, and reports back roundtrip and latency values, and so on. The provider’s architecture must have central elements, such as in the form of a data warehouse where customizable reports can be generated and from which data can be downloaded. A presentation layer is available for subscribers to generate these reports. The degree of reporting is related to the service tier of the given performance attribute and contractually agreed. This is an area of possible differentiation for cloud providers. 7.7.4 Performance Monitoring Monitoring interfaces is required to send events to the subscriber, as well as to integrate into the subscriber’s monitoring tools. Given the importance of these interfaces, monitoring their continued health is essential. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 33 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 7.7.5 Performance Analysis The architecture must support or provide tools to analyze performance on a deep level. For this purpose it must be possible for the cloud provider to start probes and drones within the cloud subscriber’s capacity and only within contractually agreed security and operational parameters. 7.7.6 Performance Definition and Interface The architecture must provide an interface in the form of an API that a subscriber can query to verify capabilities. As every subscriber workload is unique, it may be difficult to define concrete universal SLA terms. For example: One application with 80% random reads, 20% random writes at 8 KB blocksize isn’t that different from one with 20% sequential reads, 80% sequential writes at 32KB blocksize. These kinds of values are not usually practical for defining complex, real-world applications with diverse execution profiles. In order to create a comparable performance value, there must be some type of comparison or benchmark. Probes allow bespoke benchmarking that is specific for an application, but they cannot run when the application is up and operational, such as with handling transactions, and they only show a simple point in time. Generic benchmarks, on the other hand, can suggest broader performance characteristics, but often do not represent real-world application behavior, and similarly cannot run concurrent with the application itself. Examples of general benchmarks are SPC, SPECint, IOMeter, NetBench, and IOZone. Keep in mind that general benchmarks are not specific to customer needs, but they enable the subscriber to compare different offerings from different providers. 8.0 Service Interface and Reference Model 8.1 ODCA Conceptual Architecture Figure 8.1.1 - ODCA Conceptual Framework SaaS IT Ops Cloud Aware Apps PaaS Web Database Traditional & Cloud Aware Apps IaaS Compute Storage Network Facility Dynamic Management & Orchestration of End-to-End Services Application Development Actionable Service Catalog (UI and API) Business Processes End User Web & Data Service Interoperability Power, Cooling, Physical Space Consume / Subscribe Provide 34 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Capacity & Performance SW Delivery OS and Apps Configuration Event Security Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 8.2 Basic Cloud Lifecycle Below is a summary of the essential lifecycle steps associated with compute infrastructure services. These are elaborated in greater detail in the ODCA Master Usage Model: Service Orchestration.21 1. Discovery: The cloud subscriber obtains a list of available services from the CIaaS provider. This step may optionally be facilitated by a cloud broker. 2. Negotiation: Cloud subscriber and cloud provider negotiate contract and terms for CIaaS services. (Note: See ODCA Usage Models: Commercial Framework12 and Regulatory Framework13 for in-depth coverage of relevant requirements and considerations). This step may optionally be intermediated by a cloud broker. 3. Provision: The cloud subscriber submits requirements to the cloud provider for the service that they need. The cloud service provider then fulfills the service request. 4. Instantiation: The step may or may not be required in a given situation. This entails the cloud subscriber taking any manual steps necessary to facilitate their use of the CIaaS service. 5. Use: Cloud subscriber’s use of the service, until said service is modified or terminated. This includes management of the service, including start and stop, monitoring, reporting, and so on. 6. Modify: This is an optional step wherein the cloud subscriber re-evaluates their requirements and may negotiate to alter the service. An example may be elastically bursting capacity beyond already agreed levels. 7. Terminate: Termination of service in accordance with the negotiated contract. 8.3 Service Interface Requirements As with any services-oriented architecture, the CIaaS services interface is to provide a separation between the interface of a service and its underlying implementation. This is to ensure that cloud subscribers (and their applications) can interoperate across the widest possible set of cloud providers. Such interfaces, as well, will facilitate the easy swapping of cloud providers with minimal to no modification to application code, etc. Simply put, the price of entry for cloud providers is to build their offerings on open and interoperable standards to even be considered as a candidate by enterprises. Where these standards don’t exist or are found lacking, service and solution providers are expected to collaborate on developing them. To preserve the investment in development, application and system logic are separated from the underlying infrastructure through the use of software interfaces, each of which defines a contract between a service consumer and a service provider. This separation is the basis of any valid SOA. These steps will effectively insulate the cloud subscriber from provider-specific protocols, server identities, utility libraries, and the service provider, resulting in software that is easier to develop, longer lasting, and usable across a wider array of computing environments. As a general framework, we will default to focus on open service interfaces, in order to enable ease of adoption and portability of services. The importance of open interfaces cannot be overstated. As discussed by Buyya, et al, a standard interface for CIaaS will allow:22 • Consumers to interact with cloud computing infrastructure on an ad hoc basis. • Integrators to offer advanced management services. • Aggregators to offer a single common interface to multiple providers. • Providers to offer a standard interface that is compatible with the available tools. • Vendors of clouds to offer standard interfaces for dynamically scalable service’s delivery in their products. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 35 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Goals CIaaS services interfaces must support the aims of interoperability: • Portability of workloads across clouds and cloud providers. • Interconnectability of different cloud infrastructures and services. • Consistent management interfaces, both human and automated. • Interoperability between and amongst disparate cloud providers. Scope The perspective of the cloud subscriber (also known as the service receiver) is taken initially in this version of the ODCA Master Usage Model: CIaaS. The perspective of the cloud provider and solution providers will be added in later revisions. Assumptions The ODCA Master Usage Model: CIaaS is used as the initial focus to define basic services orchestration, and the ODCA usage models Service Catalog, 24 and Standard Units of Measure for IaaS15 are used as the foundation. References and usable definitions, therefore, are drawn from those usage models.Compatible with DMTF’s Open Virtualization Format (OVF)25 Compatible with SNIA’s Cloud Data Management Interface (CDMI)26 General Interface Requirements In order to dynamically and flexibly adopt and operate cloud based services, these service interfaces must be capable of integration into the cloud subscriber’s automation systems, preferably without unnecessary human involvement (also considering the workflow and approval stages). This means that the service interfaces need to be open, so as to not limit potential cloud users by means of special license or cost limitations on the use of the interface, and to enable automated working of systems and services. The interfaces need to interact with standardized work flows and service orchestration triggers, in a consistent and predictable fashion, globally, and according to defined service qualities. This references the impact on each of the configuration items (CI) in the Service Catalog, the ordering and operations processes, and the security and compliance of each CI. Should any interfaces be restricted or have special licensing or other prerequisites, then the openness and adoptability of the overall cloud service is undermined. Any limitations could restrict the potential user base of the cloud service, and represent lock-in or deviation from open standards, which are a disincentive to users in the sense that it is more difficult to transport their services out of that cloud environment. Additional Requirements: • The ability to deliver compute, data and network services over a standardized interface. • RESTful approach • Extensibility for integration with and support of other XaaS cloud services. • Act as a complete front-end to a CIaaS provider’s internal infrastructure management. • Provide a commonly understood set of syntax, semantics and management methods. • Coverage of the entire CIaaS life cycle. • Extensible support for common entities and entity collections, such as:27 ºº System Templates ºº Systems ºº Machine Configuration ºº Machine Template ºº Machine Image 36 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 ºº Machine Admin ºº Machine ºº Volume, Network, Job, Meter, Event ºº Maintenance and ongoing development of the service interfaces must be non-intrusive, enabling sustained operations without impact, or requirement for downtime. 8.4 Specific Required Interfaces Table 8.4.1 Specific Required Interfaces System Sub-System Consumer Order from Catalog Shop Navigator Reporter Cloud Portal User Manager Orchestrator Cost Calculator Technical Admin Interface Service Portal Credential Manager User Manager Token or Claim Manager Security System Privileges and Rights Manager Identity Manager Intrusion Manager Security Incident Event Manager Directory Services Service Catalog Service Configurator Consumer Catalog Rules Engine Work Package Creation Work Package Distributer Work Package Monitor Workflow System Work Package Reporter Enterprise Service Bus Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 37 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 System Sub-System Rates Tables Asset Rate Tables including License Models Actual Service Consumption Tables Billing System SLA Tables Contract Tables Calculation Engine Incident Management Service Desk & Knowledge Management Problem Management Reporting Knowledge Database Asset Database IP Pool Database Service Catalog Configuration Management Configuration Templates Actual Configuration records License Pools (HW, SW, CAL) Consumption Records Work Package Deployment Engine –Compute Queue Creator - Management and Monitoring Queue Creator - Service Desk Capacity Record Creation SLA Report Triggering Orchestration Work Package Deployment Engine - Management and Monitoring Work Package Deployment Engine –Hypervisor Work Package Deployment Engine –Storage Work Package Deployment Engine –Network Hardware License Pool Software License Pool License Models License Management Allocated Licenses Available Licenses Returned Licenses 38 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 System Sub-System System Configuration Templates for Automated Deployments System Connectable Configuration Certificate Management Replication between High Availability Sites System Integration between Consumer & Service Provider Client Security Certificate Management Service Provider Certificate Management Incident Manager Problem Manager Event Monitor Configuration Monitor Capacity Management Management & Monitoring Availability Monitor SLA Monitor Network Monitor Storage Monitor Hypervisor Manager License Reporting and Tracking Cloud Broker Service Mapper Service Integration Module Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 39 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 8.5 Services Orchestration The illustration below represents a number of interfaces. It also contains three planes: the solution provider, cloud provider and cloud subscriber planes. These contain the same elements, but in differing depth of detail. These are in no particular order in the illustration, since the cloud subscriber may be federating services between various cloud service providers (including their own IT department and one or more cloud providers). Alternatively, the cloud provider themselves may federate such services from third-party providers. Therefore the planes’ relationships can vary based on the particular circumstances. What is important is that there are interfaces between all three layers that must be orchestrated and, depending on the particular scenario, ownership and responsibility may vary. Figure 8.5.1 Interfaces Solution Provider Service Subscriber Service Provider End User Services Application Development Remediation Tuning & Optimization Business Processes Resource Allocation Actionable Service Catalog (UI and API Monitoring Web & Data Service Interoperability Orchestration of End-to-End Services Cloud Aware Apps Capacity & Performance SaaS Services Remediation Application Operations Service Management PaaS Web Data Message Traditional & Cloud Aware Apps Configuration IaaS Integration Service Desk Compute Storage Network Facility Power, Cooling, Physical Space The service orchestration must encompass the following: • Service discovery • Service implementation • Configuration • Capacity • Systems management and monitoring • Ordering • Billing • Reporting • Identity management 40 SW Delivery OS and Apps Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Event Security Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 There are certain components common to each service type, as below. These help standardize what the cloud consumer sees for the service, orchestration, measurement, management and end-user interfaces. • Each service will have a schema that describes the parameters associated with the service itself, specifically the required and optional attributes associated with the request for the service. • Each service component should have a standardized SLA description associated with it. This uniquely identifies the service and describes parameters associated with the provisioning of a request for it. • Each service should have compliant methods of calculating, reporting, and presenting standard units of measurement that describe the services and the performance of those services. • Upon requesting a service, there will be an industry standard set of response codes for reporting on service request / provisioning status. It is worth highlighting the core descriptions that can be applied to a service; both are equally important to usage scenarios for cloud subscribers. The first is a definition of the core attributes of a service using the common terminology described herein. The second is the SLA parameters that describe how quickly a service may be provisioned. The following example illustrates the latter. 8.6 Usage Scenario Example: Burst Capacity at a Specified SLA A common provisioning status object is important. Cloud subscribers often manage their own workflow and provisioning scripts for requesting external services. These need common status codes, notifications and other semantics for actions on the cloud provider side to allow build processes to be automated in a way that is independent of service provider. Consider the usage scenario for “burst mode” cloud use. A cloud subscriber has a business application that is experiencing high demand. They wish to burst to an external provider and need additional capacity within 10 minutes. They may have contractual relationships set up with five cloud providers to burst in this manner. By querying the SLA response times for each cloud provider, they can choose the response time most appropriate to their needs, rather than risk bursting from one overloaded data center to another which is suffering the same temporary capacity issues. To work through this example, we could look at provisioning a basic virtual machine running an enterprise grade version of Linux. First we query the service provider to see if we have a service matching that description and receive an SLA object associated with that service. So for our Linux VM, the SLA object might be described as below, in pseudo-code: <Service Provider>ACME Cloud Corp</> Each service provider must be uniquely identified. <Service>VM Ware VM</> <Service>RHEL 6.1</> A way to uniquely describe each service offering. <Service Location></> Locations where this service is offered. <Service Price Model></> Price of the service- this would be a reference to agreed pricing units. <Service Provisioning Time>10 minutes</A> This is a key attribute: the time taken to provision from request, useful for workflow planning, etc. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 41 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 This describes the provisioning SLA from the cloud provider to the cloud subscriber. A descriptive object is needed to describe the service itself and all the attributes that are available for customization, for example: <Service>VM Ware VM</> <Service>RHEL 6.1</> <Allowed Memory>1GB, 2GB, 4GB, 8GB</> <Allowed VCPU>1, 2, 4, 8 </> <Allowed Storage>10GB, 50GB, 100GB, 500GB, 1000GB</> These attributes are just an example. So, we now have a service that has an SLA object that describes how a service provider will provision it, and a configuration object that describes which options a customer may request. Assume now that the cloud subscriber places an order. They reference the service with its unique ID, and request the service by using the available configuration attributes. The service provider then accepts that message, and confirms back with a confirmation of the SLA object and a unique service request reference to allow tracking of this process. At any point, the requestor can query the service provider with the service request reference, and receive a commonly understood status code to indicate how the provisioning process is progressing and if there are any issues. 9.0 Operations and Management 9.1 Overview Use of a CIaaS provider requires capabilities that enable the consumer to integrate, automate, monitor, and measure the resources in a completely self-service manner while being assured of a certain level of responsiveness and consistency from the provider. While a user interface providing access to these capabilities is always welcome, the ability to program an interface with these capabilities quickly becomes more important than a user interface. This section covers the high-level capabilities for operations and management in the areas of integration, automation, monitoring, measurement, and response for CIaaS operations and management. 9.2 Motivations Integration: The ability to incorporate the operations and management workflow of a CIaaS provider into the workflow processes of cloud subscribers. This includes requirements for authentication, authorization, other access control, notifications, management, compliance/ governance, change control, and service incident management functions, all of which must be able to mesh at some level with the workflow of the consumer, manual or automated. Automation: The ability to programmatically access and automate capabilities provided by the CIaaS provider. This includes eliminating any gating human interactions during operation and management of the consumed CIaaS capabilities, as well as any provisioning and decommissioning of CIaaS capabilities. Notification: The ability to be notified when changes in provisioning, configuration, access control, billing, service difficulties or outages, and other activities occur. This enables the awareness, reporting, and auditing of operational activities that occur within the CIaaS provider on behalf of the consumer beyond what is detectable by monitoring. Monitoring: The ability to inquire and understand the status and operating characteristics of the CIaaS capabilities provided from the general and high level (“Your capability is active”) to the specific and detailed level (“Here is how your capability is performing”). Monitoring usage scenarios provide for both polling and event driven models for status and incident communication. 42 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Measurement: The ability to measure the current and historical performance and cost of the provisioned CIaaS capabilities being consumed. Measurement differs from monitoring by adding the dimensions of cost and SLA reporting against monitored performance. Response: The ability of the supplier to provide a well-understood, repeatable, reliable, and, preferably, preemptive response or interaction regarding service delivery issues, as well as the ability of both the supplier and consumer to investigate and identify service delivery issues. 9.3 CIaaS Operations Usage Scenarios The following operations usage scenarios are presented: • Access and control configuration • Provisioning and deprovisioning capabilities • SLA or service fault identification by provider • SLA or service fault identification by consumer • Service change or outage notification • Service monitoring • Subscriber billing and usage • Business driver mapping Assumptions The following assumption applies to all usage scenarios described in this section: • Facilities management is opaque to the consume of CIaaS capabilities 9.3.1 Usage Scenario 1–Access and Control Configuration Business Drivers: Integration, automation Goal: Enable the cloud subscriber to add or remove user accounts and assign various levels of provisioning capability to them in order to promote self-service consumption of CIaaS capabilities without requiring direct interaction by subscriber operations staff. Assumption 1: Cloud subscriber has signed up for the cloud service and the provider has provisioned an initial administration account. Success Scenario 1: Cloud subscriber administrator is able to add a new account and assign access permissions and subscriber metadata to the account. Subscriber is able to change permissions to an existing account. Steps: 1. Cloud provider provides clear documentation and tools which define the scope of available configuration and customization features for configuring an account. 2. Cloud subscriber identifies specific credentials and optional metadata for a new account. Items may include: a. Username b. Email c. Password d. Metadata (such as subscriber billing code) 3. Cloud subscriber configures or changes the account’s permissions to consume, control, and monitor CIaaS capabilities provided by supplier or changes metadata for the account. 4. Account owner is notified by email of account availability or configuration changes. 5. Account owner is immediately able to consume supplier services. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 43 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Failure Condition 1: New account cannot be created or desired permissions applied. Failure Handling 1: Cloud subscriber administrative account notified. Remediate according to support agreement. Success Scenario 2: Cloud subscriber administrator removes a user account. Steps: 1. Cloud provider provides clear documentation and tools that define scope of available configuration and customization features for removing an account. 2. Cloud subscriber administration account removes user account. 3. User account is immediately unavailable 4. User account may remain intact for historical purposes. Failure Condition 1: Removal of user account fails. Failure Handling 1: Subscriber administrative account notified. Remediate according to support agreement. 9.3.2 Usage Scenario 2– Provisioning and Deprovisioning Capabilities Business Drivers: Integration, automation Goal: Self-service provisioning and deprovisioning or CIaaS capabilities. Assumption 1: An appropriate user account with relevant permissions has been created, configured, and authenticated against. Success Scenario 1: User requests or removes CIaaS capability (such as a compute instance), and the capability is provided or removed per the relevant SLA. Steps: 1. Cloud subscriber user account requests provisioning or removal of CIaaS capability. 2. Cloud provider creates resource(s) and provides access to subscriber user account or removes resources from availability to subscriber user account. 3. Resources are made available or unavailable to user account as appropriate. 4. User account is notified of change of service. 5. Cloud provider begins or stops billing for resources as appropriate. Failure Condition 1: Failed to create or remove capability. Failure Handling 1: Cloud provider and cloud subscriber are notified of failure per SLA. Interfaces used to enable clear analysis of reason for failure. Success Scenario 1: User changes service configuration of an already provisioned capability, such as stopping or starting a compute instance, and the change occurs to the provisioned resource per SLA. Steps: 44 1. Cloud subscriber user account requests modification of provisioned CIaaS resources. 2. Cloud provider applies change to appropriate resources. 3. Resources now modified as requested by user account. 4. User account is notified of resource modification. 5. Cloud provider modifies billing for resources as appropriate. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Failure Condition 1: Failed to modify capability. Failure Handling 1: Cloud provider and cloud subscriber are notified of failure per SLA. Interfaces used to enable clear analysis of reason for failure. 9.3.3 Usage Scenario 3– SLA or Service Fault Identification by Provider Business Drivers: Integration, automation, monitoring, response Goal: Clear identification and notification to subscriber of problems to capabilities that may impact subscriber, and setting expectations for remediation of problems. Assumption 1: Subscriber and supplier have agreed upon SLA and communication plan for service and SLA exceptions. Success Scenario 1: SLA or service fault is identified by supplier before or immediately upon the problem impacting the subscriber. Problem details, scope of impact, and remediation plan provided to subscriber. Steps: 1. Cloud provider identifies problems with service offering. 2. Cloud provider identifies impact to subscriber. 3. Cloud provider notifies subscriber of problem, impact or potential impact to subscriber, and remediation plan. 4. Cloud subscriber takes appropriate business action based on impact. 5. Cloud provider remediates problem. 6. Cloud provider notifies cloud subscriber of problem resolution. 7. Cloud subscriber takes appropriate business action based on resolution. 9.3.4 Usage Scenario 4–SLA or Service Fault Identification by Subscriber Business Drivers: Integration, automation, monitoring, response Goal: Notify cloud provider of problems to subscribed capabilities that are impacting subscriber. Obtain expectations for remediation of problems. Assumption 1: Cloud subscriber and cloud provider have agreed upon SLA and communication plan for service and SLA exceptions. Success Scenario 1: SLA or service fault is identified by cloud subscriber. Problem details, scope of impact, provided to cloud provider. Cloud subscriber receives expectation for remediation of problem. Steps: 1. Cloud subscriber identifies problems with service offering. 2. Cloud subscriber identifies and communicates impact to cloud provider. 3. Cloud provider notifies cloud subscriber of action plan for addressing the problem. 4. Cloud subscriber takes appropriate business action based on impact. 5. Cloud provider remediates problem. 6. Cloud provider notifies cloud subscriber of problem resolution. 7. Cloud subscriber takes appropriate business action based on resolution. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 45 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 9.3.5 Usage Scenario 5– Service Change or Outage Notification Business Drivers: Integration, monitoring, response Goal: Provide cloud subscriber with sufficient notice of cloud provider service changes or outages that may affect cloud subscriber. Assumption 1: Cloud subscriber and cloud provider have agreed upon SLA and communication plan for service and SLA exceptions. Success Scenario 1: Cloud provider provides notice to cloud subscriber of planned service change or outage. Cloud subscriber has sufficient time to prepare business for service change event or outage. Cloud provider service change or outage proceeds as planned with expected impact to cloud subscriber. Steps: 1. Cloud provider notifies cloud subscriber of planned service change or outage indicating, at a minimum: a. Date and time to occur b. Duration of event c. Expected impact to subscriber d. Rollback plan if event fails e. Escalation path for expected and unexpected impacts to subscriber during an event 2. Cloud subscriber prepares business for planned event. 3. Cloud provider proceeds with planned event. 4. During event, cloud subscriber evaluates impact on business and escalates issues per Usage Scenario 4. 5. Cloud provider completes planned event. a. If not successful, service capabilities are restored to previous state for cloud subscriber. 6. Cloud provider notifies subscriber of event outcome. 9.3.6 Usage Scenario 6– Service Monitoring Business Drivers: Monitoring, response Goal: Cloud provider and cloud subscriber are able to monitor in detail the state of resources provided by the cloud provider to assure operations and identify or debug potential problems with subscribed resources. Assumption 1: Cloud subscriber and cloud provider have agreed-upon SLA and communication plan for service and SLA exceptions. Success Scenario 1: Cloud subscriber may access internal and external state of the subscribed resource. Internal and external state can be captured and communicated between cloud subscriber and cloud provider. Steps: 1. Cloud provider provides documentation on accessing and monitoring internal and external resource state as appropriate. This may include, but is not limited to: a. Overall health, state, and history of subscribed resources. b. Key exceptions or events occurring to a subscribed resource. c. Detailed activity logs for each subscribed resource. 46 2. Cloud subscriber accesses internal and external resource state as needed. 3. As needed, cloud subscriber or cloud provider capture and communicate resource state to each other. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Failure Condition 1: External or internal state of resource not available. Failure Handling 1: Cloud subscriber notifies cloud provider to remedy. Success Scenario 2: Cloud provider automatically monitors internal and external state of subscriber resources, and notifies cloud subscriber of potential problems as they are discovered. Steps: 1. Cloud provider monitors internal and external state of resource for potential problems. 2. If cloud provider action is deemed necessary, cloud subscriber is notified per Usage Scenario 5. 3. If cloud subscriber action may be indicated, provider notifies subscriber of monitoring discovery and makes internal and external state of the subscribed resource available to subscriber. 4. Cloud subscriber determines if action is required and acts accordingly. 5. As needed, cloud subscriber or cloud provider capture and communicate resource state to each other. Failure Condition 1: Cloud provider is unable to monitor cloud subscriber resources. Failure Handling 1: Cloud provider notifies cloud subscriber of inability to monitor and expected remediation. 9.3.7 Usage Scenario 7– Subscriber Billing and Usage Business Drivers: Monitoring, response Goal: Cloud subscriber is able to access up to date resource billing and usage data in a self-service manner per SLA and commercial terms. Cloud provider is able to bill subscriber at agreed upon periodicity for consumed resources per commercial terms. Assumption 1: Cloud subscriber and cloud provider have agreed upon SLA and commercial terms. Success Scenario 1: Cloud subscriber accesses current and historical billing and usage information. Steps: 1. Cloud subscriber authenticates with cloud provider using an account with access permissions to request usage and billing information. 2. Cloud subscriber accesses billing and usage information as required. Failure Condition 1: Billing or usage information not accessible. Failure Handling 1: Cloud subscriber notifies supplier to remedy. Success Scenario 2: Cloud provider bills cloud subscriber per agreed commercial terms. Cloud subscriber pays supplier. Steps: 1. Cloud provider calculates appropriate billing for cloud subscriber consumed resources per commercial terms. 2. Cloud provider delivers bill to cloud subscriber for payment. 3. Cloud subscriber pays bill per commercial terms. 4. Cloud provider provides escalation path for billing discrepancies. 5. Cloud provider retains history of usage and billing for cloud subscriber. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 47 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 9.4 Operations and Management Service Tiering Table 9.4.1 Operations and Management Service Tiering Requirement (O – Optional, M - Mandatory) Bronze Silver Gold Platinum S.I.1 Formal documentation of all service interfaces, GUIs and command lines in cloud provider’s choice of documentation standard. This documentation must provide sufficient detail to permit cloud subscriber to operate and manage the usage scenarios above. M M M M S.I.2 As S.I.1, but with a formally documented programmatic web-service interface in cloud provider’s choice of standard, permitting cloud subscriber to interconnect via programs using web-service API calls. O M M M S.I.3 As S.I.2, but with provision to replicate security credentials (provide a duplicate set of username, password, or group membership credentials with a regular synchronization process) to improve user sign-on experience. O M M M S.I.4 As S.I.3, but with provision to fully federate security credentials to make a seamless single-sign-on capability O O O M S.I.5 As S.I.4, but formally documenting the service interface in a recognized industry standard format fully aligned with the concepts included in the ODCA Usage Models Standard Units of Measure for IaaS,15 and Service Catalog.24 O O O M S.I.6 As S.I.5, but with the ability to support fully-automated service interface hookup and reconnection, orchestration and interconnection at large scale. O O M M S.I.7 As S.I.6, but demonstrating the ability to interconnect seamlessly between multiple cloud providers. O O O M 10.0 Technical Architecture 10.1 Assumptions and Context This section contains the content for the technical architecture requirements for CIaaS. Wherever possible, the content is both technology and technology generation agnostic. The service selection and provisioning process is complete to the point where the cloud subscriber is entitled to create and manage one or more containers. Service requests can originate directly from an individual in a cloud subscriber organization via a web portal, or may originate from some sort of orchestration layer. 10.2 Components The figure below represents a logical description of the components in scope. The components collectively make up the compute container. This compute container is likely to be a virtual machine, but this is not mandated. 48 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Figure 10.2.1 CIaaS Architecture Components Management Network Fabric(s) Compute Container Storage Fabric(s) Block Storage Object Storage Storage Layer 10.3 Compute Layer The compute container, or compute instance, must be capable of hosting a subscriber-specified OS image. This image can be supplied by either the service consumer, service provider or a third party. We do not mandate that this has to be a virtual machine. The container is specified in units described below and has attributes such as • CPU execution threads (will be seen as number of host CPUs by the guest operating system) • Allocated memory (in gigabytes) Required Services/Attributes: • Ability to start, stop and suspend at any arbitrary point in time. • Portability (see also ODCA Usage Model: Long Distance Workload Migration10) • Must be able to query the state of the container - created, non-existent, started, stopped, suspended. • Must be able to query the allocated compute resources (CPU and allocated memory). Recommended Services and Attributes: • Ability to query the over subscription limits of CPU and memory resources. • Ability to query the over subscription status or contention of CPU and memory resources. Optional Services/Attributes: • Report on the geographical and jurisdictional location of the container. In reality, geographical location considerations apply mostly to data where there are regulatory requirements. Sometimes this will also apply to the compute container as well as the data. Lastly, it may be necessary to manage a compute container’s location in order to manage locality to the data stores for performance and latency reasons. While included here as “optional,” it will be required for some industries such as capital markets. • A standard unit of measure for physical location. This will be especially important for highly regulated industries, or where cloud subscribers are sensitive to European privacy requirements and/or USA PATRIOT Act implications. The most relevant location information is the jurisdiction; whenever it generally equals a geographic location, it is not always the same. Therefore, the unit to use must refer to the jurisdiction. The ISO 3166-1 country code refers to the jurisdiction and can be used to precisely specify the actual location of the container or storage. • The ability to scale CPU and memory independently. In most scenarios, service providers want to provide compute containers in fixed sizes as this simplifies capacity management, provisioning and charging. A service provider can optionally provide the ability to scale CPU and memory independently of each other. This capability might command a price premium. The SLA between cloud subscriber and provider may specify how additional capacity will be provided (for example horizontal or vertical scaling). Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 49 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 10.3.1 Service Tier Summary: Compute Instance Attributes Compute Instance Attributes (O – Optional, M - Mandatory) Bronze Silver Gold Platinum Ability to start, stop, suspend M M M M Portability M M M M Query container state M M M M Query compute resources M M M M Query over-subscription limits O M M M Query over-subscription status O M M M Container location reporting O O O M Standard physical location unit O O O M Independent CPU & memory scaling O O O M 10.4 Storage Layer 10.4.1 Block Storage Requirements A container or virtual machine must have either one of non-persistent or persistent block storage. It may have both. • Non-persistent storage is not persisted across compute container power cycle events but is persisted across machine image restarts, that is a reboot of hosted operating system. • Persistent storage is retained until explicitly deleted or destroyed. • Storage can exist and be provisioned independently of a compute container. • Multiple storage allocations can have independent service levels and attributes. • Block storage allocations can be dedicated to a single machine container or boot device or can be mapped to other compute containers, not necessarily for simultaneous access, such as clusters. 10.4.1.1 Service Tier Summary: Block Storage Attributes 50 Block Storage Attributes (O – Optional, M - Mandatory) Bronze Silver Gold Platinum Provides block storage M M M M Non-persistent storage persisted across image restarts M M M M Persistent storage retained until deleted/destroyed M M M M Storage provisionable independent of container M M M M Multiple allocations with independent service tiers M M M M Configurable mapping to compute instances M M M M Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 10.4.2 Object Storage Requirements Object storage is out of scope of the ODCA Master Usage Model: Compute Infrastructure as a Service (CIaas). 10.4.3 Storage Fabric Storage fabrics are fabrics that link components. Fabrics are a logical construct that map onto physical constructs. Fabrics provide connectivity and host higher level services. Note that data protection requirements must still be met. A fabric spawned across several physical devices must still allow for the same level of access protection as a single physical device. The storage fabric layer may include one or more technologies implemented and managed by the service provider. Examples include Fibre Channel and iSCSI. Required Services Services that must be supported within the storage layer include: • Persistence • Ability to query service levels, attributes and capabilities • Isolation • Access control Recommended Services Services that should be provided but are not mandatory include: • Reliable Isolation (linked to access control). Note that this is mandatory for gold and platinum levels. What is not mandatory is that the isolation must be physical, if a logical isolation provides sufficient reliability of isolation. • Identification (access control) Optional Services Optional services provided within the storage layer provide additional capabilities beyond basic transport and persistence. In general, capabilities fall into the following categories: • Data availability (replication, mirroring, snapshot and so on) • Data security and encryption • Capacity optimization (compression de-duplication, thin provisioning, and so on) • De-duplication can be a security issue. In security critical environments, it should be explicitly mentioned in the SLA. A cloud subscriber must be able to opt out of it (in gold and platinum tiers). • Thin provisioning can be a security issue. It must be explicitly stated in the SLA and the subscriber must be able to opt out of it (in gold and platinum tiers). Migration capabilities It is highly desirable that some mechanism exists to query what optional services have been applied to a specific storage allocation. For example, if a storage service offers in situ compression, the service consumer may wish to know this, as this might make compression of storage objects further up the stack redundant. On the other hand, the provider wants to know whether compression makes sense at all (which is not the case if the subscriber encrypts all data before storage). Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 51 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 10.4.3.1 Service Tier Summary: Storage Fabric Attributes Storage Fabric Attributes (O – Optional, M - Mandatory) Bronze Silver Gold Platinum Persistence M M M M Query service details M M M M Isolation M M M M Access control M M M M Reliable isolation O M M M Identification O M M M Query optional services O O O M 10.5 Network Fabrics Like storage fabrics, network fabrics are fabrics that link components. Fabrics are a logical construct that map onto physical constructs. Fabrics provide connectivity and host higher level services. An external network fabric comprises one or more network connections presented to the machine container and provide data transport between end points. End points are machines or service consumers, such that a transport may be machine-to-machine or machine-to-cloud subscriber or broker. External network fabrics are likely to be IP-based fabrics but are not mandated. The cloud provider can optionally provide higher-level capabilities, such as load balancers, in order to meet the desired service characteristics. Providers need scalability, rightsizing, quality-of-service (QoS) and cloud subscribers need the ease and familiarity of simple Ethernet networking. Network virtualization provides this. It is truly virtualization in that it provides a clean abstraction layer that creates a separation of concerns between the cloud provider and cloud subscriber.28 Network virtualization is a critical enabler of large cloud data centers. It simplifies networking for all, and avoids complex bolt-on technologies which are controlled by network administrators today. Best of all, it plays inherently to the strengths of commodity systems. Instead of buying increasingly expensive networking equipment and appliances, it is much less expensive to scale-out using L3 networking techniques on inexpensive equipment, and then use the abstraction layers to hide it all beneath.28 Those characteristics align virtual networking well with the Alliance vision and its members’ strategic cloud requirements. Required Services/Attributes: • Transport services - bandwidth, latency, QoS, bursting (scale up and down) • Access Control--define access in terms of who can access the container Recommended Services/Attributes: • Firewall management service consumer. The service provider will have a mandatory set of configuration and firewall rules they will apply to the deployment. • Per machine or tenant container SLAs Optional Services/Attributes: • Load Balancers --Mask servers and applications. Extend performance for applications. • Resilience/Redundancy services--bandwidth on demand, transport path protection 52 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 • VPN services–IPSec based VPN (secure tunnel for VPN connection through the Internet) • Security Service –Ability to protect server based upon ports and protocol • Monitoring Service –Ability to monitor on-demand packet level capture, conversation, and so on. 10.5.1 Service Tier Summary: Network Fabric Attributes Network Fabric Attributes (O – Optional, M - Mandatory) Bronze Silver Gold Platinum Transport services M M M M Access control M M M M Firewall management O M M M Per-container SLAs O M M M Load balancers O O O M Resilience services O O O M VPN services O O O M Monitoring services O O O M 10.6 Management The management layer includes: • Resource and configuration management • Resource Pooling • Resource Allocation • APIs: standards-based, proprietary • Resource state management--control and state reporting • Resource performance monitoring and usage metering • Resource security--access control, grouping/mapping In order to achieve a consistent approach across service providers, a federated view of system management information is required. This layer aggregates the management, monitoring and reporting capabilities against the service attributes. A purpose of this layer is to provide a consistent information format within and across service providers, using the service interfaces described elsewhere herein. Mandatory supported interfaces The cloud subscriber requires a set of interfaces to manage the compute infrastructure. There need to be interfaces to perform the full range of management tasks required, as well as to instrument and view the operation of the infrastructure elements. All of these are required for all service tiers. • Service discovery • Orchestration (service implementation) • Configuration Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 53 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 • Capacity • Systems management and monitoring • Ordering • Billing • Reporting • Identity management 10.6.1 Service Tier Summary: Management Management (O – Optional, M - Mandatory) Bronze Silver Gold Platinum All service orchestration capabilities within a single cloud M M M M All service orchestration capabilities across all clouds of a single cloud provider O O M M All service orchestration capabilities across disparate cloud providers O O O M See also separate section Service Interface and Reference Model in this document for more information. 11.0 Security Considerations Note: All references to the platinum service tier are under the assumption that the definition in the ODCA Usage Model: Provider Assurance 19 will be updated to reflect the NIST definition. 11.1 Security Requirements The following requirements align with the ODCA Usage Model: Provider Assurance:19 Antivirus, Malware, and Rootkit with definition updates within 24 hours. Protect against typical attacks such as “Blue Pill” and some low level network attacks, such as spanning-tree and others. This is mandatory even at the bronze tier, as this is an integral part of the infrastructure. This is relevant for parts of the cloud control infrastructure and would be the responsibility SOLELY of the cloud provider. Vulnerability management process exists and is fully tested to ensure no impact to target and applications: This is required only for some platform components, such as infrastructure, hypervisor, and storage. The cloud provider should provide evidence to the cloud subscriber about the current status of the platform (reporting details are dependent on the service tier). Network and firewall isolation of cloud subscriber systems with management: This is mandatory for all service tier. Isolation should however be automatically assured by the platform. The cloud subscriber may further segregate his network and define firewall rules. This can be different for different SLA tiers. The firewall rule set should be managed by the cloud subscriber at gold and platinum levels and by the cloud provider at bronze and silver levels. The cloud provider will ensure a secure default rule set. Physical access control into cloud data center: Required for all tiers. Different data center types are possible for different service tiers (1, 2,3,and 4). 54 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Secure protocols used for remote administration, such as SSL, SSH and RDP: Required for all service tiers. All default passwords and guest access removed: Required for all service tiers. Mandatory use of non-disclosure agreements (NDAs) for cloud provider staff: Required for all service tiers. This will also include a data privacy agreement in the EU, for example. Mandatory alignment with Information Technology Infrastructure Library (ITIL): processes for change, incident and configuration management. Required for all service tiers. Identity management for subscriber assets: Only applicable for cloud management API / services. Authentication mechanisms for the bronze, silver, gold and platinum as described in the ODCA Usage Model: Provider Assurance.19 Data retention and deletion management: Same as described in ODCA Usage Model: Provider Assurance,19 plus end and delete cloud service (ODCA Master Usage Model: Service Orchestration)21 Security incident and event monitoring: Limited to cloud provider’s infrastructure, below the operating system. Mechanisms for the bronze, silver, gold and platinum as described in the ODCA Usage Model: Provider Assurance.19 Network Intrusion Prevention (NIPS): In IaaS, the NIPS is only required at the physical layer, network and/or in the hypervisor and not in the VM layer. As such, this is the responsibility of the cloud provider. Event logging for all administration level events: Mandatory for all service tiers on the cloud infrastructure. Should be implemented as defined in the ODCA Usage Model: Provider Assurance19 for cloud subscriber administrative events, such as via the cloud management API service.19 Four-Eye Principle for key administrative changes: This should be implemented as defined in the ODCA Usage Model: Provider Assurance.19 A key administrative change is defined as any change the cloud provider does which may impact one or more cloud subscribers’ service or security. Cloud provider has an implemented and tested continuity plan: Should be implemented as defined in the ODCA Usage Model: Provider Assurance19, and needs to address technology, processes and personnel. Fully documented and controlled network: This should be implemented as defined in the ODCA Usage Model: Provider Assurance.19 High level documentation and a network map should be mandatory. Detailed documentation should exist containing details about the operational procedures of the security certification and processes, as well as information about cloud management infrastructure security, such as a customer security concept. The level of detail for this documentation may vary by service tier. Where custom code is used, systems must be developed using a secure software development lifecycle coding standard:29 For CIaaS most of the software will be COTS software so the cloud provider has limited control over the software. A defined engineering process should exist for the design, implementation and development of the cloud infrastructure on the cloud provider side. This process must include security. Option to perform penetration testing on hosted systems and applications: Any penetration testing conducted by the provider or by the subscriber must be known in advance to all parties involved. The subscriber may penetration test only the VM components (compute instance), and not directly other components such as LUN’s or underlying physical network devices even if purchasing those services. Only the cloud provider may pen test other components, including management components such as service catalog and service orchestration, and so on. On request, the cloud provider must supply either the results or certification of testing to the cloud subscriber under NDA. Physical segmentation of hardware, such as server, storage, network, and so on, to ensure isolation from all other system: Should be implemented as defined in the ODCA Usage Model: Provider Assurance.19 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 55 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Encrypted communication between cloud provider and cloud subscriber: This would be implemented as defined in the ODCA Usage Model: Provider Assurance.19 This will most likely be implemented for external private clouds independent of the service tier, such as through site-to-site VPNs. Multi-factor authentication: Should be implemented as defined in the ODCA Usage Model: Provider Assurance.19 Different methods may be necessary for the cloud infrastructure itself, as well as the cloud management API which is also available to the cloud subscriber. Ability for cloud subscriber to define geographic limits for hosting: Should be implemented as defined in the ODCA Usage Model: Provider Assurance.19 Storage encryption at logical unit number level (LUN): Should be implemented as defined in the ODCA Usage Model: Provider Assurance.19 Technical limitations may currently exist which make this difficult in some scenarios. In that case, the cloud subscriber must be able to encrypt the file system of the individual VM. No administrative access for cloud provider staff: Relevant to the VM content only. Cloud provider has administrative access to all other components, such as migration, maintenance. Platinum service tier: See general assumption at the beginning of this section. Strong encryption is mandatory for all data in-flight and at rest: For the VMs and persistent data this is in the responsibility of the cloud subscriber. For the cloud management API this is already covered by the secure protocols requirement. For non-persistent data, for example in flight, or inside the provider’s network devices/storage, the requirements are as according to the ODCA Usage Model: Provider Assurance.19 Platinum service tier: See general assumption on top of the document. In addition to these security requirements, the cloud provider may provide additional value-added services, such as preconfigured security appliances VMs, and virtual HSMs, and vTPM that can be used by the cloud provider. 11.2 Implementation Guidelines: 11.2.1 Assumptions: Bronze and silver service tiers can be hosted on the same hardware and infrastructure software that have logical separations. Gold can be hosted on the same hardware and infrastructure only with other gold level customers. No sharing of hardware and infrastructure between gold and silver or bronze. Platinum has separate hardware and infrastructure software for each cloud subscriber as follows: • Cloud provider has no administrative access at all: this is “unmanaged.” • Cloud provider has administrative access for the hardware and infrastructure software, or “managed.” 11.2.2 General Guidelines: Logical separation (access control) and service hardening are applied to all components. Identity provisioning (life cycle) will be implemented according to the usage scenarios in the ODCA Usage Model: Cloud Based Identity Provisioning.30 Privileged access for administrative, service provisioning, audit, reporting services, and so on will be implemented according to the usage scenarios in the ODCA Usage Model:Infrastructure as a Service (IaaS) Privileged User Access.31 SIEM resources for personnel, management consoles, and so on may be shared across all service tiers; network probes (NIPS/NIDS, etc.) must be separate for bronze/silver, gold, and platinum levels. Bronze and silver may share network probes. Network Security: Layer 2 and Layer 3 mitigations should be implemented for every physical and logical firewall, router, and switch for storage and the network. 56 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 11.2.3 Service Tier Specific Guidelines: Bronze and Silver: VM: Can share the same hardware and hypervisor instance with other VMs. Storage: All LUNs can share the same hardware, such as switches, gateways, spindles, and SSDs. Network: Shared hardware–physical and virtual. Different VLANs are configured for each customer. Virtual firewall: Shared by all customers and managed by the cloud provider. Management tools: Can share the same instance. Gold: VM: Can share the same hardware and hypervisor instance VMs only at the same gold service tier. Storage: Can share the same hardware, such as switches, gateways, spindles, and SSDs only with other gold-level consumers. Network: Shared hardware, physical and virtual, only with gold-level subscribers. Different VLANs are configured for each customer. Virtual Firewall: Every customer manages their own virtual firewall instance. Management tools: Shared only with other gold level subscribers. Platinum: VM: Separate hardware and hypervisor instance for every customer. Storage: Separate hardware for switches, gateways, spindles, SSDs, and so on for every customer. Network: Separate hardware for every customer. Management tools: Separate instance for every customer. 11.3 Security Service Catalog For the basic elements of the service catalog, see also the ODCA Master Usage Model: Service Orchestration.21 The more specific security attributes will be covered here. For service tiers gold and platinum, the provider’s service catalog must list where the services are located geographically and jurisdictionally, in order for the subscriber to comply with regulatory restrictions on location. For service tier gold, the catalog entry for backup service must show whether key management for backup tapes can be done by the subscriber. For platinum level, key management for backup tapes must be under subscriber governance, the service catalog shall show whether backup tapes can be delivered to the subscriber for storage in its own premises. 11.4 Malware Protection Malicious software (malware) represents one of the more significant threats to cloud computing, specifically due to the rate at which an infected system may be able to propagate malware throughout a highly virtualized and dense compute pool. This could lead to a denial of service for legitimate traffic caused by these malware infected hosts either attempting to further propagate or by launching attacks on other services. Malware in the compute hypervisor layer is not as prolific as in the VM layer. This is due to the nature of the limited attack surface. However, the spread of malware may cause service outages of the underlying infrastructure due to network or IO contention. The ODCA Usage Model: Provider Assurance19 recommends malware and antivirus protection at all levels of security from bronze through platinum. However, this may be difficult to achieve at the compute or hypervisor layer, as malware and anti-virus platforms and products do not typically cover this segment of the stack. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 57 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 As a consequence, controls must be applied to protect the underlying infrastructure from attack. Firewalls are already mandatory at all tiers of security. This provides basic protection to the underlying infrastructure. At the silver level and above, intrusion prevention systems must apply appropriate signatures to detect and prevent signature-based attacks. Furthermore, the cloud provider must assure that they provide appropriate segregation and protection for their clients in the event of a malware outbreak in another subscriber’s environment. Malware protection, such as web application firewalls, should be deployed to protect publicly available management services. Underlying hypervisor and back-end infrastructure should provide a means to ensure the integrity of the platform. 11.5 Admission Control Every management interface that is exposed to the cloud subscriber must at least fulfill the security requirements defined by the bronze, silver, gold, and platinum service tiers. In addition, exposure of these interfaces is different for public and private clouds, and may require additional security measures, such as a publicly exposed interface. User management for these interfaces is defined in the ODCA Usage Model: Infrastructure as a Service (IaaS) Privileged User Access,31 and must follow the appropriate assurance level definitions. Necessary authentication mechanisms for bronze, silver, gold, and platinum levels are further described in the ODCA Usage Model: Provider Assurance.19 This section specifies the requirements for these interfaces in more detail. Interfaces types: Service Portal (Web based): Bronze: Secure protocols, minimal password complexity checks (such as no forced renewals or no reuse checks), public interface, shared system. Silver: Secure protocols, best practice password complexity rules, optional soft-token-based, two-factor authentication, virtual private interface, shared system. Gold: Secure protocols, two-factor authentication and virtual private or private interface, such as a dedicated, direct connection. Platinum: Secure protocols, two or more factor authentication (including biometrics), private interface, such as a dedicated, direct connection. The system must support SAML to support federation between the cloud provider and the cloud subscriber systems, as well as internally. The service portal in this instance provisions and deprovisions new VMs, extends storage, memory, and other devices, as well as billing, incident reporting, and the availability of service metrics that aren’t polled via APIs. Cloud Management APIs: Same as above. All web service-based interfaces must use XML security or mutual authentication between peers. Authentication and authorization should be manageable through the cloud service portal. Virtual System Access: SSH, RDP, or other ki12 jm55access to your hosts. The cloud subscriber should use secure protocol controls as SSH and RDP when accessing their virtual machines. VM Service Console: Same as above for platinum as well as for non-secure protocols or protocols without strong encryption. A bastion host must be used to access the service console. The bastion must support the logging of authentication events, and it should support key stroke logging. Access to the service console must use the same IDM backend, and should be integrated via SSO. Other Interfaces: All other interfaces that cannot be integrated into the cloud service portal must support the same security level as the service portal. Access to this service must use the same IDM backend, and should be integrated via SSO. 58 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 11.6 Security Audit and Governance Effective and efficient security audit and governance requires a proactive focus from both the subscriber and the provider of cloud services, irrespective of sectors which include: • Management that are accountable, responsible, and who fully support and provide guidance as well as mandate IT governance that includes personnel, processes, procedures, system, technologies, networks, and information. • Security is viewed as a business requirement aligned with strategic goals, enterprise objectives, risk management plans, compliance requirements and policies. • Risk-based decisions for risk management, regulatory obligations, and commerciality where a comprehensive risk assessment is undertaken followed by a risk management plan. • Security requirements are implemented via policies and procedures. • Staff with access to information are aware and trained. They understand their daily responsibilities to protect and preserve the confidentiality, integrity, and availability of the information. Security awareness training is conducted routinely and consistently. • Ensure compliance with applicable regulatory requirements (such as privacy, security, business continuity), and cloud subscriber policy requirements. This compliance should be auditable and consistent upon application. See the ODCA Usage Model: Regulatory Framework13 that provides a process flow for engaging with regulators, and manages requirements associated with governance and compliance through the cloud service lifecycle. 11.6.1 Security Governance Usage Scenario The following usage model demonstrates how security audit and governance can be achieved in a CIaaS based on the ODCA Usage Model: Regulatory Framework.13 Actors: Cloud subscriber, cloud provider, regulator, external auditor Goals: 1. To ensure that cloud subscribers have the ability to efficiently assess the implementation of their own policies and local, federal, international, and industry regulatory obligations that use a standardized, repeatable approach when engaging and acquiring services from cloud providers. 2. To ensure that cloud subscribers can mandate or specify key requirements of their own policies and regulatory obligations to be met by cloud providers for the industry verticals they wish to service. 3. To ensure that cloud providers can efficiently demonstrate their ability to meet cloud subscriber policies and local, federal, international and industry regulatory obligations from geographical, jurisdictional and industry perspectives in an auditable manner. Considerations: 1. Assumes primary industry, local, federal and international regulators, regulatory obligations, and standards can be readily identified, noting that there will be some necessary ongoing monitoring and maintenance of regulatory requirements. 2. Ultimately, the onus is on the cloud subscriber to ensure compliance with its own policies and all geographical and industry-based regulation. Success Scenario 1: The cloud provider shall efficiently demonstrate compliance to applicable geographical and industry-based regulations for their cloud subscribers’ needs. This compliance should be auditable and consistent upon application. The cloud provider is able to deploy and adhere to changes and new regulatory requirements with a minimum impact to existing cloud subscribers. The cloud subscriber or cloud provider is notified of any material changes to regulations, laws and compliance requirements applicable to their own policies or geographic location and industry in a formal and timely manner so that the partnership of the cloud subscriber and cloud provider can agree what service changes, if any, are needed to become compliant with the material changes in regulation or law. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 59 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Note: Industry efficiency will be improved if cloud providers have a legal statement that cloud subscribers can rely on with respect to compliance with specific local and national laws and regulations. Failure Condition 1: The cloud provider is unable to demonstrate or maintain the applicable regulatory or standards compliance requirements, such as privacy, security, and business continuity, or meet cloud subscriber policy requirements. Failure Handling: For all failure conditions, both the cloud provider and the cloud subscriber should assess their inability to meet applicable regulatory and standards compliance requirements and take remedial actions. Requirements: • Cloud subscribers should develop an ongoing corporate compliance and risk management program. • Cloud subscribers should understand the implications of the geo-location of data, data ownership considerations, access restrictions and provisions, as well as regulatory obligations driving data protection, privacy, ownership, and data flows. • Cloud providers should develop an understanding of the legal, regulatory and compliance needs of each sector of their cloud subscribers’ target market in order to be able to tailor services to meet the specific needs of that sector. To the extent that cloud providers can assist cloud subscribers to better meet their obligations, cloud providers will be better positioned to attract and retain business, and develop a strong reputation with applicable regulators. Good practice regulatory requirements on cloud subscriber institutions include obligations to: • Have a policy relating to outsourcing of material business activities. • Have an adequate risk management plan to meet obligations, and manage risk posed by the outsourcing arrangement. • Have sufficient monitoring processes in place to manage the outsourcing of material business activities. • Have a legally binding agreement in place for all outsourcing of material business activities, unless otherwise agreed upon by the relevant regulators. • Ensure compliance with all applicable laws and statutes governing the location and type of business being transacted, such as data privacy laws, banking secrecy laws, and the Gramm-Leach-Bliley Act. • Consult with relevant regulators prior to entering into agreements to outsource material business activities to cloud providers who conduct their activities outside the cloud subscriber’s country. • Notify the relevant regulators before entering into agreements to outsource material business activities. 12.0 Commercial Considerations Please refer to ODCA Master Usage Model: Commercial Framework 12 for more information. 13.0 Regulatory Considerations Please refer to ODCA Master Usage Model: Commercial Framework 12 for more information. 14. RFP Requirements The following are requirements that the Alliance believes should be included in requests for proposal to cloud providers to ensure that proposed services support CIaaS. ODCA Principle Requirement –Service is open and is standards-based. Describe how the service meets this principle and any limitations towards the ODCA principle. ODCA CIaaS Usage Model 1.0 –IT Operations Management. The service must support a wide range of x86-based operating systems, including Windows (server and desktop OS), Solaris x64 and Linux (leading distributions) in 32-bit and 64-bit versions. 60 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 ODCA CIaaS Usage Model 1.0– IT Operations Management: The service must support network isolation controls for inbound and outbound traffic. ODCA CIaaS Usage Model 1.0–IT Operations Management: The service must support the deployment of Web, Application, Database and Infrastructure Service components, such as LDAP components. ODCA CIaaS Usage Model 1.0–IT Operations Management: The service must support alignment with Information Technology Infrastructure Library (ITIL) processes for change, incident and configuration management ODCA CIaaS Usage Model 1.0–Network Management: The service provider must provide options for consumer network connectivity, such as internet VPN, and leased lines. In addition, the service provider must articulate any other network requirements, stipulations and constraints, such as NAT, IP address overlays, and latency controls. ODCA CIaaS Usage Model 1.0–Network Management: The service must include instrumentation to provide the consumer with a view of bandwidth, performance, and latency. These should be available via the service interfaces (including the service portal user interface and the API). ODCA CIaaS Usage Model 1.0–Security Management: The service provider must provide architectural, design, policy and other artefacts that demonstrate the degree to which the cloud subscriber’s service is being segregated from other subscribers. This applies to single-tenant and multi-tenant cloud services. ODCA CIaaS Usage Model 1.0–Security Management: The cloud provider must formally and explicitly affirm that the storage, network, and processing security meet the requirements of the cloud subscriber’s contracted tier (Bronze, Silver, Gold, or Platinum). Additionally, the cloud provider must support independent verification. ODCA CIaaS Usage Model 1.0–Security Management: In a managed cloud environment, certain software used by the subscriber within the cloud should integrate with the monitoring tools provided in the cloud that ensure the cloud’s integrity. This includes such software for intrusion detection and prevention, thresholds on access logs, and so on. ODCA CIaaS Usage Model 1.0–Workload Management: The service must provide volume flexibility, and allow the consumer to dial up or down the resources being consumed. ODCA CIaaS Usage Model 1.0–Workload Management: The service must be capable of integrating with the consumer cloud management tools programmatically. ODCA CIaaS Usage Model 1.0–Workload Management: The service should allow the consumer to change workload policy rules and parameters at will within specific criteria. ODCA CIaaS Usage Model 1.0–Workload Management: The service must provide a cloud service management portal. ODCA CIaaS Usage Model 1.0–Compliance Management: The provider must agree and adhere to, and permit enforcement of governing frameworks and policies, internal and external audits, minimum standards/certifications and consequence management. Lack of controls may subject providers to penalties. ODCA CIaaS Usage Model 1.0–Compliance Management: There may be technical and procedural requirements based on the cloud subscriber’s industry or country in which they operate or have customers. This may also include requirements such as data that must stay within the country of origin, or regular, prescriptive disaster recovery testing. The cloud subscriber may be required to provide evidence of compliance, and thus may need the provider’s assistance to produce that. ODCA CIaaS Usage Model 1.0–Problem Management: Cloud Provider must have established an effective root cause analysis of incidents related to contracted or consumed services to prevent recurrence of negative service impacts. ODCA CIaaS Usage Model 1.0–Service Continuity Management: Cloud Provider must have effective processes to ensure that IT services can recover and continue even after a serious incident occurs. This will also include the business continuity of material suppliers. ODCA CIaaS Usage Model 1.0–Service Continuity Management: The cloud provider must ensure that a third party cloud subscriber cannot impact the cloud subscriber, such as in “noisy neighbor” situations. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 61 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 ODCA CIaaS Usage Model 1.0–Vulnerability Management: The cloud provider must establish a regular practice of identifying, classifying, remediating, and mitigating vulnerabilities, including patch management. Furthermore, the provider must notify the cloud subscriber of any actions or incidents, known or suspected, that may risk the cloud subscriber’s assets or data via the provider’s service. ODCA CIaaS Usage Model 1.0–Vulnerability Management: The cloud provider works closely with its ISPs and with regional and international security organizations in order to prevent Internet driven attacks against its subscribers or its infrastructure. The provider has a risk response team and a security operations team that is trained to respond quickly to attacks, and to prevent their clients from being impacted. Furthermore, for the Gold and Platinum services, DOS and DDOS attacks are contained by filtering the attacking servers as early as possible on the ISP’s infrastructure. ODCA CIaaS Usage Model 1.0–Vulnerability Management: For service tiers Silver, Gold and Platinum, the subscriber has a vulnerability management system in place and applies security patches in a timely manner, as defined in the ODCA Usage Model: Provider Assurance. ODCA CIaaS Usage Model 1.0–Vulnerability Management: For service tiers Gold and Platinum, the subscriber may perform analysis of its access and application logs and communicates identified patterns to the provider, in order to improve the accuracy of the provider’s filters. ODCA CIaaS Usage Model 1.0–Monitoring Service: The cloud provider must monitor the environment, including event, capacity, security and utilization, to ensure SLAs are met. The provider’s monitoring data must be provided over standardized APIs. ODCA CIaaS Usage Model 1.0–Monitoring Service: If application layer monitoring and analytics point to the infrastructure, then the infrastructure metrics should be easily accessible and available for root cause analysis, troubleshooting, and to provide early warnings of issues that may be preventable. ODCA CIaaS Usage Model 1.0–Incident Management: The Cloud Provider must inform the Cloud Subscriber of incidents that may affect it. Pre-defined agreements must be established on prioritization of an incident and level of effort required by the cloud provider during an incident. Automated and standardized interfaces are to be established to manage incidents. ODCA CIaaS Usage Model 1.0–Incident Management: For major incidents, as agreed in a contract between the provider and the subscriber, the cloud provider must notify the affected customers within 48 hours. ODCA CIaaS Usage Model 1.0–Incident Management: Incident responses must be agreed between both Subscriber and Provider in order to make them as effective as possible. Coordinated activities help prevent service degradation by avoiding conflicting actions. ODCA CIaaS Usage Model 1.0–Change Management: Cloud Provider must notify Cloud Subscriber when a change in configuration or other operational aspect may affect the service capabilities of the other party. Proactive management is required to ensure a stable environment. ODCA CIaaS Usage Model 1.0–Governance: The provider must adhere to and permit enforcement of governing frameworks and policies, internal and external audits, minimum standards and certifications, and security controls. Penalties and termination of contracts may be established where requirements are not met. ODCA CIaaS Usage Model 1.0–Governance: For Gold and Platinum service tiers, the contract between the subscriber and the provider will typically stipulate geographic or jurisdictional limitations where the subscriber’s data can be stored and processed, including secondary site and backup tapes location. ODCA CIaaS Usage Model 1.0–Governance: For Gold and Platinum tiers, the provider must notify the subscriber of the parent company or legal jurisdiction if they have a US parent company governed by the US PATRIOT Act. ODCA CIaaS Usage Model 1.0–Provisioning of services: Provider must have effective automated mechanisms to request, provision, manage, and meter usage of services wherever possible. 62 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 15.0 Next Steps and Summary of Industry Actions Required In the interest of giving guidance on how to create and deploy solutions that are open, multi-vendor and interoperable, we have identified specific areas where the Alliance believes there should open specifications, formal or de facto standards or common intellectual property-free (IP-free) implementations. Where the Alliance has a specific recommendation on the specification, standard or open implementation, it has been called out elsewhere in this master usage model. In other cases, ODCA will be working with the industry to evaluate and recommend specifications in future releases of this document. 15.1 Industry Actions Required This master usage model references and incorporates requirements from many other ODCA usage models (see References section). Those requirements should be considered as embedded into the overall CIaaS requirements. As a consequence, industry actions required for all referenced ODCA documents are inherited here, and are thus industry actions required for CIaaS. Specific industry actions worth highlighting: • Solution and service providers need to ensure that their service and product offerings clearly delineate between core capabilities and differentiators. Providers should reference and map their offerings to this and other ODCA documents. Which capabilities are common with the industry at large, and which are unique differentiators for that provider should be clearly outlined as well. • Standards development organizations (SDOs) need to develop and aggressively promote interfaces and processes that will enable seamless end-to-end CIaaS offerings that are vendor-agnostic. • Service providers need to clearly communicate how their offerings map against the ODCA Master Usage Model: Commercial Framework,12 and how the different architectural elements work together. • Cloud consumers need to proactively prepare their organizations for the operating model changes and other potential disruptions that may arise from introducing CIaaS services from an outside provider. • SDO’s need to develop common, industry-standard semantics for cloud provisioning. • The industry should develop object storage standards, in cooperation with SDOs. • Maturation of network virtualization needs to occur and standards for network virtualization develop. • There needs to be a way for solution and service providers to enable metrics sharing across monitoring systems and tools in standardized and open ways, whether through APIs, adapters, extensions, or the like. • Industry efficiency will be improved if cloud providers have a legal statement that cloud subscribers can rely on with respect to compliance with specific local and national laws and regulations. Service and solution providers should take into account real user monitoring insights to drive load test scripts such that those scripts better reflect the combinations of transactions that occur based on real user usage. Transparency is a concept worth specific attention. Service providers and solution providers should enable incident reporting, metrics, etc. in easier and more accessible ways –even publishing the metrics themselves.32 Such transparency, however, needs to go beyond infrastructure components that are not very actionable or directly related to user experience. Transparency that illuminates end-user experience and business impact will be most valuable to cloud subscribers. 15.2 Future CIaaS Requirements Development Version 1.0 of this master usage model has focused on requirements for a single cloud subscriber engaged with a single cloud provider. While acknowledging the role of cloud brokers, cloud federation and cloud marketplaces, there has been little elaboration on how they impact CIaaS requirements. Future versions of the ODCA Master Usage Model: Compute Infrastructure as a Service (CIaaS) should address those issues. Object Storage was beyond the scope for v1.0. That may be revisited in future revisions. Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 63 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 16.0 References ODCA USAGE MODELS ODCA Usage Model: Cloud Based Identity Governance and Auditing http://www.opendatacenteralliance.org/document-sections/category/71docs?download=677:HW_ODCA_Identity_Gov_Auditing_Rev1.0_final ODCA Usage Model: Cloud Based Identity Provisioning http://www.opendatacenteralliance.org/document-sections/category/71docs?download=679:ODCA_Identity_Provisioning_Rev1%200_final ODCA Developing Cloud-Capable Applications White Paper http://www.opendatacenteralliance.org/index2.php?option=com_productsearch&view=ligh tbox&proid=17&ie=UTF-8&oe=UTF-8&q=prettyphoto&iframe=true&width=60%&height=90% ODCA Master Usage Model: Commercial Framework http://www.opendatacenteralliance.org/docs/ODCA_Commercial_Framework_MasterUM_v1.0_ Nov2012.pdf ODCA Usage Models: Conceptual Overview and Document Map http://www.opendatacenteralliance.org/document-sections/category/71docs?download=454:conceptual_overview_and_document_map ODCA Usage Model: Guide to Interoperability Across Clouds http://www.opendatacenteralliance.org/docs/ODCA_Interop_Across_Clouds_Guide_ Rev1.0.pdf ODCA Usage Model: Identity Management Interoperability Guide http://www.opendatacenteralliance.org/document-sections/category/71docs?download=676:HODCA_%20IdM_%20InteropGuide_Rev1%200_final ODCA Usage Model: Infrastructure as a Service (IaaS) Privileged User Access http://www.opendatacenteralliance.org/document-sections/category/71docs?download=678:HW_ODCA_%20IdM_PrivAccess_Rev1.0_final ODCA Usage Model: Long Distance Workload Migration http://www.opendatacenteralliance.org/docs/Long_Distance_Workload_Migration_Rev1.0_b. pdf ODCA Usage Model: Provider Assurance http://www.opendatacenteralliance.org/docs/Security_Provider_Assurance_Rev%201.1_b.pdf ODCA Usage Model: Regulatory Framework http://www.opendatacenteralliance.org/document-sections/category/71-docs?download=455:regulatory_ framework ODCA Master Usage Model: Service Orchestration http://www.opendatacenteralliance.org/docs/ODCA_Service_Orch_MasterUM_v1.0_Nov2012.pdf ODCA Usage Model: Single Sign On Authentication http://www.opendatacenteralliance.org/document-sections/category/71docs?download=680:ODCA_idM_SingleSign_Rev1.0_final ODCA Usage Model: Standard Units of Measure for IaaS http://www.opendatacenteralliance.org/document-sections/category/71docs?download=458:standard_units_of_measure ODCA Usage Model: VM Interoperability in a Hybrid Cloud Environment http://www.opendatacenteralliance.org/docs/ODCA_VMInteroperability_ Rev.1.1_Final.pdf 64 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Other Sources Cloud Computing: Principles and Paradigms; Buyya, Broberg, Goscinski; John Wiley & Sons; 2011; pg. 130 Cloud Federation; Kurze, et al; http://www.aifb.kit.edu/images/0/02/Cloud_Federation.pdf Cloudscaling Infrastructure-as-a-Service Builder’s Guide, Network Edition: The Case for Network Virtualization (v1.0.4, Q4 2010) DMTF’s Cloud Management for Communications Service Providers http://www.dmtf.org/sites/default/files/standards/documents/DSP2029%20 _1.0.0a.pdf NIST Cloud Computing Reference Architecture http://www.nist.gov/customcf/get_pdf.cfm?pub_id=909505 NIST Cloud Specific Terms and Definitions http://collaborate.nist.gov/twiki-cloud-computing/pub/CloudComputing/ ReferenceArchitectureTaxonomy/Taxonomy_Terms_and_Definitions_version_1.pdf NIST Definition of Cloud Computing http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf ODCA Proposal Engine Assistant Tool (PEAT) http://www.opendatacenteralliance.org/ourwork/proposalengineassistant TM Forum’s TR174 Addendum A-Cloud Business Models http://www.opendatacenteralliance.org/docs/tmforum_tr174addendum_ cloudbusinessmodels.pdf TM Forum’s TR174 Enterprise-Grade External Compute IaaS Requirements (Virtual Private Cloud) http://www.opendatacenteralliance.org/docs/ tmforum_tr174enterprisegrade_computeiaasrequirements.pdf TM Forum’s TR174 Addendum C Enterprise-Grade Virtual Private Cloud from a State-of-the-Art Reference Implementation http://www. opendatacenteralliance.org/docs/tmforum_tr174enterprisegrade_referenceimplementation.pdf Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 65 Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 Endnotes 1. NIST Cloud Computing Reference Architecture http://www.nist.gov/customcf/get_pdf.cfm?pub_id=909505 2. Cloud Federation http://www.aifb.kit.edu/images/0/02/Cloud_Federation.pdf 3. Maintenance Window http://en.wikipedia.org/wiki/Maintenance_window 4. Recovery Point Objective http://en.wikipedia.org/wiki/Recovery_point_objective 5. Recovery Time Objective http://en.wikipedia.org/wiki/Recovery_time_objective 6. Recovery Consistency Objective http://en.wikipedia.org/wiki/Recovery_consistency_objective 7. NIST Definition of Cloud Computing http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf 8. Paul Fremantle Blog-Cloud Native http://pzf.fremantle.org/2010/05/cloud-native.html 9. ODCA Developing Cloud-Capable Applications White Paper http://www.opendatacenteralliance.org/index2.php?option=com_productsearch &view=lightbox&proid=17&ie=UTF-8&oe=UTF-8&q=prettyphoto&iframe=true&width=60%&height=90% 10. ODCA Usage Model: Long Distance Workload Migration http://www.opendatacenteralliance.org/docs/Long_Distance_Workload_Migration_ Rev1.0_b.pdf 11. The NIST Definition of Cloud Computing http://www.nist.gov/itl/cloud/upload/cloud-def-v15.pdf 12. ODCA Master Usage Model: Commercial Framework http://www.opendatacenteralliance.org/docs/ODCA_Commercial_Framework_ MasterUM_v1.0_Nov2012.pdf 13. ODCA Usage Model: Regulatory Framework http://www.opendatacenteralliance.org/document-sections/category/71docs?download=455:regulatory_framework 14. ODCA Development Program and Priorities http://www.opendatacenteralliance.org/docs/ODCA_Development%20and%20Program%20 Priorities_final.pdf 15. ODCA Usage Model: Standard Units of Measure for IaaS http://www.opendatacenteralliance.org/document-sections/category/71docs?download=458:standard_units_of_measure 16. ODCA Usage Model: VM Interoperability in a Hybrid Cloud Environment http://www.opendatacenteralliance.org/docs/ODCA_ VMInteroperability_Rev.1.1_Final.pdf 17. ODCA Usage Model: Guide to Interoperability Across Clouds http://www.opendatacenteralliance.org/docs/ODCA_Interop_Across_Clouds_ Guide_Rev1.0.pdf 18. TM Forum’s TR174 Enterprise-Grade External Compute IaaS Requirements (Virtual Private Cloud) http://www.opendatacenteralliance.org/ docs/tmforum_tr174enterprisegrade_computeiaasrequirements.pdf 19. ODCA Usage Model: Provider Assurance http://www.opendatacenteralliance.org/docs/Security_Provider_Assurance_Rev%201.1_b.pdf 20. “Performance Indicators” http://en.wikipedia.org/wiki/Performance_indicator 21. ODCA Master Usage Model: Service Orchestration http://www.opendatacenteralliance.org/docs/ODCA_Service_Orch_MasterUM_v1.0_ Nov2012.pdf 22. Cloud Computing: Principles and Paradigms; Buyya, Broberg, Goscinski; John Wiley & Sons; 2011; pg. 130 23. NIST Cloud Specific Terms and Definitions http://collaborate.nist.gov/twiki-cloud-computing/pub/CloudComputing/ ReferenceArchitectureTaxonomy/Taxonomy_Terms_and_Definitions_version_1.pdf 24. ODCA Model: Service Catalog http://www.opendatacenteralliance.org/document-sections/category/71-docs?download=445:service-catalog 25. Compatible with DMTF’s Open Virtualization Format (OVF) http://dmtf.org/standards/ovf 66 Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. Open Data Center Alliance: Compute Infrastructure as a Service Rev 1.0 26. Compatible with SNIA’s Cloud Data Management Interface (CDMI) http://www.snia.org/cdmi 27. Cloud Management for Communications Service Providers http://www.dmtf.org/sites/default/files/standards/documents/DSP2029%20 _1.0.0a.pdf 28. Cloudscaling Infrastructure-as-a-Service Builder’s Guide, Network Edition: The Case for Network Virtualization 29. There are many options for secure software coding standards, such as https://www.owasp.org, http://www.mcafee.com/us/resources/ data-sheets/foundstone/ds-secure-software-dev-life-cycle.pdf, and http://csrc.nist.gov/publications/nistpubs/800-64-Rev2/SP800-64Revision2.pdf 30. ODCA Usage Model: Cloud Based Identity Provisioning http://www.opendatacenteralliance.org/docs/Cloud_Based_Identity_ Provisioning_%20b.pdf 31. ODCA Usage Model: Infrastructure as a Service (IaaS) Privileged User Access http://www.opendatacenteralliance.org/docs/Infrastructure_ as_a_Service_(Iaas)_Privileged_User_Access_Rev_1.0_b.pdf 32. Such as via http://trust.salesforce.com Copyright © 2012 Open Data Center Alliance, Inc. ALL RIGHTS RESERVED. 67
© Copyright 2026 Paperzz