CloudPlatform Deployment Reference Architecture For Citrix CloudPlatform Version 3.0.x © 2012 Citrix Systems, Inc. All rights reserved. Specifications are subject to change without notice. Citrix Systems, Inc., the Citrix logo, Citrix XenServer, Citrix XenCenter, and Citrix CloudPlatform are trademarks or registered trademarks of Citrix Systems, Inc. All other brands or products are trademarks or registered trademarks of their respective holders. CloudPlatform Deployment Reference Architecture Contents What's In This Guide .................................................................................................................................................................... 4 Workload-Driven Deployment Process........................................................................................................................................ 5 Types of Cloud Workloads ....................................................................................................................................................... 5 CloudPlatform Supports Both Workload Types ....................................................................................................................... 8 Traditional Workload ........................................................................................................................................................... 8 Cloud-Era Workload ............................................................................................................................................................. 9 Management Server Cluster Deployment ................................................................................................................................. 10 What Type of Workload is the Management Server? ........................................................................................................... 10 Management Server Cluster Backup and Replication ........................................................................................................... 11 Management Server Cluster Hardware ................................................................................................................................. 12 Primary Management Server Cluster................................................................................................................................. 12 Standby Management Server Cluster ................................................................................................................................ 12 Management Server Cluster Configuration ........................................................................................................................... 13 Primary Management Server Cluster Configuration.......................................................................................................... 13 Cloud-Era Availability Zone Deployment ................................................................................................................................... 16 Overview ................................................................................................................................................................................ 16 Network Configuration ...................................................................................................................................................... 17 Cloud-Era Availability Zone Hardware ................................................................................................................................... 18 Primary Storage Sizing ....................................................................................................................................................... 19 Secondary Storage Sizing ................................................................................................................................................... 19 Cloud-Era Availability Zone Configuration ............................................................................................................................. 20 Traditional Availability Zone Deployment ................................................................................................................................. 22 Overview ................................................................................................................................................................................ 22 Traditional Availability Zone Hardware ................................................................................................................................. 23 Primary Storage Sizing ....................................................................................................................................................... 24 2 © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Secondary Storage Sizing ................................................................................................................................................... 24 Choice of Hypervisor in Traditional Availability Zone ............................................................................................................ 25 Traditional Availability Zone Configuration (for vSphere) ..................................................................................................... 25 Traditional Availability Zone Configuration (for Xenserver) .................................................................................................. 28 Disclaimer: Vendors and products mentioned in this document are provided as examples and should not be taken as endorsements or indication of vendor certification. © 2012 Citrix Systems, Inc. All rights reserved. 3 CloudPlatform Deployment Reference Architecture What's In This Guide This Guide is for cloud operators who are planning medium to large-scale production deployments of Citrix CloudPlatform. It is designed to work in conjunction with the CloudPlatform Installation Guide . This document is intended to offer highlevel planning and architectural guidance, as opposed to detailed installation procedures, for production deployments. The reader should refer to the CloudPlatform Installation Guide for the detailed steps needed to install and configure Citrix CloudPlatform. The reader should also refer to the CloudPlatform Administration Guide for the instructions on how to operate, maintain, and upgrade a CloudPlatform installation. Citrix CloudPlatform supports a large number of hypervisor, network and storage configurations. To simplify the planning of large-scale production deployments, this document is designed to provide guidance on selecting the proper architecture and configuration according to the target workload the cloud is designed to support. Before we cover the details of different options of deployment architecture we’ll first establish the foundation of our methodology: workload-driven deployment. 4 © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Workload-Driven Deployment Process Citrix CloudPlatform™ is an open source software platform that pools datacenter resources to build public, private, and hybrid Infrastructure as a Service (IaaS) clouds. CloudPlatform abstracts the network, storage, and compute nodes that make up a datacenter and enables them to be delivered as a simple-to-manage, scalable cloud infrastructure. These nodes or components of a cloud can vary greatly from datacenter to datacenter and cloud to cloud because they are defined by the unique workloads or applications that they support. With so many options for servers, hypervisors, storage and networking it is imperative that cloud operators design with a specific application in mind to ensure the infrastructure meets the scalability and reliability requirements of the application. The following figure illustrates the steps a cloud operator typically follows to determine the appropriate deployment architecture for CloudPlatform. Define target workloads Determine how that application workload will be delivered reliably Develop the deployment architecture Implement cloud deployment Operate cloud environment (e.g., monitor, upgrade, patch) Iaas Cloud Types of Cloud Workloads Two distinct types of application workloads have emerged in cloud operator’s datacenters. The first type is a traditional enterprise workload. The majority of existing enterprise applications fall into this category. They include, for example, applications developed by leading enterprise vendors such as Microsoft, Oracle, and SAP. These applications are typically built to run on a single server or on a cluster of front-end and application server nodes backed by a database. Traditional workloads typically rely on technologies such as enterprise middleware clusters and vertically-scaled databases. Citrix commonly refers to the second type as a Cloud-Era workload. Internet companies such as Amazon, Google, Zynga, and Facebook have long realized that traditional enterprise infrastructure was insufficient to serve the load generated by millions of users. These Internet companies pioneered a new style of application architecture that © 2012 Citrix Systems, Inc. All rights reserved. 5 CloudPlatform Deployment Reference Architecture does not rely on enterprise-grade server clusters, but on a large number of loosely-coupled computing and storage nodes. Applications developed this way often utilize technologies such as MySQL sharding, no-SQL, and geographic load balancing. There are two fundamental differences between traditional workloads and cloud-era workloads. SCALE: The first difference is scale. Traditional enterprise applications serve tens of thousands of users and hundreds of sessions. Driven by the growth of Internet and mobile devices, Internet applications serve tens of millions of users. The orders of magnitude difference in scale translates to significant difference in demand for computing infrastructure. As a result the need to reduce cost and improve efficiency becomes paramount. RELIABILITY: The difference in scale has an important side effect. Enterprise applications can be designed to run on reliable hardware. Application developers do not expect the underlying enterprise-grade server or storage cluster to fail during normal course of operation. Sophisticated backup and disaster recovery procedures can be setup to handle the unlikely scenario of hardware failure. The Internet scale changed the paradigm. As the amount of hardware resources grow, it is no longer possible to deliver the same level of enterprise-grade reliability, backup, and disaster recovery at the scale needed to support Internet workloads in a cost effective and efficient manner. Traditional vs. Cloud-Era Workload Requirements Traditional Workload Cloud-Era Workload Scale 10s of thousands of users Millions of Users Reliability 99.999 uptime Assumes failure Infrastructure Proprietary Commodity Applications SAP, Microsoft, Oracle Web Content, Web Apps, Social Media Cloud-era workloads assume that the underlying infrastructure can fail and will fail. Instead of implementing disaster recovery as an after-thought, multi-site geographic failover must be designed into the application. Once the application can expect infrastructure failure, it no longer needs to rely on technologies such as network link aggregation, storage multipathing, VM HA or fault tolerance, or VM live migration. Instead the application is expected to treat servers and storage as “Ephemeral Resources,” a term that means resources can be used while they are available, but they may become unavailable after a short period of use. Some cloud-era applications, such as the Netflix streaming video service, have notably employed a mechanism called “Chaos Monkey” that randomly destroys infrastructure nodes to ensure that the application can continue to function despite infrastructure failure. Common Cloud Workloads Traditional Workload Candidates Communications / Productivity CRM / ERP / Database Desktop 6 Outlook, Exchange or SharePoint Oracle, SAP Desktop-based computing, desktop service and support applications, and desktop management applications © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Cloud-Era Workload Candidates Web Service Static and dynamic web content, streaming media, RSS, mash-ups and SMS Web Applications Web service-enabled applications, eCommerce, eBusiness, Java application servers Rich Internet Applications Disaster Recovery HPC Videos, online gaming and mobile apps (Adobe Flex, Flash, Air, Silverlight, iPhone) Onsite/Offsite backup and recovery, live failover, cloud bursting for scale Collaboration / Social media Web 2.0 applications for online sharing and collaboration (Blog, CMS, File Share, Wiki, IM) Batch Processing Predictive usage for processing large workloads - Data mining, warehousing, analytics, business intelligence Development and Test Software development and test processes and image management Engineering design and analysis, scientific applications, high performance computing © 2012 Citrix Systems, Inc. All rights reserved. 7 CloudPlatform Deployment Reference Architecture CloudPlatform Supports Both Workload Types Citrix CloudPlatform is the only product in the industry today that supports both traditional enterprise and Cloud-era workloads. While Cloud-era workloads represent an application architecture that will likely become more dominant in the future, the majority of applications that exist today are written as enterprise-style workloads. With CloudPlatform, a cloud operator may design for one style of workload and add support for the other style later. Or a cloud operator may design for supporting both styles of workload from the beginning. The ability to support both styles of workload lies in CloudPlatform’s architectural flexibility. Cloud operators can, for example, configure multiple availability zones using different hypervisor, storage, and networking capabilities required to support different types of workloads to meet security, compliance and scalability needs of multiple cloud initiatives. Traditional Workload The following figure illustrates how a CloudPlatform Traditional Availability Zone can be constructed to support a traditional enterprise style workload Traditional workloads in the cloud are typically designed with a requirement for high availability and fault tolerance and use common components of an enterprise datacenter to meet those needs. This starts with an enterprise-grade hypervisor, such as VMware vSphere or Citrix XenServer that supports live migration of virtual machines and storage and has built-in high availability. Storage of virtual machine images leverages high-performance SAN devices. Traditional physical network infrastructure like firewalls and layer 2 switching are used and VLANs are designed to isolate traffic between servers and tenants. VPN tunneling provides secure remote access and site-to-site access through existing network edge devices. Applications are packaged using industry-standard OVF files. 8 © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Cloud-Era Workload The following figure illustrates how a CloudPlatform Cloud-Era Availability Zone can be constructed to support cloud-era workloads: The desire for cost-savings can easily offset the need for features in designing for a cloud-era workload making open source and commodity components such as XenServer and KVM a more attractive option. In this workload type, virtual machine images are stored in EBS volumes and object store can used to store data that must persist through availability zone failures. Because of VLAN scalability limitations, software defined networks are becoming necessary in cloud-era availability zones. CloudPlatform meets this need by supporting Security Groups in L3 networking. Elastic Load Balancing (ELB) or Global Server Load Balancing (GSLB) is used to redirect user traffic to servers in multiple availability zones. Third party tools developed for Amazon Web Services to manage applications in this type of environment are readily available and have tested proven integrations with CloudPlatform. © 2012 Citrix Systems, Inc. All rights reserved. 9 CloudPlatform Deployment Reference Architecture Management Server Cluster Deployment The management server deployment is not dependent on the underlying style of cloud workload. A single management server cluster can manage multiple availability zones across multiple datacenters enabling cloud operators to create different availability zones to handle different workload types as needed. The following figure illustrates how a single cloud can contain both cloud-era availability zones and traditional availability zones that are local or geographically dispersed. What Type of Workload is the Management Server? CloudPlatform Management Server is designed to run as a traditional enterprise-grade application or traditional workload. It is designed as a simple, lightweight, and highly efficient application with the majority of work running inside system VMs (see CloudPlatform Administration Guide – Working with System Virtual Machines) and executed on computing nodes. This design choice is for two reasons: First, managing a cloud is not a cloud-scale problem. In CloudPlatform version 3.0.x, each management server node is certified to manage 10,000 computing nodes. This level of scalability is sufficient for today’s production cloud deployments. When CloudPlatform deployments continue to grow, we expect to be able to tune management server code so that each individual management server node can scale to many times more computing nodes. The second reason for designing the management server as an enterprise application is a pragmatic one. Few people who rd deploy CloudPlatform will have a Cloud-era infrastructure already in place. Without an existing IaaS cloud and 3 party management tools like RightScale or EnStratus in place, deploying cloud workload is not an easy task. Building CloudPlatform Management Server as a cloud-era workload would therefore lead to a bootstrap problem. 10 © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Management Server Cluster Backup and Replication As a traditional-style enterprise application, the management server cluster is front ended by a load balancer and connects to a shared MySQL database. While the cluster nodes themselves are stateless and can be easily recreated, the MySQL database node should be backed up and replicated to a remote site to ensure continuing operation of the cloud. The following figure illustrates how a standby management server cluster is setup in a remote datacenter. During the normal course of operation, the primary management server cluster serves all UI and API requests. Individual server failures in the management server cluster are protected as other servers in the cluster will take over the load. To ensure the management server cluster can recover from a MySQL database failure, an identical database machine is setup to serve as the backup MySQL server. All database transactions are replayed in real time on the Backup MySQL server in an active-passive setup. If the primary MySQL server fails, the admin can reconfigure the management server cluster to point to the backup MySQL server. To ensure that the system can recover from the failure of the entire availability zone 1 that contains the primary management server cluster, a standby management server cluster can be setup in another availability zone. Asynchronous replication is setup between the backup MySQL server in the primary management server cluster and the MySQL server in the standby management server cluster. If availability zone 1 fails, a cloud administrator can bring up the standby management server cluster and then update the DNS server to redirect cloud API and UI to the standby management server cluster. © 2012 Citrix Systems, Inc. All rights reserved. 11 CloudPlatform Deployment Reference Architecture Management Server Cluster Hardware Primary Management Server Cluster Citrix recommends a two-node management server cluster that is capable of managing a cloud deployment totaling 10,000 computing nodes. Load Balancer NetScaler VPX or MPX based on the number of concurrent active sessions. Management Server Node 1 Intel or AMD CPU server with at least 2GHZ, 1 socket, 4 cores, 16GB of memory, and 250GB of RAID 1 local disk storage. Management Server Node 2 Intel or AMD CPU server with at least 2GHZ, 1 socket, 4 cores, 16GB of memory, and 250GB of RAID 1 local disk storage. Primary MySQL Server Intel or AMD CPU server with at least 2GHZ, 1 socket, 4cores, 16GB of memory, and 250GB of RAID 1 local disk storage. Backup MySQL Server Intel or AMD CPU server with at least 2GHZ, 1 socket, 4 cores, 16GB of memory, and 250GB of RAID 1 local disk storage. As long as adequate performance is available It is permissible to run management server and MySQL server as virtual machines. It is permissible to run NetScaler VPX virtual appliance. Standby Management Server Cluster Standby Management Server cluster is identical to the primary management server cluster with one difference: backup MySQL server is not required. 12 Load Balancer NetScaler VPX or MPX. Management Server Node 1 Intel or AMD CPU server with at least 2GHZ, 1 socket, 6 cores, 32GB of memory, and 250GB of RAID 1 local disk storage. Management Server Node 2 Intel or AMD CPU server with at least 2GHZ, 1 socket, 6 cores, 32GB of memory, and 250GB of RAID 1 local disk storage. Primary MySQL Server Intel or AMD CPU server with at least 2GHZ, 1 socket, 6 cores, 32GB of memory, and 250GB of RAID 1 local disk storage. © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Management Server Cluster Configuration Primary Management Server Cluster Configuration The database replication between primary and standby clusters can be done using MySQL replication methodology with hot backup option. You can find more information about this at http://www.innodb.com/wp/products/hot-backup/. CloudPlatform Internal DNS CPMS-URL (Example URL pointing at CloudPlatform Management server) Management nodes:10.52.2.148; 10.52.2.149 CloudPlatform Version CloudPlatform 3.0.x MySQL Version MySQL 5.1.61 MySQL Database (Master) IP Address 10.52.2.142 MySQL Database (Slave) IP Address 10.52.2.143 Management Server Node Configuration Management Servers Number of Servers (VM) for Management 2 This is a redundant design for high availability. Name(s) CPMGSRV01, CPMGSRV02 Naming convention does not provide any standards or suggestions. These are sample names. IP Address(es) 10.52.2.148; 10.52.2.149 The IP addresses specified only for reference and needs to be changed to fit network configuration of datacenter. Deployment Hypervisor XenServer 6.0.2 6.0.2 is the latest version of XenServer and is tested and entitled with CloudPlatform 3.0.x Management Server VM Properties CPU: 4 x vCPU RAM: 16 GB RAM NIC: 1 NIC HDD: 250GB RHEL 6.2 (64-bit) Management server is memory intensive and having enough RAM ensures performance requirements. Operating System RHEL is the recommended OS for its available commercial support. Management Servers – Load Balancing © 2012 Citrix Systems, Inc. All rights reserved. 13 CloudPlatform Deployment Reference Architecture Load Balancing used Yes Load balancing management servers is a recommended practice to meet performance requirements. Load Balancer NetScaler VPX Considering the load and number of users and SSL connections this Cloud architecture needs to manage Netscaler VPX would suffice. NetScaler MPX is an option if the load requirement goes beyond what is mentioned in this document. Load Balancer (NetScaler) Configuration The CloudPlatform UI is to be load balanced using Load Balancers. CloudPlatform requires that ports 8080 and 8250 are configured on the LB VIP and that it requires persistence/stickiness across multiple sessions. Source Port Destination Port Protocol Persistence 8080 8080 HTTP Yes 8250 8250 TCP Yes Master/Slave MySQL Configuration CloudPlatform requires a MySQL database, to store configuration information, VM Staging, and events related to every VM (i.e. every guest VM started as part of the cloud environment creates an associated event which is stored in the database). The script provided with the CloudPlatform installation creates two different databases referred to as cloud and cloud_usage and populates the initial data within each database. The CloudPlatform Installation Guide details the scripts used for installing and preparing the databases for CloudPlatform. Currently CloudPlatform has a dependency on the InnoDB Engine used in MySQL for foreign key support in both the cloud and cloud usage databases; therefore a MySQL Cluster cannot be used. The following section refers to a Master / Slave configuration of MySQL. MySQL replication works on a master/slave topology therefore there is no requirement for shared storage. Internally by means of the asynchronous transfer mode data is kept consistent between both of the servers. The replication methodology used for CloudPlatform is ROW Based. The MySQL community edition (GPL) is to be deployed on two separate Virtual Servers running Red Hat Enterprise Linux 6.2 with Replication (master and slave) configured between them for high availability. 14 © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture MySQL Database Number of MySQL Databases Instances 2 In Master/Slave configuration Virtual Machine Configuration 2 vCPU, 16GB RAM, 250 GB Local Disk Use shared storage for DB Storage. High Availability MySQL Master/Slave Replication CloudPlatform does not support mysql clustering -- it is a manual failover. INNODB_ROLLBACK_ON_TIMEOUT=1 INNODB_LOCK_WAIT_TIMEOUT=600 MAX_CONNECTIONS=350 LOG-BIN=MYSQL-BIN BINLOG-FORMAT = 'ROW' InnoDB Rollback on Timeout 1 (second) The InnoDB Rollback on Timeout (innodb_rollback_on_timeout) configuration. Located in /etc/my.cnf InnoDB Lock Wait Timeout 600 (seconds) The InnoDB Lock Wait Timeout (innodb_lock_wait_timeout) configuration. Located in /etc/my.cnf MySQL Max Connections 700 The configuration for the max number of MySQL connections (max_connections). This should be set to 350 * (Number of CloudPlatform Management nodes). Located in /etc/my.cnf Binary Log Location mysql-bin This setting (log-bin) enables and sets the location for the binary log. Located in /etc/my.cnf Binary Log Format Row Based This setting (binlog-format) defines the binary Log format. Located in /etc/my.conf © 2012 Citrix Systems, Inc. All rights reserved. 15 CloudPlatform Deployment Reference Architecture Cloud-Era Availability Zone Deployment Overview In this section we will describe how to design and configure a 3200-node cloud-era availability zone where all 3200 nodes reside in the same datacenter. These nodes are divided into 200 racks or pods, with 16 hosts in each. The number of hosts in each pod is typically a function of the available power. In the event that blade servers are used, 16 hosts constitute a typical blade chassis and embedded networking switches would eliminate the need for TOR switches. Each pod also contains an NFS server for primary storage. The following figure illustrates how the compute hosts and storage servers are interconnected 16 © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Network Configuration Here is the summary of networking configuration in the Cloud-era availability zone: 1. A pair of NetScaler MPX appliances in HA configuration is connected directly to the public Internet on one side, and on the other side to the datacenter core switch on a RFC 1918 private network. 2. Datacenter core switch and aggregation switches create 200 pairs of RFC 1918 private IP networks. Each pod consumes 1 pair of RFC 1918 private IP networks: a storage/management network and a guest network. 3. Each host in the pod is connected to 2 RFC 1918 private IP networks. One is a 10Gbps network used for storage and management traffic. The other is a 1Gbps network used to carry guest VM traffic. 4. There is one NFS server in each pod. The NFS server is connected to the storage/management network via a 10Gbps Ethernet link. 5. Link aggregation may be used in the datacenter core and aggregation switches. Link aggregation is not used in TOR switches, hosts, or primary storage NFS servers. 6. A high performance NFS server is directly connected to the datacenter aggregation switch layer and is used as the secondary storage server for this datacenter. The datacenter core and aggregation switches set up the appropriate network ACL to ensure that various networks are properly isolated. The following table details best practices on whether access should be allowed or denied based on source and destination. Destination S o u r c e Storage/Mgmt Network Guest Network Secondary Storage NFS Server Public Internet Storage/Mgmt Network Allowed Denied Allowed NAT’ed Guest Network Denied Allowed Denied NAT’ed Secondary Storage NFS Server Allowed Denied Allowed Denied Public Internet Denied Denied Denied Allowed The detailed network and IP address configuration is listed in the following table: Storage/Management Network Each host in the pod must have an IP address in the storage/management network. CloudPlatform will also use a small number of private IP addresses for system VMs. So a minimum of /27 RFC 1918 private IP address must be allocated for each pod. These IP addresses will be exclusively used by CloudPlatform. Each pod must have a different address range for storage/management. © 2012 Citrix Systems, Inc. All rights reserved. 17 CloudPlatform Deployment Reference Architecture Guest Network The number of guest IP addresses for each pod is determined by the profile of the VM supported. For example, if VMs on average have 2GB memory, allocate 64 VM in each host, and 1024 VMs on each pod. To be safe, allocate /21 RFC 1918 private IP range for the guest network in each pod, allowing a maximum of 2048 VMs to be created. Guest network IP ranges in different pods must not overlap. Cloud operators may choose to create site-to-site VPN tunnels that enable VMs in different availability zones to communicate with each other via their private IP addresses. If that is a requirement guest network IP ranges in different availability zones must not overlap. Secondary Storage Server IP One or more RFC 1918 IP addresses for the NFS server. Cloud-Era Availability Zone Hardware 18 Load Balancer NetScaler MPX Core Switch and Aggregation Switch Follow established networking practices TOR Switch 2 per pod. 1 10G and 1 1G. 24 ports each. More ports would be required if using IPMI or ILO for managing the individual hosts. Computing node Intel or AMD CPU server with at least 2GHZ, 2 sockets, 6 cores per socket, 128GB of memory, and 250GB of RAID 1 local disk storage. Primary Storage NFS Server NFS server sized based on the profiles of VMs the cloud is designed to support. CloudPlatform supports thin provisioning and primary storage sizing can take advantage of this to reduce the initial storage requirements. (See sizing calculation below) Secondary Storage NFS Server Sized according to number of hosts and VM profiles. (See sizing calculation below) © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Primary Storage Sizing Primary storage sizing is based on the VM Profile. The formula for calculating the primary storage for each pod-specific NFS storage would be as follows: R = Average size of the system/root disk. D = Average size of the Data volume. N = Average number of Data volumes attached per VM. V = Total number of VMs per pod. The size of the primary storage required per pod would be V * (R + (N*D)) Overprovisioning is supported on NFS storage devices in CloudPlatform and can be used to reduce the initial size requirement of the primary storage per pod. Secondary Storage Sizing For Secondary Storage Sizing, here is a formula to follow: N = Number of VMs in the Zone. S = Average Number of Snapshots per VM. G = Average size of snapshot per VM. T = Number of Templates in the zone. I = Number of ISOs in the zone. Secondary Storage sizing would be ((N * S * G) + (I * Avg Size of ISOs) + (T * Avg size of Templates)) * 1.2 There is a 20% spare capacity built into the formula. The actual size could be further reduced based on the following factors Deduplication in the Storage Array. Thin Provisioning. Compression. © 2012 Citrix Systems, Inc. All rights reserved. 19 CloudPlatform Deployment Reference Architecture Cloud-Era Availability Zone Configuration We will configure CloudPlatform as follows: 1. Each pod consists of 2 XenServer pools. 2. There are 8 hosts in each pool. 3. Create 2 NFS exports in the primary storage NFS server for each pool Availability Zone(s) – 1 (it is always recommended to go with minimum of two availability zones) ZONE-01 Network Mode Basic (L3 Network Model) This Zone has two hundred PODs and two clusters in each POD. The configuration is specified for one cluster that can be replicated for all other clusters in all the PODs. \ZONE-01 \PODS\ <Z-01-POD01-Xen-CL01-04> Name of Cluster(s) Number of Hypervisors (compute nodes) per Cluster Z-01-POD01-Xen-CL01 Z-01-POD01-Xen-CL02 Z-01-POD02-Xen-CL03 Z-01-POD02-Xen-CL04 8 x XenServer 6.0.X These names for POD and Clusters are specific to implementation. Storage Infrastructure Type / Make NetApp FAS3270 Number of Controllers 2 Primary Protocol NFS Available Capacity 20TB/pod Two controllers for availability. This is an example. Calculate the capacity requirement for primary and secondary storage using the formulas mentioned in the section above. Primary Storage (two per cluster) Z-01-POD01-CL (Replicate this for every cluster) 20 Availability Zone ZONE-01 Pod Z-01-POD01 Cluster Z-01-POD010-CL01 Protocol NFS Size 4TB Name of the Zone/POD/Clusters should be treated as examples. Please refer to the computation in the section above to determine the exact size. © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Path / LUN NFS:/PS/Z-01-CL01-PS01/ NFS:/PS/Z-01-CL01-PS02/ This is a sample path. Secondary Storage Z-01-SS01 Type / Make NetApp FAS3270 Number of Controllers 2 Primary Protocol NFS Available Capacity 50TB This is per Zone. Calculate the capacity requirement from the formula mentioned in the section above. XenServer Version 6.0.2 Latest version of XenServer XenServer Edition Advanced Server Hardware Specifications HP DL360p Gen8 Networking Configuration 1 G NIC (1) 10 G NIC (1) + 1 NIC for IPMI Number of XenServer Hosts (computing nodes) 16 (hosts per pod) Host Configuration 1 G NIC will be dedicated for public network and 10G NIC for private/storage. Network Configuration Distribution (core) Switch Juniper EX4500 Access Switch Juniper EX4200 (4) 2 per POD 48 10G ports © 2012 Citrix Systems, Inc. All rights reserved. 21 CloudPlatform Deployment Reference Architecture Traditional Availability Zone Deployment Overview In this section we will describe how to design and configure a 64-node traditional server virtualization availability zone. The availability zone consists of 4 pods, each comprised of 16 nodes. Unlike the Cloud-era setup, where each pod has its own NFS servers, the entire zone shares a centralized storage server over a SAN. The availability zone is connected to 4 shared VLANs: public, DMZ, test-dev, and production. In addition, tenants can be allocated isolated VLANs from a pool of zone VLANs. A VM can be connected to one or more of these networks: An isolated VLAN NAT’ed to public internet via the virtual router The DMZ VLAN The test-dev VLAN The production VLAN The following figure illustrates the physical network setup for a traditional availability zone: Every host is connected to 3 networks: 1. A storage network that connects the host to primary storage. Storage multipath technology should be used to ensure reliability. 2. An untagged Ethernet network used for management and vMotion traffic. NIC bonding should be used to ensure reliability. 3. An Ethernet network used for shared and public VLAN traffic. NIC bonding should be used to ensure reliability. This network is used to carry 4 shared VLANs: public, DMZ, test-dev, and production. It is also used to carry the isolated zone VLANs. 22 © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Either 1Gbps or 10Gbps network can be used depending on the workload and VM density requirements. The detailed network and IP address configuration is listed in the following table: Storage Area Network Apply vendor’s best practices for SAN setup Management/vMotion Network Each host needs 1 RFC 1918 private IP address. CloudPlatform consumes additional private IPs for system VMs like CloudPlatform virtual routers. Reserve at least a /22 private IP address range to ensure plenty of private IPs (1024) are available for system VMs. Management/vMotion network IP ranges in different pods must not overlap. VLAN network Carries tagged VLAN traffic for shared and isolated VLANs. Secondary Storage Server IP One or more RFC 1918 IP addresses for the NFS server. Traditional Availability Zone Hardware Core Switching Fabric Follow established networking practices TOR Switch 2 per pod. 48 ports each to allow NIC bonding. Computing node Intel or AMD CPU server with at least 2GHZ, 2 sockets, 6 cores per socket, 128GB of memory, and 250GB of RAID 1 local disk storage. Primary Storage Server Sized based on the profiles of VMs the cloud is designed to support Secondary Storage NFS Server Sized according to VM profiles © 2012 Citrix Systems, Inc. All rights reserved. 23 CloudPlatform Deployment Reference Architecture Primary Storage Sizing Primary Storage Sizing is based on the VM Profile. The storage sizing is based on the formula for calculating the primary storage: S = Average size of the system/root disk. D = Average size of the Data volume. N = Average number of Data volumes attached per VM. V = Total number of VMs per pod. R = Number of pods in the zone. The size of the primary storage required per pod would be R* V * (S + (N*D)) If using tiered storage, which is quite common in traditional Enterprise style workloads, repeat the calculation for each tier. Secondary Storage Sizing For Secondary Storage Sizing, here is a formula to follow: N = Number of VMs in the Zone. S = Average Number of Snapshots per VM. G = Average size of snapshot per VM. T = Number of Templates in the zone. I = Number of ISOs in the zone. Secondary Storage sizing would be ((N * S * G) + (I * Avg Size of ISOs) + (T * Avg size of Templates)) * 1.2 There is a 20% spare capacity built into the formula. The actual size could be further reduced based on the following factors 24 Deduplication in the Storage Array. Thin Provisioning. Compression. © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Choice of Hypervisor in Traditional Availability Zone There are a variety of choices of hypervisors in a traditional availability zone. The following table lists the recommended configuration for each hypervisor type. XenServer vSphere Primary Storage NFS iSCSI or FC Storage Network Link Aggregation (LACP) Multipathing Cluster Size 8 8 Traditional Availability Zone Configuration (for vSphere) Availability Zone 1- ZONE-VMW-01 Name of Zone [ZONE-VMW-01] Sample name for the Zone. Network Mode Advanced Network with VLANs Advanced Networking is a stipulation for using the VMware ESX hypervisor with vCenter VLAN Type Tagged VLANs Guest Networks CIDR Guest VLAN Range 10.2.1.0/24 300-1000 This CIDR is a sample and CloudPlatform administrator can choose based on their networking best practices. These VLANs are allocated for each account and any isolated/shared network created apart from a guest network. You can compute a range roughly at an average of 3 VLANs per © 2012 Citrix Systems, Inc. All rights reserved. 25 CloudPlatform Deployment Reference Architecture customer. Guest networks (VM Traffic) VMware Switch: vSwitch0 Virtual switch specified are sample names/values and should be changed to fit actual configurations. Storage network VMware Switch: vSwitch3 Virtual switch specified are sample names/values and should be changed to fit actual configurations. Management network (Control Plane Traffic) VMware Switch: vSwitch1 Virtual switch specified are sample names/values and should be changed to fit actual configurations. Public network VMware Switch: vSwitch2 Virtual switch specified are sample names/values and should be changed to fit actual configurations. POD (Z01-POD01) Replicate this for each POD Pod Name Z01-POD01 Sample name for POD. Start Reserved System IPs 10.144.53.201 These are examples. Need to change to suit your network configuration. IPs for the CloudPlatform hosts, storage and network devices within the Pod. For 64 hosts enough IPs should be allocated to virtual routers, storage VM, Console Proxy VMs. End Reserved System IPs 10.144.53.235 These are examples. Need to change to suit your network configuration. IPs for the CloudPlatform hosts, storage and network devices within the Pod Number of Clusters 26 2 Clusters Citrix recommends 8 servers per cluster -- this provides the optimum management to performance ratio © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Cluster Name Z01-POD1-VMW-CL01 Sample names for clusters. Z01-POD1-VMW-CL02 Hypervisor VMware ESXi 5.0 vCenter must use port 443 (default). Compute Nodes in cluster (replicate for each cluster) Number of Servers 8 hosts per Cluster. Make & Model Cisco UCS B230 M1 Blade Servers CPUs 2 x 6-core Intel CPUs Memory 128GB RAM Target Number of VMs 60 per server Cisco UCS B230 M1 blades (Guest VMs) 128 GB should be sufficient for most workloads but can be increased based on target workload and hypervisor capacity. Network Hardware Access switches 2 x Cisco Nexus 5548 Storage Hardware Shared Hypervisor Storage (Primary Storage) Storage System: EMC VNX 7500 Protocol: VMFS VMFS Datastore: Z1-P1-CL01PS01 Citrix recommends a minimum of two primary storage volumes per cluster. Use VMFS file system when storage is connected by iSCSI or FC. Name for VMFS Datastore mentioned here is a sample. © 2012 Citrix Systems, Inc. All rights reserved. 27 CloudPlatform Deployment Reference Architecture Traditional Availability Zone Configuration (for XenServer) Availability Zone 1- ZONE-XEN-01 Name of Zone [ZONE-XEN-01] Sample name for the zone. Network Mode Advanced Network with VLANs Advanced Networking is a stipulation for using the Citrix XenServer hypervisor VLAN Type Tagged VLANs Guest Networks CIDR 28 10.2.1.0/24 This CIDR is a sample and CloudPlatform administrator can choose based on their networking best practices. Guest VLAN Range 300-1000 These VLANs are allocated for each account and any isolated/shared network created apart from a guest network. You can compute a range roughly at an average of 3 VLANs per customer. Guest networks (VM Traffic) Network Label (XenServer Bridge) : cloud-guest These names are examples and should be changed to meet XenServer Configuration. Storage network Network Label (XenServer Bridge): cloud-storage These names are examples and should be changed to meet XenServer Configuration. VM Management network (Control Plane Traffic) Network Label (XenServer Bridge):cloud-mgmt These names are examples and should be changed to meet XenServer Configuration. Public networks Network Label (XenServer Bridge):cloud-pub These names are examples and should be changed to meet XenServer © 2012 Citrix Systems, Inc. All rights reserved. CloudPlatform Deployment Reference Architecture Configuration. POD (Z01-POD01) Replicate this for each POD Pod Name Z01-POD01 Sample name for the pod. Start Reserved System IPs 10.144.53.201 These are examples and should be changed to suit your network configuration. XenServer uses link local addresses for virtual routers and system VMs. End Reserved System IPs 10.144.53.210 These are examples and should be changed to suit your network configuration. IPs for the CloudPlatform hosts, storage and network devices within the Pod Number of Clusters 2 clusters Citrix recommends 8 servers per cluster -- this provides the optimum management to performance ratio. Cluster Name Z01-POD1-XEN-CL01 Sample name for clusters. Z01-POD1-XEN-CL02 Hypervisor Citrix XenServer 6.0.x Compute Nodes in Cluster (replicate for each cluster) Number of Servers 8 hosts per cluster. Make & Model HP DL360 G8 CPUs 2 x 6-core Intel CPUs Memory 128 GB RAM 128 GB should be sufficient for most workloads but can be increased based © 2012 Citrix Systems, Inc. All rights reserved. 29 CloudPlatform Deployment Reference Architecture on target workload and hypervisor capacity. Target Number of VMs 60 per server Network Hardware Access switches 2 x Cisco Nexus 5548 Storage Hardware Shared Hypervisor Storage (Primary Storage) Storage System:NetApp FAS3240AE Citrix recommends a minimum of two primary storage volumes per cluster. Protocol: NFS Sample NFS Mount names. NFS Mount 1:/Z1-P1-CL01-PS01/ NFS Mount 2: /Z1-P1-CL01-PS02/ 30 © 2012 Citrix Systems, Inc. All rights reserved.
© Copyright 2026 Paperzz