HC3 User Guide HyperCore v6.4.2 This document supports up to the version of product listed and all previous versions. Check for new editions of this document at http://www.scalecomputing.com/support/ login/. HC3 User Guide: HyperCore v6.4.2 You can find the most up-to-date technical documentation on the Scale Computing website and Portal at: http://www.scalecomputing.com/resources/technical-resources/ http://www.scalecomputing.com/support/login/ The Scale Computing site also provides the latest product updates and information: http://www.scalecomputing.com/ Provide feedback on this document at: [email protected] Document Version 3.0 Corporate Headquarters West Coast Office 5225 Exploration Drive 360 Ritch Street Indianapolis, IN 46241 Suite 300 P. +1 877-722-5359 San Francisco, CA 94107 www.scalecomputing.com EMEA Office Saunders House 52-53 The Mall London W5 3TA United Kingdom 1-877-SCALE-59 (877-722-5359) Content About this Document 5 1 HC3 Powered by HyperCore Overview 6 HC3 System Architecture6 Software Defined Storage 6 Software Defined Computing 6 Guest Virtual Machine Data Management 7 Virtual Machine Placement 7 Virtual Machine Live Migration 7 High Availability7 Failure Scenarios8 Node Failure8 Drive Failure8 Network Failure8 2 Accessing the User Interface10 Prerequisites10 Browser Security10 Log In11 Remote Support Assistance11 3 Overview of the User Interface 13 Layout13 Heads Up Display and Cluster Display Panel 14 Virtual Machine Management Panel 15 Control Center Panel16 4 System Registration and Setup18 Settings18 Add an ISO File19 Remove an ISO File20 5 Create, Organize, and Edit Virtual Machines 21 Prerequisites21 Tags and Virtual Machines21 Search for Virtual Machines23 Create a Virtual Machine24 Edit a Virtual Machine25 Change a VM Name25 Change a VM Description25 Add or Remove Tags25 Edit CPU Allocation26 Edit RAM Allocation26 Mount or Unmount ISO Files 26 Edit, Add, or Remove Drives Edit, Add, or Remove NICs Connect or Disconnect NICs 27 27 28 6 Export or Import a Virtual Machine 29 Prerequisites29 Export a Virtual Machine29 Import a Virtual Machine30 7 Manage Virtual Machines32 Move a Virtual Machine32 Clone a Virtual Machine32 Snapshot a Virtual Machine33 Restore a Virtual Machine from a Snapshot 33 Delete a Virtual Machine Snapshot 34 Delete a Virtual Machine34 Best Practices for Virtual Machine Management 34 8 HyperCore Replication36 Prerequisites36 Add or Remove a Remote Cluster Connection 37 Replicate a Virtual Machine37 Pause or Resume Replication of a Virtual Machine 38 Restore from a Replication Virtual Machine 38 Cancel Virtual Machine Replication 39 9 System Monitoring and Maintenance 40 Alerts and Log40 Conditions40 Alerts41 ITEMS41 Cluster Log41 Tasks42 Shutdown the Cluster43 Power on the Cluster43 Update the Cluster43 Remote Support44 Drive Failure45 Node Failure46 Capacity and Resource Management 47 Appendix A48 Appendix B51 About this Document Audience, feedback, and support Intended Audience This guide is intended for HC3 users to better understand the HyperCore Operating System (HCOS). It is assumed that the user has a general understanding of virtualization and is looking to better operate and manage HC3. Document Feedback Scale Computing welcomes your suggestions for improving our documentation. If you have any comments or suggestions please send your feedback to [email protected]. Technical Support and Resources There are many technical support resources available for use. Access this document, and many others, at http://www.scalecomputing.com/resources/technical-resources/ or http:// www.scalecomputing.com/support/login/. Online Support You can submit support cases and view account information online through the Portal at http://www.scalecomputing.com/support/login/. You can also Live Chat with support through www.scalecomputing.com or email support at [email protected] during standard hours Monday-Friday 8 AM to 6 PM ET. Telephone Support Support is available for critical issues 24/7 by phone at 1-877-SCALE-59 (877-722-5359) in the US and at 0808 234 0699 in Europe. Telephone support is recommended for the fastest response on priority issues. Professional Scale Computing offers many professional service options for both remote and Resources on-site assistance in getting your cluster up and running quickly and knowledgeably. Contact your Scale Computing sales representative today to discuss our service offerings. Find additional information at http://www.scalecomputing.com/products/support/. 1 HC3 Powered by HyperCore Overview How it works Scale Computing’s HC3 and the HyperCore architecture were designed to provide highly available, scalable compute and storage services while maintaining operational simplicity through highly intelligent software automation and architecture simplification. The following topics are included in this chapter: •HC3 System Architecture •Guest Virtual Machine Data Management •Failure Scenarios HC3 System Architecture HC3 and HyperCore are based on a 64-bit hardened and proven OS kernel – and leverage a mixture of patented proprietary and adapted open source components. With HC3 there is no separate management server (or VM appliance) to install or connect to, or a single “brain” that controls the overall system. You can manage the entire system simply by pointing a web browser to the LAN IP address of any node in the cluster and you will get the same cluster-wide management view. Also, all nodes of the HC3 system coordinate with each other to monitor the overall state of the cluster, nodes, hardware components, and virtual machines running across the entire system. Software Defined Storage A critical software component of HyperCore is the Scale Computing Reliable Independent Block Engine, known as SCRIBE. SCRIBE is an enterprise class clustered block storage layer, purpose built to be consumed by the HC3 embedded KVM based hypervisor directly. SCRIBE discovers and aggregates all block storage devices across all nodes of the system into a single managed pool of storage. All data written to this pool is immediately available for read or write by any and every node in the storage cluster, allowing for sophisticated data redundancy and load balancing schemes to be used by higher layers of the stack--such as the HC3 compute layer. SCRIBE is not simply a re-purposed file system with the overhead introduced by local file or file system abstractions such as “virtual hard disk files” that attempt to act like a block storage device. Performance killing issues such as disk “partition alignment” with external RAID arrays also go away. Arcane concepts like storing VM snapshots as “delta files” that later have to be merged through I/O killing brute force reads and re-writes are a thing of the past as well. Software Defined Computing The HyperCore compute software layer is a lightweight, type 1 (bare metal) hypervisor that is directly integrated into the OS kernel, and leverages the virtualization offload capabilities 6 HC3 Powered by HyperCore Overview provided by modern CPU architectures. Specifically, it is based on components of the KVM hypervisor, which has been part of the Linux mainline kernel for many years and has been extensively field-proven in large-scale environments. HyperCore uses SCRIBE to integrate block storage objects directly into the KVM hypervisor. This means that VMs running on HC3 have direct access to SCRIBE “virtual storage device” (VSD) objects in the clustered storage pool without the complexity or performance overhead introduced by using remote storage protocols and accessing remote storage over a network. Unlike other seemingly “converged” architectures in the market, the storage layer is not a Virtual Storage Appliance (VSA) but instead interfaces directly with the hypervisor allowing data flows to benefit from zero copy shared memory performance. Guest Virtual Machine Data Management HyperCore utilizes various software and design tools to ensure creating a VM is simple and easy. Creating a VM not only persists those VM configuration parameters that will later tell the hypervisor how to create the VM container when it is started, it also physically creates storage objects using the SCRIBE distributed storage pool that will contain the virtual hard disks to present to the VM once it is started. HC3 VMs are able to access their virtual hard disk files directly as if they are local disks, without the use of any SAN or NAS protocols, regardless of where they are running at the time. HC3 virtual disks are thin provisioned so that virtual hard drive space is not fully allocated until it is actually used by the guest VM. Virtual Machine Placement All HC3 capable nodes have access to the entire pool of storage, which means that any VM can be started on any HC3 capable node in the cluster based on the availability of the compute resources that the VM requires. For example, if a new VM requires 16GB RAM there may only be certain nodes with that much RAM currently available and HC3 makes this node selection automatically. If more than one node has this amount of RAM available, the system can intelligently assign out the VM to the least utilized node. Virtual Machine Live Migration HC3 allows running VMs to be “live migrated” to other HC3 nodes without the VM being shutdown and with virtually no noticeable impact to the workload being run or clients connecting to it. Live migration can be used to organize VMs or manually balance load across nodes. Live migration is also an integral part of HyperCore rolling software updates. High Availability HC3 is designed to be fault tolerant. Whether it is a drive, a node, or a network link, the system will continue to run through a failure scenario. In the case of a network link or node failure, the 7 HC3 Powered by HyperCore Overview system will automatically recover from the failure once access is restored. If an HC3 node was running VMs prior to failure, those VM disks and data are still available to the remaining cluster nodes. This allows VMs to be automatically re-started on the remaining nodes with HyperCore selecting the ideal node placement for restarting, contingent on the system having the available resources to run the VMs while a single node is unavailable. Failure Scenarios HC3 systems are designed to sustain operation during a drive failure, node failure, or network link failure, making them highly available during the most common failure scenarios. Node Failure HyperCore was designed to sustain the temporary loss of an entire node while continuing to process data and running VMs using redundant resources from the remaining cluster nodes. Due to the number of different scenarios possible, we recommend that customers work with the Scale Support team on the best plan of action if the node failure is considered permanent. See the Node Failure section for information on how to assess a failed node. Drive Failure HyperCore was designed such that disk failures have little effect on the cluster beyond the temporary loss of the capacity and I/O resources of that disk. When a disk is determined to have failed, in addition to raising corresponding alerts in the user interface and remote alerting mechanisms, HyperCore will continue servicing I/O transparently using the mirrored data copies. HyperCore will also automatically use remaining available space to re-generate a second copy of any chunk that was previously located on the failed disk. Any new writes are mirrored across the remaining disks to ensure immediate protection of new and changed data and reduce future data movement once the failed disk is replaced. Note that unlike many RAID systems, this process does NOT need to perform time consuming and I/O intensive parity calculations to determine what data was lost. HyperCore simply reads the mirror blocks and writes a new copy using any available space on the cluster. Note that both read and write operations here leverage the same wide striping performance benefits as normal I/O. Not only does HyperCore not require a dedicated “hot spare” disk that sits idle most of the time, HyperCore does not even require a failed disk to be replaced or any manual steps before it automatically re-creates the lost mirror copies to regain full data redundancy. This ensures there is no critical need to replace a failed disk in the system for ease of system management. See the Drive Failure section for information on how to replace a failed drive. Network Failure HC3 nodes have two distinct physical networks in which they participate. A public network 8 HC3 Powered by HyperCore Overview provides a path to allow network access to virtual machines, as well as for end users accessing data from VM’s running on the cluster. A private backplane connection is for intra-cluster communication, such as the creation of redundant copies of data being written on one node or disk to a mirrored copy on another node. All current HC3 nodes offer redundant network ports for both the public LAN and private backplane network connections for a total of 4 network ports per node to allow for full network redundancy. These ports are in an active / passive bond for failover. If access to the primary port is lost on either the public or private node network, the secondary port will activate with little to no disruption to the cluster node. Once the primary port becomes available again the network will fail back to that port. All onboard network cards are used as the primary port for the HC3 nodes. 9 2 Accessing the User Interface Browsers, login, troubleshoot The HC3 Web Interface, known as the UI for the rest of this guide, is the single landing page for managing the entire HC3 system. The following topics are included in this chapter: •Prerequisites •Browser Security •Log In •Remote Support Assistance Prerequisites In order to access the UI you will need the following items: •A machine that can access the LAN IPs of the Scale HyperCore cluster on your network •A supported web browser; review the Scale Computing HC3 Support Matrix for the most current support details Browser Security Depending on your browser choice and the browser version, you will be prompted to confirm the security certificate of the Scale HC3 system. Chrome is the recommended web browser. In the case of the Chrome web browser you will see the page in Figure 2-1 below. Figure 2-1 This warning is generated for the unsigned security certificate used by the UI. It is safe to proceed by clicking “Advanced” and then “Proceed to...” as shown in Figure 2-2. Figure 2-2 10 Accessing the User Interface Log In You will need to receive the login credentials from Scale Support. This is normally provided as part of the installation service engagement. 1. In the address bar of your web browser, type the LAN IP address for any node in the cluster. NOTE: Be sure to use the HTTPS protocol when connecting to the LAN IP. 2. Follow the Browser Security section to proceed to the login prompt if you are warned about the security certificate. 3. Enter admin as the username and use the password provided to you by Scale Support. 4. Click Login. Resolve any login errors you may experience using Table 2-1 below. Table 2-1 Error Image Symptoms 1. When initially accessing the node LAN IP you receive this message despite having accepted the security certificate previously 2. You do not accept the browser security certificate and then try to login and receive this message When trying to login this message is generated in the lower right corner of the browser even after accepting the security certificate Resolution 1. Ensure the node LAN IP you entered is accessible on the network and that there have been no cluster events that would have disabled the UI service on the node 2. Accept the browser security certificate in order to access the UI normally You are attempting to login with incorrect login credentials Verify the credentials and try again Remote Support Assistance The Remote Support tool creates a secure tunnel for the Scale Support team to remotely access the HC3 cluster to diagnose and resolve cluster issues as if they were next to you at the cluster console. See the Remote Support section for further details. If you need Scale Support assistance but are unable to access your cluster UI for any reason, there is a Remote Support link available on the login prompt. 11 Accessing the User Interface 1. 2. 3. 4. Click Remote Support on the login prompt. Enter the Support Code number provided to you by Scale Support in the popup dialog. Click Open. A secure tunnel will be initiated with Scale Support. 12 3 Overview of the User Interface Navigation It is important to understand how the UI operates and how each panel and display can assist you in monitoring your system and managing day-to-day tasks quickly, easily, and intuitively. The following topics are included in this chapter: •Layout •Heads Up Display and Cluster Display Panel •Virtual Machine Management Panel •Control Center Panel Review Appendix A and Appendix B if you are unfamiliar with the VM icons or cues mentioned in this chapter. Layout Figure 3-1 shows the default layout of the UI before any VMs have been created. 1 Figure 3-1 2 4 3 1. Heads Up Display (HUD) - Displays at-a-glance system information such as CPU, RAM, capacity, action items, firmware version, and the cluster name. Used with the Cluster Display Panel to monitor system health. 2. Cluster Display Panel - Part display and part management panel, you can toggle between RAM (VM) and DISK (capacity) information about the system as well as discern node status details. From the RAM view, you can use the display panel to complete some VM management tasks. 3. Control Center Panel - Check cluster history, see cluster action items in detail, upload ISO files, shutdown the cluster, establish a remote cluster connection for replication, change cluster settings, and accomplish many other system tasks from this panel. 13 Overview of the User Interface 4. Virtual Machine Management Panel - Create, import, tag, clone, edit, organize, and complete many other day-to-day VM tasks on the system. Heads Up Display and Cluster Display Panel The HUD and the Cluster Display Panel are an important part of monitoring and managing system health, performance, and redundancy. For these reasons, various key items on the HUD and Cluster Display Panel are interactive to provide you quick access to common cluster tasks. The Scale Computing logo in Figure 3-2 opens a new browser tab to www.scalecomputing.com/support for quick Scale Support contact information. The assigned cluster name, “d12c-Primary” in Figure 3-2, links to the Settings tab in the Control Center Panel. Figure 3-2 LOGOUT in Figure 3-3 will log out of the UI. The HyperCore version number, v6.0.0.152503 in Figure 3-3, links to the Update tab in the Control Center Panel. When an update becomes available you will see Update Available appear as in Figure 3-4. Figure 3-3 Figure 3-4 In Figure 3-5, the ITEMS gauge links to the Conditions tab in the Control Center Panel. Figure 3-5 The Cluster Display Panel also uses interactive elements. A running VM is represented as a bar on the node. The size of the bar corresponds to the RAM footprint of that VM in relation to the overall RAM of the node. See the VM named “qa-Indy-win81” as an example in Figure 3-6 below. 14 Overview of the User Interface Figure 3-6 You can single-click any VM running on one of the nodes in the RAM view to quickly refocus the browser on that VM and open the group housing the VM’s card if it is collapsed. You can also double-click any VM to access the VM’s console in a new browser tab. Virtual Machine Management Panel The VM Management Panel is the core of the UI. All VMs established on the cluster are accessed, monitored, and managed through this panel. Each individual VM is displayed visually as an informational, interactive “card.” Each aspect of the VM can be centrally managed and manipulated without the hassle of multiple screens or windows. Figure 3-7 shows an example of a series of VMs, some running and some powered down, and the three key VM management screens. •The Summary screen, identified by the “gauge” icon, is displayed on VM AD1. •The Devices screen, identified by the “gear” icon, is displayed on VM AD2 (the VM is currently powered down). The Snapshots and Replication screen, identified by the “layers” icon, is displayed on VM win2012r2-production. Figure 3-7 Find further details on the icons, status color coding, and the tagging and grouping used in Figure 3-7 in Appendix B and the Tags and Virtual Machines section. 15 Overview of the User Interface Control Center Panel The Control Center Panel is used to monitor system logs, check system health, set system configurations, establish remote cluster sessions for replication, and more. The Control Center Panel is accessed through the “gear” icon to the left of the VM Management Panel icons as shown in Figure 3-8. Figure 3-8 Figure 3-9 shows the main screen of the Control Center Panel when it is opened. The Control Center Panel opens over the VM Management Panel in the lower portion of the UI. Figure 3-9 The Control Center Panel features tabs on the left-hand side to access various cluster settings, tools, and information. These tabs are listed from top to bottom below. 1. The Cluster Log tab is used to view cluster events and history. See the Alerts and Log section for more information. 2. The Conditions tab is used to view current cluster conditions. See the Alerts and Log section for more information. 3. The Control tab is used to gracefully shutdown the cluster. See the Shutdown the Cluster section for more information. 4. The Media tab is used to manage ISO files that have been uploaded to the cluster. See the Add an ISO File and Remove an ISO File sections for more information. 5. The Remote Clusters tab is used to add and manage connections to remote clusters for replication. See the Add or Remove a Remote Cluster Connection section for more information. 6. The Remote Support tab allows remote support tunnels to be opened on a per-node basis for support assistance. See the Remote Support section for more information. 16 Overview of the User Interface 7. The Settings tab should be completed during initial cluster setup and contains all system settings. See the Settings section for more information. 8. The Update tab is used to view and run any available cluster firmware updates. See the Update the Cluster section for more information. 9. The Exit tab will not bring up a new screen but will close out of the Control Center Panel and bring the VM Management Panel back into focus. This can also be accomplished by clicking the Control Center Panel icon again. 17 4 System Registration and Setup Initial configurations and media Once the HC3 nodes are working as a cluster, as part of the final steps of the installation it is important to set up initial system configurations and settings in the UI. You will also need to upload ISO file(s) to the cluster before creating any VMs or exploring the cluster tools for VM management. For assistance installing a HC3 cluster, refer to the Scale Computing HyperCore Installation Guide available online. If an installation service has been included with your HC3 cluster purchase you can also contact Scale Support for assistance. The following topics are included in this chapter: •Settings •Add an ISO File •Remove an ISO File Settings Settings should be completed from top to bottom in the Settings tab of the Control Center Panel. Some settings are optional. 1. Cluster Name - A name to identify this particular cluster. 2. Registration - Register the cluster to your account in the Scale Computing database with a Company Name, Technical Contact, Phone number, and Email address. 3. DNS - Configure DNS Server(s) and Search Domains for the cluster. Scale always recommends including an external physical or virtual DNS server for the cluster to use in conjunction with any VMs running on the cluster. 4. Time Server - Indicate a local or external time server for the cluster. This can be an IP address or a DNS name. 5. Time Zone - Select your timezone from the menu. This is important for cluster logging and the statemachine. 6. SMTP Settings - Set up an SMTP server to use for email alerts for the cluster. This is very important for monitoring cluster health and availability. 7. Email Alert Recipients - Manage and test cluster email alert recipient addresses. The default address [email protected] is an optional configuration that can be kept or removed; the address sends alerts to an unmonitored listing at Scale Computing for historical reference only. 8. Syslog Servers - (Optional) Configure a syslog server to use with the system. It will capture the standard cluster email alerts only. 9. Admin Password - (Optional) Change the UI password for the admin account. This does not impact any other cluster access, only the UI login. Scale Support cannot reclaim any forgotten UI passwords, only restore the access to the default login credentials. 18 System Registration and Setup Add an ISO File A newly initialized HyperCore cluster will not have any ISO files currently uploaded when you open the Media tab under the Control Center Panel. The Media tab will be empty, as seen in Figure 4-1 below. The Media screen also includes the simplified instructions to drag and drop ISO files into the browser to upload them onto the cluster. Figure 4-1 1. Ensure the file type is listed as *.iso. The cluster will not recognize any other formats. 2. Drag and drop the file from the original location to anywhere on the cluster UI. You do not need to have the Media tab open. The icon will change to a “move” image to indicate the behavior as shown in Figure 4-2. Figure 4-2 3. There will not be any confirmation for the upload. Monitor the upload progress through the Media tab in the Control Center Panel as shown in Figure 4-3. Figure 4-3 4. The cluster will inform you if you try to upload a duplicate ISO file, as shown in Figure 4-4. 19 System Registration and Setup Figure 4-4 Remove an ISO File Removing an ISO file from the cluster is easy through the Media tab. 1. Navigate to the Media tab. 2. Click delete next to the correct ISO file as shown in Figure 4-5. The selected line will change to brown before the ISO file is removed from the list and the cluster. See the Note for important details on deleting an ISO file. Figure 4-5 NOTE: The cluster will prevent an ISO that is currently mounted on a VM from being deleted to protect VM integrity. You can monitor how many times an ISO is mounted under the Mounts column in the Media tab. Any mounted ISO will show a grayed out , while a free ISO will show a black and 0 mounts, and can be removed as instructed. 20 5 Create, Organize, and Edit Virtual Machines Build and shape Ease of use and simplified management is the core to VM creation and day-to-day operations on the HyperCore cluster. The following topics are included in this chapter: •Prerequisites •Tags and Virtual Machines •Search for Virtual Machines •Create a Virtual Machine •Edit a Virtual Machine Review Appendix A and Appendix B if you are unfamiliar with the VM icons or cues mentioned in this chapter. Prerequisites You need to confirm you have the following items in place before creating a new VM on the cluster: •Verify the OS and/or application you are virtualizing is approved by the manufacturer for virtualization in general and specifically on KVM platforms •Verify that you have proper licensing for your environment and the OS and/or application you will be virtualizing •If the application you are virtualizing requires outside USB hardware functionality ensure proper USB-over-IP networking is in place •Verify an ISO file is uploaded to the cluster to use to boot and install the VM •Alternative to the ISO file, verify a network Preboot Execution Environment (PXE) server is available to use to boot and install the VM Tags and Virtual Machines Tags are an organizational feature that are used to group and quickly search for VMs on the cluster. Tags are an optional addition at the time of VM creation; tags can also be added or removed at any time after VM creation. There are two types of tags, a group tag--shortened to group for the rest of this guide--and an informational and search tag--shortened to tag for the rest of this guide. The group will always be the first tag assigned to the VM. Any additional tags after the group will be informational and also used in the VM search feature along with the group. Tags are displayed in a comma separated list below the VM name and description on the VM card, as shown in Figure 5-1. 21 Create, Organize, and Edit Virtual Machines Figure 5-1 Tags cannot contain spaces. When adding tags, a space or the return key will create the entry as a new tag. When removing tags, the first tag will always be the group. Deleting the group will cause the next tag in line to become the group. See Figure 5-2 and Figure 5-3 for an example. The “test” tag currently set as the group in Figure 5-2 is deleted and the next tag in line, “win7,” becomes the group in Figure 5-3. Figure 5-2 Figure 5-3 Groups are used to organize VMs into logical containers in the VM Management Panel. The VM container name will be the group that was assigned to the VM as shown in Figure 5-4. Figure 5-4 22 Create, Organize, and Edit Virtual Machines Each blue square in Figure 5-4 identifies a VM container displayed alphabetically by group name. The top blue square which does not have a name contains two untagged VMs, while the next three containers each have a single VM in the “Servers,” “Syslog,” and “Workstations” groups. A container group in the VM Management Panel can be collapsed by clicking on the top of the blue bar to the left of the VM card and expanded by clicking on the blue square to the left of the group name. See Figure 5-5 for an example. Figure 5-5 In Figure 5-5, the non-grouped VM container has been expanded, while the grouped VM containers are still minimized. Tags and groups are also used to populate the VM Management Panel search box. Select the dropdown from the search box to search by any currently assigned groups. Find more information on adding or removing tags see the Add or Remove Tags subsection. For more information on using tags to search for VMs see the next section. Search for Virtual Machines As more VMs are added to the cluster, it becomes necessary to quickly find a single VM. This can be accomplished using the search bar located to the top right of the VM Management Panel. When clicked, the search bar will offer a dropdown menu with default searches for RUNNING and SHUTOFF VMs as well as all of the existing groups on the cluster, all in alphabetical order. See Figure 5-6 for an example. Figure 5-6 Individual VMs can also be searched by name. However, individual VM searches must match the name exactly, both case and spelling. 23 Create, Organize, and Edit Virtual Machines Create a Virtual Machine Creating a VM is quick and easy on a HyperCore cluster. If you are creating a Windows VM, Scale recommends reviewing the HyperCore 6.1: Performance Driver Installation for Windows application note for using the new Scale Tools performance drivers and installer with Windows. 1. Select the Create VM icon from the bar on the VM Management Panel. 2. A popup dialog titled Create VM will appear as shown in Figure 5-7. Figure 5-7 3. Fill out the Create VM screen from top to bottom. Some fields are optional. NOTE: The bar to the left of the highlighted box will remain orange until you complete a valid entry. Valid entries will change the orange bar to gray. A gray bar by default indicates an optional or default field that may or may not need to be edited for your particular VM. a. Assign a Name for the VM. The VM name must be unique. b. Enter a Description (optional) of the VM. c. Add Tags (optional) to the VM. See the Tags section for information on how to use this feature. d. By default the OS is set to Windows but can be changed to Other if necessary. e. By default the Drivers are set to Performance (the recommended cluster VIRTIO drivers) but can be set to Compatible if necessary. f. Set the number of CPUs to be assigned to the VM; CPUs are available on a per-node basis. g. Set the amount of RAM to be assigned to the VM; RAM is available on a per-node basis. h. Set up an associated virtual Drive for the VM; up to four drives can be included at the time of VM creation and capacity is available on a cluster basis. i. Select a previously uploaded ISO file to Boot From, or select Network to boot 24 Create, Organize, and Edit Virtual Machines from the local network. 4. Click Create to finalize the settings and create the VM. Edit a Virtual Machine Once a VM is created, it has various properties that can be expanded, edited, and/or deleted. Many of these items will be dependent on the state of the VM, whether it is powered on or powered off. When editing a VM keep the current maximums in mind. These maximums can change with subsequent HyperCore software releases: •A VM not utilizing performance drivers can support 4 IDE devices •A VM utilizing performance drivers can support 28 virtual devices total, meaning the combined virtual drives and virtual NICs •A VM utilizing performance drivers is limited to 8 virtual NICs •A VM utilizing performance drivers is limited to 26 virtual drives •For example, a maximum of 26 virtual drives and 2 virtual NICs, a total of 28 virtual devices, can be created on a VM utilizing performance drivers Change a VM Name A VM name can be changed at any time, whether the VM is powered on or powered off. 1. Select the Edit VM icon from the Summary screen of the VM card. 2. A popup dialog will appear. 3. Change the Name box for the VM. A VM name must be unique. 4. Click Update to save and apply the change. Change a VM Description A VM description can be changed at any time, whether the VM is powered on or powered off. 1. Select the Edit VM icon from the Summary screen of the VM card. 2. A popup dialog will appear. 3. Change the Description box for the VM. 4. Click Update to save and apply the change. Add or Remove Tags VM tags can be changed at any time, whether the VM is powered on or powered off. 1. Select the Edit VM icon from the Summary screen of the VM card. 2. A popup dialog will appear. Add 3. Add tags by typing them into the Tags box for the VM or selecting previous tags from the dropdown menu. 4. Click Update to save and apply the change. Remove 3. Remove tags by clicking the “x” next to their colored box, as marked in Figure 5-8. 25 Create, Organize, and Edit Virtual Machines Figure 5-8 4.Click Update to save and apply the change. Edit CPU Allocation The allocated amount of CPU to a VM can only be edited while the VM is powered off. 1. Select the Edit VM icon from the Summary screen of the VM card. 2. A popup dialog will appear. 3. Click the CPUs field and select the amount of CPU to allocate from the dropdown menu. 4. Click Update to save and apply the change. Edit RAM Allocation The allocated amount of RAM to a VM can only be edited while the VM is powered off. 1. Select the Edit VM icon from the Summary screen of the VM card. 2. A popup dialog will appear. 3. Click the Memory field and enter the amount of RAM to allocate. The format of allocation can also be selected from a dropdown menu. 4. Click Update to save and apply the change. Mount or Unmount ISO Files ISO files can be mounted or unmounted at any time, whether the VM is powered on or powered off. Precautions should be taken to ensure the ISO is not in use on the VM before unmounting the file. 1. Open the Devices screen of the VM card. 2. Identify the ISO file to remove in one of the two CD trays. Unmount 3. Click eject next to the ISO file to finish unmounting the ISO file as shown in Figure 5-9. Figure 5-9 Mount 3.Click insert next to the ISO file you would like to mount as shown in Figure 5-10. Figure 5-10 4. A popup dialog will appear. 5. Select the ISO file to mount from the dropdown menu. 6. Click Insert to confirm the choice and finish mounting the ISO file. 26 Create, Organize, and Edit Virtual Machines Edit, Add or Remove Drives Virtual drives can only be edited, added, or removed when the VM is powered off. Removing a virtual drive is permanent; all data on that virtual drive will be erased. 1. Open the Devices screen of the VM card. Add 2. Click the Add Drive icon. 3. A popup dialog will appear. 4. Assign the drive Capacity and Type. 5. Click Create to save and create the new drive. Edit 2.Click edit next to the virtual drive you would like to modify. 3. A popup dialog will appear. 4. Click the Type field and select the drive type to use from the dropdown menu. 5.Click Update to save and apply the change. Delete 2.Click edit next to the virtual drive you would like to modify. 3. A popup dialog will appear. 4.Click Delete. 5. A confirmation dialog will appear over the popup. 6.Click Delete Device to confirm the deletion and remove the drive. Edit, Add or Remove NICs Virtual NICs can only be edited, added, or removed when the VM is powered off. Removing a virtual NIC is permanent; all connections utilizing that virtual NIC and virtual NIC VLAN will no longer be able to communicate through that network. 1. Open the Devices screen of the VM card. Add 2. Click the Add NIC icon. 3. A popup dialog will appear. 4. Assign the NIC VLAN (leave blank for default), Type, and MAC Address (leave blank to generate a new MAC). 5. Click Create to save and create the new NIC. Edit 2.Click edit next to the virtual NIC you would like to modify. 3. A popup dialog will appear. 4. Edit the MAC Address, VLAN or Type of NIC fields. 5.Click Update to save and apply the change. Delete 2.Click edit next to the virtual NIC you would like to modify. 3. A popup dialog will appear. 4.Click Delete. 5. A confirmation dialog will appear over the popup. 6.Click Delete Device to confirm the deletion and remove the NIC. 27 Create, Organize, and Edit Virtual Machines Connect or Disconnect NICs Virtual NICs can be connected or disconnected at any time, whether the VM is powered on or powered off. Precautions should be taken to ensure a NIC is not disconnected while it is in use on the VM. 1. Open the Devices screen of the VM card. 2. Identify the NIC to connect or disconnect using the virtual NIC icon. Disconnect 3. Click disconnect next to the NIC as shown in Figure 5-11. Figure 5-11 Connect 4. Click connect next to the NIC as shown in Figure 5-12. Figure 5-12 28 6 Export or Import a Virtual Machine Backups made easy It is easy to import or export HyperCore VM files using a local SMB share. The following topics are included in this chapter: •Prerequisites •Export a Virtual Machine •Import a Virtual Machine Review Appendix A and Appendix B if you are unfamiliar with the VM icons or cues mentioned in this chapter. Prerequisites You need to confirm you have the following items in place before exporting or importing a VM: •A server with a SMB share configured •Credentials with access to the server with the SMB share; a local account will work if the server is not joined to an Active Directory domain •Cluster network access to the server with the SMB share for the VM export feature •A previously exported HyperCore VM folder in the SMB share on the server with cluster network access for the VM import feature NOTE: Currently, the HyperCore cluster software is only able to import previously exported HyperCore VM files. Export a Virtual Machine Exporting a HyperCore VM to a SMB share is quick and easy. 1. Open the Snapshots and Replication screen of the VM card. 2. Click the Export VM icon. 3. A popup dialog will appear. 4. Enter the DNS name or IP of a local server with a SMB share configured in the SMB Server field. 5. Enter the Username for an account with access to the server. 6. Enter the Password for the account in Step 5. 7. Enter the server domain in the Domain field. The local server name will work if the server is not joined to an Active Directory domain. 8. Enter the name of the SMB share on the server in the Path field. Exported HyperCore VMs will create a folder in the SMB share using the VM name. See Figure 6-1 for an example of completed exportation information. 29 Export or Import a Virtual Machine Figure 6-1 9. Click Export to confirm and start the export process. Import a Virtual Machine Importing a previously exported HyperCore VM to create a new HyperCore VM is quick and easy. 1. Click the Import VM icon in the VM Management Panel. 2. A popup dialog will appear. 3. Enter a unique name for the VM in the New VM Name field. 4. Enter the DNS name or IP of a local server with a SMB share configured in the SMB Server field. This share will need to contain a previously exported HyperCore VM folder for import. 5. Enter a Username for an account with access to the server. 6. Enter the Password for the account in Step 5. 7. Enter the server domain in the Domain field. The local server name will work if the server is not joined to an Active Directory domain. 8. Enter the path to the previously exported HyperCore VM folder stored in the SMB share on the server in the Path field. Exported HyperCore VMs create a folder using the VM name by default. See Figure 6-2 for an example of completed VM import information. 30 Export or Import a Virtual Machine Figure 6-2 9. Click Import to confirm and start the import process. 31 7 Manage Virtual Machines Daily tasks and recommendations HyperCore does all the work to make managing your VMs simple, quick, and effortless. The following topics are included in this chapter: •Move a Virtual Machine •Clone a Virtual Machine •Snapshot a Virtual Machine •Restore a Virtual Machine from a Snapshot •Delete a Virtual Machine Snapshot •Delete a Virtual Machine •Best Practices for Virtual Machine Management Review Appendix A and Appendix B if you are unfamiliar with the VM icons or cues mentioned in this chapter. Move a Virtual Machine The HyperCore cluster is designed to always assign VMs to the node with the most free resources when the VM is powered on. You may wish to organize VMs after they are powered on for performance or testing, however. This can be done by live migrating the VM between nodes. There is no downtime and little to no disruption to VM operation when it is live migrated between cluster nodes. VMs can only be live migrated when they are powered on. 1. Open the Summary screen on the VM card. 2. Click the Move VM icon. 3. The VM in the Cluster Display Panel will turn blue, as seen in Figure 7-1. Figure 7-1 4. Click the destination node from the nodes in the Cluster Display Panel while the VM is still blue. 5. The VM will be live migrated to the destination node. Clone a Virtual Machine A VM can be cloned at any time, whether the VM is powered on or powered off. The clone feature on a HyperCore cluster will create an identical VM to the parent but with its own unique name and description. The clone VM will be completely independent from the parent VM 32 Manage Virtual Machines once it is created. For example, the parent VM of the clone could be deleted from the HyperCore cluster and the clone VM will not be affected in any way. 1. Open the Summary or Snapshots and Replication screen on the VM card. 2. Click the Clone VM icon. 3. A popup dialog will appear. 4. The HyperCore cluster will automatically populate the Name and Description fields for you, attaching a “-clone” to the VM name to ensure it is unique. 5. Edit the Name or Description field as needed. The Name field must contain a unique name for the new clone VM. 6. Click Clone to save the changes and create the clone VM. Snapshot a Virtual Machine A VM can have a snapshot taken at any time, whether the VM is powered on or powered off. A snapshot of a VM can be used to create a “restore point” of the VM from the moment in time that the snapshot was taken. Snapshots are meant to be used for local VM backup purposes on the HyperCore cluster. Snapshots cannot be moved from the HyperCore cluster as they are tied to specific VMs and their databases on the cluster. Currently, up to 5,000 snapshots can be stored for a VM. The realistic limitation on snapshot longevity is the cluster’s capacity to store the snapshot changes since VM snapshots are allocateon-write, meaning they grow with each subsequent change made on the VM from the time the snapshot is taken. The Diff Size shown in the Snapshots popup dialog does not indicate the size of the specific snapshot; the size only indicates the block changes at the cluster level that have taken place for the VM since the last snapshot has been taken. 1. Open the Summary or Snapshots and Replication screen on the VM card. 2. Click the Snapshot VM icon. 3. A popup dialog will appear. 4. Enter a name and/or description for the snapshot in the Label field. 5. Click Snapshot to save and create the snapshot. Restore a Virtual Machine from a Snapshot The snapshot restore feature works in the same fashion as the cloning feature and will create a VM identical to the parent VM at the time the snapshot was created, but the VM must have its own unique name. The VM restored from the snapshot will be completely independent from the parent VM once it is created. 1. Open the Snapshots and Replication screen on the VM card. 2. The Snapshots section of the screen will display the date and time of the Latest snapshot, if any, as well as the total snapshot Count, if any. 3. Click the Snapshots icon, the Latest snapshot date and time, or the snapshot Count. 4. A popup dialog will appear. 5. All VM snapshots will be recorded in this list by date, newest on top to oldest at the bottom, with their assigned label. Identify which snapshot you would like to restore the 33 Manage Virtual Machines VM from. 6. Click the Clone VM icon to the right of the snapshot you would like to restore. 7. A Clone VM popup dialog will appear over the Snapshots dialog. 8. The cluster will automatically populate the Name and Description fields for you, attaching a “-clone” to the VM name to ensure it is unique. 9. Edit the Name or Description field as needed. The Name field must contain a unique name for the newly restored VM. 10. Click Clone to save the changes and create the new VM from the snapshot image. Delete a Virtual Machine Snapshot As snapshots grow or become outdated it may become necessary to delete them. 1. Open the Snapshots and Replication screen on the VM card. 2. The Snapshots section of the screen will display the date and time of the Latest snapshot, if any, as well as the total snapshot Count, if any. 3. Click the Snapshots icon, the Latest snapshot date and time, or the snapshot Count. 4. A popup dialog will appear. 5. All VM snapshots will be recorded in this list by date , newest on top to oldest at the bottom, with their assigned label. Identify which snapshot you would like to restore the VM from. 6. Click the Delete VM icon to the right of the snapshot you would like to delete. 7. A confirmation dialog will appear over the Snapshots dialog. 8. Click Delete Snapshot to confirm the deletion and remove the snapshot from the VM. Delete a Virtual Machine VMs can only be deleted when the VM is powered off. 1. Open the Summary screen on the VM card. 2. Click the Delete VM icon. 3. A confirmation dialog will appear. 4. The VM, all associated data, and all associated snapshots will be deleted 5. Click Delete VM to confirm the deletion and remove the VM permanently. WARNING Any snapshots that are associated with the VM when it is deleted will be permanently removed as well. Ensure you have proper backups of the VM if necessary. Any VMs that have been cloned from the VM being deleted or any VMs that have been restored from the VM snapshots of the VM being deleted will not be affected. Best Practices for Virtual Machine Management Scale Computing recommends following the below guidelines and recommendations to ensure the stability and redundancy of the HyperCore cluster at all times: 34 Manage Virtual Machines •Leave a node’s worth of RAM available at any point in time to allow room for automatic failover in the event a node becomes unavailable; for example, roughly no more than 75% of RAM should be in use when viewing the HUD in a four node HC3 cluster. •Isolate CPU intensive VMs to separate nodes. Combining multiple CPU intensive VMs on a single node can lead to VM performance issues while running on that node. •The fresh installation of a new and manufacturer supported OS on a VM is recommended. This offers the most stable guest VM, particularly if a migration would include an OS no longer supported by the manufacturer or old applications not designed for virtualization, both of which can be top causes for OS BSODs or reboot issues on VMs. •Ensure the licensing you have available for the OS you will be installing is approved for virtualization as well as failover of the guest VM between the nodes. 35 8 HyperCore Replication Disaster recovery made easy Replication is offered in a one-to-one fashion between HyperCore clusters and is handled at the VM level so you can easily pick and choose VMs to replicate for disaster recovery purposes. One-to-one accounts not only for Source to Target replication, but also “cross replication,” allowing the Target to replicate back to the Source cluster as well. Replication is a near-continuous process that uses VM snapshots to estimate and send changes to the target cluster. Any changes that have not been sent by the time a new snapshot is scheduled to be taken will be completed before the replication process proceeds with the new snapshot. As of the release of HyperCore 6.2, replication now uses network compression by default. It is also possible to exclude VSDs (virtual harddrives) on a VM from the replication process as of the 6.2 release, although this is a feature that is still in testing and must currently be done by engaging Support and discussing the options and considerations required. This is convenient for transient or temporary data that would otherwise generate large changes for the replication job, such as Windows pagefiles, SQL temporary databases, log data, print queues, etc. Contact Scale Support at [email protected] for assistance in excluding a VSD. For more information on how Replication is implemented between HyperCore clusters--including more details on replication configuration, failover, failback, and DR testing--Scale recommends reviewing the HC3 Native Replication Feature Note available in the Knowledge base in the Scale Portal. The following topics are included in this chapter: •Prerequisites •Add or Remove a Remote Cluster Connection •Replicate a Virtual Machine •Pause or Resume Replication of a Virtual Machine •Restore from a Replicated Virtual Machine •Cancel Virtual Machine Replication Review Appendix A and Appendix B if you are unfamiliar with the VM icons, cues, or cards mentioned in this chapter. Prerequisites You need to confirm you have the following items in place before you can enable replication: •A local (Source) and remote (Target) HyperCore cluster that are accessible to each other over the LAN network •The LAN IP for one of the nodes in the Target cluster •The UI login credentials for the Target cluster •Port 10022 open between the Source and Target clusters; this is the default replication 36 HyperCore Replication port used by the Scale cluster Add or Remove a Remote Cluster Connection A Remote Cluster Connection must be set up with the Target cluster before you can initiate replication on the Source cluster VMs. WARNING All replicating VMs must have their replication paused (or replicating VMs must be removed from replication completely) on the Source cluster before removing the Target cluster’s Remote Cluster Connection. Follow the Cancel Virtual Machine Replication section for help with these processes. Add 1. Open the Control Center Panel or the Snapshots and Replication screen on the VM card on what will be the Source cluster. 2. Click the Remote Clusters tab in the Control Center Panel or click Setup Replication next to the Replication icon in the Snapshots and Replication screen. 3. Click + Add Cluster if you are working from the Remote Clusters tab. 4. A popup dialog will appear. 5. Enter the LAN IP of a node from the Target cluster in the Node field. 6. Enter the Username and Password for the UI credentials to the Target cluster. 7. Click Create to save the settings and initiate the replication connection. Remove 1. Follow the Cancel Virtual Machine Replication section for all VMs currently being replicated before removing a Remote Cluster Connection. 2. Open the Control Center Panel on the Source cluster. 3. Click the Remote Clusters tab. 4.Click Remove under the Actions column. 5. A confirmation dialog will appear. 6.Click Delete Connection to confirm the deletion and remove the Remote Cluster Connection. Replicate a Virtual Machine You can configure individual VMs on the HyperCore Source cluster to replicate to a connected Target cluster. 1. Open the Snapshots and Replication screen on the VM card. 2. Click Setup Replication next to the Replication icon. 3. A popup dialog will appear. 4. Select the connected remote cluster from the Target dropdown list. 5. Click Replicate to confirm and initiate the VM replication. Once initiated, replication uses the snapshot feature for the VM to estimate the rate of change and send new data appropriately to the Target cluster. Replication is a near-continuous process. You can monitor replication status in the Snapshots and Replication screen on the VM card on the Source cluster. Replication can be idle, active, or paused, as shown in Figure 37 HyperCore Replication 8-1, Figure 8-2, and Figure 8-3. Figure 8-1 Figure 8-2 Figure 8-3 Once replication is configured for a VM and the initial job is complete, you will see a VM replication card on the Target cluster, as seen in Figure 8-4 below. This VM card contains the replicated images of the Source VM stored in snapshot copies. Figure 8-4 Pause or Resume Replication of a Virtual Machine You can pause or resume individual VM replication once the replication job has been initiated for the VM. Pausing a VM during replication will halt replication, even jobs in the process of sending blocks, and stop all automated snapshots. Resuming replication will continue where the job was halted, even picking up in the middle of sending blocks (assuming that the replicated VM card was not deleted), and will start automated snapshots again. 1. Open the Snapshots and Replication screen on the VM card on the Source cluster. Pause 2. Click the Pause Replication icon. 3. VM replication will pause. Resume 2. Click the Resume Replication icon. 3. VM replication will resume. Restore from a Replicated Virtual Machine You can easily and quickly restore VMs from replicated VM cards on the Target cluster. 38 HyperCore Replication BEST PRACTICE TIP When performing a Disaster Recovery (DR) test, disconnect the NIC of the VM restored from the replicated virtual machine snapshot to prevent IP conflicts with the primary VM, or isolate the restored VM’s traffic with a VLAN. 1. Open the UI for the Target cluster. 2. Locate the replicated VM card you would like to restore from. 3. Click the Clone VM icon on the VM card to restore from the latest snapshot or click the VM Replication icon to see all possible snapshots available to use for the restoration and click Clone VM next to the appropriate snapshot. 4. A new VM will be cloned from the snapshot image. Cancel Virtual Machine Replication Individual VM replication can be paused indefinitely to be easily resumed at a later date as needed. Review the Note and/or contact Scale Support for details on permanently removing a VM from replication at this time. 1. Open the Snapshots and Replication screen on the VM card on the Source cluster. 2. Click the Pause Replication icon. 3. Replication will pause for the VM. 4. Open the UI of the Target cluster. 5. Click the Delete VM icon on the replicated VM card for the VM you wish to remove from replication. 6. The VM will cease replication to the Target cluster as long as replication is in the paused state. NOTE: The space consumed on the Target cluster by the VM replication will be released when the VM replication card is deleted. However, the Source cluster VM must remain in the paused state for replication indefinitely. As soon as the replication is resumed the VM will begin a new replication with the Target cluster. This is ideal for most environments that need to only temporarily free Target cluster capacity. It is possible to permanently remove a VM from replication if needed by cloning the Source cluster VM, deleting the original VM in the replication job, and powering on the clone VM to use as the primary VM going forward. This does require the short window of downtime to transition to the cloned VM. 39 9 System Monitoring and Maintenance Care and keeping Simplified management and maintenance is a key feature of HC3 and HyperCore. The following topics are included in this chapter: •Alerts and Log •Tasks •Shutdown the Cluster •Power On the Cluster •Update the Cluster •Remote Support •Drive Failure •Node Failure •Capacity and Resource Management Review Appendix A and Appendix B if you are unfamiliar with the VM icons or cues mentioned in this chapter. Alerts and Log HC3 is designed to intelligently monitor and report cluster health in real time. The cluster utilizes the intelligent HC3 monitoring service in order to assess and resolve hundreds of possible cluster scenarios. As assessments are made and cleared in real time, they can be viewed as conditions on the cluster. Conditions Active conditions are set on the cluster and displayed in the UI under the Conditions tab in the Control Center Panel. Conditions are also translated into cluster alerts and the “items” in the ITEMS display in the HUD. HC3 utilizes a simple tiered system to indicate the impact of a condition on the cluster, sent to you as an alert. From least severe to most severe the conditions are Informational, Notice, Warning, and Critical. Informational conditions are for your information only and require no action on the system. They are color coded green and abbreviated to INFO or INFORMATION. Notice conditions are generated dynamically when minor concerns are detected by the monitoring service on the cluster. Minor conditions should not impact VM availability or cluster stability but should be addressed as indicated by the alert instructions. They are color coded yellow and listed as NOTICE. Warning conditions are generated dynamically when major concerns are detected by the monitoring service on the cluster. Major conditions have the potential to impact VM availability 40 System Monitoring and Maintenance or cluster stability or redundancy and should be addressed as soon as possible by following the alert instructions. They are color coded yellow and listed as WARNING. Critical conditions are generated dynamically when critical concerns are detected by the monitoring service on the cluster. Critical conditions impact VM availability, cluster stability, or cluster redundancy and should be addressed immediately by following the alert instructions. Although most critical conditions can be resolved by the system, it is always recommended to contact Scale Support for a cluster health check if a critical condition has been reported. They are color coded orange and listed as CRITICAL. When first logging into the UI, if there are any active conditions they will be listed in the lower right corner of the UI. You can dismiss these items individually using the orange “x” or by hovering over the list and selecting hide all messages as seen in Figure 9-1. Figure 9-1 Alerts HyperCore translates cluster conditions detected by the monitoring service into user readable alerts. Alerts are reported by the monitoring service in real time and will use the configured System tab settings to send alerts to selected alert recipient email addresses. Alerts will explain any actions that need to be taken to resolve the set condition, if any. ITEMS The ITEMS display in the HUD is a counter used to show active conditions on the cluster that should be addressed. In the HUD, the color of the bar around the ITEMS display indicates the highest severity level of the current condition(s) and can be either green, yellow, or orange to correspond with the condition and alert color coding. All conditions reported under the ITEMS display are set in real time. You can find details on the conditions by clicking the ITEMS display in the HUD to open the Conditions tab under the Control Center Panel or by opening the Conditions tab directly. Cluster Log 41 System Monitoring and Maintenance The Cluster Log tab is a comprehensive history of cluster events and conditions. This includes items like VMs being started or stopped,VMs migrated between nodes, remote cluster sessions for replication, cluster redundancy events, set and cleared conditions, and many other items. Each of these items is recorded with a SEVERITY level using the condition definitions, as well as additional ALERT and ERROR conditions. Alerts signify alerts that have been sent by the cluster and errors are tasks that were unable to be completed on the system. Similar to the VM Management Panel, the log can be filtered using existing or predesignated search tags. Select a specific tag by selecting the filter dropdown menu to the top right of the Cluster Log screen, as seen in Figure 9-2. Figure 9-2 Tasks When a change is applied to a VM or the cluster, you will see an informational alert appear in the lower right corner of the UI. These alerts can take different forms depending on the progress or status of the change being applied. QUEUED seen in Figure 9-3 is the initial state of a requested change to the cluster or VM settings. This shows that the request is being processed for inclusion in the database on the system. This state may not always appear if the database is not currently active with another task; it will proceed right to the RUNNING state. Figure 9-3 RUNNING seen in Figure 9-4 is the second state of a requested change to the cluster or VM settings and includes a progress bar for the status. This indicates that the change is being actively applied to the system. While running, the database will be locked from any other changes. Figure 9-4 COMPLETED seen in Figure 9-5 is the final state of a requested change. The change is complete and has been successfully applied. 42 System Monitoring and Maintenance Figure 9-5 ERROR seen in Figure 9-6 is most commonly indicative of the database being locked for another change at the time a new one is requested. These errors can most often be ignored as the system will simply wait for the current task to complete and then proceed to the task that generated the error initially. Ensure you see the COMPLETED alert appear for the request once the currently running task is complete. Figure 9-6 Shutdown the Cluster The system should always be shutdown from the UI for any planned network or infrastructure management that may impact power or network availability to the nodes. Always contact Scale Support to review your individual scenario if you would like to shutdown individual nodes. Individual nodes should never be powered down manually unless instructed or approved to do so by Scale Support. 1. Open the Control tab in the Control Center Panel. 2. Click Shutdown Cluster. 3. A confirmation dialog will appear as shown in Figure 9-7. Figure 9-7 4. Select Yes or No from the Force dropdown menu. Selecting Yes to the Force option will forcibly power off any powered on VMs during the shutdown process. 5. Click Shutdown Cluster to confirm the choice and shutdown the cluster. Power On the Cluster All cluster nodes should be powered on as closely together as possible. Any significant delay in powering on one or more nodes will trigger a recovery scan on the system and prolong the normal startup checks. The cluster will go through a 20-30 minute startup check process. Any conditions triggered during the startup process can be ignored. Only remaining conditions that 43 System Monitoring and Maintenance have not been cleared after the initial wait period should be addressed. Update the Cluster HyperCore software updates are an important part of regular system maintenance. Almost all HyperCore updates are rolling, meaning that they are non-disruptive to any running VMs on the system as long as there is sufficient RAM to live migrate the VMs between nodes while a node is taken into maintenance mode and updated. When an update is available it will appear in the Update tab in the Control Center Panel. It will also appear as a flag above the current HyperCore version in the HUD. 1. Open the Update tab in the Control Center Panel. 2. The update version and release date will be displayed as shown in Figure 9-8. Figure 9-8 3. Click Apply Update. 4. A confirmation dialog will appear. 5. Click Update to confirm and begin the HyperCore update. The update process will begin with the first cluster node and work through to the last cluster node. Any VMs running on the node being updated will be live migrated to the nodes with the free resources to support the VMs. If there is insufficient RAM to live migrate the VMs the system will display an alert and abort the update. Sufficient RAM must be made available to support a single node’s downtime and then the update can be started again. Once the update on a node is complete the original VMs will be moved back to the node and the update process will proceed to the next node down the line until all nodes have been updated. Review specific firmware release notes for details on an update’s impact to the system. Remote Support HC3 clusters are configured with a remote access feature to allow Scale Support to interact with the cluster for customer assistance as if the technician were right in the room with you. This tunnel can only be opened outbound from the cluster and through enabled ports on 44 System Monitoring and Maintenance the customer network. The connection utilizes various security procedures on the cluster and server side to ensure all commands passed between Scale Support and the cluster are completely private and secure. Each support tunnel code enables a specific port on the Remote Support server to create the secure connection. This means that each tunnel code, once entered on a cluster node, is unique to that node while the tunnel is open and cannot be used on any other cluster node in any HC3 system. 1. Open the Remote Support tab in the Cluster Control Panel. 2. Each node in the cluster will be listed along with any open tunnels as shown in Figure 9-9. Figure 9-9 Open a Tunnel 3. Identify a node that does not currently have a tunnel opened. 4. Enter the number given by Scale Support in the CODE field next to the node you would like to open the tunnel on. Unless otherwise instructed by Scale Support, it should not matter what node the tunnel is opened on. 5. Click Open Connection next to that node. 6. The tunnel is opened. Close a Tunnel 3. Identify the node to close an open support tunnel on. 4.Click Close Connection next to that node. 5. The tunnel is closed. Drive Failure As part of the intelligent monitoring of HC3, HyperCore is designed to scan and assess hard drive status reports at a regular interval on each node. By monitoring the drives, HyperCore can proactively remove a drive prior to the drive becoming problematic in the system. This ensures failing drives are removed quickly and cleanly from the cluster. See the Drive Failure subsection of the Failure Scenarios section for more information on how HyperCore handles drive failures. Once a drive has failed and been removed from the cluster, there is no critical need to replace the drive. The system is completely stable and fully redundant with the drive removed, it is only down the capacity and I/O of that drive. Follow the below steps to replace a drive on HC3. All HC3 drives are hot swappable and 45 System Monitoring and Maintenance require no downtime to replace. 1. Contact Scale Support for a replacement drive when you are alerted that a drive has failed and has been removed on the cluster. 2. Identify the drive to replace from the DISK view of the Cluster Display Panel when you receive the replacement drive. The failed drive on the node in the UI will correspond to the physical slot of the same node. 3. Verify the node IP before replacing the drive if your nodes are not already labeled by IP. 4. Remove the failed drive from the node slot. 5. Wait 20 seconds for the bus to recognize that the drive has been removed. 6. Insert the new drive firmly. 7. HyperCore will recognize the new drive and incorporate it into the system. The drive will be immediately available for reads and writes. There is no rebuild process necessary. 8. Ship the failed drive back to Scale Computing using the included return label. Node Failure HC3 is designed to continue running through a single node failure. Whether the node lost power or the backplane network link was disrupted, the system will continue running through the loss and if necessary will automatically recover any VMs running on the node at the time of the loss. If VMs were running on the node at the time the node became unavailable they will become unavailable along with the node. However, HyperCore will automatically detect the loss of the VMs and will intelligently restart the VMs on the remaining nodes if there are sufficient resources to do so. When access to the failed node is restored, the VMs that were previously running on the node will not be moved back. Instead, all VMs will remain running where they were placed to ensure the node’s stability. You should always contact Scale Support to determine the cause of a node’s failure if you are unsure of the reason. Once the node’s health is verified, you can live migrate VMs onto the node as needed and proceed with normal operation. See the Node Failure subsection of the Failure Scenarios section for more information on how HyperCore handles node failures. Follow the below steps to restore access to a node in the most common node failure scenarios. Power Loss 1. Determine if the node is powered on or powered off. 2. If the node is powered off power it back on. a. If you are aware of the cause of the power loss proceed with normal cluster operation. b. If you are not aware of the cause of the power loss contact Scale Support for assistance. Software / Hardware Issue 1. If the node is powered on hook a crash cart (monitor and keyboard) to the node and 46 System Monitoring and Maintenance determine if it is responsive to keyboard input. 2. If the node is unresponsive to input contact Scale Support for assistance. 3. If the node is responsive to input review the network connections. Networking Issue 1. Ensure all network cables for the node are firmly plugged in. 2. Ensure the network switches are powered on and functioning correctly. 3. Ensure all VLANs and interconnects, where applicable, are functioning correctly. 4. Determine if the node is accessible from the LAN network. 5. If the node is still not reachable from the LAN network contact Scale Support for assistance. 6. If the node is accessible from the LAN network but not the backplane network contact Scale Support for assistance. Capacity and Resource Management HC3 is designed to be highly available and self-healing. If necessary, HyperCore can recover any VMs that may become unavailable for various reasons once the issue that caused the unavailability is resolved. However, there must be enough capacity and free resources for the system to assess and heal itself. As with any system, there is a critical threshold for capacity. HC3 will always send email alerts regarding Informational, Minor, Major, and Critical capacity concerns. At 40% free an Informational alert is sent. Minor is sent at 30%, Major at 20%, and Critical at 10%. At the very least, Scale recommends either freeing capacity or adding an additional node at 20% free capacity to avoid any potential issues. At 10% free capacity and below, regardless of what actual free space that may represent, there is the potential for stability concerns as with any OS. Avoid any issues and contact Scale Support if you need any assistance in identifying and freeing capacity. RAM resources are important on the HC3 cluster. Scale always recommends having the free RAM resources of at least one cluster node. This ensures the system can run all VMs while one node may be unavailable for any reason. NOTE: HC3 VMs use thin provisioning to conserve capacity on the cluster. This means that once space has been allocated on a VM drive, it is consumed on the cluster, even if that data is later deleted on the VM OS. For example, a 200 GB VM drive has historically reached 150 GB allocated. 100 GB is then deleted on the VM drive so that the OS reports only 50 GB being used. 150 GB will still be allocated and consumed on the HC3 cluster. 47 Appendix A - Cluster Display Panel Visual cues and icons The UI uses many intuitive visual and color cues in order to easily identify actionable buttons, actionable icons, and VM statuses in the Cluster Display Panel. Cluster Display Panel Image Status Normally operating node, fully available in the cluster Action, Use, Cause Resolution Informational Visual cue for node status Not Applicable Healthy node Informational Visual cue for node status Can be caused Offline and by a variety of unavailable factors, including node that is network link unable to host disruption to VMs the backplane network, the node being powered down, or a software service issue Ensure the node is powered on and verify that there are no network issues on the backplane Contact Scale Support if the node does not become available in the system again after 15 minutes 48 Appendix A Image Status Action, Use, Cause Resolution Informational A blue bar on a VM indicates it has been selected to be live migrated to a new node Use the Cluster Visual cue Display Panel for VM to complete the live migration livemigration or deselect the The Move VM Move VM icon on icon has been the VM selected Informational Normally Visual cue for operating drive status and drives, fully available to the current capacity system Healthy drives Not Applicable A drive has Informational been marked for removal on Visual cue for Call Scale Support the node and drive status and to place an order is currently current capacity for a new drive having data migrated off Failing drive 49 Appendix A Image Status Action, Use, Cause A drive has been Informational removed from Visual cue of a the system completely, and drive’s complete is still physically removal from in the slot the system Rarely, the slot may be marked as “Slot Empty” if the drive is no longer physically in the slot or is not readable by the system Failed drive that has been completely removed from the system and possibly even from the physical slot Resolution Call Scale Support to place an order for a new drive 50 Appendix B - VM Management Panel Visual cues and icons The UI uses many intuitive visual and color cues in order to easily identify actionable buttons, actionable icons, and identify VM statuses in the VM Management Panel. VM Management Panel Icon or Image Name, State Create VM Always available Import VM Always available Use Create a new VM on the cluster Import a previously exported HyperCore VM from a SMB share to create a new VM on the cluster VM powered off A VM that is currently powered off; some management options will be affected by the VM state VM powered on A VM that is currently powered on; some management options will be affected by the VM state VM working on a task A VM that is currently working on a task and temporarily locked from additional changes 51 Appendix B Icon or Image Name, State Use VM replication card An informational card stored on a Target replication cluster for VM replication management and restoration Power button of a powered off VM Power the VM on Available when the VM is powered off Power button of a powered on VM Available when the VM is powered on VM Console Always available VM Summary Always available Edit VM Always available Dependent on VM state Reboot, Shut Down, or Power Off a running VM Open the VM management console in a new browser tab Open the Summary screen for the VM The Summary screen offers access to the Edit, Move, Delete, Clone, and Snapshot icons Edit the VM’s assigned Name, Description, Tags, CPUs, or Memory The VM must be shutdown to edit CPUs or Memory 52 Appendix B Icon Name, State Move VM Unavailable while the VM is powered off Move VM Available while the VM is powered on Delete VM Unavailable while the VM is powered on Delete VM Available while the VM is powered down Clone VM Always available Snapshot VM Always available Use Power on the VM to live migrate the VM between nodes in the cluster Live migrate the VM between nodes in the cluster Power off the VM to delete the VM from the cluster Delete the VM from the cluster Create a clone of the VM on the cluster Create a snapshot of the VM on the cluster Open the Devices screen for the VM VM Devices Always available The Devices screen offers access to manage the existing CD trays, virtual drives, and virtual NICs, as well as the use of the Add Drive, Add NIC, and Modify Boot Order icons CD Tray Eject or insert an ISO image Visual cue for management into the CD tray to access on that identifies the VM CD the guest OS of the VM tray 53 Appendix B Icon Name, State Use Virtual Drive Edit the virtual drive Type or Visual cue for management Delete the drive from the VM that identifies the VM when the VM is powered off virtual drive Add Virtual Drive Power off the VM to add a Unavailable while the VM is virtual drive powered on Add Virtual Drive Available while the VM is powered off Virtual NIC Visual cue for management that identifies the VM virtual NIC Add Virtual NIC Unavailable while the VM is powered on Add Virtual NIC Available while the VM is powered off Modify Boot Order Add a virtual drive to the VM Connect or disconnect a virtual NIC whether the VM is powered on or off Edit the virtual NIC Mac Address, VLAN tag, or Type when the VM is powered off Power off the VM to add a virtual NIC Add a virtual NIC to the VM Power off the VM to modify the order of the virtual drives on the VM Unavailable while the VM is powered on The VM will use the first drive to boot by default Modify the order of the Modify Boot Order virtual drives on the VM Available while the VM is powered off The VM will use the first drive to boot by default 54 Appendix B Icon Name, State VM Snapshots and Replication Always available Snapshots Always available VM Replication Visual cue for management that identifies the VM replication status Pause Replication The Pause Replication icon is absent when there is no replication configured for the VM Resume Replication The Resume Replication icon is absent when there is no replication configured for the VM Export VM Always available Use Open the Snapshots and Replication screen for the VM The Snapshots and Replication screen offers access to view and manage VM snapshots and replication View, manage, and clone from the list of VM snapshots Setup and monitor replication for the VM Pause replication for the VM whether the VM is powered on or powered off Resume replication for the VM whether the VM is powered on or powered off Export a VM’s .qcow2 and .xml files to an external SMB share 55
© Copyright 2025 Paperzz