HP Helion CloudSystem Enterprise and Foundation Software HP Helion CloudSystem 9.0: Monasca Agent This white paper describes the Monitoring-as-a-Service (Monasca) agent in CloudSystem. Overview Monasca, the HP open source project, is a comprehensive cloud monitoring solution for OpenStack based clouds. Monasca enables users to understand the operational effectiveness of the services and underlying infrastructure that make up their cloud and provide actionable details when there is a problem. System status and supporting metrics are constantly monitored, readily available, and trackable, making system management tasks more timely and predictable. Monasca uses node-based agents to report metrics to a centralized collection point, where alarms are triggered. The Monasca agent is based on Python and consists of several sub-components. The agent supports system metrics, such as CPU utilization and available memory. Monasca also supports StatsD and built-in checks for services such as MySQL and RabbitMQ. The Monasca agent includes: • • • • • • System metrics, for example, CPU and memory utilization Integrated StatsD daemon that can be used by applications via a StatsD client library Process checks that return several metrics on a process such as number of instances, memory, I/O, and threads Service checks, for example MySQL and RabbitMQ OpenStack metrics Automatic checks that detect and setup checks on certain processes and resources Deployment The Monasca agent is installed on all of the CloudSystem virtual appliances and KVM compute nodes. Metrics are collected on all servers where the agent is installed. When the Monasca agent monitors ESXi compute nodes, it collects the required information from VMware vCenter using the VMware Infrastructure (VI) API. Once metrics are fetched, the agent pushes these metrics to the Monasca server. The following steps and diagram show the process. 1. Metrics are published to the Message-API from each agent. 2. The Monasca API publishes the metrics to the Message Queue (Kafka). 3. The Persister consumes metrics from the Message Queue and publishes them to Vertica. 4. The Threshold Engine consumes metrics from the Message Queue and uses them to either evaluate current alarms or create new alarms. If an alarm transitions to a new state due to the metrics coming in, it updates that state in the MySQL database and publishes an “alarm-state-transitioned-event” to the Message Queue. 5. The Persister consumes alarm-state-transitioned-events from the Message Queue and publishes those to Vertica as alarm state history. 6. The Notification Engine consumes alarm-state-transitioned-events from the Message Queue and sends out a webhook to the monasca-webhook-handler. 2 Alarm definitions Alarm Definitions are policies that specify how Alarms are created. By using Alarm Definitions, you do not need to create individual alarms for each system or service. Instead, a small number of Alarm Definitions can be managed. Monasca creates Alarms for systems and services as they appear. An Alarm is created when metrics match the Alarm Definition. 3 An Alarm Definition has an expression for evaluating metrics to determine if one or more alarms needs to be created. Using the Monasca Agent The Monasca Agent shows the status/usage of resources on the various appliances and compute nodes where monasca-agent is installed. To view the Monitoring Dashboard: 1. 2. 3. 4. 5. In the Operations Console, from the General menu, select Monitoring Dashboard. Click Launch Monitoring Dashboard. Log in using the user name and password you set for the Operations Console during First-Time Installation. On the overview of the Monitoring Dashboard, the alarms for all the appliances and activated computes are displayed. The alarm states are: o OK [Green] - Metrics have been received and the Alarm Definition Expression evaluates to false for the given metrics 4 o o Alarm [Orange] - Metrics have been received and the Alarm Definition Expression evaluates to true for the given metrics Undetermined [Gray]: No information has been received in (period + 2) time periods 6. To view alarms: o In the left navigation, click Alarms to see alarms for all services and appliances. From the Actions menu, to the right of each row, click Graph metrics,, Show History, or Show Alarm Definition. o Click any service name from the Overview screen to view alarms. o Click any server name from the Overview screen to see alarms for a CloudSystem appliance. 7. In the left navigation, click Alarm Definitions to view and edit the types of alarms that are enabled. You can change the name, expression, and other details about the alarm. You might want to raise or lower alarm thresholds if you are receiving too many or not enough alarms. Do not edit the default alarm definitions, because they are used in the Operations Console. Click Dashboard. The Grafana Dashboard opens. From this dashboard, you can view a graphical representation of the health of services, and the CPU and database usage of each CloudSystem appliance. o Click the graph title (for example, CPU), and then click Edit. 8. 9. Click Monasca Health. The Monasca Service Dashboard opens. From this dashboard, you view a graphical representation of the health of the Monasca services. Compute Nodes The Monasca agent supports KVM and ESXi compute node monitoring. Hyper-V monitoring is not supported in CloudSystem 9.0. 5 The Operations Console uses the Monasca agent to display the status and usage data of CPU, memory, and storage of the compute nodes as shown in the figure below. • KVM Monitoring o When a KVM compute node is activated, a CloudSystem RPM installs monasca-agent on the compute node. o The Monasca agent collects the metrics from the compute node and sends them to the Monitoring appliance. o When the KVM compute node is deactivated, the RPM and monasca-agent are uninstalled. • ESXi Monitoring o A custom plug-in for the Monasca agent exists in each of the Cloud controllers to monitor ESXi compute clusters. o When an ESXI cluster is activated, the configuration (vcenter.yaml) of the particular plug-in is modified to monitor the activated cluster. o When an ESXi cluster is deactivated, the cluster’s details are removed from the plug-in configuration. Notification Methods: Webhook Webhook is the notification method used to notify the state change for all of the alarms. When there are state changes in these alarms, the threshold engine on the Monitoring appliance sends a notification to a webhookhandler service, which converts this notification to the appropriate payload. This payload is processed by other services to display on the Activity Dashboard screen. To view the Activity Dashboard: 1. In the Operations Console, from the main menu, select Monitoring Dashboard. 6 2. 3. Click Launch Activity Dashboard. The Monitoring Activity dashboard in Horizon on the Management appliance opens. Log in using the user name and password you set for the Operations Console during First-Time Installation. Learn more about HP Helion CloudSystem: http://www.hp.com/go/CloudSystem http://www.hp.com/go/CloudSystem/docs © Copyright 2014-2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. 5900-0103, December 2015 7
© Copyright 2026 Paperzz