White Paper Get to the Root of Your Business Service Quality Issues White Paper Business Service Quality Matters Agile delivery of high-quality business services creates competitive advantage and drives profitability. Business services are the lifeblood of any modern organization. They provide business insights, empower employees, and enhance customer satisfaction. Agile delivery of highquality business services creates competitive advantage and drives profitability. On the other hand, poorly performing business services result in radically reduced productivity, declining market position, financial loss and ultimately organizational failure. To illustrate the critical importance of business service quality, consider a real-life example – an airline. The typical air carrier relies on dozens of business services – in fact, one leading international airline identified 230 such services, of which 60 were mission-critical. These services include seat inventory, yield management, departure control, cargo logistics, and many other industry-specific business systems. In addition, airlines rely on services that are found across industries, such as payroll, accounting and email. The failure of just one of these services can have a devastating business impact. For instance, if an airline’s ticketing system is down, this can affect millions of dollars in bookings. In 2012, American Airlines boarded nearly 108 million passengers – the average number of flight legs ticketed per hour exceeded 12,000 and was significantly higher during peak periods. At a typical price of $200 per flight leg, this translates into $2.4 million of affected bookings for a single hour’s downtime. A business service is not simply the application that the end user sees – it is the entire chain that supports delivery of the service, including physical and virtualized servers, databases, middle-ware, storage and networks. A failure in any of these can affect the service – and so it is crucial that IT organizations have an integrated, accurate and up-to-date view of all of these components and of how they work together to provide the service. Why Diagnosing Service Issues Is Difficult While it is critical to know how a business service is delivered, many IT organizations lack this visibility. Furthermore, they do not have all of the availability and performance information that they need to detect and diagnose service health issues – even if they could map this data to individual business services. Because of this, identifying and finding the root cause of service issues becomes an enormous challenge. There are many reasons for this lack of visibility – and all of these need to be addressed to ensure that business service issues are detected quickly and resolved effectively. www.servicenow.com ServiceNow | 2 Get to the Root of Your Business Service Quality Issues IT service configuration data is typically collected from isolated technology domains, rather than from a top-down service perspective. White Paper Lack of an Integrated Service-Level Dashboard IT organizations do not deliver a single business service – they deliver many. Therefore, IT personnel need to know instantly when any of these services starts to experience an issue – and have to be able to determine the root cause quickly and easily. While many IT departments have a dashboard based on service transaction monitoring, this often does not correlate service issues with underlying infrastructure problems. This leaves IT staff with the task of analyzing vast amounts of low-level infrastructure data in order to identify the particular IT components that are responsible for the service issue. Complex, Siloed Infrastructure Configuration Data IT service configuration data is typically collected from isolated technology domains, rather than from a top-down service perspective. There is no integrated information about how services flow – only siloed data about applications, servers, middle-ware, networks and storage. At best, topological data is limited to relationships between adjacent components – there is no view of the total set of relationships that support an end-to-end service, forcing staff to map the service delivery path manually. In addition, much of this configuration data is infrastructure-oriented and irrelevant from a service perspective – adding further to the complexity of the mapping process. Increasingly Dynamic IT Environments In the past, IT services and infrastructures were both relatively static, with changes happening over periods of months. However, this is no longer the case – IT organizations are faced with an ever-accelerating pace of change. This is being driven by virtualization and cloud, increasing demands for new business services, and new operational approaches such as DevOps that shorten upgrade cycles dramatically. This makes tracking how services are delivered using traditional manual methods impossible – these are designed for changes that happen in weeks, not in minutes. As a result, IT staff lack the accurate and up-to-date service topology information that they need to identify, analyze and resolve service issues correctly and efficiently. Change-Related Service Issues Change is the single biggest cause of service issues, and the impact of change is only increasing as it becomes more frequent. Not only do IT staff lack a current and accurate view of how services are delivered to help them diagnose problems, they also are unable to see if anything has changed that could have caused a service impairment. This is the typical “What happened last night?” scenario, and IT personnel now find themselves struggling to answer the question. Shared and Redundant Infrastructure While tracing a service through dedicated infrastructure is difficult enough, the task becomes even more challenging with shared or redundant infrastructure. When a service flows across an enterprise bus or load balancer, the ingress point may be known, but there are multiple possible egress points. This makes it very difficult to determine which components are downstream of the shared component in the service flow. This in turn makes it hard to diagnose the root cause of service problems and to understand the service impact of infrastructure failures. Similarly, when a service depends on redundant infrastructure such as server clusters, failures of individual servers within that cluster may just reduce the capacity of the service or not affect it at all. Without a clear model of the impact of these types of failures, IT staff run the risk of ignoring crucial problems and of overreacting to minor ones. www.servicenow.com ServiceNow | 3 Get to the Root of Your Business Service Quality Issues Managing service health effectively is not just a technical issue – it is also an organizational and cultural one. White Paper Inaccurate and Out-of-Date Service Maps To identify, diagnose and resolve service issues accurately and efficiently, IT staff need a reliable map of each business service. This must identify all of the components that support that service, along with the relationships that trace the service topology across the end-to-end IT infrastructure. However, all of the service change and configuration issues already discussed lead to fragmented, obsolete and unreliable service maps. As a result, IT staff miss critical issues, take excessive amounts of time to resolve service problems, and waste time on infrastructure issues that have little or no service impact. What is needed is an automated, real-time service discovery tool that spans technology domains, and that has the intelligence to trace services across complex shared and redundant IT components – including virtualized and cloud infrastructure. This tool must be able to update service maps as soon as changes occur, eliminating costly manual service-mapping approaches that depend on scarce expertise and tribal knowledge. Multiple Infrastructure-Oriented Monitoring Tools In the same way that IT configuration data is siloed and infrastructure-oriented, IT monitoring tools are not designed or configured with services in mind. These tools typically monitor the health of individual IT infrastructure domains, and deliver information that is defined by and intended for domain experts. As a result, much of the data they produce is irrelevant from a service perspective and is not easy for NOC operators to understand. Furthermore, the monitoring data that is relevant is not mapped back to specific business services, leading to highly inefficient manual “swivel-chair” root-cause analysis. At the same time, deploying these monitoring tools is expensive since the process for doing this is not automated. Because of this, only a subset of IT components are usually monitored. This leads to gaps in monitoring data – and even if full coverage was possible, IT staff would be overwhelmed with the resulting flow of domain-specific information. Since monitors are not deployed with services in mind, this translates into gaps in service coverage. As a consequence, service issues go undetected – and even when they are identified, there is often not enough information to determine the root cause. Lack of Cross-Domain Expertise Managing service health effectively is not just a technical issue – it is also an organizational and cultural one. IT departments manage individual domains today, not services. This creates many domain experts, but cross-domain expertise is rare. Because of this, IT organizations lack the skills needed to trace a service issue across domains to its root cause – and instead find themselves pointing fingers in the war room. ServiceNow® Gets to the Root of the Problem ServiceNow ServiceWatch gives IT staff a completely accurate and up-to-date map of how business services are delivered. It starts with what is important – the business service – and then drills down intelligently across domains to discover all of the IT components that deliver that service – along with the relationships that represent the service flow through these components. Its unique, top-down approach to service mapping is comprehensive and surgical – it discovers everything that matters from a service perspective while eliminating irrelevant and confusing infrastructure data. Once ServiceWatch has discovered a business service, it then continues to monitor the service topology in real time – updating the corresponding service map as soon as changes are detected. As a result, IT personnel always have a correct, complete and current map of how each business service is delivered – and can also determine the precise service topology at any time in the past. www.servicenow.com ServiceNow | 4 Get to the Root of Your Business Service Quality Issues ServiceWatch collects service-related monitoring data from industry-standard monitoring tools. ServiceWatch collects service-related monitoring data from industry-standard monitoring tools. It also has its own monitoring system, which can either be used to augment existing monitoring data or to provide a complete business service monitoring solution. ServiceWatch correlates this monitoring data using its service maps to produce complete and up-to-date service health status, along with detailed information about how infrastructure issues propagate through the service topology. This gives IT staff instant visibility of a business service impairment when it occurs, along with the data they need to isolate the problem rapidly and identify its root cause. White Paper Figure 1. Real-Time Service Health Dashboard ServiceWatch’s service health dashboard lets IT staff see the status of all business services at a glance. Each service is represented by a tile that is color-coded to represent the health of the service. The size of the tile gives the business impact of the service – which is calculated using completely configurable business metrics such as the number of users of the service. The dashboard is always up-to-date and accurate, since service health is calculated by combining monitoring data with ServiceWatch’s real-time service maps. Figure 2. www.servicenow.com ServiceNow | 5 Get to the Root of Your Business Service Quality Issues Alerts can be sent using a number of different mechanisms, including SNMP traps, command line invocations and emails. White Paper In the example shown below, an E-banking service has a critical problem, as do several other services. Simply clicking on the E-banking tile displays a prioritized list of all of the events that are affecting the service, along with their severity. Similarly, clicking on one of the displayed events highlights all of the business services that are affected. ServiceWatch also lets IT staff define configurable alerts when conditions occur that require attention – such as when a business service enters a critical Figure 3. state. This is done simply by selecting a trigger condition along with the set of services to which the trigger applies. The alerts can be refined further depending on the type of trigger – for example, the business service status change trigger shown on the right can be refined by specifying the severities that generate the alert. Alerts can be sent using a number of different mechanisms, including SNMP traps, command line invocations and emails. This makes it easy to integrate ServiceWatch with service desks and other systems, and also lets ServiceWatch alert IT staff directly when problems occur. Top-Down, Automated Service Map Discovery Traditional approaches to building service maps involve collecting bottom-up infrastructure data from multiple domains. This siloed data is then combined manually into an overall service topology – a process that can take weeks and needs to be redone every time there is a change. ServiceWatch, on the other hand, discovers end-to-end service maps automatically in as little as minutes, and keeps these maps up-to-date as changes occur. To discover a business service, ServiceWatch simply needs the service entry point – such as an URL or MQSeries queue. It then drills down through the IT infrastructure, tracing the service across domains – including applications, load balancers, servers, middleware, storage and networks. It probes each component in turn, making intelligent decisions about which components to interrogate next based on the dependencies that it discovers. This allows it to streamline and accelerate the discovery process, focusing only on what matters from a service perspective. Irrelevant infrastructure data is eliminated, creating an accurate and concise service map that simplifies the tasks of problem isolation and root cause analysis. An example of this discovery is shown here. The complete E- Banking service has been discovered from the initial service entry point, including applications, load balancers, web server clusters, enterprise buses, databases and network connections. Figure 4. www.servicenow.com ServiceNow | 6 Get to the Root of Your Business Service Quality Issues ServiceWatch provides complete support for virtualization and cloud, connecting directly with management systems such as VMware vCenter and Citrix XenCenter. White Paper Optimized For Dynamic Environments ServiceWatch is designed for today’s dynamic IT environments. It provides complete support for virtualization and cloud, connecting directly with management systems such as VMware vCenter and Citrix XenCenter. It tracks changes in dynamic infrastructure as they occur, keeping its service maps completely up-to-date as it does. The image above shows a HAProxy load balancer running in a virtualized environment. Users can see that the load balancer is Figure 5. connected to a web server cluster, and also have complete information about the hypervisor, physical server and virtual machine, as well as about the load balancer application itself. Built-In Vendor and Technology Intelligence ServiceWatch has a knowledge-driven approach to service mapping that makes it unique. It analyzes IT vendor infrastructure intelligently so that it is able to map end-to-end services without requiring expert input from IT personnel. This allows it to completely automate the mapping process – although IT staff are free to make changes to generated service maps once they have been discovered. Because of this, ServiceWatch is also able to trace services automatically through shared infrastructure such as load balancers and enterprise buses. It can also determine the true service impact of problems in redundant infrastructure such as server clusters and parallel network links. Efficient Root-Cause Analysis ServiceWatch provides comprehensive analysis capabilities that let IT personnel diagnose the root cause of service issues quickly and accurately. Using the service map, IT staff can see how issues propagate across the service topology, and can quickly home in on the component causing the service issue. Each component is color-coded to indicate its current health status, and clicking on a component brings up a list of all of the issues that are affecting it. For example, the E-banking service shown below has an issue due to a Microsoft SQL Server – circled in orange. Clicking on the server displays an event list that shows there is a free space problem. Selecting the event in turn highlights that an SSIS component and an SSAS component are also experiencing symptoms because of the root cause space issue – both of which are circled in white. Figure 6. www.servicenow.com ServiceNow | 7 Get to the Root of Your Business Service Quality Issues ServiceWatch provides comprehensive analysis capabilities that let IT personnel diagnose the root cause of service issues quickly and accurately. White Paper Once the IT component causing the issue has been identified, users can then drill down into it to see which subcomponent(s) are causing the problem. This allows IT staff to pinpoint the exact root cause of the service issue so that they can immediately take the right corrective actions. For example, returning to the same E-Banking service, this is now exhibiting the same free space issue on the MS SQL Server as before, but there is also a critical SCOM event on the SSIS component as shown below. Drilling into the SSIS component shows that all of its subcomponents are affected by the free space problem. However, one highlighted subcomponent – a Windows service – is the source of the critical SCOM event. Figure 7. In addition to isolating service issues to the subcomponent level, ServiceWatch also uses service topology to intelligently assess how infrastructure issues affect service health, including issues with redundant infrastructure and communications paths. www.servicenow.com ServiceNow | 8 Get to the Root of Your Business Service Quality Issues Once the IT component causing the issue has been identified, users can then drill down into it to see which subcomponent(s) are causing the problem. White Paper As shown below, the E-Banking service is now experiencing a major issue with a web server cluster consisting of two Apache instances. While the cluster is shown as having a major problem, expanding the cluster shows that one of the instances has a critical file system issue. ServiceWatch has automatically downgraded the overall cluster status to major since the second Apache instance is still operating normally. At the same time, there is a failed port on a network device. In this case, ServiceWatch has determined which communication paths within the service are affected and has highlighted these in red to indicate a critical problem. Figure 8. www.servicenow.com ServiceNow | 9 Get to the Root of Your Business Service Quality Issues Since change is one of the biggest reasons why services have issues, it is important to be able to correlate changes with service problems. White Paper Correlate Service Health Issues with Infrastructure Changes Since change is one of the biggest reasons why services have issues, it is important to be able to correlate changes with service problems. ServiceWatch makes it easy to do this. It can show the topology of a service at any time in the past, and can also highlight components that have been added, deleted or changed between any two points in time. In the example shown below, a user has compared the service maps for an intranet service over a six-day interval. The map clearly shows that an Apache instance and a Web server instance were added during this time period. ServiceWatch also displays a chronological list of all of the service health issues that occurred over the selected time period, and allows users to drill into individual changes to get component-level details. Figure 9. Integrated Service Monitoring ServiceWatch integrates easily with industry-standard monitoring tools including Nagios and Solarwinds. Administrators simply need to configure a connection to the monitoring system using a user-friendly wizard interface. Once the connection has been made, they can then select the desired monitors for any IT component from the set of monitors that are currently active in the third-party monitoring system. ServiceWatch also comes with its own comprehensive set of business service monitors. These provide a complete cross-domain monitoring platform, and offer service-oriented metrics that track business service availability and business service performance. Unlike traditional monitoring solutions, ServiceWatch automatically deploys its monitors to manage appropriate IT components when specific monitors are selected. This vastly simplifies the task of configuring and deploying service-focused monitoring in the IT network, and ensures that all the information needed to track business service health is collected. Figure 10. www.servicenow.com ServiceNow | 10 White Paper ServiceWatch integrates easily with industry-standard monitoring tools including Nagios and Solarwinds. Conclusion High-quality business services are the cornerstone of an agile and efficient business. They underpin almost every aspect of business operations, ranging from back-office functions such as supply chain through to customer care and strategic planning. When they are robust and responsive, they create competitive differentiation and unlock new business opportunities. However, delivering high-quality business services consistently is the single biggest challenge that IT organizations face. They struggle to gain visibility of the health of their business services, and lack the service-oriented information they need to isolate, diagnose and resolve service issues quickly. They are caught in a perfect storm of ever-increasing business demands, constant change and limited information to accomplish their mission. ServiceNow ServiceWatch provides IT organizations with the service visibility they need to transform the way that they operate. It tracks how services are delivered across the entire IT infrastructure – including virtualized and physical servers, applications, middleware, storage and networks. Its service maps are always up-to-date and accurate, providing a rock-solid foundation for detecting and isolating service problems, and for diagnosing their root cause. Unlike traditional mapping approaches that can take weeks, ServiceWatch automatically creates service maps in as little as minutes, and keeps them up-to-date as changes occur. ServiceWatch gives a comprehensive top-down view of business service health, starting with high-level dashboards and extending all the way through to isolation of issues to individual subcomponents within the IT infrastructure. It helps IT organizations to detect and respond more quickly to critical service issues, and gives them the tools they need to resolve these issues quickly and cost-effectively. The result is vastly improved service quality levels, more responsive and agile IT organizations, and dramatically reduced costs. www.servicenow.com ©2015 ServiceNow, Inc. All rights reserved. ServiceNow believes information in this publication is accurate as of its publication date. This publication could include technical inaccuracies or typographical errors. The information is subject to change without notice. Changes are periodically added to the information herein; these changes will be incorporated in new editions of the publication. ServiceNow may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time. Reproduction of this publication without prior written permission is forbidden. The information in this publication is provided “as is”. ServiceNow makes no representations or warranties of any kind, with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. ServiceNow is a trademark of ServiceNow, Inc. All other brands, products, service names, trademarks or registered trademarks are used to identify the products or services of their respective owners. SN-WP-RCA-072015
© Copyright 2026 Paperzz