Agentless or Agent-based Monitoring? The never-ending story Document version: 1.2 Date: November 2007 Which one is best? Agentless or agent-based? There has been a lot of debate about this, and it seems to be one endless discussion. As we receive many questions about this, this paper examines the question in the light of new developments and Tango/04 experience and customer requests. Nevertheless, as we say at the end, remember that it is far more important what you do with the data collected than the method used to collect it. What is Agent-based? Agent-based monitoring involves the deployment of a software program that runs natively on the monitored element. This is commonly a third-party program or script that is not part of the original element, for instance, the Windows Server Agent from Tango/04, which is a Windows program you need to install and properly configure in a Windows server. What is Agentless? Agentless monitoring refers to the ability to monitor an element without the need to install a third-party program first. Usually, agentless monitoring leverages some standard technology that is already included in the monitored element. For instance, the SNMP1 services that are commonly found in most operating systems today. Typically, you can manage the monitoring capabilities from a remote point of control. Tango/04 offers the ThinkServer Monitoring Engine which is basically a manager of monitors. One immediate advantage of ThinkServer is that you can inspect or change attributes of each monitor, including thresholds, actions, data collection, etc. from a common interface for all the monitored elements. A bit of History Agent-based monitoring was the initial way to monitor systems and infrastructure elements. This was due to several factors, in particular the lack of standards to interchange information across different platforms. Another reason was the nature of IT infrastructure in the early days, typically composed of centralized hosts (such as large mainframes) and static applications (see Table 1). There were very few elements to monitor, they rarely changed, and applications were modified at a much slower pace. Security was also a concern, since the early designs of SNMP version 1 were not considered safe enough. Thus, agent-based monitoring was a necessity and probably the best solution until the nineties. When centralized systems were rapidly replaced by “rightsizing”, “Client/server computing” and “distributed applications”, the situation changed radically. There were no longer just a few elements to monitor, there were thousands, and they changed rapidly. Applications also started to be in a constant state of flux when Web applications and rapid development tools became popular. Agent-based monitoring started to show its disadvantages in the new scenario. At the same time, monitoring standards started to evolve. The SNMP specification added versions 2 and 3, solving most of the early problems, adding new capabilities and security. Several other initiatives appeared or were improved, including both open source standards and proprietary 1 SNMP means Simple Network Management Protocol (SNMP), and it is an internet protocol defined by the Internet Engineering Task Force (IETF) that has been in use for more than 20 years. © 2007 Tango/04 Computing Group Page 2 specifications that became popular, such as RMON2, Windows WMI3, SMB/SAMBA4, JMX5, etc. Most hardware and software vendors started to include APIs to facilitate the remote monitoring of their products. Before Now Number of servers Few Thousands Infrastructure rate of change Static In constant flux Agentless monitoring standards Unproven, few, unsafe Plenty, proven, safer Application rate of change Few monthly changes Web 2.0 pace: several changes per day Diversity of elements Mono-platform Multiplatform Architecture Centralized Distributed Table 1 – Characteristics of IT environments in the past decade and the present. Common Agent-based Disadvantages Generally mentioned disadvantages for agent-based monitoring include: • Intrusiveness (it may affect existing applications) • High cost to deploy • High cost to maintain • Increased CPU usage As you need to deploy a third-party piece of software in a monitored element, a conflict may be introduced in the environment. The fear of causing a disruption in a server that is working properly was a very common reason for Tango/04 customers to demand agentless monitoring capabilities from us. The “if it works, don’t touch it” motto is often heard when you mention the need to install additional software in a fragile environment (such as Windows machines or critical applications in constant state of flux). Usually, agentless monitoring leverages the fact that the data collection agent is already part of the monitored element, it has been thoroughly tested in a multitude of scenarios, security issues have been openly discussed and fixed, etc. A common example of the danger of intrusiveness would be adding a native monitoring agent that generates too many log entries: this native agent could fill up storage space and cause unavailability in the element it was supposed to watch over. 2 RMON means Remote Network MONitoring . The RMON MIB was developed by the IETF to support monitoring and protocol analysis of LANs. 3 Windows Management Instrumentation (WMI) is an extension to the Windows Driver Model that provides an operating system interface through which instrumented components provide information and notification. WMI is Microsoft's implementation of the Web-Based Enterprise Management (WBEM) and Common Information Model (CIM) standards from the Distributed Management Task Force (DMTF). 4 SMB means Server Message Block, an application-level network protocol mainly applied to shared access to files, printers, serial ports, and miscellaneous communications between nodes on a network. Samba is an open source implementation of SMB. 5 JMX means Java Management Extensions (JMX), a Java technology that supplies tools for managing and monitoring applications, system objects, devices (e.g. printers) and service oriented networks. © 2007 Tango/04 Computing Group Page 3 Agent-based deployment and maintenance costs are considered higher since it is necessary to install software manually in each element, and the fact that it is far easier to configure everything remotely instead of touching each agent at each monitored element. This is true in general, but we should note that usually standard monitoring protocols still need some kind of setup, especially in earlier releases of operating systems (for instance, SNMP services are not enabled by default in Windows 2000 or 2003 Server, Windows WMI services were not even installed by default in earlier Windows versions). Deployment can be automated in some cases too, which mitigates this disadvantage6. The increased CPU usage stems from the fact that usually a local agent would need to incorporate program logic to set and control thresholds, store data, alert, and notify of exceptions. This is a valid concern, but agentless monitoring may generate in some cases more traffic in the network, since it would probably need to send information periodically to the controlling node. We can add to the disadvantages, the need to use and learn different user interfaces. As each agent will probably have their own GUI or command line interface, this adds to the general complexity of the IT environment, adding a training learning curve for operators. Welcome to the Agentless Era As mentioned, with the explosion of IT infrastructure complexity and the evolution of standards, people saw opportunities in agentless monitoring. Its commonly cited advantages are the counterpart of the agent-based disadvantages, as discussed before, so we can mention: • Reduced risk of conflict and undesired crashes • Lower cost of deployment • Lower cost of maintenance • Reduced CPU usage • Common interface for multiple agents in multiple platforms The fact that you can use a similar interface for all of your agents across all your platforms significantly reduces the learning curve and provides for knowledge reuse across domains. All of these advantages should yield a lower Total Cost of Ownership (TCO) and a higher Return On Investment (ROI). The differences should be more noticeable the larger the size of the IT infrastructure. Increased requirements for network bandwidth can be a concern, but only if there is a shortage of it. Usually the communication between the monitored element and the controlling manager (the ThinkServer Monitoring Engine in the case of Tango/04 technology) is a local one, and customers usually don’t have a bottleneck in the local network. Security, as mentioned, is less of a concern when using proven protocols such as SNMPv3, which adds authentication, privacy, and access control7. In fact, the facilities provided by these standards are generally considered safer than most features provided by third-party software vendors, which usually use the “security by obscurity” paradigm. 6 Typically automated distribution, when feasible, covers only a part of the monitored elements, usually some servers, but note that generally there are several different elements to monitor and not all of them have a common, automated software distribution facility. 7 See SNMPv3: A Security Enhancement for SNMP, William Stallings, IEEE, http://www.comsoc.org/livepubs/surveys/public/4q98issue/stallings.html © 2007 Tango/04 Computing Group Page 4 Dealing with a high volume of data has been pointed out 8 as a disadvantage for the agentless approach, but this argument is less and less relevant as the remote monitoring technology adds the same filtering capabilities as their agent-based counterpart. So, the winner is…? If you look at the summary in Table 2, it looks like agentless is the one with the most advantages. Why, then, is there continuous debate over which method is better? Well, simply put, every environment is different, so chances are that the best answer for a company is not the best for another. Usually, the correct answer to such a generic question is “it depends”, so to reach to a scientific, definitive conclusion on the subject is far from feasible. Agent-based Agent-less Installation Requires local agent Nothing Maintenance Requires updating for new versions Nothing Risks Application or operating system conflict, performance degradation, crashes due to untested interferences Reduced, since no extra software is deployed Security Could be good, but beware of “security by obscurity” and undetected security holes in third-party products that are not widely deployed Good when using safe, open protocols Controlling interface Typically different for each agent Same if a common management engine is used Learning curve for common agent maintenance tasks One for each agent One for all the common management engine Bandwidth usage Maybe less Maybe higher Resource usage Maybe higher Maybe less Usually best for Small number of elements, extremely critical centralized elements (such as a big mainframe server) Large IT infrastructure TCO Probably higher Probably lower Table 2 – Commonly referred attributes of each monitoring alternative. Please note that the facts are not as simple as it seems. Most “advantages” are theoretical only, and can be the opposite in case of a bad implementation. 8 Some also question the ability to get more data and hence to be able to perform in-depth monitoring has also been cited as an advantage of agent-based approaches. This is sometimes true, but far from an absolute truth. In some cases the implementation of the agentless mechanism is as good as a local agent, and even better. Moreover, in some cases, the agentless mechanism uses the same underlying technology than its agent-based counterpart. However, when doing very detailed data collection for highly technical purposes, such as capacity planning, in some environments an agent-based product can be more capable. But this opens a whole different discussion: what should you monitor, and why? In our practical experience with hundreds of successful monitoring projects, we always preferred to privilege the quality of controls against the mere quantity of them. For application monitoring (BSM and SLM) projects this has always been a winning value proposition. So discussing the richness of agentbased against agentless monitoring theoretically is just a futile exercise. Moreover, end-to-end service level monitoring is performed through synthetic transactions, which are by definition agentless. And there are other cases where Agentless is the only way to go (monitoring certain devices such as Internet appliances is a good example of this). © 2007 Tango/04 Computing Group Page 5 Since there has been so much interest in agentless technologies from our customers, Tango/04 is creating more and more monitors that can be deployed remotely. We rewrote our native Windows Server Agent as a WMI-based ThinkServer monitor, and we are also rewriting most of our existing iSeries agents as Java-based Agentless ThinAgents9. Practically all the new monitoring functionality we are creating is based on the ThinkServer engine, ensuring its agentless capabilities. Indeed, we strongly believe in the benefits of this approach and we made a strategic decision to support this type of monitoring mechanism. But we are not dogmatic about it. In some cases we will offer hybrid alternatives, such as the ability to send a Unix/Linux script to a remote server, execute the script natively, and retrieve data back to the ThinkServer (which is very similar to our SSH-based monitors), or use a mix of remote and agentbased monitors. We can also create a native implementation of our ThinkServer monitoring engine in 10 Unix/Linux platforms , so customers will have the choice to monitor either remotely or locally if they want, at the cost of installing the ThinkServer engine in each machine. Conclusion In the end, agentless monitoring is just agent-based monitoring. The only difference is that the local, native monitoring agent is pre-installed in the monitored element. For instance, SNMP or syslog services are already installed in most UNIX systems, but they basically act as any other monitoring agent, collecting data and reporting it back. Probably the best monitoring alternative for you would be the one that gives you the most chances of success. Unfortunately, millions of dollars have been spent on complex monitoring framework implementations with the simple goal of controlling and improving the performance of an IT infrastructure… all in vain. The best advice we can give you when selecting a monitoring mechanism, is to look closely at the track record of the company that will be ultimately responsible for your project. So, agentless or agent-based? It depends! Agentless is very tempting as discussed, but both methods have their advantages and disadvantages, and you will find people advocating in both directions. But trust us, in the end, what matters more is the ability (and willingness) of your monitoring partner to convert you into a happy customer story instead of another sad tale of frustration. 9 ThinAgent is how we call a ThinkServer agent. As most of the surrounding services are offered by the engine itself, the agent is “thin”, it just collects the data and does very little else. That leverages Tango/04 ability to rapidly create new monitors and agents, as the time required to implement a new one is dramatically reduced. 10 The ThinkServer Monitoring Engine has been designed with portability in mind. It leverages cross-platform standards such as C++, XML, and SOAP, and we have versions of the engine running perfectly in Linux platforms in our labs. © 2007 Tango/04 Computing Group Page 6 About Tango/04 Computing Group Tango/04 Computing Group is one of the leading developers of systems management and automation software. Tango/04 software helps companies maintain the operating health of all their business processes, improve service levels, increase productivity, and reduce costs through intelligent management of their IT infrastructure. Founded in 1991 in Barcelona, Spain, Tango/04 is an IBM Business Partner and a key member of IBM’s Autonomic Computing initiative. Tango/04 has more than a thousand customers who are served by over 35 authorized Business Partners around the world. Alliances Partnerships IBM Business Partner IBM Autonomic Computing Business Partner IBM PartnerWorld for Developers Advanced Membership IBM ISV Advantage Agreement IBM Early code release IBM Direct Technical Liaison Microsoft Developer Network Microsoft Early Code Release Awards © 2007 Tango/04 Computing Group Page 7 Legal notice The information in this document was created using certain specific equipment and environments, and it is limited in application to those specific hardware and software products and version and releases levels. Any references in this document regarding Tango/04 Computing Group products, software or services do not mean that Tango/04 Computing Group intends to make these available in all countries in which Tango/04 Computing Group operates. Any reference to a Tango/04 Computing Group product, software, or service may be used. Any functionally equivalent product that does not infringe any of Tango/04 Computing Group’s intellectual property rights may be used instead of the Tango/04 Computing Group product, software or service Tango/04 Computing Group may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any license to these patents. The information contained in this document has not been submitted to any formal Tango/04 Computing Group test and is distributed AS IS. The use of this information or the implementation of any of these techniques is a customer responsibility, and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. Despite the fact that Tango/04 Computing Group could have reviewed each item for accurateness in a specific situation, there is no guarantee that the same or similar results will be obtained somewhere else. Customers attempting to adapt these techniques to their own environments do so at their own risk. Tango/04 Computing Group shall not be liable for any damages arising out of your use of the techniques depicted on this document, even if they have been advised of the possibility of such damages. This document could contain technical inaccuracies or typographical errors. Any pointers in this publication to external web sites are provided for your convenience only and do not, in any manner, serve as an endorsement of these web sites. The following terms are trademarks of the International Business Machines Corporation in the United States and/or other countries: AS/400, AS/400e, iSeries, i5, DB2, e (logo)®Server IBM ®, Operating System/400, OS/400, i5/OS. Microsoft, SQL Server, Windows, Windows NT, Windows XP and the Windows logo are trademarks of Microsoft Corporation in the United States and/or other countries. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and/or other countries. UNIX is a registered trademark in the United States and other countries licensed exclusively through The Open Group. Oracle is a registered trade mark of Oracle Corporation. Other company, product, and service names may be trademarks or service marks of other companies. © 2007 Tango/04 Computing Group Page 8
© Copyright 2025 Paperzz