Splunk and Grafana - indico.jinr.ru – Indico

Grafana и Splunk как пример
решения проблем визуализации
данных в современных системах
сбора информации.
И.Н.Александров, Е.И.Александров,
М.А.Минеев
LIT, JINR
Outline
•
•
•
•
Aim of this work
Splunk
– Supported platforms, requiremets
Log Service
– Short description
– Log Manager
– (Splunk version of Log Manager)
Grafana
• Conclusions
2
The aim
The functionality and performance monitoring of huge
distributed computing systems such as Data Acquisition
systems (DAQ) in high energy physics require a permanent
analysis and storing of a big amount of operational data from
the different sources.
There are 2 important aspects here:
– Online access to the monitored data.
– Logged data: retrieving from DB (search), manipulating (statistics
calculation) and visualization (different kinds of graphs, tables).
3
The aim
• Not long ago a new trend has appeared.
Special tools were developed to simplify work
with large volume of machine-generated data.
• Here we are going to discuss our experience of
using two of them: Splunk and Grafana.
4
Splunk
http://www.splunk.com
Splunk is a commercial product produced by Splunk Inc.,
which provides a general-purpose search, analysis & reporting
engine and a distributed, non-relational, semi-structured
database for time-series text data (typically machine data in
large-scale data processing).
But there is a free version. The Free license lets you index up
to 500 MB per day.
5
Splunk
• Support for different platforms:
– Linux, Windows, MacOS, Solaris
• Different data sources
– files, SNMP data, Windows Event Log data, Windows Registry data, etc.
• Different data formats
• The powerfull searach language support
– SPL – Search Processing Language
• Built-in Web-server
• Reach vizualization capabilities
– Splunk has own single page application framework based on Backbone.js
6
Splunk GUI example (the script list)
Simple Introduction (in Russian).
“Пример использование Splunk для анализа
логов”:
https://habrahabr.ru/post/160197/
7
Log Manager Interface
The Log Manager is the user graphical interface to browse the log
messages produced by the TDAQ system.
The Log Manager has
been
developed in Java
(AWT, Swing, JDBC).
It requires the TDAQ
software environment
and works inside CERN
network.
The ATLAS TDAQ
experts need an access
to the information
from outside CERN to
be in touch during the
run.
8
The ERS Browser
Splunk App Scheme
Splunk
Splunk App
Splunk
(ERS Browser)
App
Scripted Input
Splunk App
Indexing
DB
Web Server
Log Service
DB
(Oracle)
Browser
Developing Views and Apps for Splunk Web
http://docs.splunk.com/Documentation/Splunk/latest/AdvancedDev/Whatsinthismanual
(Chapters: Build apps, Build scripted inputs.)
9
ERS Browser
(Splunk Version of Log Manager)
A.Kazarov
10
Chained logs in Log Manager
11
Chained logs in ERS Browser
12
ERS statistics (example of diagrams)
13
Grafana
Grafana is the graph and dashboard builder for
visualizing time series infrastructure and application
metrics. It provides a powerful way to create,
explore, and share dashboards.
http://grafana.org
14
Grafana
First version was present in 2014. Now available 3.1 version
of grafana.
Advantage:
 Open source;
 visual dashboards editor with reach possibilities (graphs,
triggers, html-inserts). Grafana dashboards are based on
Angular.js.
 Flexibility - adjusted everything and everyone;
 All change can be done use GUI;
 usability - it is convenient as much as it is beautiful;
 scrolling, zooming and so on;
 sortable table of values (min, max, avg, current, total);
 to display the metrics can be applied to it the mathematical
/ statistical functions;
15
Grafana
Advantage (2):
 dashboard is created on the client side;
 easy installation;
 self-sufficient (do not required http server);
 big set of supported type of databases (REST DB
services via HTTP protocol support included)
Disadvantage:
 Bad documentation.
Mode of grafana:
 Static (structure of dashboard is define);
 scripting (structure of dashboard depends on the
request parameters ).
16
Grafana: datasource
Name of database
Version of grafana
Graphite
InfluxDB
1 and above
1 and above
OpenTSDB
KairosDB
1 and above
2 and above
Prometheus
Elasticsearch
2 and above
3 and above
CloudWatch
3 and above
Manually can create new
1 and above
plugin for connect to the other
database
17
Grafana: static
18
Grafana: scripting
URL for access:
http://grafana_url/dashboard/script/scripted.js?rows=3&name=myName
grafana_url - hostname of grafana server (include port)
scripted.js – name of usage script
Example of script (JS script):
var rows = 1;
var seriesName = 'argName';
if(!_.isUndefined(ARGS.rows)) { rows = parseInt(ARGS.rows, 10); }
for (var i = 0; i < rows; i++) {
dashboard.rows.push(
{ title: 'Scripted Graph ' + i,
panels: [ { title: 'Events', type: 'graph',
targets: [ { 'target': "randomWalk('" + seriesName + "')" }],
}]
});
}
19
return dashboard;
Example of dashboard for ATLAS
Partition
Template
Zoom area
20
Netis: current implementation
Images
Zoom
Support only Round-robin Database (RRD)
21
Netis: new Grafana version
22
Netis: new Grafana version
The current implementation is based on InfluxDB.
A new DB PBEAST can be used in the future.
In case of this migration it will not be necessary to rewrite
the dashboard. The data source will be changed only.
23
Conclusion
Splunk and Grafana:
– Reach vizualization capabilities
– Templates for simple tasks
– Systems can be adopted on-fly in case of changes
24
Thank you!
25
Backup Sides
26
Log Service
The Log Service is the component in charge of collecting, saving and archiving all
information which needs to be logged in the TDAQ system.
27
Additional Information about Splunk
Simple Introduction. “Пример использование Splunk для анализа логов”:
https://habrahabr.ru/post/160197/
Performance of Splunk for the TDAQ Information Service at the ATLAS experiment
Yoshiji Yasu, Andrei Kazarov et al,
http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=7097473&url=http%3A%2F%
2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D7097473
Developing Views and Apps for Splunk Web
http://docs.splunk.com/Documentation/Splunk/latest/AdvancedDev/Whatsinthismanual
(Chapters: Build apps, Build scripted inputs.)
28
Splunk
• Splunk supported platforms:
–
–
–
–
Windows 7, and 8, 8.1, and 10
2.6+ kernel Linux distributions (64-bit)
Solaris 10, 11
OSX (Intel) 10.9, 10.10, 10.11
• Spluk runs on the virtual machine:
–
–
–
–
RAM 8 GB
VCPU 4 VCPU
HD 40 GB
OS Scientific Linux CERN (SLC) 6
29
Dashboards
•
Splunk has own MVC-like framework. It based on Backbone.js.
•
Dashboards use following libraries:
–
Backbone.js
http://backbonejs.org
–
Underscore.js
http://underscorejs.org
–
jQuery
https://jquery.com
30
Splunk Apps
Apps are made up of knowledge objects and configuration, anything from custom
UI to custom input scripts.
Apps:
•
Contain at least one navigable view.
• Can be opened from the Splunk Enterprise Home Page, from the App menu, or from the Apps
section of Settings.
•
Focus on aspects of your data.
•
Are built around use cases.
•
Support diverse user groups and roles.
•
Run in tandem.
• Contain any number of configurations and knowledge objects.
•
Are completely customizable, from front to back end.
• Can include Web assets, such as HTML, CSS and JavaScript.
31
Access to the Log Service DB
Scripted Input (Python)
• Runs at a regular interval
• Queries a database
• Parses the data in a format optimized for Splunk indexing
The parsed log example (1 line):
t=1455801117, rn=289479, part=GMTestPartition_lshi, uname=lshi, msgID=SFOng::InconsistentRunNumber,
host=pc-tdq-mon-32.cern.ch, app=SFO-1, sev=WARNING, text="The event run number (276952) does not match
the actual run number (289479)", context="PACKAGE_NAME: SFOng. FILE_NAME: ../src/Event.cxx.
FUNCTION_NAME: SFOng::Event::Event(SFOng::StatsCollector&, SFOng::Input&, uint64_t, uint32_t,
SFOng::BufferHandle, bool). LINE_NUMBER: 77. DATE_TIME: 1455801117.", params="gid: 276952. runno:
289479. ", quals="SFOng ", chained="2"
32