Título xx/2012

The evolution of
monitoring services :
RNP Case
Emmanuel Gomes Sanches
Paulo M. da Conceição Júnior
Rede Nacional de Ensino e Pesquisa (RNP)
Brazilian NREN
Engineering and Operations Directory (DEO/GTI)
RNP Introduction – The Brazilian NREN
Ipê Network
•
•
•
•
•
•
RNP National backbone
Network connections
Situation on August 2016
Composed by 27 PoPs
(Points of Presence)
Aggregated capacity 325 Gbps
International capacity 116 Gbps
2
RNP Introduction – Network extension
Total size : 8,516,000 km2
Australia
Europe
USA
Europe excluding Russia = 6,900,000 km2
3
RNP Introduction – NOC Team
Network Operation Center – 24x7 monitoring service
NOC team alert technical teams contacting each 10 minutes
4
Motivation
Some difficulties
• Reactive action in some failure situations
• Lack of visibility in identifying affected services
• No statistics of services availability
• Dependence of PoP team’s reporting to follow connectivity health
• Lack of monitoring information during connection failure events
Action plan
“Promote the evolution of monitoring to improve the quality of
management, inserting new tools and expanding the scope”
5
Proposal
Strategy of evolution
To implement a set of projects in order to improve the monitoring :
•
Project 1 : Monitoring tools evolution (2015 – concluded)
•
•
Project 2 : Monitoring scope evolution (2015 – in progress)
Project 3 : Monitoring scope expansion (2016 – in progress)
6
Project 1 : Monitoring solution – Initially
• Centralized architecture
• Single point of failure
Ipê
Network
7
Project 1 : Monitoring solution – Model
• Distributed architecture
• No single point of failure
• Better reliability
• Open source software
• Based on Nagios code
• Load distribution
Ipê
Network
8
Project 1 : Monitoring solution – Nowadays
RNP implementation
• 27 poller – remote monitoring agents (one in each PoP)
• Monitoring 1,037 hosts with 3,012 elements until the moment
• High resilience in case of failure
Ipê
Network
9
Project 2 : Monitoring scope evolution
Roadmap
• Availability monitoring
•
•
•
•
Conectivity monitoring (already done)
IT infrastructure monitoring (2015)
Corporate services monitoring (2016)
Advanced services monitoring (2017)
• Performance monitoring (2018)
• Quality monitoring (2019)
10
Project 2 : Monitoring scope evolution
Previous scope : Network and IT Infrastructure (based on SNMP)
RNP NOC
11
Project 2 : Monitoring scope evolution
New scope : Add customer connectivity and RNP services (functional tests)
12
Project 2 : Monitoring scope evolution
Inclusion of 16 Advanced Services
• CAFe (federation)
•
Telepresence system
•
•
•
•
•
•
•
•
•
•
•
•
•
•
TV Signal Transmission
Live Video Transmission
Videoclass@RNP
Videoconference
Video on Demand
Edudrive (cloud storage)
Compute (IaaS)
Web Conference
eduroam
FileSender@RNP
FIX
fone@RNP (VoIP)
ICPEdu (CA)
IDC (colocation)
13
Project 3 : Monitoring scope expansion
Inclusion of customers Connectivity
Network layers
monitoring history
CORE Layer:
Ipê Network backbone
monitored since 2007
DISTRIBUTION Layer:
27 Points of Presence
10 PoPs monitored since 2016
ACCESS Layer:
1,237 customers with
3.5 million users
497 customers monitored
since 2016
14
Benefits and conclusion
Results already perceived due efforts done during last 2 years :
• Adoption of a new monitoring tool
•
•
•
•
Better availability of monitoring service
Load distribution through remote monitoring agents (pollers)
Monitoring resilience in case of connection failure
Possibility of services monitoring through functional tests
• Construction of a services monitoring view
•
•
•
•
Quicker failure identification
NOC operators contact appropriate technical teams
Smaller resolution time
Better service management and support
• Inclusion of monitoring customers connectivity
• Proactive action on customers’ connectivity failure
• Better customer satisfaction
• Better network management and planning
15
Emmanuel Sanches – IT Manager
[email protected]
Paulo Júnior – IT Specialist
[email protected]