Time Series Data Repository (TSDR) Project Proposal www.opendaylight.org TSDR Functional Objectives To capture ODL data into a persistent time series data repository This includes: Statistics counters Performance data Health status information Operational configuration data To facilitate various applications built on top of TSDR Applications include: Operational configuration optimization Traffic engineering Network analytics with automated intelligence Major functions Security risk detection Performance analysis Data Collection Data Storage Data Queries Data Aggregation Data Purge Lithium Focus TSDR functionalities on OpenFlow Statistics data www.opendaylight.org 2 TSDR Design Objectives Generic and Extensible architectural framework Generic and extensible TSDR Data Model. Abstract and generic TSDR Persistence Layer − Allow implementation of various data store plugins under TSDR Persistence Layer − with TSDR Persistence APIs with HBase Plugin as an example TSDR Data Store implementation. Scalable with high performance Providing both integrated and distributed architectures − to handle different scales of time series data Fully utilizing MD-SAL’s clustering capability − to handle performance and scalability in large scale deployment scenarios www.opendaylight.org 3 TSDR Integrated Architecture TSDR Data Services including Data Collection, Data Storage, Data Query, Data Purging, and Data Aggregation are MDSAL services. Data Collection service receives time series data published on MD-SAL messaging bus from MDSAL southbound plugins. Data Collection service communicates with Data Storage service to store the data into TSDR. TSDR data services access TSDR Data Store such as HBase through generic TSDR Data Persistence Layer. Needs MD-SAL notification subsystem support. www.opendaylight.org 4 TSDR Distributed Architecture In large data center deployment scenarios, TSDR Distributed Architecture would be needed to handle the performance and scalability. In distributed architecture, TSDR data services are deployed in a separate MD-SAL instance. The data pushed onto MD-SAL messaging bus by ODL southbound plugin are propagated to the other MD-SAL instance for TSDR data services to process into TSDR data repository. Needs ODL clustering support. www.opendaylight.org 5 TSDR Data Flow with multiple data models TSDR Data Flow involves multiple data models including source data model ( OpenFlow statistics), TSDR data model, and TSDR plugin ( HBase) data model. Data Collection Service subscribes to receive OpenFlow Statistics data from MD-SAL Notification Subsystem and passes the data to Data Storage Service. Data Storage Service converts OpenFlow Statistics data model to TSDR data model. HBase TSDR Plugin converts TSDR data model to HBase specific data model based on HBase TSDR schema design. www.opendaylight.org 6 Unstructured or Semi-Structured data consideration – for future release For unstructured or semi-structured data such as syslog data, MD-SAL receives the data in the format of syslog specifica data model. Data Filtering and Preprocessing can be added to filter out the data noise and optionally extract structured information from the semi-structured data. Third party specific TSDR plugin such as Splunk Plugin could be added under TSDR Data Persistence Layer to work with proprietary data stores. Data Aggregation Service is not needed when handling unstructured data. Third party tools such as Splunk could leverage Data Query Service to obtain the unstructured data from TSDR and add application specific processing on top of it. www.opendaylight.org 7 TSDR Data Model The goal of the TSDR data model design: Generic Extensible Scalable Performance Optimized The data model captures: Statistics data Log type of data Note: To add a new group, extend TSDRBaseRecord DataCategory contains: Flow Group Stats Flow Stats Flow Meter Stats Interface Stats Log Records Queue Stats Note: More categories can be added to the above list. RecordKeys contains: A list of composite keys Different categories contain different set of keys Key set validation is needed based on different data categories www.opendaylight.org 8 TSDR Persistence APIs Interface Name Description/comments Extends from ODL Common APIs? Specific to TSDR Persistence API? Will be implemented in HBase plugin in Lithium? save() Including saving one or a list of objects Yes No Yes find() Including query based on a list of IDs, with specified criteria, and paging support Yes No No Yes No Yes count() delete() Including delete with one or a list of IDs, and delete the entire table Yes No No exists() Including query based on one or a list of IDs Yes No Yes min(), max(), avg() For Data Aggregation purpose No Yes No www.opendaylight.org 9 HBase TSDR Schema – Raw Data TableName RowKey Column Family: Column Qualifier = Cell Value FlowMetrics MetricID_NodeID_TableID(_FlowID)_timestamp ‘raw’ = metric_value InterfaceMetrics MetricID_NodeID_TableID(_PortID)_timestamp ‘raw’ = metric_value QueueMetrics MetricID_NodeID_TableID_PortID_QueueID_timestamp ‘raw’ = metric_value GroupMetrics MetricID_NodeID_GroupID(_GroupBucketID)_timestamp ‘raw’ = metric_value MeterMetrics MetricID_NodeID_GroupID(_MeterID)_timestamp ‘raw’ = metric_value Schema Design considerations: General HBase Schema Design Rules applied: Keep RowKey, Column Family Key, Column Qualifier as short as possible. Design the RowKey properly so as to keep rows evenly distributed in multiple data nodes. Keep the number of column family low Other performance considerations: Multiple tables are created based on the data categories in the TSDR data model. Data storage and query operations run much faster on smaller data sets stored in HBase tables with structured keys. www.opendaylight.org 10 HBase TSDR Schema – Aggregated Data TableName RowKey Column Family: Column Qualifier = Cell Value HourlyFlowMetrics MetricID_NodeID_TableID(_FlowID)_timestamp ‘min = metric_value ‘max’ = metric_value ‘avg’ = metric_value HourlyInterfaceMetrics MetricID_NodeID_TableID(_InterfaceID)_timestamp ‘min = metric_value ‘max’ = metric_value ‘avg’ = metric_value HourlyQueueMetrics MetricID_NodeID_TableID_PortID_QueueID_timestamp ‘min = metric_value ‘max’ = metric_value ‘avg’ = metric_value HourlyGroupMetrics MetricID_NodeID_GroupID(_GroupBucketID)_timestamp HourlyMeterMetrics MetricID_NodeID_GroupID(_MeterID)_timestamp For performance consideration, we design multiple aggregation tables with different granularity. ‘min = metric_value ‘max’ = metric_value ‘avg’ = metric_value ‘min = metric_value ‘max’ = metric_value ‘avg’ = metric_value Aggregation tables with different granularity will have similar schema as displayed above www.opendaylight.org 11 HBase TSDR Data Model TSDR HBase Plugin converts the generic TSDR data model into HBase specific data model based on HBase schema design. TSDR HBase Plugin leverages this HBase specific data model to implement the generic TSDR Persistence APIs including storage, query, purging, and aggregation to complete the TSDR data services in HBase. www.opendaylight.org 12 TSDR Scope in Lithium In the Lithium release, we will focus on the following deliverables: Architectural framework TSDR Integrated Architecture HBase on Hadoop single node deployment scenario Data Collection Data Storage Data Model implementation TSDR Data Model to support OpenFlow Statistics HBase Data Model for HBase Plugin implementation Data Type Support OpenFlow Statistics Data Collection mechanisms Functionality implementation as specified in the architectural design Deployment scenarios support Implement Pub/Sub collection mechanism Data Persistence Layer Complete TSDR Persistence APIs with interface definition TSDR Plugin HBase plugin as an example implementation Focus on the storage API implementation in HBase plugin to support Data Storage Service in Lithium www.opendaylight.org 13
© Copyright 2026 Paperzz