ESSnet DWH

ESSnet DWH - Metadata in the S-DWH
Harry Goossens – Statistics Netherlands
Head Data Service Centre / ESSnet Coordinator
[email protected]
ESS-net DWH
Questionnaire stocktaking
No NSI answers ‘YES’ on all these four questions:




Do you have a single coherent system which covers most of your data in the
production of business statistics ?
Is your metadata currently integrated into your data systems ?
Is your data input for current needs integrated into your data systems ?
Are your current output requirements integrated into your data systems ?
 No NSI has a finished DWH and metadata system
ESS-net DWH
1
Conclusion Stocktaking
Overall daily practice:


All NSI’s find metadata (highly) important
Mostly NO metadata systems operational
(yet some in development)


Most NSI’s struggle with metadata
Often capacity problem, ‘extra work’
 Need for guidance on metadata
ESS-net DWH
2
Metadata definitions
Data & Metadata
Statistical Data & Metadata
 Data are qualitative or
quantitative information
collected through observation
 Statistical data are data from
surveys and/or administrative
sources, used to produce
statistics
 Metadata are data about /
describing data.
 Statistical metadata are data
about / describing statistical
data
or better: about STATISTICS
ESS-net DWH
3
Metadata for a DWH
Technical metadata
 Structural information
How to physically find and use logical data

Process descriptions
How data flows in the DWH

Authentication rules
Who may do what ?
Business metadata
 Definitions and descriptions
Help the end-user interpret and evaluate the data
ESS-net DWH
4
Metadata for statistics production
Structural metadata
 Act as identifiers and descriptors of the data:
Identify, use, and process data matrixes and data cubes
Names of databases, columns, dimensions
Reference metadata
 Describe the contents and the quality of the data:
Include conceptual, methodological and quality
metadata
Algorithms, definitions, Q-indicators
Source: METIS
ESS-net DWH
5
Metadata categories
A metadata item is either
 Structural (technical) or Reference (business)
Other mutually exclusive categories:
 active
 passive
 structured
 free - form
 standardised  non standardised
 centralised
 local
ESS-net DWH
6
The Statistical DWH
Data
Warehouse
Statistics
production
Statistical Data
Warehouse
A central ‘statistical data store’ for managing all available
data of interest, regardles of its source, enabling the NSI to:
- produce necessary information (= statistics !)
- (re)use available data to create new data / new outputs
- execute analysis and perform reporting
ESS-net DWH
7
Metadata for a S-DWH
Emphasis / focus on:
 Active, Structural and Structured metadata
 Reference metadata (common to all statistics production)
and
 Process metadata
Describe expected or actual outcome of one or more processes
using evaluable and operational metrics

Quality metadata
Source quality, methods used, usability/restrictions

Tracing information
Which surveys/registers contributed to a specific output ?
ESS-net DWH
8
Metadata standards in a S-DWH

What should be standardised ?
Contents, formats, repository, software

Which level of standards should be used ?
International/Eurostat, National/NSI, DWH internal

How should a standard be interpreted ?
Complete adherence, compatible

How strict adherence should be required ?
Mandatory, recommended

Should some components be prioritised ?
Big bang, evolution
ESS-net DWH
9
Metadata Quality



The more data, the more need for metadata
The S-DWH contains lots of data,
making it dependent on its metadata
Correct, high-quality metadata are vital for its use
and for metadata governance:
No metadata  useless data
Bad metadata  misused data
Good metadata  useful data
ESS-net DWH
10
Metadata as a design tool

Metadata is a complex issue and central to the concept
and implementation of the data warehouse.
The Project needs to consider how guidance may/can be given to ensure that
metadata systems allow all the gains of the S-DWH to be exploited effectively.

Metadata has a role to play in the abstract design
process, independent of any specific structure.
The S-DWH model has implications for the way metadata is collected,
transmitted and used. If this is the case: the process design could be
determined entirely within the metadata requirements and provide automatic
consistency between technical architecture and metadata needs
ESS-net DWH
11
SGA II: WP 1 - Metadata
Fitting S-DWH in current metadatamodels and standards:
 Building a framework which defines metadata
requirements and roles in the S-DWH context
 Study on the use of metadata models and standards:
define the various functionalities of a metadata system
to facilitate and support the operation of the S-DWH
 Provide recommendations and guidelines on the
governance of metadata management in the S-DWH
 Keep it manageable & practical !!!
ESS-net DWH
12