NTTS 2017 presentation

Highlighting the added value of
Statistical Linked Open Data
Monica Scannapieco
Raffaella Aracri
Andrea Pagano
Paolo Pizzo
Laura Tosco
Luca Valentino
Istat
Giovanni Corcione
Oracle
Istat’s Linked Open Data Portal
The LOD portal of
ISTAT


The LOD Portal as of
today allows
accessing about 900
Million RDF triples
Traffic for year 2016:
24.000 unique visitors
and 750.000 hits
Istat LOD Portal: http://datiopen.istat.it
English Version: http://datiopen.istat.it/index.php?language=eng
1
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017
Big Investment on ontology modeling
STATISTICALADMINISTRATIVE
GEOGRAPHICAL
BOUNDARIES
BOUNDARIES
Localities
Regions
Census sections
Provinces
Municipalities Describes the administrative
and geographical features of
the Italian territory
SPECIAL
AREAS
TERRITORY
SPECIAL UNITS
Abbeys
Hospitals
Describes the measures
and dimensions of the
indicators w/r/t people
POPULATION
MEASURES
Ontologies
DWELLINGS
Describes the measures and
dimensions of the indicators
w/r/t dwellings
DIMENSIONS
DIMENSIONS
Sex
Age
Marital status
MEASURES
HOUSEHOLDS
Describes the measures and
dimensions of the indicators
w/r/t households
DIMENSIONS
MEASURES
2
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017
IT Architecture of the Portal
Open Source
Oracle 12C
3
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017
First scenario: use case
A bookseller wants to open a new international library.
He is carrying out an inspection on an available location and
wants to make a market analysis to know the type of users
distributed by age, country of origin, educational level and
employment status that are resident in areas adjacent to the
possible location of the store.
4
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017
First scenario: workflow
(1) An app on the smartphone
of the bookseller detects
the local GPS coordinates
(2) It sends this information to
the dati.open.istat.it
SPARQL endpoint that
allows to make a query,
via the HTTP protocol, to
the Istat’s triple store.
(3) The endpoint returns the
required information.
(4) Data are visualized on the
smartphone.
5
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017
First scenario: SPARQL query
Selects population indicators
by census section
Identifies census section
nearest to the detected position
 A GeoSPARQL query performs the retrieval of the results
 The GEOSPARQL query identifies the census sections nearest to the
detected position and returns for each of them, the related WKT
geometry and the resident population according to the specified profiled
6
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017
First scenario: result
Example of
visualization of the
result on a
smartphone
7
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017
Second scenario: use case
The responsible of a technical office of an Italian province has to
make an analysis of the status of degradation of the buildings in
the municipalities of the province in relation to the land use.
8
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017
Second Scenario: Federated Querying
With LOD, it is very easy to realize analyses with comparison of data
coming from different sources (linked for example at territorial level)
Federated query on ISTAT and ISPRA (Institute for Environmental Protection
and Research) i.e. the query accesses ISTAT and ISPRA portals
Results dynamically
retrieved
from both portals
Query on one
Portal
ISTAT
ISPRA
Italian National Institute for
Environmental Protection and Research
9
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017
Second scenario: workflow
(1) The application builds a
federated query between
the data published by
ISTAT and data published
by ISPRA.
(2) Results are retrieved from
both triple stores (data
have been linked at the
municipality level).
(3) Obtained data can be
visualized on a chart.
10
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017
Second scenario: SPARQL query
Query to local
endpoint (ISTAT)
Query to remote
endpoint (ISPRA)
The federated query
selects for the
municipalities in the
province : name,
cadastral code, resident
population and an
indicator related to the
number of buildings in a
bad state of
preservation (from the
Istat triple store) and an
indicator related to the
land usage expressed
in percentage (from the
ISPRA triple store).
11
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017
Second scenario: result
Example of resulting
chart.
The municipalities are
represented by their
cadastral code.
The details of all
retrieved information
for a single
municipality appears
by hovering over its
point
12
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017
Conclusions
 The next planned release is the National Italian Registry of
Addresses (with civic numbers), recognized as a priority also
by the Italian Agency for IT in Public Administration.
 The advanced services described in the use cases could be
particularly suitable also with respect to these next releases.
 A dissemination strategy based on open data does put the
Official Statistics users at the center:
 Reaching them through different channels
 e.g. apps
 Making easier for them to retrieve data
 e.g. federated query that make transparent the
distribution of data on different portals
 Providing richer services to them
 e.g. spatial querying and dynamical visualizations
13
Highlighting the added value of Statistical Linked Open Data, Monica Scannapieco – Brussels, NTTS, 14-16 March 2017