Big data!

Big data! Smaller models?
Dr. Georgia Aifandopoulou
Research Director
Center for Research & Technology Hellas, Hellenic Institute of Transport
Tel: 2310 498457
Email: [email protected]
Web: www.hit.certh.gr
Decision making in Transport policy
Problem
Solution
???
Public administration
What to?
Consultant (internal or external)
Scenario 1
Scenario 2
Scenario 3
Scenario 4
What if?
Decision making in Transport policy
• What to?
▫ High-level experts
▫ Experience
▫ Best practices
• What if?
▫ Models
Decision making in Transport policy
• What to? Maybe we could ask the users…
???
Decision making in Transport policy
• What if? 4-step Transportation modeling …
4-step Transportation modeling
• Need for rationalizing something that is not always
rational (decision making of individuals)
• Need for explaining human behavior using simple
mathematical expressions / formulas
▫ Trip distribution: Gravity model
▫ Modal split: Logit model
▫ Traffic assignment: Wardrop equilibrium
4-step Transportation modeling
• Inspiration in other sciences (hydraulics)
• High aggregation of data
▫ Zones (land use, population, employments)
▫ Traffic counts, average speeds
▫ Volume-delay functions (BPR)
4-step Transportation modeling
• Since the mathematical framework was already
defined and delimited, the major problem was
how to collect the requested data.
▫
▫
▫
▫
Time constraints
Limited resources
Computation limitations
When to update?
• High dependence on experience of transport
modelers
• IT capabilities always behind transport modelers
The era of Data
The era of Data
• Nowadays the problem is not how to collect data,
but how to select the right datasets, how to clean
the data, how to process it...
▫ New keyword: Big Data
▫ New specialization: Data Scientists
•
•
•
•
Now transport modeling is behind IT capabilities
Work “intrusionism” (IT in all domains)
Less theory, more application
The end of theory?
Big Data
Linked Data
New Data Sources
• New datasets
▫ Bluetooth detections (improved ANPR)
▫ Floating Car Data
• Disaggregated data
▫ We have distributions, not average values
▫ Higher complexity at the back-end side (noise at
back-end side)
New Data Sources
• Disaggregated data at back-end side
▫ New capabilities (+)
▫ More processing (-)
• IT can program everything, but there is still a need
for taking into account the transportation
modelling expertise
▫ Data misuse paradigms
▫ Data can show what, but it cannot explain why
New Data Sources in Thessaloniki
• Stationary sensors network: Point to point tracking of MAC ids
along the network through 43 Bluetooth device detectors.
• Dynamic sensors fleet: Floating Car Data provided in real time
by a professional fleets composed of 1.200 taxis and 600 buses.
• Social media (geolocated tweets & Facebook check-in events)
Static sensors network
Dynamic sensors fleet
Social media
BAR
60
50
40
30
20
10
0
2/22/2016
0:002/23/2016 0:002/24/2016 0:002/25/2016 0:002/26/2016 0:002/27/2016 0:002/28/2016 0:002/29/2016 0:00 3/1/2016 0:00 3/2/2016 0:00
-10
CAFE
15
10
5
0
2/22/2016 0:002/23/2016 0:002/24/2016 0:002/25/2016 0:002/26/2016 0:002/27/2016 0:002/28/2016 0:002/29/2016 0:00 3/1/2016 0:00 3/2/2016 0:00
NIGHTLIFE
12
10
8
6
4
2
0
2/22/2016 0:002/23/2016 0:002/24/2016 0:002/25/2016 0:002/26/2016 0:002/27/2016 0:002/28/2016 0:002/29/2016 0:00 3/1/2016 0:00 3/2/2016 0:00
THE HIT PORTAL
- State-of-the-art platform for data and
software related to the Transport sector
in Greece
- Content aggregator, observatory,
research infrastructure, services
- Data collection and processing
- Data standardization and certification
- Data management
- Data analytics
- Data merging / enrichment
- Optimization and routing algorithms
- Data analytics and visualization
- Transport modeling and simulation
- Commercial and research studies
Data
Algorithm
Specialized
transport
software
HIT PORTAL – functional architecture
HIT PORTAL becomes “Big Data”
- Conventional detectors (15MB / day)
- Bluetooth detectors (50MB / day)
- Floating Car Data (200MB / day)
Volume
Velocity
- Bluetooth detectors
Variety
- Floating Car Data
-Social media
-Conventional detectors (loops, radars and cameras)
- Transport pilot of the
Big Data Europe project
- Historical data
- Real-time
- Near real time
New modeling in Thessaloniki
Bluetooth sensors
measuring travel time
Full-equipped areas and corridors
(homogeneous coverage)
Mostly isolated measurements
(more homogeneous coverage)
Floating Cara Data
measuring
instantaneous speed
?
Isolated
measurements
No coverage
Full-equipped
areas/corridors
(heterogeneous coverage)
Conventional sensors
(loops, radars and
cameras) measuring
traffic flow and speed
Conclusions
• Operational (light) models are mostly based on real
time data for the provision of infromation services
▫ Data intensive (high processing capabilities)
▫ Fewer mathematics and algorithms (low processing time)
• Strategic (heavy) models are mostly based on
traditional 4-step modeling for answering “what if”
questions
▫ No use of innovative data sources
▫ Not able to provide “what to”
• Both models should converge into a unique model
▫ Using innovative data sources (ICT - IT)
▫ Mathematical and physical framework (transport
engineering)
Thank you for your attention
Dr. Georgia Aifandopoulou
Research Director
Center for Research & Technology Hellas, Hellenic Institute of Transport
Tel: 2310 498457
Email: [email protected]
Web: www.hit.certh.gr