Contrasting typical SW and DB
approaches to semantic integration
Arnon Rosenthal
1
© 2008 The MITRE Corporation. All rights reserved.
Two versions of a common
problem
• Schema matching ≈ Align classes/properties in ontology
Two meta-models, similar core problem
• Start with either:
– Two domain models
– Two schemas (for systems)
– One domain model and one schema
• Goal: Identify the relationships between
– their concepts
– their instance sets
– same as, IS-A, “usable for” seem the main ones helpful to a
“customer”
• May need to transform to make things match
2
© 2008 The MITRE Corporation. All rights reserved.
Decades have elapsed!
• Database side: Survey of schema matching
research (Batini et. al., 1986)
– Target schema may be constructed from inputs
– Envisioned end product is a SQL view
– Focus is on “where can we find clues”
• Sem-Web precursors: ISI, MCC – domain model
(in logic) plus articulation axioms
– Constraints are within the logic
– Reasoning-based. Each project had its own formalism
Obvious question: Why no robust products yet?
3
© 2008 The MITRE Corporation. All rights reserved.
Leaping ahead to my conclusions
(SW competitor)
• For enterprise systems today, lean toward
DB and XML tools,
unless you really exploit ontologies’
greater expressive power (value
taxonomies, IS-A)
• Maturing sem-web environments will (by
definition) import knowledge from big data
integration products
4
© 2008 The MITRE Corporation. All rights reserved.
Correspondence topology
• Direct approaches
• Neutral form approaches (can be multiple)
Domain
model
5
© 2008 The MITRE Corporation. All rights reserved.
Emerging work – not associated
with systems
• Multiple intermediaries
– Which to use when creating? describing?
Domain
model 1
Domain
model 2
Domain
model 3
6
© 2008 The MITRE Corporation. All rights reserved.
Compare typical DB vs. AI
approaches (1)
Formalisms to describe concepts & relationships
DB (Schema)
AI (Ontology)
• Basic unit: relation or
tree scheme
– Record is a good chunk
for storage or display
– Sets are present, implicit
• Describe a system or a
physical message
• Basic unit: atomic
concept (object or
property)
– Small chunks
easy to relate & reuse
• Describe a domain
model
– Robust for multiple uses
8
© 2008 The MITRE Corporation. All rights reserved.
Compare typical DB vs. AI
approaches (2)
Formalisms to describe concepts & relationships
DB (Schema)
AI (Ontology)
• Direct relationships and
flows between systems
– Instant gratification
(funding is usually for an
applic’n, not for integrat’n)
– Differences in real data lead
to improved definitions
• Relate via neutral defns
– Reuse is easier
– Will administrators
understand “foreign” or
abstract concept defns?
• Tools examine the data
• $billion industry
– feature-rich, scalable
tools
9
© 2008 The MITRE Corporation. All rights reserved.
Compare typical AI vs. DB
approaches (5)
DB
• Homegrown logic
– Even simple Datalogs
won’t interoperate
– extensible
• Mappings are in
popular query
languages
– Efficient: parallel, query
optimizers
– Deployable e.g., change
management
AI
• OWL has both
theory and tool
communities
– extensible
• Execute by
inference engine?
– Not tuned to query
processor strengths
10
© 2008 The MITRE Corporation. All rights reserved.
Compare typical AI vs. DB
approaches (3)
DB
AI
• Relationships among
sets, via {informal or
formal logic assrtns.} or
query language
• Relationships among
concepts: “Usable_for”
• Rel’ships use formalism
very similar to ontologies
– More powerful (data
exchange logicians)
– Terminology:
TGD = שڅ
– View defns are big: hard
to edit (and to reuse)
– IS-A is “native”, i.e., part of
the regular model
– IS-A logically merges the
ontologies
– OWL is insufficient,
rule languages overkill
11
© 2008 The MITRE Corporation. All rights reserved.
Compare typical AI vs. DB
approaches (4)
DB
• Exchange semantics
– examined from user
viewpoint, precise
– Hard to learn or
communicate
– Discards tuples
unnecessarily?
AI
• Exchange semantics:
Whatever my engine
infers !!!
– Is this tolerable? Why
(not)?
12
© 2008 The MITRE Corporation. All rights reserved.
How can they combine
• Formalism:
– OWL ontologies
– Need a standard construct for “can be used for”
super-property ≈ tuple-generating dependencies
• Direct or via neutral model:
– Mix and match, share info and infer over both
• Execution environment: DBMS
– Parallel, query optimization, deployment
– Already bilingual (SQL, XML), add RDF when it
reaches critical mass ($Bs)
14
© 2008 The MITRE Corporation. All rights reserved.
Why “Alignment” research is
hard to transfer
• Conspicuous lack of widely-used products,
from either community
• Aligners/matchers automate some work of
an integration engineer, but can’t 90+%
solve a major “customer” problem
– Without a robust mediator, there aint no
customer!
Lesson: Touch the end users, downstream
(someone outside the IT dept)
• 95% reduction in their work as schemas evolve
15 • Generate code for end users 80% faster
© 2008 The MITRE Corporation. All rights reserved.
Summary
• Two communities addressing similar
problems
– More standards, cleaner formalisms on
S Web side
– More pragmatics and richer suites on the
db side
• Largely formalism independent, could be
imported, esp. “Instant gratification”
16
© 2008 The MITRE Corporation. All rights reserved.
© Copyright 2026 Paperzz