Diplomarbeitspräsentation
Normal Forms for Non-Relational Data
Technische Universität Wien
Institut für Informationssysteme
Arbeitsbereich: Database and Artificial Intelligence
Betreuer: Univ.Prof. Mag. Dr. Reinhard Pichler
Masterstudium:
Computational Intelligence
Wolfgang Fischl
Existing Normal Forms For Relational Data
Motivation
The same information can be stored in different data models
I
Figure : Information on conference articles stored in different data models.
I
I
(a) Relational Table
(b) XML Tree
proc
p1 : proc
article year
PODS13 NF
2013
PODS13 DLNF 2013
@name
PODS13
Definition. A relational schema R with a set of FDs Σ over R is
in Boyce-Codd Normal Form iff for every nontrivial FD X → Y
implied by Σ, X is a superkey, i.e. X → U. [3]
a2 : article
a1 : article
@name @year @name @year
2013 DLNF 2013
NF
Existing Normal Forms For XML Data [1]
(c) Graph Database
I
NF
PODS13
I
name
name
e
has_articl
a1 : article
I
appeared
p1 : proc
2013
has_articl
e
a2 : article
appeared
I
I
name
DLNF
I
I
I
A relational table R has attributes U := A1, . . . , An
A functional dependency (FD) over R is an expression
X → A, where X ⊆ U and A ∈ U
Database design is important: “How to maintain consistency?”
Avoid redundancies arising from badly designed models
Normal Forms define well-designed models
Goal
An XML tree consists of nested elements of different type
Associated to each element type is a set of attributes
The structure is given by a DTD D
A DTD D specifies the set of allowed paths in D, i.e. paths (D)
An XML functional dependency (XFD) over D is an expression
S → p, where S ⊆ paths (D) and p ∈ paths (D)
Definition. A DTD D and a set Σ of XFDs over D is in XML
+
Normal Form iff for every nontrivial XFD ϕ ∈ (D, Σ) , of the
+
form S → p.@l, it is the case that S → p is in (D, Σ) .
Theorem. Let R[U] be a relational schema, FD be a set of FDs
over R. DR is the DTD and ΣFD is the set of XFDs, translated by
a direct-mapping of relational schemas to XML.
Then, (G, FD) is in BCNF iff (DG, ΣFD) is in XNF.
Is there a Normal Form for Graph Databases?
A New Normal Form for Graph Databases
The Description Logic DL-LiteA [4]
I
I
I
DLs are well-suited as a formal model for graph databases
DL-LiteA is expressive enough for conceptual models
A DL-LiteA knowledge base consists of a set of
I
I
extensional assertions in a TBox T , i.e. the structure of the graph
intensional assertions in an ABox A, i.e. knowledge on individuals
Path-based identification constraints [2]
I
I
Express that one or more nodes uniquely determine another:
(id C π1, . . . , πn), where π → S | D? | π ◦ π ,
C, D are basic concepts, πi is a path and S a role.
Unfortunately, pIdCs do not semantically capture FDs:
Theorem. There is a set FD of FDs over a relational schema
R[U], and a pair of relational instances I1 and I2 of R[U], s.t. there
does not exist a set Σ of pIdCs, s.t. the following holds:
If I1 FD and I2 2 FD, then MI1 Σ and MI2 2 Σ.
Description Logic Normal Form (DLNF)
I
I
subtrees (τ ) are the subtrees of τ at depth 1
neighbors (τ ) are the concepts appearing at depth 1 in τ
Definition. A DL-LiteA TBox T and a set of tIdCs Φ over T is in
k-Description Logic Normal Form if and only if
I for every nontrivial tIdC ϕ with depth (τi) ≥ k,
I such that hT , φi ϕ,
I it is the case that for each concept C ∈ neighbors (ϕ)
0
I it holds that hT , Φi (id C Π (C)), where
0
Π (C) = {subtrees (τi) | neighbors (τi) = C ∧ depth (τi) > 1} .
Theorem. Let R[U] be a relational schema and FD a set of FDs
over R[U]. TR[U] is the TBox and Σ is the set of tIdCs, translated
by a direct-mapping of relational schemas to DL-LiteA KBs.
Then (R[U], FD) is in BCNF iff TR[U], Σ is in 2-DLNF.
Tree-based identification constraints
I
I
Extend pIdCs with trees τ :
(id C τ1, . . . , τn), where τ → π | π ◦ (τ, . . . , τ ),
tIdCs are evaluated over a pair (o, ho1, . . . , oni), where o is an
individual in A and ho1, . . . , oni in A is a tuple of individuals
at the leaves of a tree
This research is supported by the Austrian Science Fund (FWF):P25207-N23.
References
[1] M. Arenas and L. Libkin.
A Normal Form for XML Documents.
ACM TODS, 29(1):195–232, 2004.
[2] D. Calvanese and et al.
Path-Based Identification Constraints in
DLs.
In KR, pages 231–241, 2008.
[3] E. F. Codd.
Recent Investigations in Relational DB
Systems.
In IFIP Congress, pages 1017–1021, 1974.
[4] A. Poggi and et al.
Linking Data to Ontologies.
J. Data Semantics, 10:133–173, 2008.
Kontakt: [email protected]
© Copyright 2026 Paperzz