K-D-B Tree

K-D-B Tree
Multidimensional Index
Characteristics
• Multi-way branch
• Height-balanced tree
• Repeatedly divide area of the domain into
disjoint sub-area
• A node in a tree corresponds to a (set of
consecutive) disk page(s)
Jaruloj Chongstitvatana
K-D-B Tree
2
Example of Data Records
Table (stdntID, courseID, grade, year, smstr)
Table (accID, branchID, saving, name, addr)
Table (custID, age, gender, occupation, salary,
children, promotion, since)
Jaruloj Chongstitvatana
K-D-B Tree
3
Nodes = Pages
• Region pages
– Contain a set of <region, ptr. to page>
– Internal nodes
• Point pages
– Contain a set of <point, ptr. to data record>
– Leaf nodes
Jaruloj Chongstitvatana
K-D-B Tree
4
Region Pages
Xmax Xmin Ymax Ymin
Region
… Xmax Xmin Ymax Ymin
PAGE
PAGE
The branching factor is determined by the
page size and the size of each entry.
Jaruloj Chongstitvatana
K-D-B Tree
5
Point Pages
X
Y
DATA RECORD
X
…
Y
DATA RECORD
X
Y
DATA RECORD
POINT
The branching factor of a point page is
usually larger than that of a region page.
Jaruloj Chongstitvatana
K-D-B Tree
6
Example
Point page
Jaruloj Chongstitvatana
Point page
Point page
K-D-B Tree
Point page
7
Point
query
Point page
Jaruloj Chongstitvatana
Search
Point page
Point page
K-D-B Tree
Point page
8
Insert
Insert a point here
and the point page
overflows.
Jaruloj Chongstitvatana
K-D-B Tree
9
Split
• Split a region r with page id p along xi
If r is on the right/page of xi then put <r, p> in the
right/left page.
Otherwise;
For each children pc of p , split pc along xi
Split r along xi into rleft and rright.
Create 2 new pages with page id pleft and pright.
Move children of p in the left region into pleft and
children in the right region into pright.
Return <rleft, pleft> and <rright, pright> .
Jaruloj Chongstitvatana
K-D-B Tree
10
Split: Example
The page overflows,
This region
and is
is also
splitted.
splitted.
This region is splitted.
Jaruloj Chongstitvatana
K-D-B Tree
11
Split: Example
The region page is splitted.
The point page is also splitted.
Create a pages
Children
new region
are transferred.
page.
Jaruloj Chongstitvatana
K-D-B Tree
12
How to find split axis
• Cyclic: x -> y -> x -> y -> …
• Priority: x -> x -> y -> x -> x -> y -> …
• Possible one
Jaruloj Chongstitvatana
K-D-B Tree
13
Insert
• Insert a record with point a and location l in a tree with
root r
If r is NIL, then create a point page p and insert the
record with <a,l> in p and return p.
Otherwise;
Search for a in the tree with root r until a point page,
say p, is reached.
Insert the record in the point page p.
Jaruloj Chongstitvatana
K-D-B Tree
14
Insert (cont’d)
• Insert a record with point a and location l in a tree with
root r
If the point page p is overflowed, then find an
appropriate axis to split p into pleft and pright.
If p is not the root, then change to p and pleft, and
insert pright into the parent of p.
If p is the root, then create a new root node with two
children of pleft and pright.
Jaruloj Chongstitvatana
K-D-B Tree
15
Insert: Example
Search for the given point
until the point page is found.
Insert here and split
point page
if overflows.
Divide
region.
Jaruloj Chongstitvatana
K-D-B Tree
16
Insert: Example
Parent page overflows,
then split the page.
This region is splitted.
Jaruloj Chongstitvatana
K-D-B Tree
17
Insert: Example
The point page is splitted.
The region page is splitted.
Jaruloj Chongstitvatana
K-D-B Tree
18
Insert: Example
Insert
The
root
thenode
new is
region
overflowed,
page inand
its parent.
then splitted.
Jaruloj Chongstitvatana
K-D-B Tree
19
Insert: Example
Create the new root node
Jaruloj Chongstitvatana
K-D-B Tree
20
Delete
• Simple, if storage utilization is ignored.
• Otherwise, an underfull page should be merged
with another page.
• When 2 pages are merged, the region of the
new page must be a valid region.
• A number of regions are joinable if their union is
also a region.
Jaruloj Chongstitvatana
K-D-B Tree
21
Joinable Regions
Jaruloj Chongstitvatana
K-D-B Tree
22
Unjoinable Regions
Jaruloj Chongstitvatana
K-D-B Tree
23
Delete (cont’d)
• If a page p is underfull, merge sibling pages of
p whose regions are joinable.
• If the newly-created page is overflowed, then
split the page.
Jaruloj Chongstitvatana
K-D-B Tree
24
Further Discussion
Splitting Criteria
• Value
• Axis
– Cyclic
– Priority
– Shape ?
– Area
– Number of data points
– Ratio ?
– Random ?
Combine the two decisions ?
Jaruloj Chongstitvatana
K-D-B Tree
26