Motivating example

CS 500: Database Theory
!
!
!
Object-relational databases!
!
!
supplementary material:
class notes
!
Material of this lecture is based on Chapter 23 of !
“Database Management Systems” 3rd Ed, Ramakrishnan & Gehrke!
Julia Stoyanovich ([email protected])
Motivating example
California Department of Water Resources: 500,000
photos, with captions
Find sunset pictures of landmarks within 20 miles of
Sacramento, CA.
create table Photos (
id number primary key,
ts date,
caption document,
picture photo_CD_image);
!
create table Landmarks (
name varchar(64) primary key,
location point);
select
from
where
and
and
and
id
Photos P, Landmarks L, Landmarks S
sunset(P.picture)
contains(P.caption, L.name)
L.location |20| S.location
S.name = ‘Sacramento, CA'
Note: user-defined functions
(UDFs) and operators!
!
Note: query optimization must
be re-considered
Julia Stoyanovich
2
Object-database systems
•
Relational databases: relations are in first normal form. Clean
and simple. But the world is more complex!!
•
There is sometimes a need to accommodate!
•
•
•
complex data types / nesting!
inheritance hierarchies!
Two directions, conceptually very similar but implementations
differ!
•
•
Object-Oriented Database Systems (OODBMS) !
•
implemented by Oracle, PostgreSQL and others
Object-Relational database Systems (ORDBMS) is the
currently accepted model, part of the SQL:1999 standard.
Extends the relational model, borrows concepts from OO
programming languages.!
Julia Stoyanovich
3
The Dinky Entertainment Company
•
About the company!
•
•
•
Location: Holywood, CA!
Main assets: cartoon characters, e.g., Herbert the Worm!
Products: film shows, voice and video-footage licenses
(e.g., for action figures, video games)!
DBMS manages everything!!
•
New data types required!
•
user-defined abstract data types (ADTs): image, sound,
video, with functions and operators!
•
•
type constructors: sets, tuples, arrays!
Julia Stoyanovich
inheritance: low-resolution and high-resolution images
are images
4
Why an RDBMS won’t do
BLOB = binary large object!
create table Frames (
frame_number number primary key,
image BLOB,
category number
);
!
no structure / semantics!
!
cannot issue any conditional
queries against the image
attribute
Enter ORDBMS!
• user-defined data types possible for attributes!
• complex attributes are possible (non 1NF)!
• reference types (pointers) - why do we need them? why are these
potentially problematic?!
• inheritance
Julia Stoyanovich
5
User-Defined Abstract Data Types
I’ve got one word for you: encapsulation! !
•
•
•
Users define new types, with their operations (methods). !
Define how to read and output objects of the new type!
Define the size of the objects of the new type
create abstract data type jpeg_image
(internallength=VARIABLE, input=jpeg_in, output=jpeg_out);
!
create function is_sunrise(jpeg_image) returns boolean
as external name '/usr/local/bin/dinky.jar';
•
•
Atomic and user-defined types!
Type constructors: !
•
row (f1 t1, …,fn tn) - a tuple of n fields, where fi us the bane if the filed and ti
is its type !
•
Listof (base), Arrayof(base), Setof(base), Bagof(base)
Julia Stoyanovich
6
Reference types
•
•
•
Objects have an OID!
Consequences? ref / deref!
Examples:!
•
•
•
ref(theater_t) !
setof(ref(arrayof(integer)))!
Shallow vs. deep equality !
•
deep equality is defined recursively for complex
types
Julia Stoyanovich
7
Inheritance
•
•
To reuse and refine type definitions!
To create hierarchies of collections of similar but
not identical objects!
!
create type superhero_t
under superbeing_t (strength, power);
Substitution principle: Given a super type A, and a subtype
B, i t is always possible to substitute an object of type B
into a legal expression written for objects of type A.
Julia Stoyanovich
8
ORDBMS Implementation
•
Physical data layout!
•
objects that vary in size over their lifetime!
Access methods!
•
nested objects, arrays!
indexes on predicates!
indexes over collection hierarchies!
Query processing!
-
New aggregates!
1.specify what to do with first object (e.g., sum=0)!
2.specify what do to on next (e.g., sum+=item)!
3.specify what to do on last object (e.g., avg = sum / cnt)!
- Method caching for expensive predicates
Julia Stoyanovich
9
Query optimization
•
Using new indexes!
• what where-clause conditions are matched by the index!
• what does it cost to fetch a tuple using the index - either
supplied or measured by the DBMS directly!
•
Reduction factor and cost estimation for ADT methods!
• effect of evaluating a selection condition is no longer
negligible! Must consider both selectivity and cost.!
• selectivity of a condition is 1/N means that 1 in N tuples
will pass the selection condition
Julia Stoyanovich
10
Query optimization example
Retrieve all photos in which Herbert is enjoying the sunrise.
σ
isSunrise()∧isHerbert ()
N=100,000 images in R!
isHerbert: cost = 0.5 sec/image, sel = 1/10 !
isSunrise: cost = 0.01 sec/image, sel = 1/5
Julia Stoyanovich
R
11
Query optimization example
Retrieve all photos of Herbert in which there is no sunrise.
σ
isSunrise()∧ NOT isHerbert ()
N=100,000 images in R!
isHerbert: cost = 0.5 sec/image, sel = 1/10 !
NOT isSunrise: cost = 0.01 sec/image, sel = 4/5
Julia Stoyanovich
12
R
Query optimization
!
•
Reduction factor and cost estimation for ADT methods!
1. compute the rank of each condition involving an ADT method!
2. order conditions by increasing rank, process in that order
rank =
Julia Stoyanovich
reductionFactor − 1
Cost
13