download

Matakuliah
Tahun
Versi
: M0184 / Pengolahan Data Distribusi
: 2005
:
Session – 7
DATA HANDLING – DISTRIBUTION AND
TRANSFORMATION
OBJECTIVES
• DATA ALLOCATION
• THE DATA PLACEMENT AND ALLOCATION PROBLEM
• SEMANTIC APPROACH FOR DATA ALLOCATION
DATA ALLOCATION
• An essential aspect of handling distributed
data arises when we are deciding how to
distributed the data around the sites in order
to take advantage of the “natural” parallelism
of execution inherent in the distributed
system.
• Providing automatic methods of transforming
access to data which are written in global
query language into local DBMS
The Data Placement and
Allocation Problem
• This problem is concerned with ensuring that the
data objects are well placed with respect to other
objects with which they are related
• “Well Placedness”  if data object O1 and O2
are required consecutively in a particular query,
place them near to each other.
• Dispersing files efficiently around network
• Splitting records up, so that the most “busy”
parts are allocated the best performing facilities
DATA ALLOCATION (EXAMPLE)
• Single Relation Case :
(1,p)
(1,q)
(2,q)
(2,r)
(3,r)
(3,s)
(4,s)
(4,p)
(1,r)
(2,s)
(3,p)
(4,q)
(1,s)
(2,p)
(3,q)
(4,r)
Get all tuples with a “1” in the first position ?
Alternative placement as follows :
(1,p)
(1,q)
(2,p)
(2,q)
(1,r)
(1,s)
(2,r)
(2,s)
(3,p)
(3,q)
(4,p)
(4,q)
(3,r)
(3,s)
(4,r)
(4,s)
DATA ALLOCATION (EXAMPLE)
• Multi Relation Case :
(a,p)
(p,1)
(a,q)
(p,2)
(a,r)
(p,3)
(a,s)
(p,4)
(b,q)
(q,2)
(b,r)
(q,3)
(b,s)
(q,4)
(b,p)
(q,1)
(c,r)
(r,3)
(c,s)
(r,4)
(c,p)
(r,1)
(c,q)
(r,2)
(d,s)
(s,4)
(d,p)
(s,1)
(d,q)
(s,2)
(d,r)
(s,3)
===== =====
===== =====
=====
===== =====
======
R1
R2
R1
R2
R1
R2
R1
R2
Get all R2 tuples with a “1” in column 2 which match R1 tuples with an “a” in
column 1
Alternative placement as follows :
(a,p)
(b,p)
(a,q)
(b,q)
=====
(p,1)
(p,2)
(q,1)
(q,2)
=====
(c,p)
(d,q)
(c,p)
(d,q)
=====
(p,3)
(a,r)
(r,1)
(c,r)
(r,3)
(p,4)
(b,r)
(r,2)
(d,r)
(r,4)
(q,3)
(a,s)
(s,1)
(c,s)
(s,3)
(q,4)
(b,s)
(s,2)
(d,s)
(s,4)
=====
=====
===== =====
======
Semantic Approach to the Problem
• Optimizing the placement and allocation
data without considering the requirements
of any quantitative cost model directly
• The goal : to split a global database up
into fragments to maximize the efficiency
of query execution
Semantic Approach to the Problem
• Example of qualitative approach  to
have equi-JOIN on a primary key or a
foreign key in SQL Query
SELECT doctor-name, patient-name
FROM Doctor, Patient
WHERE Doctor.age = patient.age