Tutorial 19

Tutorial 19
Dina Said
Indexing Data
1. A data entry k* is an actual data record (with
search key value k
2. A data entry is a (k, rid) pair, where rid is the
record id of a data record with search key
value k.
3. A data entry is a (k, rid-list) pair, where rid-list
is a list of record ids of data records with
search key value k.
Indexing Data
1. A data entry k* is an actual data record (with
search key value k  Primary Index
2. A data entry is a (k, rid) pair, where rid is the
record id of a data record with search key
value k. Secondary Index
3. A data entry is a (k, rid-list) pair, where rid-list
is a list of record ids of data records with
search key value k.
Duplicates
• Two data entries are said to be duplicates if
they have the same value for the search key
field associated with the index.
Indexing Data
1. A data entry k* is an actual data record (with
search key value k  Can’t have duplicates
2. A data entry is a (k, rid) pair, where rid is the
record id of a data record with search key
value k. May have duplicates
3. A data entry is a (k, rid-list) pair, where rid-list
is a list of record ids of data records with
search key value k.
Duplicates
• If no duplicates exist
– The search key contains some candidate key
– We call the index a unique index.
Problem 10.10
Consider the instance of the Students
relation shown in Figure 10.22.
Show a B+ tree of order 2 in each of these
cases below, assuming that duplicates are
handled using overflow pages. Clearly
indicate what the data entries are (i.e., do
not use the k∗ convention).
1. A B+ tree index on age using Alternative
(1) for data entries.
Solution
Problem 10.10
Consider the instance of the Students
relation shown in Figure 10.22.
Show a B+ tree of order 2 in each of these
cases below, assuming that duplicates are
handled using overflow pages. Clearly
indicate what the data entries are (i.e., do
not use the k∗ convention).
2. A dense B+ tree index on gpa using
Alternative (2) for data entries. For this
question, assume that these tuples are
stored in a sorted file in the order shown
in Figure 10.22: The first tuple is in page 1,
slot 1; the second tuple is in page
1, slot 2; and so on. Each page can store
up to three data records. You can use
page-id, slot to identify a tuple.
Consider the instance of the Students
relation shown in Figure 10.22.
Show a B+ tree of order 2 in each of
these cases below, assuming that
duplicates are
handled using overflow pages. Clearly
indicate what the data entries are (i.e.,
do not use the k∗ convention).
2. A dense B+ tree index on gpa using
Alternative (2) for data entries. For this
question, assume that these tuples are
stored in a sorted file in the order shown
in Figure 10.22: The first tuple is in page
1, slot 1; the second tuple is in page
1, slot 2; and so on. Each page can store
up to three data records. You can use
<page-id, slot> to identify a tuple.
Is that correct?
3-d tree
• Construct a 3-d tree using the following
dimensions: age (int), years with the company
(int), salary (real) for the following database:
John(60, 24, 64,000); Scott(25, 2, 50,000);
Charlie(38, 18, 54000); David(55, 29, 68,400);
Ellen(27, 7, 55000); Frank(57, 17, 115000);
Grant (66, 22, 40000).