Matakuliah Tahun : M0264/Manajemen Basis Data : 2008 Manajemen Basis Data Pertemuan 3 Objectives • Data on External Storage (Data pada penyimpanan eksternal) • File Organization (Organisasi File) • Indexing (Pengindeksan) Bina Nusantara Data on External Storage • Disk The most important external storage devices. • Tapes Sequential access devices and force us to read data one page after the other. Bina Nusantara File Organization • Many alternatives exist, each ideal for some situation , and not so good in others: – – – Bina Nusantara Heap files: Suitable when typical access is a file scan retrieving all records. Sorted Files: Best if records must be retrieved in some order, or only a `range’ of records is needed. Hashed Files: Good for equality selections. • File is a collection of buckets. Bucket = primary page plus zero or more overflow pages. • Hashing function h: h(r) = bucket in which record r belongs. h looks at only some of the fields of r, called the search fields. File Organization Cost Model for Our Analysis. We ignore CPU costs, for simplicity: – – – – – Bina Nusantara B: The number of data pages R: Number of records per page D: (Average) time to read or write disk page Measuring number of page I/O’s ignores gains of pre-fetching blocks of pages; thus, even I/O cost is only approximated. Average-case analysis; based on several simplistic assumptions. File Organization • Single record insert and delete. • Heap Files: – – Equality selection on key; exactly one match. Insert always at end of file. • Sorted Files: – – Files compacted after deletions. Selections on sort field(s). • Hashed Files: – Bina Nusantara No overflow buckets, 80% page occupancy. Indexing • There are three main alternatives for what to store as a data entry in a index : – A data entry k* is an actual data record (with search key value k). – A data entry is a (k, rid), where rid is the record id of a data record with search key value k. – A data entry is a (k, rid-list) pair, where rid-list is a list of records ids of data records with search key value k. Bina Nusantara Indexing • An index on a file speeds up selections on the search key fields for the index. – – Any subset of the fields of a relation can be the search key for an index on the relation. Search key is not the same as key (minimal set of fields that uniquely identify a record in a relation). • An index contains a collection of data entries, and supports efficient retrieval of all data entries k* with a given key value k. Bina Nusantara Indexing Clustered vs. Unclustered Index Bina Nusantara
© Copyright 2026 Paperzz