Disks, Tapes And CD-ROM. fundamental file structure . Managing

Welcome to …..
FILE ORGANIZATION
DATA AND FILE ORGANIZATION
Instructor: Dr Halla Abdel Hameed
e-mail: [email protected]
Objectives of the Course and Preliminaries
2
Objectives of The Course
 To Provide a Solid Introduction to the Topic of
File Structures Design.
 To Discuss a number of Advanced Data
Structure Concepts that are necessary for
achieving high efficiency in File Operations.
 To Develop important programming skills in and
Object-Oriented Language such as C++ or Java.
January 10, 2000
3
Pre-Requisites
 Introduction to Computer Science.
 Knowledge of C++
 Course Link…..
 http://www.acadox.com/class/27279#resources
January 10, 2000
4
Required TextBook
 Title:
File Structures
An Object-Oriented Approach with C++
 Authors:
 Michael J. Folk
 Bill Zoellick
 Greg Riccardi
January 10, 2000
5
Course Requirements
 Lab Work (Programming assignments or A
Programming Project + oral Exam) (25%)
 A Mid-Term (10%)
 A Final Exam (65%)
January 10, 2000
6
Special Thank to Eng. Mostafa Elmasri for his
contribution in preparing this material.
Course Outline
1. Introduction To File Management
2. Fundamental File Processing Operations
3. Secondary Storage, Physical Storage Devices: Disks, Tapes And CD-ROM.
4. fundamental file structure .
5. Managing Files Of Records.
6. Organizing file for performance (File Compression- Reclaiming Space In FilesInternal Sorting- Binary Searching- Keysorting).
7. Indexing.
8. Consequential Processing And Eternal Sorting
9. Multilevel Indexing And B Trees
10. Indexed Sequential Files And B+trees.
11. Hashing And Extendible Hashing.
Course Outline
1. Introduction To File Management
2. Fundamental File Processing Operations
3. Secondary Storage, Physical Storage Devices: Disks, Tapes And CD-ROM.
4. fundamental file structure .
5. Managing Files Of Records.
6. Organizing file for performance (File Compression- Reclaiming Space In FilesInternal Sorting- Binary Searching- Keysorting).
7. Indexing.
8. Consequential Processing And Eternal Sorting
9. Multilevel Indexing And B Trees
10. Indexed Sequential Files And B+trees.
11. Hashing And Extendible Hashing.
a
-All this will be built on your knowledge of Data
structure
Lecture 1
Introduction to the Design and Specification of File Structures
FILE ORGANIZATION
Lecture Objectives
 Introduce the primary design issues that
characterize file structure design.
 Survey the history of file structure.
 Introduce conceptual toolkit for file structure
design.
 Develop an object-oriented toolkit that makes
file structure easy to use.
Lecture Contents
1.
2.
3.
4.
The heart of file structure design.
A short history of file structure design.
A conceptual toolkit: File structure literacy.
An object-oriented toolkit: Making file
structure usable.
Section 1.1
The heart of file structure
design
File Structure
Definition & Functions
Definition
 A combination of representations for data in files and of
operations for accessing the data.
Functions
 Allowing applications to read, write and modify data.
Memory versus Secondary Storage
 Secondary storage such as disks can pack 1000’s
of megabytes in a small physical location.
 Computer Memory (RAM) is limited.
 Comparing to Memory, access to secondary
storage is extremely slow.
 Getting information from slow RAM takes 120. 10-9
seconds (= 120 nanoseconds) while getting
information from Disk takes 30. 10-3 seconds (= 30
milliseconds)
 Roughly,
20 second on RAM ≈ 58 days on Disk
Improve Secondary Storage Access Time
 representation of the data
 the implementation of the operations
⇒ the efficiency of the file structure for
particular applications
General Goals
 Get the information we need with one access
to the disk.
 If that’s not possible, then get the
information with as few accesses as possible.
 Group information so that we are likely to get
everything we need with only one trip to the
disk.
Section 1.2
A short history of file
structure design
Early Work
 Early Work assumed that files were on tape.
 Access was sequential and the cost of access
grew in direct proportion to the size of the
file.
The emergence of Disks and Indexes
 As files grew very large, unaided sequential
access was not a good solution.
 Disks allowed for direct access.
 Indexes made it possible to keep a list of keys
and pointers in a small file that could be
searched very quickly.
 With the key and pointer, the user had direct
access to the large, primary file.
The emergence of Tree Structures
 As indexes also have a sequential flavor, when
they grew too much, they also became difficult to
manage another problem was the changing of
files.
 The idea of using tree structures to manage the
index emerged in the early 60’s.
 Trees can grow very fast as records are added and
deleted
 resulting in long searches requiring many disk accesses
to find a record.
Hash Tables
 Retrieving entries in 3 or 4 accesses is good, but
it does not reach the goal of accessing data with
a single request.
 From early on, Hashing was a good way to reach
this goal with files that do not change size
greatly over time.
 Recently, Extendible Dynamic Hashing
guarantees one or at most two disk accesses no
matter how big a file becomes.
Section 1.3
A conceptual toolkit:
File structure literacy
Conceptual tools
Sequentially
Tree
Structure
For File Structure Design
Direct
Access
 Decrease the number of disk accesses by
collecting data into buffers, blocks, or buckets.
 Manage their growth by splitting them.
 Find a way to increase our address or index
space.
 Find new ways to combine the basic tools.
Intended Learning Outcomes
After completing the course, the student will be
able to:
 Demonstrate knowledge of storage by
describing how data is saved on disk.
 Demonstrate knowledge of how file organization
allows applications to read, write and modify
data.
 Demonstrate knowledge of cost-based query
optimization by finding the data that match
some search criteria.
Lecture Style
Previous Lecture
A brief review of the
previous lecture.
Answer questions
addressed to instructor’s
email.
New
Lecture
Introduce and
explain current
lecture topics.
Next Lecture
Follow up practice and
tutorial scheme.
A brief proposal for the
next lecture
Next Lecture
Fundamental File Processing Operations
 Physical and logical file.
 Opening and closing files.
 Reading and writing.
 Seeking.
 Special Characters in files.
 Physical devices and logical files.
 File-related header files.
Questions?