• Files – data collection created by user processes Desirable

F ILES
M INIMAL U SER R EQUIREMENTS
• Files – data collection created by user processes
◦ Desirable properties: long-term existence, sharable between processes, structure (hierarchical)
• File system – provides means of storing data organized as files, as well as a collection of functions that can be performed on files
• May control what type of access is allowed to the files
Meet the data management needs of the user
Guarantee that the data in the file are valid
Minimize the potential for lost or destroyed data
Optimize performance
Provide I/O support for a variety of storage device types
Provide a standardized set of I/O interface routines to user processes
Provide I/O support as well as protection for multiple users
CS 409, FALL 2013
• Should be able to create, delete, read, write and modify files
• May have controlled access to other users’ files
◦ Also maintains a set of attributes associated to files
◦ Typical operations: create, delete, open, close, read, write
◦ Objectives:
–
–
–
–
–
–
–
Each user:
F ILE M ANAGEMENT /1
F ILE S YSTEM O RGANIZATION
• Should be able to restructure the files in a form appropriate to the problem
• Should be able to move data between files
• Should be able to back up and recover files in case of damage
• Should be able to access his or her files by name rather than by numeric identifier
CS 409, FALL 2013
F ILE M ANAGEMENT /2
F ILE S YSTEM O RGANIZATION ( CONT ’ D )
• Device drivers: lowest level, communicate directly with the device
◦ Responsible for starting I/O operations on a device
◦ Processes the completion of an I/O request
◦ Considered to be part of the operating system
• Basic file system (or physical I/O level): primary interface with the environment outside the computer system (e.g., the disk)
◦ Deals with blocks of data that are exchanged with disk or tape systems
◦ Concerned with the placement of blocks on the secondary storage device
◦ Concerned with buffering blocks in main memory
CS 409, FALL 2013
F ILE M ANAGEMENT /3
CS 409, FALL 2013
F ILE M ANAGEMENT /4
F ILE S YSTEM O RGANIZATION ( CONT ’ D )
F ILE S YSTEM O RGANIZATION ( CONT ’ D )
• Basic I/O supervisor: responsible for file I/O initiation and termination
◦
◦
◦
◦
Maintains control structures that deal with device I/O, scheduling, and file status
Selects the device on which I/O is to be performed
Concerned with scheduling disk and tape accesses to optimize performance
Assigns I/O buffers and allocates secondary memory
• Logical I/O: enables users and processes to access resords
◦ Provides general-purpose record I/O capability
◦ Maintains basic data about files
• Access method: the level of file system closest to the user
◦ Provides a standard interface between applications and the file systems and
devices that hold the data
◦ Different access methods reflect different file structures and different ways of
accessing and processing the data
CS 409, FALL 2013
F ILE
F ILE M ANAGEMENT /5
CS 409, FALL 2013
F ILE M ANAGEMENT /6
D IRECTORIES
ORGANIZATION
Logical structuring of records. Common types:
• The pile: Data stored in order of arrival; a record consists of one burst of data
• The sequential file: Fixed format for records, which are stored sequentially
◦ Key field uniquely identifies the record
◦ Only organization that is easily stored on tape as well as disk
• Operations: search, create file, delete file, list directory, update directory
• Tree-structured directories: the files in a directory may be directories themselves
◦ Advantages: efficient searching,
grouping capabilities
• Indexed sequential file: adds an index to support random access
◦ Greatly reduces the time required to access a single record
◦ Multiple levels of indexing can be used to provide greater efficiency in access
• Indexed file: records are accessed only through their indexes
◦
◦
◦
◦
• Special files that contain information about other files
Variable-length records possible
Exhaustive index contains one entry for every record in the main file
Partial index contains entries to records where the field of interest exists
Used mostly in applications where timeliness of information is critical
◦ Current working directory (set with
cd) = files can be specified by absolute path (relative to root) or relative path (to the current working directory)
◦ Acyclic-graph directories: different
names for a single file; new operations: link a new name to an existing file and unlink
• Hashed files: direct access to fixed-length records via a hash function
CS 409, FALL 2013
F ILE M ANAGEMENT /7
CS 409, FALL 2013
F ILE M ANAGEMENT /8
D IRECTORIES ( CONT ’ D )
F ILE S HARING
Two issues in a multi-user system:
• File Directory Information
◦ Basic information: file name, file type (text, binary, directory, etc.), file organization (if supported)
◦ Address information: volume, starting address on disk, size (allocated and used)
◦ Access control information: owner, access information, permitted actions
◦ Usage information: date created, identity of creator, date last read, identity of
last reader, date last modifier, identity of last writer, date of last backup
◦ Current usage: Information about current activity on the file, such as process or
processes that have the file open, whether it is locked by a process, and whether
the file has been updated in main memory but not yet on disk
• Access rights
Appending (can add data)
Updating (can also modify existing data)
◦
◦ Deletion
◦
◦ Changing protection (can change
access rights for other users)
◦ Access rights usually established based on users or user classes
◦
◦
Owner
Usually the creator,
full rights, may grant
rights to others
• Directory implementation
◦ Sequential file (easy to implement, time-consuming to use)
◦ Indexed file
◦ Hashed file
CS 409, FALL 2013
U NIX F ILE P ROTECTION
• Each file has an owner and an associated group
◦ Groups and group membership are managed by a separate sub-system and are
system-wide
• Three access rights: read, write (both appending and updating), execute
◦ Knowledge = execute (can access) and read (can list) rights to the containing
directory
◦ Deletion = write rights to the containing directory
◦ Changing protection = chown, chgrp, chmod (owner or root only)
• One access group (read, write, execute) for owner, for group, and for the others
owner access
group access
public access
CS 409, FALL 2013
=
=
=
w
1
1
0
x
1
0
1
chmod 761 /public/games
for
owner
User groups
Set of users who
are not identified
individually
CS 409, FALL 2013
U NIX
r
1
1
0
Specific user
Specified
individual
users
◦
◦
All users
All the users who
have access to
the file system
• Management of simultaneous access: file locking, see flock/fcntl and lockf
F ILE M ANAGEMENT /9
octal
7
6
1
None
Knowledge (can determine existence and owner)
Execution
Reading
for
for
others
group
file
F ILE M ANAGEMENT /11
F ILE M ANAGEMENT /10
DIRECTORY LISTING
< godel:409/slides > pwd
/Volumes/Home/Users/bruda/409/slides
< godel:409/slides > ls -lF
total 15860
-rw-r--r-- 1 bruda staff
12104 Sep
-rw-r--r-- 1 bruda staff
27458 Sep
-rw-r--r-- 1 bruda staff
32725 Sep
-rw-r--r-- 1 bruda staff
30018 Sep
-rw-r--r-- 1 bruda staff
42454 Oct
-rw-r--r-- 1 bruda staff
21130 Oct
-rw-r--r-- 1 bruda staff
30993 Oct
-rw-r--r-- 1 bruda staff
29915 Nov
-rw-r--r-- 1 bruda staff
16003 Nov
-rw-r--r-- 1 bruda staff
8 Nov
-rw-r--r-- 1 bruda staff
22472 Nov
-rw-r--r-- 1 bruda staff
11939 Nov
-rw-r--r-- 1 bruda staff
10523 Nov
lrwxr-xr-x 1 bruda staff
14 Jan
-rw-r--r-- 1 bruda staff 8613458 Nov
drwxr-xr-x 15 bruda staff
510 Nov
< godel:409/slides >
CS 409, FALL 2013
5
9
15
29
16
23
29
6
11
13
13
13
13
11
11
11
13:38
23:51
21:16
18:30
21:24
22:30
12:54
14:04
19:40
19:58
19:58
19:58
19:58
2010
15:14
19:47
00-intro-org.tex
02-overview.tex
03-processes.tex
04-threads.tex
05-synchronization.tex
06-deadlock.tex
07-memory.tex
08-scheduling.tex
09-io.tex
10-file.aux
10-file.dvi
10-file.log
10-file.tex
Makefile -> ../../Makefile
ch12.pdf
figs/
F ILE M ANAGEMENT /12
R ECORD
F ILE A LLOCATION
BLOCKING
• On secondary storage, a file consists of a collection of blocks
◦ Inherent organization of the media or disk cache organization
• Records (user-level) are usually organized into blocks (OS-level)
◦ Originally because of disk organization (block = disk sector)
◦ But also central to storage optimization (disk cache, physical disk organization)
• Blocking schemes:
◦ Fixed-length = fixed-length records, fixed number of records per block
– May have internal fragmentation
◦ Variable-length spanned = variable-length records packed into blocks with no
unused space (one record may span multiple blocks)
◦ Variable-length unspanned = variable-length records with no spanning
– Even more prone to internal fragmentation
• The operating system or file management system is responsible for allocating blocks
to files
• Free space management is also an important task (influenced by the approach taken
for file allocation)
• Space is allocated to a file as one or more portions (contiguous set of allocated
blocks)
• File allocation table (FAT) = data structure used to keep track of the portions assigned
to a file
• Allocation policies:
◦ Preallocation = allocates space for a (maximum) size for each file
– Maximum size difficult to establish, wasteful
◦ Dynamic allocation = allocates space to a file in portions, as needed
CS 409, FALL 2013
F ILE M ANAGEMENT /13
C ONTIGUOUS F ILE A LLOCATION
CS 409, FALL 2013
C HAINED F ILE A LLOCATION
• Block allocation, with each
block containing a pointer to
the next block
• Preallocation
strategy
• Simple
• The file allocation table needs
just a single entry for each file
• Random access
• Best from the
point of view of
an individual file
• No external fragmentation to
worry about
• Simple
• But files cannot
grow
• Best for sequential files (no
random access)
• Used by several
new file systems
CS 409, FALL 2013
F ILE M ANAGEMENT /14
F ILE M ANAGEMENT /15
• Example: the FAT file system
(DOS, OS/2)
CS 409, FALL 2013
F ILE M ANAGEMENT /16
I NDEXED F ILE A LLOCATION
D OUBLY I NDEXED F ILE A LLOCATION
• Need index table
• Random access
• Dynamic access without
external fragmentation, but
have overhead of index block
• Size of the file limited by the
size of the index block
◦ To extend the maximum
size we can use more index blocks or more indexing levels
outer index
index table
CS 409, FALL 2013
F ILE M ANAGEMENT /17
C OMBINED I NDEXED A LLOCATION : T HE EXT2 I NODE
CS 409, FALL 2013
file
F ILE M ANAGEMENT /18
F REE S PACE M ANAGEMENT
• To perform file allocation, it is necessary to know which blocks are available; a disk
allocation table is thus needed in addition to a file allocation table
• Methods:
◦ Bit tables: the disk allocation table is a bit vector with one bit for each block on
the disk
– A 0 bit corresponds to a free block, a 1 bit to a block in use
– Works well with any allocation method, table as small as possible
◦ Chained free portions: the free portions are chained together in a linked list
– Suited for all allocation methods, negligible space overhead (no disk allocation
table)
– But fragmentation, need to read a block before writing (i.e., allocating) it
◦ Indexing: treats free space as a file and uses indexed allocation on it
◦ Free block list: each block is assigned a number (24 or 32 bits), as list of numbers
for free blocks is kept in a special area on disk
(4KB block size)
CS 409, FALL 2013
F ILE M ANAGEMENT /19
– Part of the list is brought in memory for efficiency reasons
CS 409, FALL 2013
F ILE M ANAGEMENT /20
VOLUMES
AND
M OUNTING
U NIX F ILE M ANAGEMENT
• Several file types: regular, directory, special
(contains no data, associated with a device),
named pipe, link, symbolic link
• All types of Unix files are administered by the OS
by means of inodes
◦ An inode (index node) is a control structure
that contains the key information needed by
the operating system for a particular file
◦ Several file names may be associated with a
single inode
◦ An active inode is associated with exactly
one file
◦ Each file is controlled by exactly one inode
• File allocation is indexed, with part of the index
stored in the inode
◦ In all UNIX implementations the inode includes a number of direct pointers and three
indirect pointers (single, double, triple)
• The file system does not reside on a
physical disk, but on a logical volume
◦ A disk may hold multiple volumes
◦ The sectors on a volume do not
even need to be consecutive on the
physical storage, or even on the
same media!
• A volume (i.e., file system) must be
mounted before it can be accessed
• A file system is mounted at a mount
point = some directory in the set of already mounted file systems
CS 409, FALL 2013
F ILE M ANAGEMENT /21
U NIX VOLUMES
CS 409, FALL 2013
• Directories are structured in
a hierarchical tree
• Each directory can contain
files and/or other directories
F ILE M ANAGEMENT /22
T HE L INUX V IRTUAL F ILE S YSTEM (VFS)
• Presents a single, uniform file system interface (API and ABI) to user
processes
• Defines a common file model that
is capable of representing any conceivable file system’s general feature and behavior
• A Unix volume is laid out with the following elements:
◦
◦
◦
◦
Boot block: contains code required to boot the operating system
Superblock: contains attributes and information about the file system
Inode table: collection of inodes for each file
Data block: storage space available for data files and subdirectories
• Assumes files are objects that
share basic properties regardless
of the target file system or the underlying processor hardware
• Allows the use of virtual file systems that behave like file systems
but do not physically exist on disk
◦ Example: the /proc file system that presents an interface
to all processes
CS 409, FALL 2013
F ILE M ANAGEMENT /23
CS 409, FALL 2013
F ILE M ANAGEMENT /24
P RIMARY O BJECT
TYPES IN
VFS
• Superblock object represents a specific mounted file system
• Inode object represents a specific file
• Dentry object represents a specific directory entry
• File object represents an open file associated with a process
CS 409, FALL 2013
F ILE M ANAGEMENT /25