File Access in C

Accessing Files in C
CS-2303
System Programming Concepts
(Slides include materials from The C Programming Language, 2nd edition, by Kernighan and Ritchie and
from C: How to Program, 5th and 6th editions, by Deitel and Deitel)
CS-2303, C-Term 2010
Accessing Files in C
1
Two Kinds of File Access
• Stream
• File is treated as a sequence of bytes
• Access is sequential – i.e., in byte order
• Cannot replace data in the middle of a file
• Raw
• File is a sequence of blocks
• Any block can be read and/or written independent
CS-2303, C-Term 2010
Accessing Files in C
2
Definition – File
• A (potentially) large amount of information or data
that lives a (potentially) very long time
• Often much larger than the memory of the computer
• Often much longer than any computation
• Sometimes longer than life of machine itself
• (Usually) organized as a linear array of bytes or
blocks
• Internal structure is imposed by application
• (Occasionally) blocks may be variable length
• (Often) requiring concurrent access by multiple
threads or processes
• Even by processes on different machines!
CS-2303, C-Term 2010
Accessing Files in C
3
Implementations of Files
• Usually on disks (or devices that mimic disks)
• Magnetic – hard drive or floppy
• Optical – CD, DVD
• Flash drives – electronic memory, organized as disks
• Requirement
• Preserve data contents during power-off or disasters
• Directory / Folder
• Special kind of file that contains links pointing to other files
• Associates names with files
CS-2303, C-Term 2010
Accessing Files in C
4
Implementations of Files
• Usually on disks (or devices that mimic disks)
• Magnetic – hard drive or floppy
• Optical – CD, DVD
• Flash drives – electronic memory, organized as disks
• Requirement
• Preserve data contents during power-off or disasters
• Directory / Folder
• Special kind of file that contains links pointing to other files
• Associates names with files
CS-2303, C-Term 2010
Accessing Files in C
5
Organization of File Systems
• Contiguous
• Blocks stored contiguously on storage medium
• E.g., CD, DVD, some large database systems
• Access time to any block is O(1)
• Linked
• Blocks linked together – File Allocation Table (FAT)
• Access time is O(n)
• Indexed
• Blocks accessed via tree of index blocks (i-nodes)
• Access time is O(log n)
• However, base of logarithm may be very large (>100)
CS-2303, C-Term 2010
Accessing Files in C
6
Organization of File Systems
• Contiguous
• Blocks stored contiguously on storage medium
• E.g., CD, DVD, some large database systems
• Access time to any block is O(1)
• Linked
• Blocks linked together – File Allocation Table (FAT)
• Access time is O(n)
• Indexed
• Blocks accessed via tree of index blocks (i-nodes)
• Access time is O(log n)
• However, base of logarithm may be very large (>100)
CS-2303, C-Term 2010
Accessing Files in C
7
Organization of File Systems
• Contiguous
• Blocks stored contiguously on storage medium
• E.g., CD, DVD, some large database systems
• Access time to any block is O(1)
• Linked
• Blocks linked together – File Allocation Table (FAT)
• Access time is O(n)
• Indexed
• Blocks accessed via tree of index blocks (i-nodes)
• Access time is O(log n)
• However, base of logarithm may be very large (>100)
CS-2303, C-Term 2010
Accessing Files in C
8
Stream File Access
fgetc(), fgets(), fputs(), fputc(),
fscanf(), fprintf(), ...
fopen(), fclose()
All take FILE * argument to
identify the file.
• Familar tools
Declared in <stdio.h>
fread(), fwrite(), fseek(), ftell(),
rewind(), fgetpos(), fsetpos()
• Not so familiar
• Note:– if you seek to a position in a file and start
writing, file is truncated at that point
CS-2303, C-Term 2010
Accessing Files in C
9
Raw File Access
• See Kernighan & Ritchie, Chapter 8
• Raw file access
• Without simplifying stream functions – e.g.,
– scanf, fscanf, printf, fprintf, fgetc, etc.
• read and write raw disk blocks
• Seek to a file position
– lseek, fseek — sets file pointer to specified
location
– Subsequent read, write, etc., start there
– ftell – returns file pointer
CS-2303, C-Term 2010
Accessing Files in C
10
Raw File Access (continued)
#include <fcntl.h>
int fd;
int open(char *name, int flags, int
perms);
int creat(char *name, int perms);
int read(fs, buf, n);
int write(fs, buf, n);
long lseek(fd, long offset, int
origin);
CS-2303, C-Term 2010
Accessing Files in C
11
Raw File Access (continued)
• Also functions for listing directories, adding
things to directories, linking files, etc.
• All are essentially calls to the OS
• Consult man pages for details
• man 3p open
• man 3p read
• man 3p write
• ...
CS-2303, C-Term 2010
POSIX standard — Most likely
to be portable across platforms
Accessing Files in C
12
Streams
• Stream access is a layer of abstraction on
top of raw access
• See K&R §8.1–8.6 for example implementations
CS-2303, C-Term 2010
Accessing Files in C
13
Questions?
CS-2303, C-Term 2010
Accessing Files in C
14