Supplementary Lecture 5

cs4411 – Operating Systems Practicum
Supplementary lecture 5
December 2, 2011
Zhiyuan Teo
Our last lecture…
•Administrative information
•General implementation hints
•A review of things we have done
•Where do we go from here?
•Offline Q&A session
Administrative Information
•CS4410 MP4 is optional for 4411 students.
•The zipfile for Project 6 has been updated on CMS.
Please use our new disk.c file.
•Deadline for Project 6 has been extended till 11.59pm
Sunday, 11 December.
Administrative Information
•We have been busy catching up on research and final
reports.
- We promise to honor all regrade requests.
- We want to help you!
•One final round of office hours on Monday, 5 December
from 3pm – late.
Before we begin…
•These slides generally reveal implementation hints.
- You do not have to follow the implementation we describe here!
- Consider following the hints only if you are stuck.
•Focus on correctness first, then performance later.
•The only concurrency we will test for is simultaneous
writes.
- But ideally, we want you to design the file system to be correctly concurrent
under any kind of workload.
•inode = “eye node”, not “ee node”.
- Don’t embarrass yourself and your instructors!
Before we begin…
•mkfs (and fsck) should be minithread programs.
- You need to compile them as separate programs.
- Don’t make mkfs or fsck function calls in your minifile implementation.
Getting started
•How do the global disk variables work?
- They are set inside main(), before minithread_system_initialize() is called.
int main(int argc, char** argv) {
use_existing_disk=0;
disk_name = “disk0”;
disk_flags = DISK_READWRITE;
disk_size = 1000;
minithread_system_initialize(entrypoint, NULL);
}
void minithread_system_initialize(proc_t mainproc, arg_t arg) {
disk_initialize(&disk);
install_disk_handler(disk_handler);
}
On-disk data structures
•Keep things simple: 1 disk block per item.
- One disk block for superblock.
- One disk block for each inode.
- Don’t share disk blocks.
•Concurrency-related structures should not be on disk.
- Reference counters, locks etc.
•Pointers on the disk point refer to disk block number.
The big picture
superblock
dir inode
dir data
file inode
magic no.
type
name
block no.
type
size of disk
size
name
block no.
size
name
block no.
root inode
direct ptr
direct ptr
first free
inode
first free
data block
free block
indirect ptr
next free block
name
block no.
name
block no.
name
block no.
next free block
file data
data
direct ptr
direct ptr
indirect ptr
direct ptr
direct ptr
direct ptr
direct ptr
indirect ptr
indirect ptr
The big picture
superblock (blk 0)
dir inode (blk 1)
magic no.: 4411
DIR_INODE
..
at inode 1
FILE_INODE
size: 1000 blks
size: 3 ents.
.
at inode 1
size: 12 bytes
abc.txt
at inode 2
root inode: 1
dir data (blk 100)
dir ptr 1: 100
free block
(blk 3)
0
0
nx free inode: 4
nx free inode: 5
Hello world!
0
0
1st free inode: 3
file data (blk 101)
dir ptr 1: 101
0
0
first free
data block
file inode (blk 2)
0
…
last free inode block
(blk 99)
0
Superblock
superblock
magic number
size of disk
root inode
first free inode
first free data block
•Use disk block 0 for the superblock.
•Root inode field should contain the
value 1.
- Since that inode is located at disk block 1.
Inodes
inode
inode type
size
direct ptr 1
•You can use the same structure for
file and directory inodes.
•Size: number of directory entries
if inode is a directory inode.
direct ptr 2
direct ptr n
indirect ptr
•Size: number of bytes of a file if inode
is a file inode.
Directory data blocks
directory data block
name
inode ptr
name
inode ptr
name
inode ptr
name
inode ptr
name
inode ptr
name
inode ptr
name
inode ptr
name
inode ptr
name
inode ptr
name
inode ptr
•This is just a table with 2 columns.
- Implement it as two arrays.
•Directory data blocks are stored in the
region of disk reserved for data blocks.
•You can’t tell from this table if a
certain entry is a file or a directory.
- But you can easily access that entry’s inode and
look up its type.
•No indirect pointers in this block.
Free blocks
free block
ptr to next free block
•Use the same data structure for free
inodes and data blocks.
•Just store an integer that points to
the next free block.
•If the next free block says 0, there
are no more free blocks after this.
•Returning blocks to the free list:
prepend and modify superblock.
Data structures for blocks
•You can apply a trick we’ve seen before.
struct superblock {
union {
struct {
char magic_number[4];
char disk_size[4];
char root_inode[4];
char first_free_inode[4];
char first_free_data_block[4];
} data;
char padding[DISK_BLOCK_SIZE];
}
}
Data structures for blocks
•You can apply a trick we’ve seen before.
struct inode {
union {
struct {
char inode_type;
char size[4];
char direct_ptrs[TABLE_SIZE][4];
char indirect_ptr[4];
} data;
char padding[DISK_BLOCK_SIZE];
}
}
Data structures for blocks
•You can apply a trick we’ve seen before.
struct dir_data_block {
union {
struct {
char dir_entres[TABLE_SIZE][256];
char inode_ptrs[TABLE_SIZE[4];
} data;
char padding[DISK_BLOCK_SIZE];
}
}
Data structures for blocks
•You can apply a trick we’ve seen before.
struct free_data_block {
union {
char next_free_block[4];
char padding[DISK_BLOCK_SIZE];
}
}
Benefits
•You can cast the struct into a char* and directly
use it in disk read and write operations.
- The struct is of size DISK_BLOCK_SIZE, so you will read/write exactly one block.
- Use pack and unpack functions to fill/read the fields.
•No need to worry about padding.
•Endianness-independent disk implementation.
Variations
•Remember: you don’t have to follow our suggestions.
- As long as your file system is reasonable and concurrent.
- Describe your implementation in the README file.
•You can use single/double/triple indirect pointers,
similar to Linux.
•You can use a bitmap instead of a free list.
•You may want to use different structures for blocks.
What are not acceptable variations
•Constricting free expansion for the number of
directory entries or a file’s size.
- But you can assume there will not be more than 232 directory entries.
- File sizes will not exceed 232 bytes (4Gb).
•Storing names in inodes.
•Storing directory data or indirect blocks in the
inode-reserved section of the disk.
Concurrency
•Create some in-memory protection structures.
- They have to be dynamically allocated since disk_size is a variable.
•Our suggestion: one ‘big lock’ for metadata accesses
that can potentially span multiple inodes.
•One lock per inode for file updates.
- Lock this inode when performing reading/writing, but release it as soon as
you can.
Need more implementation hints?
•The Design of the UNIX Operating System, Maurice J.
Bach
•Contains lots of low-level, step-by-step instructions.
•Credits to Ashik, Christopher and Sean for contributing
this gold nugget.
We’ve come to the end.
•This was not an easy class.
•Hindsight is always perfect.
•You built an OS!
•Very few students in the world have an opportunity to
practice extensive design of real OS components.
My parting quote
“Good design comes from experience. Experience
comes from bad design.”
-Theodore von Karman
It is our great pleasure to have accompanied you on a part
of your academic journey.