A Comparison of Two Distributed Systems: Amoeba

A Comparison of Two
Distributed Systems:
Amoeba & Sprite
By: Fred Douglis, John K. Ousterhout, M. Frans Kaashock,
Andrew Tanenbaum
Dec. 1991
Introduction
shift from time-sharing to multiple processors has motivated
development of distributed OS’s
paper compares two distributed systems, Amoeba and Sprite
which have taken different approaches to the design
by comparing the two systems, conclusions and observations can
aid in design of future distributed systems
Why these two?
authors had significant experience with historical developments
of both systems, Tanenbaum worked on Amoeba, and
Ousterhout worked on Sprite
limiting to only two systems allows for greater examination
Sections
1. Fundamental design philosophies
2. Relating philosophies to OS issues: kernel architectures,
communication, file systems, and process management
3. How issues have been addressed in other systems
4. Development history of amoeba and sprite, future directions
5. Conclusions
Design Philosophies
Common Goals:
both projects saw trend towards large numbers of powerful
yet inexpensive processors connected by high speed
networks
both focussed on two key issues:
shared storage
shared processing power
Design Philosophies
Shared storage: share secondary storage among all processors
without degrading performance
Shared processing power: allow collections of processors to be
harnessed by individual users so that apps could benefit from
large number of machines
Design Philosophies
Projects diverged on two philosophical grounds:
Amoeba designers predicted that networked systems would soon
have many more processors than users, take advantage of
parallelism
Sprite assumed a more traditional model, goal was to develop
technologies for implementing unix like applications (file
systems) on networked workstations, envisioned distributed
nature not visible outside the kernel
Design Philosophies
Second philosophical difference with the way processes are
associated with processors
Sprite’s approach: each user had a mostly private workstation
and that user’s processes are normally executed on that
workstation, pool of idle machines to offload work
Amoeba assumed computing power would be shared equally by
all users, processor pool, processing more centralized than Sprite
App Environment
Amoeba provides object based distributed system
each process or file is object, identified by a capability
which includes a port (logical address)
objects know nothing about the location of servers they
interact with
eases task of writing distributed apps, provides automatic stub
generation for RPC, uses own language called ORCA
App Environment
Sprite runs a network OS that is oriented around shared file
system, modeled after UNIX (pros/cons, next slide)
Sprite emphasized location transparent file access, consistent
access to shared files
caches file data on client workstations to perform many file
operations without network transfers
no support for protocols communicating over the network at
user-level
App Environment
compatibility w/ unix allowed for early adoption of Sprite
most unix apps can be easily recompiled to run on Sprite
also restricted its appeal
not a concern with Amoeba, only somewhat compatible, took
significant amount of work to port programs to Amoeba
Amoeba offers more flexibility in design of new software and
more opportunities for research
Processor Allocation
workstation model: each host maintains list of its own processes,
the tasks executed on one machine
processor pool model: processors dynamically allocated to
processes regardless of host or location
Amoeba follows more closely the pool model while Sprite is
closer to the workstation model
Processor Allocation
Amoeba system consists of a processor pool, specialized servers,
and gfx terminals
unlike the full processor pool model, amoeba uses processors
outside the pool for system services, avoids contention between
user processes and system processes
Processor Allocation
Chose model for 3 reasons:
believed processor and memory chips would continue to
decrease in price
cost of assembling pool cheaper than individual workstations
wanted to make system appear as a single time shared
system, users should not be concerned with physical
distribution of hardware
Processor Allocation
Sprite system consists of workstations and file servers
not pure processor model because although each host has
guaranteed priority over their machine, Sprite provides the
facility to execute commands using the processing power of idle
hosts (commands appear as they are run on own workstation)
Processor Allocation
Chose workstation based model for three reasons:
workstations offered opportunity to isolate system load, one
user would not be affected by high load created by another user
hypothesized that much of the power of newer machines would
be used for UI, thus put it closer to the user i.e. workstation
saw little difference between gfx terminals and workstations
Kernel Architectures
Sprite follows unix, monolithic kernel, all kernel functionality
implemented in single privileged address space
only shared kernel level service is the file system
Reasons: performance of microkernels unclear at the time
by placing kernel in one large address space made it possible to
share data structures and memory
Kernel Architectures
Amoeba implements a “microkernel”, with a minimal set of
services implemented in the kernel
Other services, such as file system and process placement are
provided by separate processes that may be accessed anywhere
in the system
some services such as time which would be an individual
process in Sprite can be provided in Amoeba by a single
network wide server
Kernel Architectures
Reasons for the microkernel
motivated by uniformity, modularity and extensibility
Since services are obtained through RPC, both kernel level and
user level services can be accessed through the same interface
users may extend or create their own services
Kernel Architectures
Performance?
RPC calls are slower than kernel calls, 70ms vs. 500ms
Amoeba lacks swapping or paging that improves performance
depends more on system characteristics such as network speed
and file caching than microkernel vs. monolithic
Comm. Mechanisms
Amoeba presents whole system as collection of objects on each
of which a set of operations can be performed using RPC
Sprite also uses RPC for kernel to kernel communication
for user level communication uses “pseudo-devices”, allows
for synchronous and asynchronous read/write on the file
system
Comm. Mechanisms
File System
Both provide single globally shared, location-transparent file
system
File System - Sprite
designed for file intensive applications
caches data on both clients and servers for high performance
provides traditional unix open-close-read-write interface with
naming and file access performed in the kernel
host responsible for processes seeing the most recent data
if a server crashes, clients use idempotent reopen protocol
files are stored in blocks that may or may not be contiguous
File System - Amoeba
splits naming and access into two different servers
directory server
translates names into cababilities
no restriction on location of objects referenced by a directory
automatically replicates directory entries as they are created
File System - Amoeba
file server, known as Bullet Server
runs on dedicated machine
ops are read-file, create-file, delete-file
process can create a file, specify contents, but file cannot be
read until committed
once committed can be read by everyone
only permissible operations on a committed file are reading
and deletion
File System - Amoeba
Bullet server also uses distributed garbage collection for memory
that has not be referenced in a long time
Amoeba permits replication of both files and directory entries
(which is more complicated and takes a performance hit)
Bullet server is simpler than Sprite’s file system but
working with immutable files requires addl services
writing to a file requires a whole file copy
File System - Amoeba
since files are contiguous, Bullet server cannot deal with files
larger than the size of its physical memory
Bullet server does no file caching, so a file must be
transfered over the network each time it is read
File System - Performance
Process Management
Amoeba provides virtual memory and threading but does not
perform swapping or demand paging
to execute, calls exec_file, specifies name of file and
capabilities, no need to copy process state of parent (fork)
Sprite identical to BSD UNIX
supports demand paging, allows main memory on a server
to cache pages for clients
to execute, process forks itself and calls exec
Process Management
Processor Allocation
Amoeba
a run server selects the processor that the process will run
on based on load, usage, (not pn user)
each application can create as many processes as processors
and then the system time-shares each processor among all
processes using round robin
downside: not always the most efficient use of resources
Processor Allocation
Sprite assumes one to one mapping between user and
workstation
processes usually run on user’s machine, but can migrate to
idle machines
downside: users can overload their own workstations
Related Work
Other Distributed Systems:
V System: work station model similar to Sprite, but provides
most system services at user-level, and uses multicast to
communicate
Chorus: microkernel and message passing, uses capabilities
and ports like Amoeba
Plan 9: Like Amoeba, but uses a small number of
multiprocessor machines
Project Evolution - Amoeba
Began in 1981, as a student’s PhD research
used “now” (1991) in European Space Industry for transmission
of real-time digital video over LANs
used at Universities for projects involving distributed and
parallel computing, free for both companies and schools
Project Evolution - Amoeba
Current research concentrated on:
Parallel applications and improving the Orca language
improving RPC
distributed shared memory
Wide-area transparent systems (having Amoeba machines in
different countries, i.e. a user in NY could have a processor
pool in Amsterdam)
Project Evolution - Sprite
began in 1984, as of 1991 over 50 users
uses in OS research, computer-aided design, and computer
architecture
most use Sprite as though it were unix but take advantage of
migration and file caching
Project Evolution - Sprite
New research
Log-structured file systems (LFS), new approach to disk
storage where only structure on disk is append-only log
Striping files: improve bandwidth of large file accesses by
spreading over multiple disks
Buffering: more sense than caching?
Reliability: system state after server crashes
Mach - microkernel
Conclusions
Amoeba helps disprove notion that microkernels are inferior to
monolithic kernels
Amoeba demonstrates desirability of uniform communication
model
same RPC interface whether at user or kernel level
Sprite demonstrates benefits of client caching for file intensive
applications
Conclusions
comparison between Amoeba and Sprite suggests advantages of
a hybrid system containing both workstations and processor
pools
compatibility with unix is a double edged sword
increases acceptability
doing all communication through kernel hurts performance
unix process model (context switching,program invocation)
also slow
Conclusions
Improvements to Amoeba
improve unix compatibility libraries
Sprite
improve context switching and scheduling