MUOS Outcome 1 Theory

UNIT NO:
D75P 34
UNIT TITLE:
Computer Architecture
Session 2003 - 2004
Outcome 2
Demonstrate an understanding of the functions of
computer system components
All materials © Aberdeen College 2002 unless stated otherwise.
May contain reference to external websites outwith the control of Aberdeen College.
All comments to: [email protected]
Computing (TN3)
Engineering, Computing and Business Studies
Awarded for excellence
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Week 1. Main Components of a Computer System.
A computer can be represented in a block diagram like the one below. The four main functional
blocks are the central processing unit, memory, input and output. The input and output
devices, often called peripherals, are used to input and output instructions and data.
In the course of this unit we will look at each of these components in turn. For the first session,
we will concentrate on the CPU, more often called just the "processor" or "chip".
Classifying Processors.
Here are four different methods of describing the "type" of a processor.
Clock Speed. This tells us how many times the clock "ticks" per second - early chips could
run at a staggering 4.77 MHz, now a more acceptable speed is 2 GHz or more.
Processor Size. There are two methods of defining this. The first is to use the register size
- i.e. whatever size the internal registers are. These are usually 8, 16, 32 or 64 bits - more
commonly 16 or 32s. Processors with 16 bit registers are called 16-bit processors. Modern
derivatives of the 80x86 family, Motorola 68000s and anything with a RISC chip have 32 bit
registers.
We define a 32bit processor as having a 32bit-word size. The greater the number of bits,
the more powerful the processor should be because it can process a larger amount of
information in one operation. For example, a 32bit processor can add two 32bit numbers at
once; an 8 bit processor can only do 2 x 8 bit numbers at once. Theoretically it can also
transfer 32 bits to/from memory at once. The actual performance of any processor,
however, depends on many different factors (size is not everything!)
September 2003
2
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Sometimes when referring to a 16bit processor you can have long words, which means 32
bits.
Bus Size. Another method is to classify the processor using its data bus size. In this case
a 16bit processor means it uses a 16bit data bus (although the register sizes might be
entirely different.) The CPU transfers 16bits in one operation. The simple Intel model that
we will begin with will be an 8-bit processor, because it uses an 8-bit data bus with 16 bit
registers. The Motorola 68000 has a 16 bit bus and 32 bit registers. And sometimes they
are called 8/16 and 16/32 processors!
The data bus width is important because it helps determine how fast data can be transferred
to and from the CPU. An Intel 8088 has to transfer 2 lots of 8 bits to fill a 16bit register.
Instruction Set. We will later examine how any processor has a fixed list of commands that
it can respond to, and that this list can vary between processors. These roughly fall into two
groups called CISC (Complex Instruction Set) and RISC (Reduced Instruction Set). There
are also "hybrid" chips (e.g. CRISC, ARM, and MIPS) which fall somewhere between the
two. We will examine later the relevant advantages and disadvantages of each type, but a
rough guide is that a CISC chip will support a large number of instructions (some have 300
or more) and the reduced set of course has much less (a typical RISC chip has about 30).
The Motorola and Intel derivatives are all CISC chips; SUN Sparcs, Acorns, some PDAs and
Nokia Mobile Phones are all examples of RISC chips. RISC instructions always operate on
32 bit registers.
This photo shows a Pentium P3
processor - the chip itself is the tiny
rectangle in the middle of the ceramic
square, which acts as a heatsink. Also
shown is the fan, which is bolted on top
and continually runs while the
processor is in use. The actual chip
measures less than 2cm across.
Photo © C Nyssen 2002
September 2003
3
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Internal Components of the CPU.
Data Bus
Address Bus
The central processing unit (CPU) consists of a control unit, an arithmetic and logic unit, and
various other registers, although individual computers differ as to the exact organisation. Not all
registers have to be the same size because they all hold different types of information. Those
registers, which hold data or instructions, have to be the same size as a memory location.
Registers which hold the address of a memory location, the program counter and the memory
address register all need to be large enough to contain the highest memory address.
In a typical 8-bit microcomputer the registers which hold data are 8 bits wide, whereas those
16
which hold memory addresses are 16 bits wide to allow for a maximum memory size 2 (65
535) locations. A 16-bit PC computer normally has a wider address range, typically in the
24
megabyte range, e.g. 24 address lines giving a maximum memory size of 2 (16777216)
locations.
September 2003
4
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
The Control Unit.
The Control Unit is the "Brain Department" of the CPU. This component controls all the timing
and activities of the processor, and controls everything that happens within the CPU - and
therefore the whole system! The CU itself consists of a number of different elements, but the
most relevant to this unit is the Instruction Decoder. Whenever a program instruction arrives in
the CPU to be executed, the Decoder interprets the information and decides how to process it.
Control signals are required to connect registers to the bus, to control the functions of the ALU
and to provide timing signals to the rest of the computer system. Most of the control signals
originate in the control section of the Central Processing Unit. All the actions of the control unit
are connected with the decoding execution of instructions, the FETCH and EXECUTE cycles.
The ALU (Arithmetic and Logic Unit).
The arithmetic and logic unit (ALU) is involved in the execution of arithmetic and logic
operations. The operands of an arithmetic or logical operation are to be found in memory, but to
speed up the operation many computers have several, typically 8 or 16, faster memory
locations, called registers, within the CPU. Many computers have a single special register,
called the accumulator, which is the source of one of the operands and the destination of an
arithmetic or logical operation. If this is the case, the structure of the processor can be
represented as follows:-
D A TA BU S
A CC
A LU
Flags
Connects to
the control
U nit
The above example also shows a flag register. A flag register contains a number of individual
bits to store information about the result of the last ALU operation, for example, whether it
resulted in a zero result, negative result, or produced a carry or an overflow. This information
may be used by later instructions.
September 2003
5
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Structure Of the ALU
The inputs and outputs will be typically 8 or 16 bits wide, depending on the size of the ALU. The
number of control signals will depend upon the number of functions which the ALU is capable of
n
performing; no control signals are required for 2 operations.
The Sub-unit Of The ALU
The ALU can perform a range of arithmetic and logic operations. The following circuit
descriptions would not necessarily be found in more modern CPUs, which implement the
operations utilising more regular structures such as PLAs (Programmable Logic Arrays).
a) An Adder
A computer works on a pattern of bits and so the lowest level of adder is a one bit adder. This
adder has to implement a truth table as follows:A
0
0
1
1
B
0
1
0
1
Carry Sum
0
0
0
1
0
1
1
0
The circuit that implements the truth table is called a half adder, since for addition of multiple
bits an additional circuit is needed which has an extra input, the carry from the previous bit
addition. This circuit is called a full adder.
The above example also shows a flag register. A flag register contains a number of individual
bits to store information about the result of the last ALU operation, for example, whether it
resulted in a zero result, negative result, or produced a carry or an overflow. This information
may be used by later instructions.
b) Logic Tests
An ALU normally contains logic to perform a number of different logical tests, such as a test to
see if the result of an operation is zero. Some of these logical tests affect the flag register used
to store information regarding the result of the last operation; other logical tests produce a result
used as data in further processing.
c) Logical Tests For Zero
All that is needed for a test for zero is an OR gate with the requisite number of inputs as shown
below. This circuit may be used to set the zero flag on the result of an operation.
September 2003
6
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
d) Bitwise AND Of Two Operands
As this operation suggests, what is required is a set of AND gates which have as inputs the
corresponding bits of the two operands. The outputs are the resultant AND of the bit pairs as
shown by the circuit below:-
The other bit operations, for example the bitwise OR, may be implemented by similar schemes
using different gates.
e) Shifting
Most computers include some form of shift or rotate instructions in the instruction set. These
instructions move bits right or left within a word. The various shift and rotate operations differ in
what is placed in the bit position left vacant by the moving of the bit pattern and by what
happens to the bit which is moved out of the word by the shifting operation.
A shift register may be implemented by a series of edge-triggered flip-flops as shown below. On
the occurrence of a clock pulse, the external input is clocked into the first flip-flop, the output
from the first flip-flop is clocked into the second and so on. Thus all bits are shifted one place to
the right. The output and input will be connected in the particular way required for the shift
operation and initial loading of all the bits of the shift register in parallel is normally allowed.
f) Comparator
Most computers include a number of comparison operations such as tests for equality, greater
than and less than. All these comparisons can be performed by subtraction, with the setting of
the appropriate status flags, without the storing of the subtraction result.
g) Multiplication and Division
In most small computers, multiplication and division are not implemented in hardware but have
to be implemented by the programmer in software. In larger computers special hardware is
provided, but this type of hardware is out-with the scope of this unit.
September 2003
7
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
The Registers.
Registers are small temporary storage units of a fixed size. Most registers are dedicated to a
specific purpose, although general-purpose registers are available in some processors. The
number and nature of registers will vary between processors. Some registers will not be
available to programmers.
R egisters only for
processor use
M em ory A ddress
R egister (M A R )
A ddress bus
M em ory D ata
R egister (M D R )
D ata bus
Instruction R egister
(IR )
P rogram C ounter
(PC )
M ainly for processor use but
can be accessed by
P rogram m er
S tack C ounter
(SC )
S tatus R egister
(SR )
G eneral P urpose
R egister
A ccum ulator
Register
s can be grouped into two types - data registers, which hold data actually being worked on, and
pointer registers, which point to where the data can be found, or where it is being sent to. Most
processors will contain at least the following:Memory Address Register - points to a location in memory where data is being read from or
written to.
Program Counter or Instruction Pointer - points to the address in memory of the next
program instruction, i.e. the one immediately after the instruction currently being executed.
Memory Data Register or Memory Buffer Register - the only register where data can be
transferred into, or leave from, the CPU. Acts like a portal or gateway for the data travelling
between the CPU and RAM.
Instruction Register - used as a "workspace" by the Control Unit, to hold and decode the
program instruction currently being executed.
Accumulator - used as a "workspace" by the ALU to hold data currently being manipulated.
Registers are designed to do a specific job and are not bound by the word size of the computer.
Generally the more complex the set of instructions, the greater the number of internal registers
will be required.
September 2003
8
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Week 2 – Busses and Peripherals.
System Busses.
The CPU is connected to everything else by systems channels called busses. A bus is a
physical, electrical connection between different parts of the computer and consists of either
copper circuits or electrical cable, or a combination of the two. In the next session we will look
in detail at the Data, Address and Control Buses, but this is a general idea of what they do:the data bus is used to transfer the actual data values
the address bus signals where in RAM the data is going to/ coming from
the control bus carries control signals.
In order to attach any input/output devices, or peripherals, you need something to connect the
device to the system bus. In reality, this involves plugging a small circuit board ("card") into the
motherboard to form the physical connection between the two.
This photograph shows a 486 motherboard with an expansion card
fitted. You can see the CPU and RAM chips to the middle and front of
the picture. The expansion card is a VGA (Video) card for outputting
signals to a monitor. Note that this card also carries ROM chips of its
own.
The card therefore acts as a device controller and
device interface. Data then flows from one device
to another along the busses. For example, data
typed in at a keyboard can enter the system via the
keyboard port, travelling along the bus in order to
reach the processor.
All computers have a number of separate
bus systems so that data can be moving
between different pairs of components at
the same time. Most systems will have a
separate CPU-Memory Bus linking
memory directly with the CPU and which
runs at very high speeds. It will also be
connected to an I/O bus via a bus
adapter, with the I/O bus running at
much slower speeds.
September 2003
9
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
I/O devices and memory run slowly compared to the CPU clock speed. The more cycles per
second, the more actions the CPU can carry out. Processor speeds are measured in hertz - a
hertz is 1 cycle, or clock "tick", per second. 1 kHz = 1000 Hz; 1 MHz = 1000 kHz; 1 GHz = 1000
MHz. The first PCs ran at about 4.77 MHz; now they run at 2 GHz or more.
However there is no way that the memory and devices can keep up with this speed, so for each
subsequent clock tick, the actual task being processed can vary.
To be capable of high-speed transfers, the physical bus length must be quite short. Many
systems therefore consist of a network of very short busses all joined together, rather than just
one big one. However one large bus is much cheaper to produce than lots of small ones, so
this results in a trade off between speed and economy.
The first electonic computers such as COLOSSUS and ENIAC did not
use transistors or capacitors (they hadn't been invented yet!). Instead
these early machines relied on thermoionic valves to store binary
values. The worlds first electronic computer, Colossus, was built at
Bletchley Park near Milton Keynes between 1941 and 1943. It relied on
huge valves to operate. Likewise, the valves in ENIAC used so much
electricity that the surrounding city of Philadelphia would experience
power brown-outs whenever the computer switched on.
Valves were fragile, unreliable and got extremely hot in use, which is
one reason why computers used to take up whole rooms. Later
computers such as the Manchester Mk 2 incorporated elaborate liquid
coolant systems, much like a domestic freezer but an awful lot bigger!
The valves shown in the picture on the left are of a particular type
called a "Nixie" tube. These were used to create illuminated
alphanumeric output. Nixie displays of this sort were used in
calculators and industrial instruments right up until the mid-1970s, when
they gradually began to be superceded by Liquid Crystal Display
screens.
This is how the above valves would have looked when soldered onto a
primaeval motherboard!
September 2003
10
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
This photograph shows an Intel™ Socket 370 motherboard, supporting both Celeron™ and
Pentium processors.
Serial and Parallel (Printer)
Ports
ISA slot
(black)
PCI slots
(white)
Keyboard and
Mouse connectors
Socket for
processor,
heatsink and
fan
AGP slot
(brown)
BIOS ROM chip
- the backup
battery is in the
middle of the
board
Slots for fitting RAM. This
board will support 2 X
516kB RAM chips, giving
1 GB of memory.
Power supply gets
attached here
IDE ports for
attaching fixed disk
drives, CD-ROM
etc.
September 2003
11
FD (Floppy Disk) port
for attaching cable to
connect floppy drive
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
The System Bus.
The main method of communication between the various parts of a computer is by the use of
one or more buses. A bus consists of a group of signal lines used to carry information. Usually
the components tap on to the bus to send and receive information as illustrated below:
A ddress Bus
Clock
Parallel
CPU
RO M
RA M
Input
Output
Serial
Interrupt
D ata Bus
Control Bus
In order to work correctly only one sender must be active on the bus at any one time. In a
simple computer this is achieved by having a single master, the central processing unit, which
controls the whole system. The other devices on the bus, called slaves, respond to commands
from the central processing unit, which controls information, address, data and control, and a
bus is often subdivided into these three types.
In a computer system there will be a number of groups of buses. In this unit only the lowest
level buses will be considered; those between components of the CPU and those between the
CPU, memory and input-output interfaces on a single printed circuit board.
The Address bus is used to specify the memory location (addressed) involved in data transfer
while the data itself is transferred between devices using the data bus. The data bus therefore,
must be bi-directional allowing data to be read into and written to the CPU.
September 2003
12
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
The Control bus compares various lines used to distribute timing and control signals throughout
the system. Important among these are;
Signals concerned with the direction of the data transfer (to or from the CPU);
Signals which indicate that the data is to be transferred to I/O rather than memory;
Requests from external devices require the attention of the CPU, The response to such
'interrupts' can be programmed in various ways, and a system of prioritisation may often
be desirable.
A system clock generator is responsible for providing an accurate and highly stable timing
signal. This generator often forms part of the microprocessor itself.
The number of lines contained in the address and data buses depend upon the particular
microprocessor employed. Most of today's microprocessors are capable of performing
operations on binary numbers consisting of either 8 or 16 bits. They are thus known as 8 bit
and 16 bit microprocessors respectively.
In a microcomputer based on a 8 bit microprocessor, the data bus has 8 separate lines.
Similarly, in a 16-bit system the data bus will have 16 separate lines. Address buses for 8 bit
systems invariably comprise 16 lines whereas those for 16 bit may consist of as many as 24
lines.
A further complication exists in the case of a number of microprocessors which in order to
minimise the CPU pin count (so that a 40 pin rather than a 64 pin package may be utilised),
employ multiplexed data and address buses. Certain CPU pins are then used to convey both
address and data information, the CPU information on to the respective bus.
Since a bus may be connected to many devices, the use of bus drivers/buffers are usually
packaged in groups of eight bits (i.e. one byte) and many may be unidirectional (e.g. for use
with an address bus) or bi-directional (e.g. for use with a data bus). In the later case devices
are usually referred to as 'bus transceivers'.
The largest binary number that can be appear on an 8 bit bus is 11111111 (or 28-1 = 255) while
that for a 16 bit bus is 1111111111111111 (216-1 = 65535 = 64 k).
Each address corresponds to an unique binary code, hence the linear addressable range
'paging' will be dependent upon the number of address lines provided within the system. (The
maximum number of individual memory locations that can exist in a system having n address is
2n).
Signal on all lines, whether they be address, bus or control, can exist in only one of two states 0
(low) or logic 1 (high). As far as individual devices sharing the data bus are concerned, a third
'high independence' state exists whenever a device is in its deselected or disabled state. This
allows the CPU to communicate with other devices without the risk of Bus conflict. Bus
transceivers can usually also be placed in a tri-star condition, thus permitting partial access to
the bus for a second processor or other 'intelligent' device.
The address range corresponding to a particular device (e.g. ROM) is decoded from the
address bus and is used to generate an appropriate 'enable' signal. A TTL decoder (or
demultiplexer) is often used in such an application.
September 2003
13
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Although the CPU is the heart of any microprocessor system, it may not be the only ‘intelligent’
device present. A second data processor, for example, may be fitted in order to perform
numeric data processing (NDP) or a dedicated microprocessor may be incorporated, for
example, in an intelligent keyboard.
The most desirable characteristics of a bus are listed below (but not in any order of importance).
Their importance will vary according to the application that one has in mind. A bus should :•
•
•
•
•
•
•
•
•
Be processor and manufacturer independent
Allow the use of multiple masters
Permit asynchronous operations
Employ simple non-multiplexed data transfer protocol
Use a simple low-cost backplane
Incorporate some means of signalling bus errors
Permit a s high bus data rate as possible (to minimise processing delays)
Allow as wide an addressing range as possible (both in relation to memory and I/O space)
Support as wide a range of processors as possible (including 16 bit and 32 bit processing
types)
A bus is simply a collection of wires on which electrical signals are passed from component to
component. The size and speed of the busses will vary between processor models, but their
functions remain the same.
A typical 80x86 system component uses standard TTL logic levels. This means each wire on
a bus uses a standard voltage level to represent zero and one. We think of binary values as
being zero and one rather than electrical levels, because these levels vary on different
processors.
The Data Bus.
The data bus is used to transfer the actual data values, and size of this bus varies widely. On
typical systems, the data bus may be 8, 16, 32, or 64 bits (lines) wide. The 8088 and 80188
microprocessors have an eight bit data bus (eight data lines) - this means that the CPU can
transfer eight bits of data at a time. The 8086, 80186, 80286, and 80386SX processors have a
16-bit data bus, and so on. The data bus is usually linked to the size of the internal registers for example, a processor with a 32-bit register will commonly have a 16-bit data bus (but this is
not always the case!)
Having an 8-bit data bus does not limit the processor to eight bit data types. It simply means
that the processor can only access one byte of data per memory cycle; the obvious
disadvantage is that an 8-bit bus can only transmit half the information per unit time as a 16-bit
one. However, since each memory address corresponds to a byte, this also has distinct
advantages - the CPU can address memory in chunks as small as a single byte. It also means
that this is the smallest unit of memory you can access at once with the processor. That is, if the
processor wants to transfer a 4-bit value, it must read eight bits and then ignore the extra four
bits.
September 2003
14
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
80x86 Processor Data Bus Sizes
Processor
8088
80188
8086
80186
80286
80386sx
80386dx
80486
80586 class/ Pentium (Pro)
Data Bus Size
8
8
16
16
16
16
32
32
64
The Address bus.
We already saw that a data bus transfers information between a particular memory location or
I/O device and the CPU. But how do we know where the data is supposed to come from or go
to? To differentiate memory locations and I/O devices, the system designer assigns a unique
memory address to each memory element and I/O device.
When some particular memory location or I/O device has to be accessed, the relevant address
is placed on the on the address bus. Circuitry associated with the memory or I/O device
recognises this address and instructs the memory or I/O device to read the data from or place
data on the data bus. In either case, all other memory locations ignore the request. Only the
device whose address matches the value on the address bus responds. The size of the
address bus will also vary between processors, and bears a direct relationship to how many
memory locations can be accessed at any given time.
If the address bus had only 1 line, the processor can access 21, i.e. 2 addresses. If the address
bus is a 12-bit bus, the processor can provide 212, or 4096 unique addresses. (Each address is
commonly 1 byte in size, so this gives us 4kB of addressable memory). Some 8088 and 8086
derivatives, for example, have 20 bit address busses and can access up to 1,048,576 memory
locations. Some computers have up to 36 address lines, giving a theoretical 64 GB of
addressable space.
80x86 Family Address Bus Sizes
Processor
8088
8086
80188
80186
80286
80386sx
80386dx
80486
80586 / Pentium (Pro)
September 2003
Address Bus Size
20
20
20
20
24
24
32
32
32
15
Max Addressable
Memory
1,048,576 (1 MB)
1,048,576
1,048,576
1,048,576
16,777,216 (16 MB)
16,777,216
4,294,976,296 (4 GB)
4,294,976,296
4,294,976,296
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
What happens if there is not enough space in the given memory location to write back the data,
or if the RAM is spread over more than one chip? The simple answer is that the data will be
spread between the 2 chips and part of the byte written to each. For example, if we wanted to
write 1110 0110 to location 42, the first half (1110) would be written to address 42 of the first
chip and the second half (0110) to location 42 of the second chip. The CPU prepares the
chosen chip for writing by means of a chip enable or chip select line.
In order to write data to a chip, therefore, the CPU must follow a sequence of steps;
Address goes on the address bus
Any address lines involving use of the chip select are decoded
The chip select is activated
The actual data goes on the data bus
Data gets written to the correct location via the write line
When accessing data, from an I/O port for example, the reverse happens;
Address goes on the address bus
The relevant I/O line is activated
Any address lines involving use of the chip select are decoded
The chip select is activated
The actual data goes on the data bus
Data gets sent back to the processor
The Control Bus.
The control bus is a collection of signal lines that control how the processor communicates with
the rest of the system. Consider for a moment the data bus. The CPU sends data to memory
and receives data from memory on the data bus. This prompts the question, "Is it sending or
receiving?" There are two lines on the control bus, read and write, which specify the direction of
data flow. Other signals include system clocks, interrupt lines and status lines.
The read and write control lines control the direction of data on the data bus. When both contain
a logic 1, the CPU and memory-I/O are not communicating with one another. If the read line is
low (logic 0), the CPU is reading data from memory (that is, the system is transferring data from
memory to the CPU). If the write line is low, the system transfers data from the CPU to memory.
The byte enable lines are another set of important control lines. These control lines allow 16, 32,
and 64 bit processors to deal with smaller chunks of data.
September 2003
16
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Note that it is quite possible for byte, word, and double word values to overlap in memory. For
example, in the figure below you could have a word variable beginning at address 193, a byte
variable at address 194, and a double word value beginning at address 192. These variables
would all overlap.
Besides the address lines which access memory, the 80x86 family provides a 16 bit I/O address
bus. This gives the 80x86 CPUs two separate address spaces: one for memory and one for I/O
operations. Lines on the control bus differentiate between memory and I/O addresses. Other
than separate control lines and a smaller bus, I/O addressing behaves exactly like memory
addressing. Memory and I/O devices both share the same data bus and 16 lines on the address
bus.
We began studying our hardware theory by looking at two most important components of any
system - the CPU and Memory (RAM). In order to do anything useful, however, these must
somehow interface with the human user, and so the two are attached via various peripheral
devices. This term refers to any piece of hardware attached to a CPU and forms the interface
between the outside world and what is happening inside the processor. These are also
sometimes referred to as simply "devices" or "peripherals".
Devices can be classified into two general groups - Input/Output Devices and Storage Devices.
Although these are designed to fulfil different purposes, they interface between processors and
users in the same way.
I/O devices enable communication between computers and users - for example through
keyboards, monitors, mice and barcode scanners. Storage devices store data on a permanent
basis, theoretically indefinitely although most media do tend to deteriorate after time. Some
examples are - hard drive storage, floppy drive storage and CD-Rs.
Expansion Busses.
When we looked at the motherboard we saw that the expansion slots came in different sizes.
Expansion busses are designed to make it easier to connect devices to the computer system.
In the early days of microcomputers, a form called an S100 bus was widely used on CP/M
systems. The Apple II was based on a proprietary design and had the first expansion bus that
made it easy for end users to add cards on their own.
The idea of an open architecture based on a simple expansion bus was one of the factors that
helped launch the first IBM PC's overnight success. The first type of slot to be introduced was
the ISA slot, and although this technology is now nearly 20 years old you can still fit ISA cards in
some modern motherboards. Any typical Pentium motherboard has a selection of different
expansion bus designs.
September 2003
17
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Different Expansion Busses
The Industry Standard Architecture (ISA) bus was the original 8-bit bus that originally
debuted on the IBM PC. At that point, it ran at the same speed as the system bus (4.77 MHz) and was later upgraded to first 6, then 8 MHz and 16 bits in width. Computers then started to
carry faster processors, and it was soon discovered that many expansion cards simply could not
keep up with system demand.
The industry had by this time
standardised on the 8MHz
speed, although most expansion
buses now use a speed
independent of the system bus.
The 8-bit-wide 4.77-MHz IBM
PC bus had a peak throughput
rating of about 2 megabytes per
second.* Bringing the speed up
to 8 MHz increased the
maximum throughput to 8 MBps.
An 8-bit extension made it
possible for computers to
address 16mb of memory (up
from the original 1mb address
space), but addressing the
additional 8 bits is not as easy as addressing the original 8-bit design, because memory access
operations require two steps.
*How to measure it.
8 MHz = 8 * 1000 * 1000 clock cycles per second, equals 8 000 000.
For an 8 bit bus, multiply by 8 which gives 64 000 000 bits per second.
Divide by 8 to get bytes and you get 8 000 000 bytes per second, or 8 MBps.
As processors became faster and gained wider data paths, the basic ISA bus design did not
change to keep pace. Even now, most ISA cards remain 8bit. The few types with 16bit data
paths (hard disk controllers, graphics adapters, and some network adapters) are still constricted
by the low throughput levels of the ISA bus. Expansion cards in faster bus slots can better
handle these processes - so much so that some newer motherboards don't even carry ISA slots
anymore.
As the slow and narrow ISA bus became a bottleneck between the processor and expansion
devices, the Peripheral Component Interconnect (PCI) bus was created by Intel to solve this
problem.
The PCI bus runs at its own clock speed separate from the system bus speed. Originally
specified as a 32-bit-wide bus operating at 33 MHz, PCI had a theoretical maximum transfer
rate of 132 MBps (16½ times as fast as the ISA bus). This is the version most widely
implemented in PC systems. PCI also simplified systems configurations by supporting plug-andplay, and it extended the limited resources of the original PC-compatible hardware by
supporting shared IRQ assignments.
September 2003
18
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
The original 124-pin slot specification (62 pins on each side of the expansion slot) has been
revised to support even greater throughput. First, a 64-bit extension was designed using an
extended connector much like the 16-bit addition to the ISA bus, adding another 64 contact pins
(32 pins per side). This doubled the theoretical throughput (though 64-bit cards are still rare at
this point).
More productive is the PCI 2.1
specification, which calls for a
66-MHz bus speed. This
effectively doubles the
theoretical throughput of the
original 32bit specification to
264 MBps-33 times as fast as
with the ISA bus.
Along with the 64bit version,
there are some other aspects of
the PCI specification that most
users may not know about. For
example, PCI cards can run on
either 5 volts or 3.3 volts. A 5volt card has a notch cut into
the edge connector toward the
front of the computer case, with a corresponding key in the slot. A 3.3-volt card has a notch
toward the rear of the case, with a corresponding key in the slot. This prevents a user from
accidentally plugging the wrong card into a slot. The PCI specification also calls for a universal
card, which fits either slot and runs on either voltage.
Another less-known trait of the PCI expansion slot is that the bus is limited to ten electrical
loads. Most cards apply more than one load to the bus, and as a result, the practical limit for
expansion cards on a single PCI bus is three cards (in some cases, four will work). If you need
more than three PCI cards installed in a single system, you can have more than one PCI bus,
using a PCI bridge configuration.
September 2003
19
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
In the days of the original ISA bus, we used the relatively simple Monochrome
Display Adapter (MDA) and Color Graphics Array (CGA) cards to drive our monitors, and these
required relatively small amounts of data. A CGA graphics display could show four colors (2 bits
of data) at 320 X 200 resolution at 60 Hz, which required 128,000 bits of data per screen, or just
over 937 kilobytes per second.
In contrast, a 16bit high-color image requires 1.5MB of data, and at 75 Hz, this data is refreshed
75 times per second. (75 Hz is probably the minimum acceptable refresh rate for monitors.)
Thanks to graphics accelerators, not all of this data has to be transmitted across the expansion
bus to the graphics card, but new imaging technology has created new problems. Now 3-D
graphics have made it possible to model both fantastic and realistic worlds on-screen with
amazing detail. Texture mapping and object hiding require enormous amounts of data, and the
graphics adapter needs to have fast access to this information.
Accelerated Graphics Port (AGP) first appeared with Pentium II motherboards. It barely
conforms to our original definition of a bus, as it is really a point-to-point connection, dedicated
to the single task of connecting a graphics adapter more directly to the motherboard's
resources.
AGP has limited capabilities.
PCI devices must support
communication with a variety
of devices-storage adapters,
network connections, and
sound cards, for example-but
AGP deals only with graphics.
This single-direction, limited
task makes it possible to
streamline the design for
maximum speed.
The speed is used to give the
graphics adapter fast access
to texture and buffer data.
Instead of loading up the
graphics card with expensive
memory, AGP lets the card
access this information directly
from the computer's system memory, without involving the CPU in the process.
How much faster is AGP than PCI? A 33-MHz 32-bit PCI bus supports up to 132 MBps
throughput. AGP is also a 32-bit design, but it runs at speeds up to 133 MHz-four times as fastso it has a maximum transfer rate of 532 MBps. (This is still twice as fast as the rate of a 66MHz 32-bit PCI bus.) Best of all, however, is that the graphics card on the AGP bus does not
have to compete with any other devices to get access to its data.
September 2003
20
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
You can have only one AGP device in a system at a time. If you want to use a second display (a
feature that Windows 98 makes relatively easy to implement), you will need to rely on a PCI
graphics adapter for the second display. If you want to upgrade an AGP display, you will need to
replace the adapter. As with PCI, there are some lesser-known details contained in the AGP
specification. Just as there are two different PCI slot designs depending on the card voltage, so
there are two different voltage designs for AGP, the common 3.3-volt design and a 1.5-volt type.
As part of the compulsory questions for Outcome 3, you will be expected to draw a graph to
demonstrate the differences in performance between systems with differing data and address
bus sizes. An assessment-level question is given next for you to try. (Your lecturer will explain
the clock cycle part, as we don't actually cover this until Book 3).
Address and data bus sizes - Graph Drawing Exercise
© SQA 2001 - taken from draft Exemplar for unit.
Scenario
A semiconductor manufacturer has decided to produce a range of
microprocessors/microcontrollers for use in a variety of application areas. As speed and cost
are both important factors the designers have decided to use a common core processor and
provide different address and data bus widths for different family members. The difference in
cost between processors is largely caused by the differences in packaging.
One result of this decision is that each member of the family can perform a maximum of one
million memory fetches per second (as long as it is attached to memory of a sufficient speed).
This corresponds to one fetch per two machine cycles.
Part 1
You have been detailed to help the design team of your companies latest product, and the task
that you have been given is to produce clear graphs showing the performance of different
members of the processor family. This will be used to help decide the lowest cost component
that can be used in the product.
The graph will be used at a meeting where the choice of device will be finalized. The graph
should be in a form suitable for its intended use and labeled clearly and scaled appropriately.
Part Number
Hyc4e
Hyc4t
Hyc4w
Hyc4s
Hyc8w
Hyc8s
Hyc12s
Hyc12n
Hyc12s
Hyc16f
Hyc32f
Hyc32o
Data Bus Size
4
4
4
4
8
8
12
12
16
16
32
32
Address Bus Size
8
10
12
16
12
16
16
20
16
24
24
32
Prepare a graph to the above specification for this set of data.
September 2003
21
Cost (ex vat)
2.00
2.50
3.20
4.00
3.70
4.60
5.30
6.20
6.10
7.00
8.50
10.20
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
It is estimated that the proposed application will require a processor capable of transferring at
least seven bits per second. Add a line to a new copy your graph indicating this level of
performance. From your graph, determine which processors meet this requirement.
Part 2
Now that the range of candidate processors has been reduced, it has been decided to further
reduce the list of candidates by considering the required memory space of the application.
You have been detailed to produce a graph showing the amount of memory that each of these
processors can address. Again, this will be required at a meeting, and should be appropriately
presented. The system will require a minimum of 30Kb of memory, and your graph should
include this.
Based on the data bus size, the address bus size and unit cost, which processor would you
recommend?
Week 3 - Memory.
A computer stores information in its memory. There are basically two types of systems memory
- RAM (Random Access Memory) and ROM (Read Only Memory). There are further subdivisions of these 2 types, which we shall examine in detail later.
Read only Memory can be used to store algorithms (i.e. the instructions of a program) when
memory is manufactured and once tested these algorithms should not need changing.
Obviously algorithms which are going to be changed are not stored in ROM as this would be
very inefficient. In general, once a ROM has been programmed it cannot be changed.
In PCs, ROM is often used to store part of the operating software of a computer system. When
a computer is switched on there is nothing inside the RAM because such data is lost when the
power is removed.
It is therefore necessary to have a program which can be loaded
automatically and which will then load the necessary programs into the main memory. Such a
loading program is called a ‘Bootstrap’ loader. The algorithm stored in the ROM is obeyed and
reads other programs from a peripheral device. In most cases the program read in is part of the
operating system which then controls the subsequent operation of the computer system.
Different Memory Types.
Random Access Memory (RAM). In common usage, the term RAM is synonymous with
main memory, the memory available for data and programs. For example, a computer with 8M
RAM has approximately 8 million bytes of memory that programs can use. It can be both read
from and written to. It is typically cheap and fast – standard Dynamic RAM (DRAM) has a usual
access speed of about 60 – 70 nS.
Dynamic Ram uses capacitors for storing electrical charge. These minute capacitors can only
hold information for a very short period of time (a thousandth of a second or millisecond).
Dynamic memory must therefore be refreshed at frequent intervals in order to retain the
information stored in the capacitors, and this is done when the microprocessor is carrying out
other work so that processing time does not suffer. The basic idea is that information is stored
in the form of a charge on a capacitor and this allows a higher bit density and gives lower power
consumption than static memories.
September 2003
22
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Main memory consists of a large number of cells each capable of storing 1 bit of information.
These cells are grouped into locations. A location will normally be either 8 bits (1 byte) or 16
bits and each location will have a unique address. A location is therefore the smallest
addressable unit of memory and the size of the location is known as the memory word size.
The word size is the smallest number of bits that can be stored or retrieved in one memory
access.
A more modern type of RAM is Synchronous DRAM (SDRAM), a type of
RAM that can run at much higher clock speeds than conventional memory.
SDRAM actually synchronizes itself with the CPU's bus and is capable of
running about twice as fast as DRAM. Today's fastest Pentium systems use
CPU buses running at 100 MHz or more, so SDRAM can keep up with them,
though barely. SDRAM is not expected to support the ever-higher speeds of the latest CPUs,
which is why new memory technologies such as RDRAM and SLDRAM, are being developed.
Older systems used SIMM Memory Chips (Single Inline Memory Module), small circuit boards
which required a 32-bit path to the memory chips. With the development of the Pentium, which
required a 64-bit path to memory, SIMMS had to be installed in pairs. Memory is nowadays
supplied as DIMMs (Dual Inline Memory Modules) which can be installed one DIMM at a time.
This is what a SIMM looks like, in contrast to the DIMM above.
Static Random Access Memory (SRAM). This is a type of memory that is faster and more
reliable than the more common DRAM. The term static is derived from the fact that it doesn't
need to be refreshed like dynamic RAM. It is both faster and less volatile than dynamic RAM,
but it requires more power and is a lot more expensive. Both types of RAM are volatile,
meaning that they lose their contents when the power is turned off.
While DRAM supports access times of about 60 nanoseconds, SRAM can give access times as
low as 10 nanoseconds. In addition, its cycle time is much shorter than that of DRAM because it
does not need to pause between accesses. Due to its high cost, SRAM is often used only as a
memory cache (see below…)
SRAM is constructed from bipolar cells, unlike the capacitors used for DRAM. It is fast, not very
compact and has a high power consumption. Static RAM uses minute switches to indicate an
ON or OFF state. These switches are called flip-flops. Whether the switch is on or off they
require a current to be passed to them. As a result static RAM is used mainly for small memory
sizes.
September 2003
23
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
L1, L2 and Secondary Cache.
The speed at which a program executes instructions will be dependent on the rate at which
instructions and data can be read from and written to main memory.
Application code and data that will be frequently used can reside in cache memory. This is an
intermediate memory system that sits between the CPU and main memory, and works on the
principle of locality of reference. In other words, stuff that gets used a lot is kept handy! If
you have accessed one location, you are more likely to access its neighbours next, because
programs are stored sequentially, as are arrays of data and processes occurring in loops.
Cache memory is SRAM or "Static RAM," the fastest available. It is also expensive compared
to main RAM. The cache memory is connected to the CPU by an extremely fast Front side
bus; consequently, data can be accessed from cache memory much faster than from main
memory. Processors usually have cache built-in or as part of the CPU module - if you look at
advertisements for processors, they are marketed with an n-size cache. Some early Celerons
had no cache at all, and subsequently performed very poorly!
Computers generally only have 128 kilobytes to 512 kilobytes of cache memory, but very high
end systems may have up to 2 megabytes. Among the Intel processors, the Pentium II and III
chips generally come with 256 (accepted minimum) or 512 kilobytes of cache memory. Cheaper
Celeron chips usually have 128 or 256 kilobytes. The top-end of the Pentium range comes with
1 or even 2 megabytes of cache, but these are extremely expensive and probably unnecessary
for normal use. If, however, you have a "dual processor capable" computer or motherboard,
you could have a dual chip system with 256K on each chip at a much cheaper price.
With a large enough cache memory, the entire executable application program might be
contained in the cache. If frequently used code isn't in the cache, the computer loses time in two
ways. First, it still spends time looking for the code in the cache, then after wasting this time, it
spends more time fetching the code from the slower main memory. Hence, bigger cache
memories can substantially increase performance in most applications. The cache acts a bit
like a buffer; frequently used data is kept, and data that has not been accessed recently "drops"
off the bottom - this policy is known as "replacement algorithm". The most common algorithm is
the Least Recently Used (LRU) algorithm by which the block which has gone the longest time
without being referenced is overwritten.
The front side bus is an extremely fast data pipeline connecting the core processor of the CPU
with its cache memory. This bus can run at full processor speed or, more often, at half (or some
other fraction) of the speed of the processor. A 600 MHz processor might have a front side bus
running at 200 or maybe even 300 MHz.
September 2003
24
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Memory caching is effective because most programs will access the same data or instructions
over and over. By keeping as much of this information as possible in SRAM, the computer
avoids accessing the slower DRAM.
Some memory caches are built into the architecture of microprocessors. The Intel 80486
microprocessor, for example, contains an 8K memory cache; most modern Pentiums ship with a
256K OR 516K cache. Such internal caches are often called Level 1 (L1) caches. Some
systems also come with external cache memory, called Level 2 (L2) cache. These caches sit
between the CPU and the DRAM. Like L1 caches, L2 caches are composed of SRAM but they
are much larger. Where a system contains both L1 and L2 cache, the L2 is sometimes called a
secondary cache.
Disk caching works under the same principle as memory caching, but instead of using highspeed SRAM, a disk cache uses conventional main memory. The most recently accessed data
from the disk is stored in a memory buffer. When a program needs to access data from the disk,
it first checks the disk cache to see if the data is there. Disk caching can dramatically improve
the performance of a system, because accessing data in RAM can be thousands of times faster
than accessing the hard drive.
When data is found in the cache, it is called a cache hit, and the effectiveness of a cache is
judged by its hit rate. Many caches use a technique known as smart caching, in which the
system can recognise certain types of frequently used data.
Optimising Your Cache
If you're buying a system or upgrading your motherboard and processors, you should make
sure that your system has as much cache memory as you can afford. If using a system for
heavy-duty applications such as video editing, two CPUs may be installed; this effectively
doubles the amount of cache from 512 kilobytes to a more than adequate 1-megabyte.
Adverts
The following advertisements have been taken from recent copies of popular computer
magazines. These demonstrate the differences in specification and price between different
processor models.
September 2003
25
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
September 2003
26
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Two Different Types Of Cache.
Write-through. Every write operation to the cache is accompanied by a write of the same data
to main memory. If this is implemented, then the input/output processor need not consult the
cache directory when it reads memory, since the state of main memory is an accurate reflection
of the state of the cache as updated by the central processor. Although this scheme simplifies
the accesses for the input/output processor, it results in fairly high traffic between central
processor and memory, and the high traffic tends to degrade input/output performance.
Write-back. In this scheme, the central processor updates the cache during a write, but actual
updating of the memory is deferred until the line that has been changed is discarded from the
cache. At that point, the changed data are written back to main memory.
Read Only Memory (ROM). Computers almost always contain a small amount of read-only
memory that holds program instructions for starting up the computer and performing special
diagnostics. This is often referred to
as a BIOS (Basic Input / Output
System) chip. Unlike RAM, ROM
cannot be written to. In fact, both
types of memory (ROM and RAM)
allow random access, so strictly
speaking, RAM should be called readwrite memory. ROM typically has a
slow access time and is more expensive to produce than RAM.
ROM is non-volatile, i.e. it will be retained in the PC's memory even if the power is switched off.
Other devices which are added to the PC can have their own ROM e.g. a graphics card will
have its own ROM dedicated to the operation of the graphics card alone.
Programmable Read-Only Memory (PROM). Like a ROM, this is a memory chip on which
you can store program code. But once the PROM has been used, you cannot wipe it clean and
use it to store something else. Like ROMs, PROMs are non-volatile; they retain their contents
even when the computer is turned off.
The difference between a PROM and a ROM
is that a PROM is manufactured as blank
memory, whereas a ROM is programmed
during the manufacturing process. To write
data onto a PROM chip, you need a special
device called a PROM burner. The process of
programming a PROM is sometimes called
burning the PROM.
A PROM / EPROM burner.
PROMS are cheap to produce (although the
initial setup costs are high) and are used for
various types of firmware. This could be
anything from sound cards to washing
machines - in fact anything that has some sort
of electronic device embedded in it will use a
PROM.
September 2003
27
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Erasable Programmable Read-Only Memory (EPROM) and Electrically Erabable
Programmable Read-Only Memory (EEPROM). EPROM is a special type of memory that
retains its contents until it is exposed to ultraviolet light. The ultraviolet light clears the contents,
making it possible to reprogram the memory. To reprogram an EPROM, you need a PROM
burner.
An EPROM differs from a PROM in that a PROM can be written to only once and cannot be
erased. EPROMs are used widely in types of firmware that may be subject to upgrade at some
point in the future. They also enable the manufacturer to change the contents of the PROM
before the device is actually shipped - for example, in a PC any bugs can
be removed and new versions installed shortly before delivery. Another
widespread use is in component manufacturing processes, where
EPROMS may be used for testing and quality control purposes.
The EEPROM works like the EPROM but is cleared using an electrical
charge rather than UV light. Like other types of PROM, both EPROMs
and EEPROMs retain their contents even when the power is turned off.
They are not as fast as RAM and are comparatively expensive.
EEPROM is similar to flash memory (sometimes called flash EEPROM).
The principal difference is that EEPROM requires data to be written or
erased one byte at a time whereas flash memory allows data to be
written or erased in blocks (thereby making flash memory faster).
Week 4 - How RAM works.
We already saw that main memory, the RAM, communicates with the processor by the data and
address buses. We also learned that the bus consists not of a single, but of multiple, electrical
circuits or lines. The width of the address bus dictates how many different memory locations can
be accessed, and the width of the data bus how much information is stored at each RAM
location.
Every time a bit is added to the width of the address bus, the address range doubles. This
effectively means that the CPU can access 2no. of lines - for example, the Intel 386 processor had
a 32-bit address bus, enabling it to access up to 4294967296 locations. (If each location = 1
byte, this would be 4GB of memory). The Pentium processor - introduced in 1993 - had a data
bus width to 64-bits, enabling it to access 8 bytes of data at a time. This model also used 168pin DIMMs (earlier computers mainly used SIMMs) which are specifically designed to support
64-bit paths and which are still the industry standard.
The actual chips themselves consist of rectangular arrays of memory cells, arranged in rows
(wordlines) and columns (bitlines). Each memory cell also has a unique location or address
defined by the intersection of a row and a column, and we usually refer to these addresses in
hexadecimal notation (remember that it would really be stored in binary - and humans can't
readily understand long binary strings!)
September 2003
28
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
DRAM is manufactured using a similar process to a processor. A thin wafer of silicon has a
circuit etched onto it using an acid bath - the circuit includes millions of tiny transistors and
capacitors and the control algorithms. The overall design is just a series of simple, repeated
structures so the whole thing can be reproduced very simply and cheaply. Over the years,
several different structures have been used to create the memory cells on a chip, but the
support algorithms usually consist of sense amplifiers to amplify the signal or charge detected
on a memory cell, and some sort of address logic to select the correct rows and columns. Other
components on the RAM chip may include internal counters or registers to keep track of the
refresh sequence, or to initiate refresh cycles as needed; plus there will be some sort of control
device for actually reading from or writing to the selected cell.
In DRAM, microscopically small capacitors are used to hold the charge representing binary 1s
and 0s, but these are so tiny that they discharge very quickly, and all the data is lost. To
overcome this problem, other circuitry refreshes the memory, reading the value before it
disappears completely, and rewriting it back. (This action is what makes the memory dynamic).
The refresh speed is expressed in nanoseconds (.000000001 sec, or the time that light takes to
travel about 20cm!) - most models of DRAM refresh every 60 or 70 ns.
The most difficult aspect of working with DRAM devices is resolving the timing requirements. A
sequence of several events has to take place before a RAM address can be read from or written
to; all of this has to be co-ordinated by the Control Unit and system clock.
Row Address Select. The /RAS circuitry is used to latch the row address and to initiate the
memory cycle. It is required at the beginning of every operation. To enable /RAS, the voltage
level is changed from high to low, and must stay in a low state until the /RAS is no longer
required. /RAS may also be used to trigger a refresh cycle (/RAS Only Refresh, or ROR).
Column Address Select. The /CAS is used to latch the column address and to initiate the read
or write operation. /CAS may also be used to trigger a /CAS before /RAS refresh cycle. This
refresh cycle requires /CAS to be active prior to /RAS and to remain active for a specified time.
Like the /RAS, it is activated by a low voltage.
Address. The addresses are used to select a memory location on the chip. The address pins
on a memory device are used for both row and column address selection, which is known as
multiplexing. The number of addresses depends on the memory's size and organisation. The
voltage level present at each address at the time that /RAS or /CAS goes active determines the
row or column address, respectively, that is selected. Other circuitry and control structures
confirm that the address being read from or written to was the one that was in fact selected!
Write Enable. The /WE signal is used to choose a read operation or a write operation. A low
voltage level signifies that a write operation is desired; a high voltage level is used to choose a
read operation. The operation to be performed is usually determined by the voltage level on
/WE when /CAS goes low
Output Enable: During a read operation, this control signal is used to prevent data from
appearing at the output until needed. When /OE is low, data appears at the data outputs as
soon as it is available. /OE is ignored during a write operation. In many applications, the /OE pin
is grounded and is not used to control the DRAM timing.
September 2003
29
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Data In or Out: The DQ pins (also called Input/Output pins or I/Os) on the memory device are
used for input and output. During a write operation, a voltage (high=1, low=0) is applied to the
DQ. This voltage is translated into the appropriate signal and stored in the selected memory
cell. During a read operation, data read from the selected memory cell appears at the DQ once
access is complete and the output is enabled (/OE low). At most other times, the DQs are in a
high impedance state; they do not source or sink any current, and do not present a signal to the
system. This also prevents DQ contention when two or more devices share the data bus.
Because most PC memory accesses are sequential, the current industry standard RAM is
designed to fetch all the bits in a burst as fast as possible. This type of memory is known as
Synchronous DRAM. An on-chip burst counter allows the column part of the address to be
incremented very rapidly which helps speed up retrieval of information. A component known as
the Memory Controller provides the location and size of the block of memory required; the
SDRAM chip can then supply the bits as fast as the CPU can take them, using an on-chip clock
to synchronise operations to the CPU's system clock.
Different Speeds of RAM.
Until a couple of years ago, most
RAM ran at its own speed
(asynchronous). The industry
standard nowadays, however, is
Synchronous DRAM (SDRAM) which
is synchronised to the system clock.
This enables data to be delivered offchip at burst rates of up to 133MHz,
although some set-up time is required
for the initial data transfer.
The problem with SDRAM was that it
was never truly designed to run at
speeds beyond about 100MHz.
Developments in he technology of
chipsets began to outstrip
developments in RAM, and various
manufacturers began to develop alternatives. One stop-gap was Intel's S-RIMM specification,
which allows PC100 SDRAM chips to use Direct RDRAM memory modules; but this was
complex and expensive. The next step was DRDRAM, or Rambus, specifically designed for the
Pentium 4.
This is a totally new RAM architecture, complete with bus mastering (the Rambus Channel
Master) and a new pathway (the Rambus Channel) between memory devices (the Rambus
Channel Slaves). Direct RDRAM is actually the third version of the Rambus technology. The
original (Base) design ran at 600MHz and this was increased to 700MHz in the second iteration,
known as Concurrent RDRAM.
A Direct Rambus channel includes a controller and one or more Direct RDRAMs connected
together via a common bus - which can also connect to devices such as micro-processors,
digital signal processors, graphics processors and other circuits. The controller is located at
one end, and the RDRAMS are distributed along the bus, which is parallel terminated at the far
end. The two-byte wide channel uses a small number of very high speed signals to carry all
address, data and control information at up to 800MHz.
September 2003
30
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
The other big player battling to provide system builders with high-performance RAM is Double
Density DRAM (DDRAM). This works by allowing the activation of output operations on the chip
to occur on both the rising and falling edge of a clock cycle, thereby providing an effective
doubling of the clock frequency without increasing the actual frequency. Like other types, DDR
SDRAM is tied to the front-side bus, with both the memory and bus executing instructions at the
same time rather than one of them having to wait for the other.
Virtual Memory.
We have already looked at different types of physical memory in a PC system, and in particular
at the mainstream RAM - the memory which holds the code or application currently being run.
When a PC loads up Windows, for example, the code relating to the windows and graphics
(USER and GDI code) loads into the lower section of memory, as do any older DOS
applications or drivers. The core Windows operating system (VMM code) loads into the top
part. Each Windows application is then loaded into its own protected memory space, usually
above the system and DOS code. These allocations can be shown pictorially as a memory
map.
But what happens if there isn't enough unallocated memory to run an application? In this case,
Windows has to pinch a bit of hard disk space to park any RAM code that hasn't been recently
used. This becomes part of the systems Virtual Memory.
Virtual memory is a combination of RAM (physical system memory) and reserved hard disk
space. It can be used to store both program code and data when applications are running.
During the execution of a program, at any given point parts of the code will be in physical RAM
and other parts are swapped out to the hard disk. This arrangement makes it possible to have
more virtual memory in your system than you have RAM installed, and also makes it possible to
run more applications simultaneously.
September 2003
31
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
How To Find Out How Much Virtual Memory A System Has.
Windows can tell you the amount of virtual memory available at any given time and the
percentage of your total system resources
that are currently available for applications.
To get this information, choose
Start>>Programs>>
Accessories>>Resource Meter. This will
show the % of free resources listed. (There
are also various freeware programs
available which will do the same thing, but
fancier). It's recommended that you keep
free memory and resources as high as
possible.
What Causes Free Memory to Decrease?
Every time you run an application program under Windows, that program uses some of your
free virtual memory to run program code, and to store and display data. Programs use
additional memory as they open new documents, execute utilities or perform other operations.
If you're running low on virtual memory one of the first indications is that your system will slow to
a crawl! This can be solved by closing down some (or all) applications that are running in the
background; in some cases, you'll need to close and restart Windows, because some
applications don't de-allocate memory after they're closed.
Increasing the Amount of Virtual Memory in a System.
In some circumstances you may have to increase the amount of virtual memory in the system.
You can do this in two ways:
Increase the amount of system RAM available, by adding to or upgrading the chips;
Create a permanent or temporary swap file, or increase the size of the current Windows
swap file.
How to Create, Delete, or Change the Size of a Swap File
Whenever possible, it's best to let Windows manage your virtual memory. Windows chooses the
default setting based on the amount of free hard-disk space. The swap file then shrinks and
grows dynamically based on actual memory usage. If you need to specify a different disk or set
limits on the minimum or maximum reserved space, however, you can create, delete or resize a
swap file manually.
Before creating a swap file, run a disk defragmentation utility. Then go to
Start>>Settings>>Control Panel>>System>>Properties>>Virtual Memory. Click the radio
button for Let me specify my own virtual memory settings, and then enter the new disk in
Hard disk or enter values (in kilobytes) in Minimum or Maximum. Note that Windows cannot
create a swap file from compressed or stacked hard disk space.
Performance Considerations for Virtual Memory
The fastest type of virtual memory is physical RAM. The more virtual memory that's provided by
memory chips in your computer, the faster Windows will run. Because of this, it's best to
increase physical memory whenever possible. Creating a swap file on a network drive is not
recommended - network swap files are extremely slow. If you must create a swap file on a
network drive, create a permanent swap file. Before creating the swap file, you must make sure
the network directory does not have a read-only attribute, and you must have both create and
write access to the directory.
September 2003
32
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Temporary vs. Permanent Swap Files
Windows allows you to set up 2 types of swap files, temporary or permanent. Temporary swap
files can be created out of fragmented hard disk space, but permanent swap files can only be
created out of contiguous hard disk space. Depending on the amount of contiguous free hard
disk space available, performance concerns, and the amount of hard disk space you need when
not running Windows, one type of swap file may be better for your configuration than the other.
Of the 2 types of swap files, temporary swap files are slower. The more fragmented your hard
disk is, the slower a temporary swap file becomes. A temporary swap file takes the form of a
DOS file (WIN386.SWP) that is created on your hard disk when Windows loads, and gets
deleted when you exit from Windows. To maintain the best performance from a temporary
swap file, run a defragmentation utility on the hard disk frequently.
Once you've set up virtual memory for a temporary swap file, a swap file of the requested size
will be created every time Windows loads. However, if the requested size swap file would use
more than 50% of the available hard disk space, the size is reduced to accommodate the 50%
limit. Windows does not warn you that it's creating a smaller swap file, so if disk space is low,
and if you add files to the hard disk, be aware that your swap file size may be affected!
Overclocking.
Overclocking is the practice of running your CPU past the speed that it is rated at, for example
running a 1.2 GHz CPU at 1.4 GHz. How can this be achieved? Most CPU manufacturers
create their CPUs and then test them at a certain speed. If the CPU fails at a certain speed,
then it is sold as a CPU at the next lower speed. The tests are usually very stringent so a CPU
may be able to run at the higher speed quite reliably. In fact, the tests are often not used at all once a company has been producing a certain CPU for awhile, they may well mark some of
them down as the slower CPUs in order to fulfil market demand!
Is overclocking dangerous? For the most part, no - provided you are not trying to run your old
486 33MHz at 1 GHz. Another practice that is not recommended is monkeying about with any
of the voltage settings. You must also keep the CPU as cool as possible, perhaps by fitting an
auxiliary fan.
Most modern CPUs are multiplier locked - i.e. you cannot change the actual CPU speed - but
you can change the bus speed. The multiplier is a figure obtained by dividing the default CPU
speed by the default bus speed, e.g. a 1.2GB Athlon with a 133 MHz bus => 1200/133 = a
multiplier of 9. On older CPUs it was possible to change the multiplier by altering some of the
jumper settings on the motherboard, but this is not possible with most CPUs on the market
today. The only way to alter the overall CPU speed, therefore, is to alter the speed of the bus.
Changing the bus speed is actually more beneficial than changing the CPU's speed - when you
increase the bus speed, in many cases you will be overclocking all the parts in your AGP, PCI
and ISA slots, and your RAM as well as the CPU. Usually this is by a small margin and won't
hurt these components.
In your motherboard manual, find the jumper settings for the particular bus speed you want to
use. Locate those jumpers on your motherboard and change them to fit the jumper settings in
the manual. Some motherboards have a "SoftMenu," which enable you to change the bus
speed in the computer's BIOS. Calculate the new processor speed by multiplying the bus speed
by your CPU's multiplier.
September 2003
33
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
WEEK 5 - Memory Maps.
We already saw that when a computer is first booted up, it loads part of the operating system
into RAM. This means that not all of the RAM is available for subsequent applications. As well
as the OS, many peripherals also "claim" a bit of RAM space for their I/O processes,
immediately their device drivers are loaded. Any applications which are then opened have to fit
themselves in, round the sections of RAM that have already been bagged.
In a typical Windows configuration it is easy to see where the RAM space for any given device
is located.
This screen was obtained by clicking
on My Computer >> Control Panel
>> System >> Device Manager >>
Modem >> Properties >>
Resources. We can then tell from
this screen that the modem card uses
IRQ Channel 3 and memory locations
1428 - 142F and 2000 to 20FF. The
addresses are always given in
Hexdecimal notation. The Conflicting
Device list shows "No Conflicts"
meaning that this modem is not in
competition with any other device for
the IRQ channels and RAM locations
it's using.
By looking at the properties of other
devices you can see which resources
are claimed on startup. It is
sometimes useful to set out a diagram
showing which parts of RAM are
claimed by devices, and which are
free for running applications; these
diagrams are called Memory Maps.
Examples of Memory
Maps.
These are all common, commercial examples provided by different computer manufacturers
which demonstrate, if nothing else, that the term "memory map" can mean different things to
different manufacturers!
Memory maps are usually labelled in hex. This is because Computers store their locations in
binary - and we humans cope better hexdecimally!
September 2003
34
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Example 1.
The following memory maps show which areas of the memory space are available for your
program's use. Essentially, if you do not use the MON51 Target Monitor, you have the
entire address space available.
Configuration Using MON51
Memory Type & Range
Description
XDATA (0000h-6AFFh)
von Neumann RAM/ROM
(Reserved for program code.)
XDATA (6B00h-7FFFh)
von Neumann RAM/ROM used by the Monitor
(Data Area)
XDATA (8000h-FFFFh)
Free RAM
(Available for target program.)
CODE (0000h-6AFFh)
von Neumann RAM/ROM
(Reserved for program code.)
CODE (6B00h-7FFFh)
von Neumann RAM/ROM used by the Monitor
(Data Area)
CODE (8000h-9100h)
ROM used by the Monitor
(Code Area)
September 2003
35
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Example 2. There are no figures with this one showing the locations and space used, because
this will vary between manufacturers. However this is a very common configuration and will
probably be similar to your own PC. It's actually a memory map of a Motorola 68000 derivative
as used in an iMac.
Figure: A simple schematic memory map of a
microcomputer. The order of the different segments of
memory can vary depending on the system.
September 2003
36
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Example 3. A very fancy memory map of a games console; this would be similar to a Nintendo
64 or a PlayStation configuration.
September 2003
37
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Example 4. One to draw for yourself. This is similar to the memory map question in Outcome 2
for Computer Architecture.
A certain system has an addressable memory using 16 lines. 8kb of boot code starts at 0000h.
System RAM starts at location 16384; this block of RAM extends for 16kb. RAM reserved for
the video display begins at location 8000h and continues for 4kb. Immediately after that comes
4kb of flash memory. The top 2kb is reserved for memory mapped i/o buffer space.
Draw the relevant memory map, labelling the areas claimed by the connected devices, unused
space and the addresses of the start and finish of each area.
How to draw a memory map.
First, you must calculate how much addressable memory you are working with - i.e. the number
of addresses, or locations, available.
This will always be (2number of address lines ). For example, a system with an 8-bit address bus will
have 28, or 256, addresses. Now draw a vertical bar chart and mentally divide it into 256 slices.
Label the bottom section 0 and the top one 255. Note that the number of the top address is
always the size -1.
It is now a simple task to fill in the slices that have been claimed by the various devices. Your
assessment question will give you addresses or sizes in both hex and decimal formats, so some
base conversion will be required. You may find it useful to label the sections in hexdecimal on
one side of the bar chart and decimal on the other. For assessment purposes, at least one side
must show all the starting and finishing addresses.
Once you have done, check your solution with the answer over the page.
September 2003
38
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Example 4 suggested solution.
First work out total addressable memory - it's 2 16 which equals 65536 locations, each of which
is 1 byte in size.
Addresses start from 0, so the range will run from 0000 to (65536 less 1) which equals 0000 65535 in decimal notation, or 0000h to FFFFh in hex.
Now draw a map - for assessment purposes it does not have to be to scale, but the addresses
MUST be accurate!
Address (Decimal notation)
63488 - 65535 (62k - 64k) I/O Buffer
Address (Hexdecimal notation)
F800h - FFFFh
40960 - 63487 (40k - 62k) Free
A000h - F7FFh
36864 - 40959 (36k - 40k) Flash
9000h - 9FFFh
32768 - 36863 (32k - 36k) Video
8000h - 8FFFh
16384 - 32767 (16k - 32k) RAM
4000h - 7FFFh
8192 - 16383 (8k - 16k) Free
2000h - 3FFFh
0000 - 8191 (0 - 8k) Boot
0000h - 1FFFh
September 2003
39
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Exercise - External hardware Amstrad original PCW series.
Draw the relevant Memory Map based on the following figures. Label all the starting and
finishing addresses and show both claimed and free space. How wide is the address bus?
FDC status register starts from 0 and uses 1 byte
FDC data register comes next and also uses 1 byte
Starts at
136 &
uses 8
bytes
Location
159
Parallel ports
Kempston joystick
AMX mouse
EMR MIDI interface
A0-A7
A0-A2
Starts at
168 and
uses 8
bytes
Hard drive
Fax Link interface (CPS8256-compatible circuitry).
C8-CF
Starts at
208 &
uses 8
bytes
DF
Location
224
FF
September 2003
Kempston mouse
MasterScan: b0 ink under scan head.
Cascade/Spectravideo joystick. Input: b4 right, b3 up, b2 left, b1 fire, b0
down.
Immediately after comes free space
Top location reserved for PROM code
40
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Exercise - Atari Games Console (part).
Draw the relevant Memory Map based on the following figures. Label all the starting and
finishing addresses and show both claimed and free space. You may assume a 16-bit address
bus.
Location
Contents
Top of memory
Operating System ROM
Device handler routines
Serial I/O utilities
Interrupt handler
Central I/O utilities
Operating System vectors
RAM vectors on powerup
JMP vectors
Cassette
Printer
Keyboard
Screen
Editor
ROM Character set
Floating Point ROM package
I/O chips
ANTIC
Programmable Interrupt
Power On Key
GTIA or CTIA
Top 4630 bytes
60906-65535
E944 onwards
59093-59715
E4A6 for 559 bytes
58533
E480 onwards
58448-58495
58432-58447
16 bytes up to E43F
58400-58415
Start at E41F for 16 bytes
58368-58383
E36D
57343
55295
start at 54272 for 12 bytes
54016-54271
53760-54015
Start from D000 for 1/4 kilobyte
Week 7 - Polling, Interrupts and Device Handling.
An interrupt is signal informing a program that an event has occurred. When a program receives
an interrupt signal, it takes a specified action (which can be to ignore the signal). Interrupt
signals can cause a program to suspend itself temporarily to service the interrupt.
Interrupt signals can come from a variety of sources. For example, every keystroke generates
an interrupt signal. Interrupts can also be generated by other devices, such as a printer, to
indicate that some event has occurred. These are called hardware interrupts. Interrupt signals
initiated by programs are called software interrupts. A software interrupt is also called a trap or
an exception.
PCs support 256 types of software interrupts including 16 hardware interrupts. Each type of
software interrupt is associated with an interrupt handler -- a routine that takes control when the
interrupt occurs. For example, when you press a key on your keyboard, this triggers a specific
interrupt handler. The complete list of interrupts and associated interrupt handlers is stored in a
table called the interrupt vector table, which resides in the first 1 K of addressable memory.
September 2003
41
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Why Interrupts Are Used to Process Information.
The processor is a highly-tuned machine that is designed to (basically) do one thing at a time.
However, we use our computers in a way that requires the processor to at least appear to do
many things at once. If you've ever used a multitasking operating system like Windows 95,
you've done this; you may have been editing a document while downloading information on your
modem and listening to a CD simultaneously. The processor is able to do this by sharing its
time among the various programs it is running and the different devices that need its attention. It
only appears that the processor is doing many things at once because of the blindingly high
speed that it is able to switch between tasks.
Most of the different parts of the PC need to send information to and from the processor, and
they expect to be able to get the processor's attention when they need to do this. The processor
has to balance the information transfers it gets from various parts of the machine and make sure
they are handled in an organised fashion. There are two basic mechanisms that a processor
can employ.
Polling: The processor could take turns going to each device and asking if they have anything
they need it to do. This is called polling the devices. In some situations in the computer world
this technique is used, however it is not used by the processor in a PC for a couple of basic
reasons. One reason is that it is wasteful; going around to all the devices constantly asking if
they need the attention of the CPU wastes cycles that the processor could be doing something
useful. This is particularly true because in most cases the answer will be "no". Another reason is
that different devices need the processor's attention at differing rates; the mouse needs
attention far less frequently than say, the hard disk (when it is actively transferring data).
Interrupting: The other way that the processor can handle information transfers is to let the
devices request them when they need its attention. This is the basis for the use of interrupts.
When a device has data to transfer, it generates an interrupt that says "I need your attention
now, please". The processor then stops what it is doing and deals with the device that
requested its attention. It actually can handle many such requests at a time, using a priority level
for each to decide which to handle first.
It's also interesting to put into perspective just how fast the modern processor is compared to
many of the devices that transfer information to it. Let's imagine a very fast typist; say, 120
words per minute. At an average of 5 letters per word, this is 600 characters per minute on the
keyboard. You might be fascinated to realize that if you type at this rate, a 200 MHz computer
will process 20,000,000 instructions between each keystroke you make! You can see why
having the processor spend a lot of time asking the keyboard if it needs anything would be
wasteful, especially since at any time you might stop for a minute or two to review your writing,
or do something else. Even while handling a full-bandwidth transfer from a 28,800 Kb/sec
modem, which of course moves data much faster than your fingers, the processor has over
60,000 instruction cycles between bytes it needs to process.
September 2003
42
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Interrupt Controllers
Device interrupts are fed to the processor using a special piece of hardware called an interrupt
controller. The standard for this device is the Intel 8259 interrupt controller, and has been since
early PCs. As with most of these dedicated controllers, in modern motherboards the 8259 is, in
most cases, incorporated into a larger chip as part of the chipset.
The interrupt controller has 8 Interrupt Request Lines (IRQs) that take requests from one of 8
different devices. The controller then passes the request on to the processor, telling it which
device issued the request (which interrupt number triggered the request, from 0 to 7). The
original PC and XT had one of these controllers, and hence supported interrupts 0 to 7 only.
Starting with the IBM AT, a second interrupt controller was added to the system to expand it;
this was part of the expansion of the ISA system bus from 8 to 16 bits. In order to ensure
compatibility the designers of the AT didn't want to change the single interrupt line going to the
processor. So what they did instead was to cascade the two interrupt controllers together. The
first interrupt controller still has 8 inputs and a single output going to the processor. The second
one has the same design, but it takes 8 new inputs (doubling the number of interrupts) and its
output feeds into input line 2 of the first controller. If any of the inputs on the second controller
become active, the output from that controller triggers interrupt #2 on the first controller, which
then signals the processor.
Interrupt Priority
The PC processes device interrupts according to their priority level. This is a function of which
interrupt line they use to enter the interrupt controller. For this reason, the priority levels are
directly tied to the interrupt number:
On an old PC/XT, the priority of the interrupts is 0, 1, 2, 3, 4, 5, 6 and 7.
On a modern machine, it's slightly more complicated, because remember that IRQ2 cascades to
the higher eight lines. The result of this is that the priorities become 0, 1, (8, 9, 10, 11, 12, 13,
14, 15), 3, 4, 5, 6 and 7.
Non-Maskable Interrupts (NMI)
All of the regular interrupts that we normally use and refer to by number are called maskable
interrupts. The processor is able to mask, or temporarily ignore, any interrupt if it needs to, in
order to finish something else that it is doing. In addition, however, the PC has a non-maskable
interrupt (NMI) that can be used for serious conditions that demand the processor's immediate
attention. The NMI cannot be ignored by the system unless it is shut off specifically.
When an NMI signal is received, the processor immediately drops whatever it was doing and
attends to it. As you can imagine, this could cause havoc if used improperly. In fact, the NMI
signal is normally used only for critical problem situations, such as serious hardware errors. The
most common use of NMI is to signal a parity error from the memory subsystem. This error must
be dealt with immediately to prevent possible data corruption.
September 2003
43
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Multiple Devices and Conflicts
In general, interrupts are single-device resources. Because of the way the system bus is
designed, it is not feasible for more than one device to use an interrupt at one time, because
this can confuse the processor and cause it to respond to the wrong device at the wrong time. If
you attempt to use two devices with the same IRQ, an IRQ conflict will result. This is one of the
types of resource conflicts.
It is possible to share an IRQ among more than one device, but only under limited conditions. In
essence, if you have two devices that you seldom use, and that you never use simultaneously,
you may be able to have them share an IRQ. However, this is not the preferred method since it
is much more prone to problems than just giving each device its own interrupt line.
One of the most common problems regarding shared IRQs is the use of the third and fourth
serial (COM) ports, COM3 and COM4. By default, COM3 uses the same interrupt as COM1
(IRQ4), and COM4 uses the same interrupt as COM2 (IRQ3). If you have a mouse on COM1
and set up your modem as COM3--a very common setup--guess what happens the first time
you try to go online? You can share COM ports on the same interrupt, but you have to be very
careful not to use both devices at once; in general this arrangement is not preferred.
Many modems will let you change the IRQ they use to IRQ5 or IRQ2, for example, to avoid this
problem. Other common areas where interrupt conflicts occur are IRQ5, IRQ7 and IRQ12. The
following table shows the most common IRQ configurations on a PC-based system.
September 2003
44
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
IRQ
line
16-Bit
Priority
Bus
Line
Default
Use
Other
Common Uses
Description
Conflicts
0
1
No
System
timer.
None; for
system use
only
This is used exclusively
for internal operations
and is never available
to peripherals or user
devices.
1
2
No
Keyboar
d/
Keyboar
d
controlle
r
None; for
system use
only
This is used exclusively
for keyboard input.
Even on systems
without a keyboard,
IRQ1 is not available
for use by other
devices. Note that the
keyboard controller
also controls the PS/2
style mouse if the
system has one, but
the mouse uses a
separate line, IRQ12.
This is a dedicated
interrupt line; there
should never be any
conflicts. If software
indicates a conflict on
this IRQ, there is a
good possibility of a
hardware problem
somewhere on your
system board.
This is a dedicated
interrupt line; there
should never be any
conflicts. If there is, this
would indicate a
motherboard or chipset
(keyboard controller)
problem.
2
N/a
No
Cascad
es a
second
interrupt
controlle
r to the
first,
allowing
the use
of IRQs
8 to 15.
Seldom
used
nowadays
except for
older
modems
and EGA
video
cards, or
as an
alternative
IRQ for
COM3 or
COM4.
For compatibility with
older cards that used
IRQ2 on the original
PC or XT machines
(which had only one
controller and a normal
IRQ2 line), the
motherboard of modern
PCs reroutes IRQ2 to
IRQ9. Hence IRQ2 can
still be used but
appears to the system
as IRQ9.
September 2003
45
Conflicts generally
come from trying to use
a device on IRQ2 and
another on IRQ9 at the
same time. Some
modems and serial port
cards allow IRQ2 to be
used as an alternative
for the two standard
lines used for modems
and serial ports (IRQ3
and IRQ4) in order to
avoid conflicts in those
two heavily-contested
areas. This is generally
a good configuration
decision since unused
IRQs from 3 to 7 are
harder to find than
unused IRQs from 10
to 15. If you want to
use IRQ2, move any
device using IRQ9 to
another line like 10 or
11.
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
3
11
8/16
bit
COM2
COM4,
modems,
sound
cards,
network
cards, tape
accelerator
cards.
Also a popular option
for modems, sound
cards and other
devices. Modems often
come pre-configured to
use COM2 on IRQ3.
4
12
8/16
COM1
COM3,
modems,
sound
cards,
network
cards, tape
accelerator
cards.
This port and interrupt
are almost always used
by the serial mouse,
where there is no PS/2
mouse fitment. IRQ4 is
also the default
interrupt for the third
serial port, COM3, and
a popular option for
modems, sound cards
and other devices.
Modems sometimes
come pre-configured to
use COM3 on IRQ4.
September 2003
46
Conflicts on IRQ3 are
relatively common. The
two biggest problem
areas are modems
attempting to use
COM2/IRQ3 and
clashing with the builtin COM2 port. Some
systems may attempt
to use both COM2 and
COM4 simultaneously
on this same interrupt
line. Many devices,
(particularly network
interface cards0 come
with IRQ3 as the
default.
Conflicts on IRQ4 are
relatively common,
although not as
common as on IRQ3.
On systems with a
PS/2 mouse, problems
are less common. The
two biggest problem
areas are modems that
attempt to use
COM3/IRQ4 and clash
with COM1, and
systems that attempt to
use both COM1 and
COM3 simultaneously
on this same interrupt
line.
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
5
13
8/16
bit
Sound
card
LPT2,
COM3,
COM4,
modems,
network
cards, tape
accelerator
cards, hard
disk
controller
on old
PC/XT.
6
14
8/16
bit
Floppy
disk
controlle
r
Tape
accelerator
card.
7
15
8/16
bit
LPT1
COM3,
COM4,
modems,
sound
cards,
network
cards, tape
accelerator
cards
September 2003
47
This is probably the
single "busiest" IRQ in
the whole system. On
the original PC/XT
system this IRQ was
used to control the
(massive 10 MB) hard
disk drive. When the
AT was introduced,
hard disk control was
moved to IRQ14 to free
up IRQ5 for 8-bit
devices. As a result,
IRQ5 is in most
systems the only free
interrupt below IRQ9
and is therefore the
first choice for use by
devices that would
otherwise conflict with
IRQ3, IRQ4, IRQ6 or
IRQ7.
Technically IRQ6 is
available for use by
other devices, and
some will allow you to
select IRQ6, but most
will not.
Normally used for a
printer port. These
days of course many
other devices use
parallel ports, including
external drives. If you
are not using a printer
or other device then
IRQ7 can be used in a
similar way to IRQ5: as
an alternate for any of
the devices that would
normally be fighting
over IRQ3 or IRQ4.
Conflicts on IRQ5 are
very common because
of the large variety of
devices that have it as
an option. Sound cards
especially like to grab
IRQ5 and are generally
best left there, to avoid
problems with poorly
written older software
that just assumed the
sound card would
always be left at IRQ5.
To whatever extent
possible, move devices
that can use highervalued IRQs away from
IRQ5.
Conflicts on IRQ6 are
uncommon and are
usually the result of an
incorrectly configured
peripheral card, since
IRQ6 is almost always
used for floppy disks. If
you use a tape
accelerator card along
with an integrated
floppy disk controller
on your motherboard,
watch out for the
accelerator trying to
take over IRQ6.
Conflicts on IRQ7 are
relatively unusual. If
you are using two
parallel ports, make
sure the second uses
IRQ5 or another
available IRQ. Some
add-in parallel boards
try to make LPT2 also
use IRQ7, which
generally won't work.
Otherwise, avoid using
IRQ7 for expansion
cards.
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
8
3
No
Realtime
clock
None; for
system use
only
9
4
16
bit
only
None
Network
cards,
sound
cards,
SCSI host
adapters,
PCI
devices,
rerouted
IRQ2
devices.
10
5
16
bit
only
None
Network
cards,
sound
cards,
SCSI host
adapters,
secondary
IDE
channel,
quaternary
IDE
channel,
PCI
devices.
September 2003
48
This is the reserved
interrupt for the realtime clock timer. This
timer is used by
software programs to
manage events that
must be calibrated to
real-world time; this is
done by setting
"alarms", which trigger
this interrupt at a
specified time.
On most PCs it can be
used freely since it has
no default setting.
This is usually open
and one of the easiest
IRQs to use since it is
generally not contested
by many devices.
While the secondary
IDE controller can
sometimes be set to
use IRQ10, it almost
always uses IRQ15
instead.
This is a dedicated
interrupt line; there
should never be any
conflicts. If software
indicates a conflict on
this IRQ, there is a
good possibility of a
hardware problem
somewhere on your
system board.
There are a couple of
things to watch out for
when using this IRQ.
First, if you are trying to
use IRQ2, you cannot
use IRQ9 as well, since
devices that try to use
IRQ2 really end up
using IRQ9 instead.
Also, some systems
that use PCI cards that
require the use of a
system IRQ line will
grab IRQ9; this can be
changed in some
cases using the BIOS
setup to manually
assign IRQs to
devices.
Conflicts on IRQ10 are
unusual. The only
thing to watch out for is
a PCI card that needs
an interrupt line being
assigned IRQ10 by the
BIOS; this can be
changed in some
cases using the BIOS
setup parameters that
assign IRQs to PCI
devices.
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
11
6
16
bit
only
None
12
7
16
bit
only
PS/2
Mouse
13
8
No
Floating
point
unit
(FPU /
NPU /
Math coprocess
or).
September 2003
Network
cards,
sound
cards,
SCSI host
adapters,
VGA video
cards,
tertiary IDE
channel,
quaternary
IDE
channel,
PCI
devices.
Network
cards,
sound
cards,
SCSI host
adapters,
VGA video
cards,
tertiary IDE
channel,
PCI
devices.
This line is usually
open and relatively
easy to use since it is
generally not contested
by many devices. If you
are using three IDE
channels (the third
typically being on a
sound card), IRQ11 is
typically the one that
the tertiary controller
will try to use. Also,
some PCI video cards
will try to use IRQ11.
Watch out for PCI
cards, especially video
cards, that grab IRQ11.
On machines that use
a PS/2 mouse, this is
the IRQ reserved for its
use. Using a PS/2
mouse frees up the
COM1 serial port and
the interrupt it uses
(IRQ4) for other
devices. Normally this
is a good trade since
free IRQs with
numbers below 8 are
harder to find than
ones above 8. If a PS/2
mouse is not used,
IRQ12 is a good choice
for use by other
devices such as
network cards.
Watch out for PCI
cards that can
sometimes be
assigned this line by
the system BIOS. If
you are using a PS/2
mouse you need to
make sure no other
devices use IRQ12.
None; for
system use
only.
This is the reserved
interrupt for the
integrated floating point
unit (on 80486 or later
machines) or the math
coprocessor (on 80386
or earlier machines that
use one). It is used
exclusively for internal
signaling and is never
available for use by
peripherals.
This is a dedicated
interrupt line; there
should never be any
conflicts. If software
indicates a conflict on
this IRQ, there is a
good possibility of a
hardware problem
somewhere on your
system board, or
possibly with your
processor or math
coprocessor.
49
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
14
9
16bit
only
Primary
IDE
channel.
SCSI host
adapters
15
10
16bit
only
Second
ary IDE
channel.
Network
cards,
SCSI host
adapters.
Reserved for use by
the primary IDE
controller, which
provides access to the
first two IDE/ATA
devices (usually hard
disk drives and/or CDROM drives). On
machines that do not
use IDE devices at all,
this IRQ can be used
for another purpose
(such as a SCSI host
adapter to provide
SCSI drives).
This IRQ is nowadays
reserved for use by the
secondary IDE
controller, which
provides access to the
third and fourth
IDE/ATA devices
(usually hard disk
drives and/or CD-ROM
drives). If you are not
using IDE, or are using
only two devices and
want to put them on the
primary channel to free
up this IRQ, that can
be done easily as long
as you remember to
disable the secondary
IDE channel.
Problems with IRQ14
are rare, since the
universality of its use
for IDE means most
peripheral vendors
avoid offering it as an
option. If you are using
SCSI and not IDE, and
want to use IRQ14,
make sure any
integrated IDE
controllers are disabled
first.
Problems with IRQ15
typically result from
assigning a peripheral
to use it while
forgetting to disable the
integrated secondary
IDE controller. Most
Pentium or later (PCIbased) motherboards
have two integrated
IDE controllers. Some
people incorrectly
assume that there will
be no conflict if nothing
is attached to the
secondary channel, but
this is not always the
case.
Interrupt Service Routines.
Suppose the PC is currently running a software application say a spreadsheet application.
When you ask the machine to print a certain spreadsheet, this means the PC must stop what it
currently doing and deal with the printer request. This involves transferring control to another
programme to deal with the print request. This second program is called an interrupt service
routine. The purpose of an interrupt system is to utilise the CPU to the full. In order to do this it
is important that the interrupt is dealt with as quickly and accurately as possible, transparent to
the user.
September 2003
50
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
The interrupt routine must therefore;
1. Remember the state and where it left the current programme running
2. Deal with the interrupt
3. Then return to the interrupted programme.
Steps 1 and 2 are dealt with by the hardware, Step 3 is dealt with by the software. When the
CPU is ready to accept an interrupt it sends an acknowledgement signal. The following is the
sequence of events:1. An interrupt signal is generated by a device.
2. The CPU completes the execution of the current instruction and acknowledges signal.
3. The requesting device sends an address location via the I/O database to the CPU and
studies off the request signal.
4. The CPU stores the current value of the PC in the other memory address and backs the
next word (given in step 3 by the request device) cuts the PC and starts to process again.
The instruction now being executed is the first instruction of the service routine.
Remember we talked previously about registers in the CPU. Registers held information
currently being processed or information that would be useful at a later date. Whenever an
interrupt is called and the CPU has to deal with it, some mechanism has to exist to store the
current values of all the registers. These can then be restored once the interrupt service routine
is completed.
Multiple and Nested Interrupts.
There may be a number of interrupt causes, which therefore require some routine to identify the
cause of the interrupt.
This can be achieved by using multiple interrupt lines. Each line will have its own part of
memory locations. However, more than one type of interrupt may be required on one line.
Therefore, a number of different interrupt devices can be attached to the same interrupt line.
An interrupt line consists of:
A request line (Which transmits the request).
An Acknowledgement line (Which request signal and to signal, so that the interrupts is
switched off).
This can be achieved either by a software technique or hardware function.
Software Interrupt
When an interrupt occurs, as we now know the hardware transfers control to a service routine.
There may be a number of devices attached to this line, therefore the software will have to
identify the device which is requesting the interrupt.
The software will check first one device then skip to the next one. It checks for a bit flag, which
is set to one if an interrupt, has been requested. It sets to zero when an interrupt has been
dealt with. When the flag is set to zero the software skips the next instruction.
September 2003
51
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
It is also possible for interrupts to interrupt each other, and this is called a nested interrupt. A
nested interrupt occurs when an interrupt currently being serviced (interrupt A) is temporarily
suspended to deal with another interrupt (interrupt B). In order to deal with either then the
address of the first instruction for interrupt A is stored in a memory location. The return address
will then be pushed onto a stack. (A stack is simply a pile of instructions and memory
locations). Each time an interrupt is accepted the PC (Programme Counter) will be pushed onto
the stack then popped off in reverse order.
Hardware Interrupt.
When using multiple interrupt lines. Priorities can be achieved simply by using a priority
arbitration circuit, with all peripheral lines attached. Priority can be either fixed or
programmed. If there is only one device per interrupt line then we have total priority between
the systems. However, usually there is more than one device on the line, in which instances the
priority can be achieved by daisy chaining, the interrupt acknowledgement line between the
circuits.
The second method of interrupt handling is called daisy chaining. The devices are attached to
the same interrupt request line. However, the acknowledgement line instead of being attached
in parallel is attached first to one device then the next. This means the device closest to the
CPU has the highest-ranking priority.
All interrupt lines have an appropriate bit pattern, and if an interrupt is generated it will only be
recognised if the corresponding bit is set to 1. An associated register called an Interrupt mask
register is programmable, and allows the bit pattern to be changed. This register is added to
the other register, therefore, unless there is a 1 bit, in the mask register, an interrupt will not be
recognised.
Input / Output (I/O) Channels.
We have seen how the CPU operates and how it deals with interrupts. It will also have to deal
with peripherals devices attached to the PC such as printers.
In earlier PCs the CPU was bound by the I/O and could not perform any processing while
communicating with a peripheral device. This meant the CPU could not process data while
sending and receiving data and was dictated to by the speed of the peripheral device. In order
to overcome this problem I/O channels were developed so that I/O could be handled
independently from the CPU.
In today’s systems the CPU executes an instruction to initiate an I/O transfer over a channel.
The transfer is dealt with by the I/O channel leaving the CPU to deal with processing other data.
When the data transfer is complete the I/O channel sends an ‘I/O complete’ interrupt to the CPU
to inform the CPU it can now transfer more data.
September 2003
52
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Memory Buffers
When data is being transferred from say a PC to a printer, the information being sent will be
stored in a buffer. A buffer is simply an area of memory used to temporarily store the data being
transferred. When sending data to the printer it is held in a buffer contained in the PC or printer
or possibly both. The CPU then instructs the I/0 channel to transfer the data from the buffer to
the printer. This is why a printer may continue to print a document even though the PC has
been switched off line or indeed off entirely.
Similarly, when typing data at the keyboard, the information is sent to a buffer by the ‘I/0
channel’, held there until the centre key is pressed, then the information is dealt with by the CPU
(i.e. a write operation).
Polling and Interrupts – Comparison of Approaches.
As part of your assessment material you will be obliged to evaluate a real-life scenario (not
necessarily computer-oriented) and explain whether it is uses a polling or interrupt-driven
approach. The following are examples of assessment-level questions for you to try.
Examples 1 & 2(c) SQA – taken from Draft Exemplar 2001
Example 1.
A Computer Technician is designated as the technical support operative. Users phone in to
report faults. The user is required to state their location as part of the fault report procedure. As
soon as a fault is reported, the technical support operative is required to go to the assistance of
the user who has reported ther fault.
Is this an Interrupt driven or Polling approach?
What, if any, is the disadvantage to this approach?
Example 2.
A Computer Technican is designated as the technical support operative. Every hour the
technical support operative is required to visit each computer user in turn to find out if there are
any problems to report.
Is this an Interrupt driven or Polling approach?
What, if any, is the disadvantage to this approach?
September 2003
53
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Example 3.
Sam and Ella run a sandwich shop. The shop is usually quiet during the mornings so they are
free to work in the back, preparing sandwiches for the busy lunchtime period. If a customer
does come in, a small bell attached to the door will ring to alert Sam and Ella that someone
requires serving. At lunchtimes, however,. the shop is very busy and both Sam and Ella have to
serve at the counter. Customers have to regularly stand in a queue to collect and pay for their
sandwiches.
Which elements of this scenario equate to an interrupt driven approach, which part is the polling
approach, and why?
Week 8 - Direct Memory Access
Programmed I/0 channels are fine for slow peripheral devices. However, the data still has to
pass through the MBR and MAR, which means the CPU, will spend a fair proportion of its time
dealing with the transfer of data.
For high speed devices e.g. laser printers or disk drives, we require to transfer the data directly
to the PC's memory, thus bypassing the CPU to allow it to continue processing other data. This
is called Direct Memory Addressing (DMA).
This is achieved by incorporating many of the functions included in software into a hard drive
controller. This controller will need the following:
•
A register for generating the memory address;
•
A register for keeping track of the word count;
•
A register to be used as a data buffer between the peripheral device and the main
memory.
Therefore, the DMA controller can be connected directly to the peripheral device and the PC's
memory, thus avoiding the use of the CPU in the transfer of data.
To commence an I/0 operation utilising the DMA the programme will do the following:
·
Load the initial memory address.
·
Load the count of the number of words to be transferred.
·
Load a control word stating whether to input or output.
·
Execute the 'start' command.
When the DMA receives the start command it will begin transferring the data independently of
the CPU. This allows the CPU to process either another part of the same programme or
another programme.
Note: The DMA is still attached to the CPU because the CPU will have to initiate the transfer
with the start command.
There will also be occasions when the DMA is in the process of transferring data to memory and
the CPU will also wish to access memory. In these circumstances the DMA is usually given
priority, as the transfer of data from a fast peripheral device cannot be held up. This is known
as cycle stealing because the memory cycle access usually originates from the CPU. Although
memory is accessed to fetch a command, when the command is executed, it often does not
always involve access to memory. Therefore, ‘cycle stealing’ is not as common as you may at
first envisage.
September 2003
54
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Any peripheral which can utilise DMA will normally default to a particular channel when it is first
installed. The following table shows the most common configurations.
DMA
Chnl
Bus
Line
Default Use
Other Common
Uses
Description
Conflicts
0
No
Memory
(DRAM)
Refresh.
None; for
system use
only
Reserved for use
by the internal
DRAM refresh
circuitry.
(Remember that
Dynamic Ram must
be refreshed
frequently to make
sure that it does not
lose its contents.)
1
No
Low DMA
channel
for sound
card.
SCSI host
adapters,
ECP parallel
ports, tape
cards,
network
cards, voice
modems.
Most sound cards
today actually use
two DMA channels;
one must be
chosen from DMAs
1, 2 or 3, while the
other can be any
free DMA channel
(and so is selected
from the less-used
5, 6 or 7). DMA1 is
also a popular
choice for many
other peripherals,
largely for historical
reasons.
2
8/16
bit
Floppy
Disk
controller
Tape
accelerators
Not usually offered
as an option for use
by most peripherals
(except the
occasional tape
accelerator card,
because many tape
drives run off the
floppy interface,
and can even be
set to drive floppy
disks themselves.)
Most devices stay far away
from DMA0, recognising its
use by the system. Beware
however, as some devices
actually offer DMA0 as an
option - never under any
circumstances use DMA0
for peripherals! If you have
no devices set to use
DMA0 but a conflict
becomes apparent
anyway, it could be a
problem with your
motherboard.
DMA1 is one of the two
most contested channels
in the system (the other
being DMA3, which is
often worse). It is important
to watch for conflicts
between multiple devices
here, particularly if you are
using a sound card. It is
preferable in general to
leave the sound card on
DMA1 and move any other
devices out of its way, for
compatibility with older
(poorly written) software
that assumes the sound
card is on DMA1. Also
watch out for ECP parallel
port conflicts here.
DMA2 is not often a source
of conflicts, as long as you
remember not to put any
other devices on it if you
have a floppy disk
controller in your system
(which almost everyone
does). Beware tape
accelerator cards that
default to DMA2 for their
channel assignment.
September 2003
55
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
3
8/16
bit
None
ECP parallel
ports, SCSI
host
adapters,
tape
accelerator
cards,
sound or
network
cards, voice
modems.
Normally the only
one free on the first
controller (DMAs 0
to 3) when you are
using a sound card.
As a result, it is
probably the
"busiest" channel in
the PC, with many
different devices
vying for its
services. On very
old XT systems,
DMA channel 3 is
used by the hard
disk drive.
DMA3 is probably the
worst channel in the
system for conflicts,
because so many devices
try to use it. It is important
to watch for conflicts
between multiple devices
here, particularly if you are
using a sound card or ECP
parallel port.
4
8/16
bit
Cascade
for DMA
channels
5 to 7.
None; for
system use
only.
There should not be any
conflicts on this channel;
any problems with it
indicate a possible system
hardware failure.
5
16
bit
only
High
DMA
channel
for sound
card.
SCSI host
adapters,
network
cards.
6
16
bit
only
None
Sound
cards (high
DMA),
network
cards.
This DMA channel
is reserved for
cascading the two
DMA controllers on
systems with a 16bit ISA bus. It is not
available for use by
peripherals.
Normally taken by
the sound card in
your PC for its
"high" DMA
channel. Some
network cards also
use this channel,
though others don't
use DMA at all.
This DMA channel
is normally open
and available for
use by peripherals.
It is one of the least
used channels in
the system and is
an alternative
location for the
"high" sound card
DMA channel or
other devices.
September 2003
56
Few conflicts arise with
this channel because there
are relatively few devices
that can use DMA
channels 5, 6 or 7.
Few conflicts arise with
this channel because there
are relatively few devices
that can use DMA
channels 5, 6 or 7.
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
7
16
bit
only
None
Sound
cards (high
DMA),
network
cards.
Normally open and
available for use by
peripherals. It is
one of the least
used channels in
the system and is
an alternative
location for the
"high" sound card
DMA channel or
other devices.
Few conflicts arise with
this channel because there
are relatively few devices
that can use DMA
channels 5, 6 or 7.
We have seen that slow devices use programmed I/0 devices while high-speed devices use a
DMA. It would be beneficial, therefore, for all high speed devices to have a DMA attached, in
order to achieve maximum use of the CPU. This would not be practical however. To overcome
this we can make use of a channel - a small processor which acts as a shared DMA to a
number of peripheral devices.
There are 3 basic types of channel:
Selector channel. A selector channel may have a number of devices attached to it. However,
the channel ‘selects’ a particular device and will not service any other device until finished with
the device selected. The channel will transfer a block of words to or from the main memory, will
synchronise the speed of transfer and perform parity checking. When transfer is completed it
will generate a transfer complete interrupt or error signal if the parity check was bad.
Byte Multiplexer channel. A byte multiplexer channel is used to transfer slower moving
devices. It can service a number of devices simultaneously since the rate of transfer of data is
greater than the rate at which a device can supply the data.
The channel will poll individual devices connected and transfer the next character for each
device, as they become ready for transfer. Character count and memory data addresses are
returned to a fixed memory location. When a device is attached these parameters are fetched
from memory and when the device is disconnected the parameters are placed back. A
multiplexer channel can be attached to only one medium speed device for a burst transfer (i.e.
more than one character). In this mode it acts as a medium speed selector channel.
Block Multiplexer Channels. The block multiplexer channel combines the best of both the
selector and the multiplexer channels. It can transfer data from high speed devices like the
selector, transfer blocks of data like the multiplexer and poll devices to transfer blocks of data
when requested.
The block multiplexer has a distinct advantage over selector channels because it is not entirely
dedicated to one device until the transfer of data is complete.
In order to perform the above operation the hard disk will have to be rotated until the read/write
heads are over the required sector, from which the data is to be accessed. A block multiplexer
channel will service another device until the above device is ready to transfer.
September 2003
57
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
DMA-Enabled versus Non-DMA Enabled Devices.
As part of your assessment material you will be obliged to demonstrate, in the form of a graph,
the increase in performance of a device which can use a DMA channel over an equivalent
device which does not. You will be given the various data transfer rates of each device and
asked to plot these on a rate/time graph, together with a control line showing 100% processor
usage. The following exercises are of equivalent difficulty to an assessment level question.
Exercise 1.
You urgently need to access a web page on a foreign server, and on trying to navigate to the
page a pop-up box asks you to download and install Georgian Text support. The files for the
text are 9.2 megabytes in size and will take approximately 5.23 minutes to download and install.
Out of the total time taken, 1/3 represents the transfer through the modem card buffer; 1/3 to
process the data through the CPU; and 1/3 to transfer the file to the hard drive.
Assuming that it is possible to apply DMA to the modem and hard drive, estimate the time
required to download and install the text support files, where setting up the DMA controller takes
1/10th of the time that the CPU would have taken when measured over 1 second.
Draw a graph to show both arrangements. Assume that the given transfer rate represents the
processor and peripherals working at 75% capacity. What is the maximum transfer rate where
a DMA controller is employed?
Exercise 2.
Two identical computers are required to download new anti-virus signature files from a remote
server. Computer A transfers the data to its hard drive via the processor, because someone
has inadvertently disabled the DMA controller; Computer B's programmed DMA is still intact.
Computer A takes 1/4 second to transfer the data through the modem card buffer, 1/2 second
for the CPU to process it, and 1/4 second to transfer it to hard drive buffer. The data can be
downloaded at a maximum speed of just under 14.65 kilobytes / second. Computer B still takes
1/2 second to collect and transfer the file, but setting up the DMA controller only takes 5% of the
time of the processing time of Computer A.
Assume that each computer has a large enough cache that the processor and DMA controller
are not competing for resources. Plot a graph showing the difference in transfer rates for both
computers, comparing the performance of the system using the DMA controller with the one that
does not.
Now assume that the transfer rate of 14.65kBps represents Computer A working at 90%
capacity. Draw in a control line to show this, then use the control to estimate the maximum
transfer rate of Computer B (also in kBps). Your answer does not have to be 100% accurate
but MUST be a close approximation.
September 2003
58
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Glossary Of Computing Terms.
You may find it helpful to complete this Glossary, writing in your own definition for each term as
you learn about it.
Accumulator
Adder
Address
AGP
ALU
ASCII
Assembly code
Binary
BIOS
Bit
Buffer
Bus
Byte
Cache
Capacitor
CISC
CMOS
Compiler
Computer
Control Unit
CPU
Decimal
September 2003
59
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Device driver
Disk drive
DMA
EPROM
Firmware
Flag
Flip-Flop
Floppy disk
Frequency
Gigabyte
Gigahertz
Handshaking
Hard disk
Hardware
Hexdecimal
High-level language
I/O port
Instruction
Instruction set
Interface
Interpreter
Interrupt
ISA
Kilobyte
Kilohertz
September 2003
60
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
LCD
Linker
Low-level language
Map File
Megabyte
Megahertz
Memory
Memory Map
Microprocessor
Microprogram
Monitor
Nanoprocessor
Object file
Operating system
Overclocking
Parallel port
Parallel transmission
Parity
PCI
Peripheral
Pixel
Polling
POST
Program counter
PROM
September 2003
61
ABCDE
Engineering, Computing and Business Studies:
Computer Architecture (D75P34)
Protocol
RAM
Read
Register
RISC
ROM
SCSI
Serial port
Serial transmission
Software
Swap file
Synchronous
Terabyte
Terahertz
Timeslice
Transistor
Virtual memory
Volatile storage
Word
Write
Writeback
Writethrough
September 2003
62