Mikroprocesszorok

Microprocessors
and computers
Architecture and operation
last modified: 2016. Feb. 25.
I.1. What is a computer?
It’s not trivial!
 There are many different types of
computers, with different structure,
operation and aim.

 personal
computers, supercomputers,
embedded systems, microcontrollers..
 Analogue vs digital, binary vs decimal vs
trinary, boole – fuzzy - neural – quantum –
dna ...
Is it one?
Or this?
Maybe this?
What about it?
E?
And then what is this?
Huh?
I.2. Basic principles
Algorithm:
process, method, sequence of steps used to solve
a problem
-» program (software)
Interesting fact: „algorithm” comes from the name of Muhammad ibn
Musa al-Khwarizmi (cc.780-850), Persian scientist. The word
„algebra” comes from the title of his work: „Al-kitāb al-mukhtaṣar fī
ḥisāb al-ğabr wa’l-muqābala ”.
Basic principles

Universal, programmable computer
Computer (1949)
Computer
Computer: originally the people performing
the calculations 
 Calculator: can do arithmetic operations
 Computer: can do arithmetic and logic
operations, can be programmed, has a
memory
 the difference is not obvious

Analog computer
Can be electronic, mechanical, optical, etc.
Principle of electronic analog computer:
The differential/integral (etc) equations that describe
physical processes can be realized in an analogue way
with electronic circuits (usually R,L,C and opamps).
Input and output: eg. voltage waveform V(t)
Storing and loading these can be problematic.
Output indication: eg. scope, plotter, deprez
The circuitry can be combined electro-mechanical-optical;
similarly the output indication as well (eg. bombsights)
Analog computer
Analog computers were often purpose-made
and as such, not re-programmable.
Ones made for research and education
could be programmed by wiring –
practically connecting together
mathematical blocks (eg. summing,
subtracting, integrating, differentiating
amplifiers).
Analog computer
Advantages:
Theoretically continous range and domain, no
quantization noise, but as for all analogue
systems, other types of noise are a big problem.
They were much faster (and smaller) for many
problems (usu.involving diff.equation systems) in
the middle 20th century.
Made obsolete by the 1960-70-ies.
Modern research (little) involves VLSI analogdigital hibrid ICs.
Analog computers
Digital computer
Input data series is discrete in time domain
(sampled) and also in range (quantized).
These are easy to store and reload.
Quantization makes for less precision
theoretically, but makes it much more
tolerant of noise.
Digitális számítógép
A new branch of mathematics was needed to be developed
– numerical calculations. It was not trivial how to solve
complex diff.equations on quantized and sampled values
and what will be the errors.
Programs are easy to store and reload as well. Methods,
algorithms need not be re-invented once created.
Digital computers are more versatile. Analog computers
were made for a smaller set of problems (imagine
reprogramming the same computer from a power plant
simulation to mp3 coding).
(Digital) Computer
Program: series of calculations and
operations that produce an output from an
input (mathematically: realization of an
algorithm)
 Stored program: the program is stored in
advance in the computer’s memory, from
where it can retrieve it at its own pace

Computer
Stored program makes possible to do
complex operations and proper timing
 Important parts of programs are
conditional statements (branching), jumps
or subroutine calls, cycles (loops) – these
differentiate a real computer from a
calculator that can execute a serial
program

Personal computer
For home and office use
 It has interfaces and peripherials made for
human interaction (display, keyboard,
mouse, sound devices, etc)
 available for tasks needing low, medium,
medium-high calculation speed
 typically for interactive programs
Personal computer
The original IBM-PC
(IBM 5150)
1981
(There were a few personal
computers before, but IBM
popularized the term PC)
Mobil computers
Somewhere between a traditional PC and
an embedded system
Mobile phone / smart phone, tablet
Mobil personal computer
Supercomputer
For scientific and engineering calculations
needing very large number of calculations
 Often controlled through other computers
(terminals), no human interfaces needed
 Very high calculation speed, large number
of processors in parallel architecture
 Their software often run for days or
months
 Typically not interactive programs (no
human intervention)
Supercomputer
Embedded system
A computer built into some device,
controlling its operation.
 Usually lacks in human interfaces or has
unusual ones
 Often equipped with IO devices to
communicate with other devices
 Lower-medium calculation speed
 Reliability, ruggedness, low power are
often requirements
Embedded system
Embedded system
Embedded system
Embedded system with human
IO
II. Operation of microprocessors
and computers
II.1. Principles of computers




Turing, 1936 „On computable numbers...”
Neumann, 1945 „First Draft of a Report on the EDVAC”
Turing 1946 Automatic Computing Engine
 these papers influenced each other, forming a base for what are
known as Neumann principles
 stored (flexible) program, binary system, use of integers, fixed
point and floating point numbers and two’s complement
 basic architecture (processor, memory, buses)
Turing 1950 „Computing machinery and intelligence”
Neumann principles (computer)
Binary system, number
formats
 Storage of program and data
 basic schematic
(ALU, CU, IO, memory, I-O
peripherials)

Neumann vs. Harvard architecture
Neumann: program and data stored in
same way, in same memory
(eg. PC)
 Harvard: program and data stored
separately, often in separate format (eg.
microcontrollers) (otherwise same
principles)
 Mixed (eg. cache in PC processors)
 several modified versions, not trivial

II.2. Instruction set (processor)
Instruction set: the set of instructions a processor
knows in hardware
Machine code: a program containing instructions
from the instruction set, in binary or hexadecimal
format. It can generally be natively run by the
processor (without the need for other software).
Instruction set, programming
Assembly: lowest level programming
language, processor dependent.
It is made by assigning easy to remember
words (mnemonics) to machine code
instructions; make easier data and number
formatting; make labels and constants
available; make some simple functions
Instruction set, programming
Software originally written in other
programming languages have to be
converted to machine code using a
compiler (in older times, by hand). For very
high level languages it can involve several
middle steps.
Question: in what language and machine are
compilers written?
II.3. What is in a processor?






Processor’s main components:
ALU: arithmetic and logic unit
CU: control unit with control bus connection
registers: small internal memory for holding
temporary data
data bus connection (parallel)
address bus connection (parallel): for
addressing memory and IO
Why microprocessor?
Microprocessor 
(Motorola 6800, 1974)
Not micro processor 
(PDP-11, 1970)
Operation of a processor




Needs a clock signal (square wave)
Clock edges dictate steps of execution, input
and output
One instruction reading and execution can take
several clock periods
The clock frequency is not a trivial indicator of a
processor’s calculation capability (speed),
because of: previous points; parallel execution;
pipeline etc.
Operation of a processor









Memory address ->address bus, memory read command->control
bus
Reading instruction (fetch) (data bus->instruction register)
Increment Program Counter (PC) after each instruction byte read
Decoding of instruction
Reading of additional parts of instruction as needed
Reading of data from memory if requested by instruction
Execution of instruction
Output data to data bus (if needed), or changing PC (if jumping
instruction), etc.
(see Z80 operation ppt for demo)
Clock cycles
System clock period (T)
 Machine cycle (several T long)

 does
parts eg. fetch, decode-execute, put
data on bus

Instruction cycle (several machine cycle
long)
Pipeline
Method for faster
overall execution
Main idea is to split
instruction execution
into several steps
which can be done as
in a „conveyor belt”
Overall execution
speed is n times (if n
steps)
Optimal if there aren’t
many jumps and forks
in the program (have
to refill the pipeline)
Types of instruction sets
RISC: reduced instruction set computer
 CISC: complex instruction set computer
 other (eg. OISC: one instruction)
 as usual the definitions vary with authors
and murky

CISC
complex instructions (one machine
instruction to do complicated stuff)
 support of complex data formats (data
types)
 instructions have many direct and indirect
ways of reaching (addressing) contents of
RAM (main attribute of CISC according to
some authors)
 many instructions? (not necessary!)

CISC

Pro:
 can
be easier to compile from higher level
languages (?)
 our program (machine code) can be smaller
 our program can be quicker if lots of
complicated stuff are needed
 for versions with small number of instructions,
but many ways of memory access (eg. PIC
uC), assembly programming can be easier
CISC

con:
 asm
programming and debugging can be
harder if lots of complicated instructions (think
PC)
 most of complex instructions are used rarely,
but use up hardware space and resources in
uP

CISC is often realized with microcode
microcode




instruction decoding and execution is two-level
inside uP: instructions (as seen by programmers)
are decoded into a series of even simpler
instructions (microinstructions) – effectively realizing
CISC with RISC
not seen from outside (even from assembly)
machine code (asm) can be changed without
modifying the hardware; similar processor types can
have similar machine code, while microcode is
different (portability); can emulate another processor
can make uP development and bugfixing easier
RISC
smaller number of instructions
 simpler instructions
 load/store architecture: there is a simple
read and write from RAM into a register,
everything else (instructions) is done on
internal registers

RISC

pro:
 more
space left in uP desing for making
instructions more efficient, faster; also more
space for extra registers and special functions
 complex instructions are needed rarely, so
doing them from software doesn’t slow the
program down too much
RISC
 try


to make all instructions same length
this helps realizing pipeline and timing calculations
con:
 might
need a better optimized compiler
 machine code of program can be larger
 slower if too many complex instructions are
needed
CISC-RISC
difference, definition ??:
 a uP/uC can have very few, very simple
instructions but with complex memory
addressing
 could have simple load/store memory
addressing, but large complex instruction
set (as in higher math, matrices etc)

II.4. Computer structure
PC structure:
 power supply unit (PSU)
 mainboard (incl. CPU, RAM, control units)
 IO (input-output) peripherial cards (human
interfaces, storage, network)
 external IO peripherials (human interfaces)
Mainboard (motherboard)

Usually one PCB (printed circuit board)
that contains the necessary tracks (lines),
ICs (integrated circuits, microchips) and
connectors necessary for computer basic
functions
Mainboard (motherboard)
Usually contains:
 CPU
 RAM (temporary memory)
 ROM (eg. BIOS)
 control circuits (chipset)
 connectors for IO devices, add-on cards
and power

CPU
CPU: Central Processing Unit
 the processor responsible for running the
main software, for controlling the computer
 there can be other processors in the
computer (eg. peripherial control)
 CPU can consist of more than one
processors (or multi-core processors)
SBC (Single Board Computer)


also called industrial motherboard
one (usually small sized) mainboard that contains most
things necessary for operation of computer




CPU, RAM, Flash (eg. CF card), integrated interface controllers
needs external power, but often only needs a single
voltage and has lower power consumption (vs traditional
PC)
usually has a special connector for direct control of
digital circuits (similar to digital IO of microcontrollers)
often contains PC/104 bus, which is modified ISA bus for
industrial applications (eg. connect to data acquisiton
(DAQ) cards, relay cards, etc)
SBC
Bus system

Bus : set of lines (tracks,wires) carrying the
pieces of data
 serial

or parallel
parallel: eg. 8bit data width: 8 lines, 8bits arrive
at the same time
 eg.
motherboard data bus, address bus, older printer
port (LPT), PATA (IDE) (for hard disk)

serial: one line (or one line per direction), eg.
8bits: 8 clock cycles to transmit info
 eg.
external peripherials (RS232,USB), SATA (hard
disk), I2C, SPI (for certain ICs)
Simple bus system (mainboard)
Addressing
Separate address space (port mapped IO)
 same
address can be IO or M
 a control bus line selects IO/M
 practically same as if that line was added to
address bus, thus twice the mem space,
 though often the address space usable for IO
is less than for M; eg 16b for M and 8b for IO
 this
limits expansion
Addressing

Memory mapped IO
 part
of memory address space is reserved for IO units
 more flexible
 can be fixed or temporary (selectable)
 if total memory space is realized by RAM ICs, some
parts of it will not be usable

eg. „PCI hole” on PC
 eg.
PIC microcontroller: IO functions and settings
mapped to RAM (Special Function Registers)
 x86: both modes, but usually memory mapped; in 64b
mode only the latter
Example memory connection
4bit dat bus, 8 bit address bus, total 256B memory
of which 1x64B ROM, 3x64B RAM
top 2 bits of address (A7,A6) selects memory module (address decoding with nand gates)
Example: ZX Spectrum (Z80)
IBM-PC (8088) (1981)
IBM PC-AT (80286) (1984)
80386
CPU clock frequency became higher than ISA bus, so bus interface chip
needed
Pentium
North bridge – south bridge architecture
Intel Hub Architecture (IHA)
Pentium Pro - ...
Intel PCH
Core series
BIOS






Basic Input-Output System
small program stored in ROM
this is read first by CPU when starting computer
(booting)
this starts the operating system from the hard
disk (or network drive)
provides some function calls for software (basic
hardware access)
since PC-AT it includes a program to change
some config settings of PC, stored in a special
RAM (with its own battery)
II.5. Memory
RAM: random access memory: any cell
(byte) can be read or written to; usually
volatile
 ROM: read only memory (usually factory
written), non-volatile, fast
 EEPROM, Flash: kind of ROM which can
be overwritten by user with special
methods (non-volatile storage) – read fast,
write slow, limited write cycles

SRAM (Static RAM)
Every bit is a flip-flop. One flip-flop is usually 6
transistor. There is also a 4 transistor + 2
resistor variation (smaller but more power
consumption).
Compared to DRAM:
•Keeps value as long as it has voltage supply no need of refreshing.
•Faster than DRAM.
•Less bits per unit area (data density).
•Easier control.
•More expensive.
Found mostly in microcontrollers and cache.
DRAM (Dynamic RAM)
Each bit is a transistor + a capacitor.
•large data density
•needs periodic refreshing using external
or built-in control electronics (capacitor
discharges)
•RAM is made up of rows and columns;
entire row is refreshed at once
•for DDRAM with 8192 rows, there is 7.8us
refresh cycle resulting in 64ms total
refresh time
Used in large memory: mostly PC RAM
ECC-RAM
Error correcting code
 9b RAM, 9th bit for error correction

Cache
Method for increasing effective speed
 RAM outside of uP is slower
 DRAM slower than SRAM (but larger
capacity)
 cache: small capacity, fast SRAM inside
CPU

Cache

from external RAM we load into the cache
such data that will be needed in the near
future
 part
of the running program
 necessary variables, data

contents must be synchronized with RAM
 needs

complex control circuitry
can be multi-level
Cache
AMD Athlon 64
DMA – Direct Memory Access





used when Memory and
IO devices need to
communicate
data normally goes
through CPU, with DMA it
goes through DMA
controller (but CPU halts
during this)
the DMA controller does
this faster than CPU
it can transfer large blocks
these functions can be
integrated into bus
controller
DMA






Problem: the bus (mem and data) can only be used by either CPU
or DMA controller
byte mode: CPU gains control of bus between transfers (eg. to read
its instructions) (cycle stealing) – eg in real time systems
burst mode : large blocks, CPU is on hold
interleaved: if memory clock faster than cpu, they use bus in
alternating mode
transparent: DMA waits for bus being freed
if there is a cache, CPU can go on working during DMA transfer, but
this can result in difference btw contents of cache and RAM (cache
incoherence) – cache must be written out to RAM before DMA
operation
DMA
ISA bus: there are 1 or 2 DMA controllers,
4 or 8 DMA channels
 PCI bus: no central controller; any PCI
device can ask the PCI controller (south
bridge) to gain control of the bus
...

II.6. Processor examples
4004
8008 (more detailed)
8080
Z80
Z80
Z80 test circuit
8087 FPU (Floating point unit)