PowerPoint file

Computer Systems
• Hardware and systems software that work
together to run application programs
• We’ll cover those aspects that are of
importance to a programmer – what you
need to know to write efficient, correct and
secure code
chapter 1
1
Intro to Intro
• This chapter covers steps taken to compile
and execute a hello world program
• Also introduces hardware and software
that is covered in more detail later in the
text
chapter 1
2
Example hello world program
#include <stdio.h>
int main()
{
printf(“hello, world”);
}
chapter 1
3
C source file
• Program created via some editor stored in
<filename>.c
• Source program is a sequence of bits
organized into bytes (8-bits each) where
each byte represents some text (ASCII)
character
• Source file can also be called a text file
since it consists exclusively of ASCII
characters
chapter 1
4
Partial Example
#
i n c
l u d e <sp> <
35 105 110 99 108 117 100 101
32 60
Actually stored:
0010 0011
0110 1001
0110 1110
0110 0011
chapter 1
5
Binary files
• Sequence of bits that can not be
organized into bytes that represent ASCII
characters
• All files (binary and text) are simply stored
as a sequence of bits
chapter 1
6
Stages of Compilation of a C
program
printf.o
hello.c
Source
program
(text)
Prehello.i
processor
(cpp)
Modified
source
program
(text)
Compiler hello.s
(cc1)
Assembler hello.o
(as)
Assembly
program
(text)
Relocatable
object
programs
(binary)
Linker
(ld)
hello
Executable
object
program
(binary)
Figure 1.3
chapter 1
7
Preprocessing Phase
• Modifies C program according to #
directives
– #include – replaces #include directive with
contents of file
– #define LRGNEG 0x80000000 – replaces
LRGNEG in source code with 0x80000000
• Output is placed in the file <filename>.i
(depending upon the compiler)
chapter 1
8
Compiler
• Translates <filename>.i file into another
text file <filename>.s which contains
assembly language program
• All compilers for a particular machine
generate assembly language
chapter 1
9
Assembly
• Translates <filename>.s into machine
instructions
• Output is placed in <filename>.o (object
file)
• The object code is a relocatable object
program
chapter 1
10
Linking phase
• C compiler provides some functions in
what is known as the standard C library
• Linker merges <filename>.o with functions
needed in the standard C library to create
an executable object file
chapter 1
11
Why do programmers care?
• We can write more efficient code if we
know:
– Difference between the implementation of a
switch statement and an if-then-else
– The expense of a function call
– The difference between referencing with
pointers and array indices
– What makes some loops execute faster than
others even when they do the same thing
chapter 1
12
Why do programmers care?
• We can debug link errors if we know:
– What an unresolved reference is
– The difference between a static and a global
variable
– The difference between a static and dynamic
library
– Why some link errors don’t appear until runtime
chapter 1
13
Why do programmers care?
• We can avoid certain security holes via
buffer overflow bugs if we understand the
stack discipline used by compilers
chapter 1
14
Matrix Multiply Example
/* kji */
for (k = 0; k < n; k++)
for (j = 0; j < n; j++) {
r = B[k][j];
for (i = 0; i < n; i++)
C[i][j] += A[i][k]*r;
}
/* ijk */
for (i = 0; i < n; i++)
for (j = 0; j < n; j++) {
sum = 0.0;
for (k = 0; k < n; k++)
sum += A[i][k]*B[k][j];
C[i][j] += sum;
}
chapter 1
15
Pentium II Xeon Matrix Multiply
Performance
60
40
kji
ijk
30
20
10
37
5
32
5
27
5
22
5
17
5
12
5
75
0
25
Cycles/iteration
50
Array size (n)
chapter 1
16
What’s going on?
• C arrays are stored in row major order – accessing array
in the way it is stored will increase cache performance
• Every assignment to an array element results in a store
to memory
• Every use of an array element results in a load from
memory
• kji inner loop
– Load of A[i][k], C[i][j] every iteration
– Cache miss on A[i][k], C[i][j] every iteration for large enough
values of n
– Store to C[i][j] every iteration
• ijk inner loop
– Load of A[i][k], B[k][j] every iteration
– Cache miss on B[k][j] every iteration for large enough values of n
chapter 1
17
What do we need to know to write
more efficient code
• Registers on machine allocated by the
compiler; what items are placed in
registers; what items are generally not
placed in registers
• Cache memory and organization; how we
can write code that minimizes cache
misses
chapter 1
18
Executing the executable
• Unix system:
cs% ./hello
hello, world
cs%
• The Unix prompt (cs%) is displayed by a shell
program that is running and waiting for the user
to type a command
• Since hello isn’t a built-in shell command, the
shell loads and runs the hello program and
waits for it to terminate
chapter 1
19
Hardware organization
•
•
•
•
Buses
I/O devices
Main memory
Processor
Understanding the hardware (to some
degree) will help us to write better code
chapter 1
20
CPU
Register file
PC
ALU
System bus
Memory bus
Main
memory
I/O
bridge
Bus interface
I/O bus
USB
controller
Mouse Keyboard
Graphics
adapter
Disk
controller
Display
Disk
Expansion slots for
other devices such
as network adapters
hello executable
stored on disk
Figure 1.4
chapter 1
21
Buses
• Collection of electrical conduits (for
example, wires) that carry bytes of
information
• Three types of conduits – control, address,
data
• Data – amount transferred at one time is
typically a word (1 or 2 bytes [embedded
system], 4 bytes [PC], or 8 bytes [server
quality machine])
chapter 1
22
I/O Devices
• Connect the computer to the external world
• Examples: keyboard, mouse, display, disk drive,
printer, scanner, microphone, speaker, etc.
• Connected to the I/O bus via an I/O controller
or an Adapter
– I/O controller – chip in the device or on the
motherboard
– Adapter – card that plugs into a slot on the
motherboard
chapter 1
23
Main Memory
• Collection of Dynamic Random Access
Memory (DRAM) chips
– Dynamic – memory has to be frequently
refreshed by reading/rewriting the bit values
– Random Access – amount of time to access
any cell is the same as any other
• Logically, memory is a linear array of bytes
each with its own address
chapter 1
24
Processor
• Executes instructions stored in main memory
• Program Counter (PC) – register that contains
the address of the next instruction to be
executed
• Fetch-Decode-Execute cycle
nextInstruction = memory[PC]
PC = PC + instructionSize
Figure out activities specified by nextInstruction
Perform those activities (may cause PC to be modified)
chapter 1
25
Back to hello world example
• Shell program is continuously reading
characters at the keyboard into a register
and then storing them into memory (see
figure 1.5)
chapter 1
26
CPU
Register file
PC
ALU
System bus
Memory bus
Main "hello"
memory
I/O
bridge
Bus interface
I/O bus
USB
controller
Mouse Keyboard
User
types
"hello"
Graphics
adapter
Disk
controller
Expansion slots for
other devices such
as network adapters
Display
Disk
Figure 1.5
chapter 1
27
Back to hello world example
• When user types enter, shell knows that
the command is complete and then
interprets the command
• Hello executable is copied from disk into
main memory
• DMA I/O performed – data (executable)
travels from disk to main memory without
passing through processor (see figure 1.6)
chapter 1
28
CPU
Register file
PC
ALU
System bus
Memory bus
"hello,world\n"
Main
memory
hello code
I/O
bridge
Bus interface
I/O bus
USB
controller
Mouse Keyboard
Graphics
adapter
Disk
controller
Display
Disk
Expansion slots for
other devices such
as network adapters
hello executable
stored on disk
Figure 1.6
chapter 1
29
Back to hello example
• Processor fetches, decodes and executes
machine instructions in the hello
executable
• “hello, world” string will be copied from
memory to a register on the processor and
from there to the output device (see figure
1.7)
chapter 1
30
CPU
Register file
PC
ALU
System bus
Memory bus
Main "hello,world\n"
memory
hello code
I/O
bridge
Bus interface
I/O bus
USB
controller
Mouse Keyboard
Graphics
adapter
Disk
controller
Display
Disk
"hello,world\n"
Expansion slots for
other devices such
as network adapters
hello executable
stored on disk
Figure 1.7
chapter 1
31
Processor-Memory Gap
• Notice how much data movement
occurred just from executing the simple
hello world program
• Unfortunately, processor performs much
faster than memory
• And processor performance is improving
at a faster rate than memory performance
chapter 1
32
Source: Computer Architecture: A Quantitative Approach by Hennessy and Patterson
chapter 1
33
Cache
• Level of memory between processor and
main memory that stores recently
accessed instructions and data, as well as,
nearby instructions and data
• Implemented with Static Random Access
Memory (SRAM) which is faster and more
expensive than DRAM
chapter 1
34
CPU chip
Register file
L1
cache
ALU
(SRAM)
Cache bus
L2 cache
(SRAM)
System bus
Memory
bridge
Bus interface
Memory bus
Main
memory
(DRAM)
Figure 1.8
chapter 1
35
Memory Hierarchy
• Storage at one level acts as a “cache” for
the level below it
• Li is smaller, faster and more expensive
than the storage at level Li+1
chapter 1
36
L0:
Registers
Smaller,
faster,
and
costlier
(per byte)
storage
devices
L1:
L2:
L3:
Larger,
slower,
and
cheaper
(per byte)
storage
devices
L4:
CPU registers hold words retrieved from
cache memory.
On-chip L1
cache (SRAM)
L1 cache holds cache lines retrieved
from the L2 cache.
Off-chip L2
cache (SRAM)
Main memory
(DRAM)
L2 cache holds cache lines
retrieved from memory.
Main memory holds disk
blocks retrieved from local
disks.
Local secondary storage
(local disks)
Local disks hold files
retrieved from disks on
remote network servers.
L5:
Remote secondary storage
(distributed file systems, Web servers)
Figure 1.9
chapter 1
37
Operating System
• Interface between the applications running on a
computer and the hardware
• Protects resources from invalid accesses
• Provides applications with a uniform method for
using very different hardware devices
• OS provides abstractions for processes, virtual
memory and files
Application programs
Software
Operating system
Processor
Main memory
chapter 1
I/O devices
Hardware
38
Processes
• Operating system’s name for a running
program
• Multiple processes can be running
concurrently with each having the illusion
that they are the only process running
• Context switch – state of the current
process is saved, the state of a new
process is restored and control is passed
to the new process
chapter 1
39
Time
shell
process
hello
process
Application code
OS code
Context
switch
Application code
OS code
Context
switch
Application code
Figure 1.12
chapter 1
40
Threads
• Process can consist of multiple execution
units called threads
• Threads share the same heap, code
(shared library and user code) and global
data, but each thread has its own stack,
stack pointer, PC and register values
chapter 1
41
Virtual Memory
• Provides the illusion that each process has
exclusive use of the main memory
• Virtual addresses translated to physical (main
memory) addresses during execution time
• Virtual address space contains areas for
–
–
–
–
–
Program code and data
Heap
Shared libraries
Stack
Kernel – portion of the operating system that is
always resident in main memory
chapter 1
42
0xffffffff
0xc0000000
Kernel virtual memory
Memory
invisible to
user code
User stack
(created at runtime)
0x40000000
Memory mapped region for
shared libraries
printf() function
Run-time heap
(created at runtime by malloc)
Read/write data
Read-only code and data
Loaded from the
hello executable file
0x08048000
0
Unused
Figure 1.14
chapter 1
43
Unix Files
• Sequence of bytes whose use is
determined by context
• I/O devices (disks, keyboards, displays,
etc) are modeled as files and I/O is
performed by reading/writing to the
appropriate file
chapter 1
44
Network
• From point of view of an individual system,
the network can be viewed as another I/O
device
– Data can flow from main memory to network
adapter just like it can from main memory to a
disk drive
– System can read data from other machines
and copy the data into main memory
• See figure 1.14
chapter 1
45
CPU chip
Register file
PC
ALU
System bus
Memory bus
Main
memory
I/O
bridge
Bus interface
Expansion slots
I/O bus
USB
controller
Graphics
adapter
Mouse Keyboard
Disk
controller
Network
adapter
Disk
Network
Monitor
Figure 1.14
chapter 1
46