Data Representation
Overflow
Limits
No good explanations in the books
Try to understand in class and from the slides
Representation of Data
All data in a the computer’s memory and files are represented as a sequence of bits
Bit : unit of storage, represents the level of an electrical charge. Can be either 0 or 1
Byte: another unit of storage that occupies 8 bits.
A bit sequence can represent many different things:
– We will see that a bit string (such as 10000000111000001011111110100010)
can mean several different things depending on the representation that is
agreed upon.
So, how should we represent integers, characters, real numbers, strings,
structures, in terms of bits?
– Representations must be efficent and convenient
– We will see some of them
Characters
In C/C++, characters are actually integers of length one byte, with special
meaning as characters
- the ASCII mapping
char c = 'a';
//stores the code corresponding to letter ‘a’, but
//prints the character a
when printed
ASCII Standard
–
–
–
–
American Standard Code for Information Interchange
dates back 1960's
256 different codes (0 . . . 255) and corresponding characters
The characters with codes 0 . . 31, and 127 are control characters
• At those times, standardizing communication related and telegraphic codes was
important. That is why most of the control characters are for this purpose and now
obselete. Though, some OSs implement some of the control chars.
– Extended ASCII (128 – 255): there are different conventions and
interpretations
ASCII
Blue control
characters are
the ones
important for
Windows/DOS
See http://www.asciitable.com/ and
http://en.wikipedia.org/wiki/ASCII for more info
These are the ASCII codes – of course you are not expected to memorize them,
just know that there are codes for special characters; and that numbers, lowercase
and uppercase letters are ordered and consecutive (so one can do '1'+5 to get code
of '6', or subtract 32 to go from lowercase to get corresponding uppercase)
| 0
| 8
| 16
| 24
Special
| 32
control
characters. | 40
| 48
Most of
| 56
them are
now
| 64
obselete.
| 72
Some OSs | 80
implement
| 88
some of
| 96
them and
meanings |104
|112
may
change
|120
from OS to
OS
NUL| 1
BS | 9
DLE| 17
CAN| 25
SP | 33
( | 41
0 | 49
8 | 57
@ | 65
H | 73
P | 81
X | 89
` | 97
h |105
p |113
x |121
SOH| 2
HT | 10
DC1| 18
EM | 26
! | 34
) | 42
1 | 50
9 | 58
A | 66
I | 74
Q | 82
Y | 90
a | 98
i |106
q |114
y |122
STX| 3
LF | 11
DC2| 19
SUB| 27
" | 35
* | 43
2 | 51
: | 59
B | 67
J | 75
R | 83
Z | 91
b | 99
j |107
r |115
z |123
ETX| 4
VT | 12
DC3| 20
ESC| 28
# | 36
+ | 44
3 | 52
; | 60
C | 68
K | 76
S | 84
[ | 92
c |100
k |108
s |116
{ |124
EOT| 5
FF | 13
DC4| 21
FS | 29
$ | 37
, | 45
4 | 53
< | 61
D | 69
L | 77
T | 85
\ | 93
d |101
l |109
t |117
| |125
ENQ| 6
CR | 14
NAK| 22
GS | 30
% | 38
- | 46
5 | 54
= | 62
E | 70
M | 78
U | 86
] | 94
e |102
m |110
u |118
} |126
ACK| 7
SO | 15
SYN| 23
RS | 31
& | 39
. | 47
6 | 55
> | 63
F | 71
N | 79
V | 87
^ | 95
f |103
n |111
v |119
~ |127
BEL|
SI |
ETB|
US |
' |
/ |
7 |
? |
G |
O |
W |
_ |
g |
o |
w |
DEL|
Integer Number Representation
Sign-Magnitude Representation
1s complement Representation
2s complement Representation
Comparison of different representations
Number Representation
Fundamental problem:
– Fixed-size representation (e.g. 4 bytes for integers) can’t encode all numbers
– Usually sufficient in most applications,
– But a potential source of bugs: overflow
– need to be careful of it
Other problems:
– How to represent negative numbers, floating points?
– Historically, many different representations.
– How to do subtraction effectively?
Base 2 – unsigned numbers
MSB – Most Significant Bit
LSB – Least Significant Bit
00000000 0
//8-bit binary representation of positive integers
00000001 1
00000010 2
00000011 3
...
1 1 1 1 1 1 1 1 255
n 1
– Representation: an n-bit number in base b has decimal value =
• di is the coefficient of the ith bit.
• Bit 0 is the LSB and bit n-1 is the MSB.
di b i
i
0
– Example for base 2 (binary): 10112 = 1 x 20 + 1 x 21 + 0 x 22 + 1 x 23 = 1110
Sign/Magnitude representation
(also called “signed representation”)
– use one of the bits (the first bit = Most Significant Bit) as a sign bit.
– use the rest for magnitude
–
e.g.
000
001
010
011
100
101
110
111
= +0
= +1
= +2
= +3
= -0
= -1
= -2
= -3
positive numbers
negative numbers
– range: -(2 (n-1)-1) to (2 (n-1) -1), where n is the total number of bits
For n = 4, [ -(23-1) . . . 23-1 ]
[ -7 . . . 7 ]
Alternative representations
Most computers don’t use a “sign and magnitude” representation
– Drawbacks of the Sign-Magnitude representation:
1. two 0s: one positive one negative
2. addition and subtraction involving negative numbers are complicated
Alternatives?
– 1's complement representation
– 2's complement representation: today's standard
These two representations seem very similar in approach, but they differ in:
– Representation of negative numbers (positives are the same in all 3
representations) and
– Ease of arithmetic operations involving negative numbers
Signed numbers: 1’s complement
– Positive numbers: first bit is 0, and the rest is the binary equivalent of the
number.
– Negative numbers: represented by the 1’s complement of the corresponding
positive number
• 1’s complement: invert all the bits (0's become 1; 1's become 0)
e.g
+8 = 0000 1000 (0 for + sign, and 000 1000 for 8)
- 8 = 1111 0111
– So, effectively the first bit is used for sign, but negative numbers show a
distinction from those of the sign-magnitude representation.
– How about 0?
Number
+7
+6
+5
+4
+3
+2
+1
+0
-0
-1
-2
-3
-4
-5
-6
-7
1’s-complement
0111
0110
0101
0100
0011
0010
0001
As in the signed representation,
0000
there is + and - 0's
1111
1110
1101
1100
n-1-1) . . 2n-1-1]
Range:
[-(2
1011
For n = 4, [-(23-1) . . . 23-1 ]
1010
1001
[-7 . . . 7]
1000
Signed numbers: 2’s complement
Signed 2’s complement is the common representation for signed numbers used in computers
– For positive numbers, use 0 first and the remaining bits are the binary equivalent of the
magnitude.
– Negative numbers are represented by the 2’s complement of the corresponding
positive number.
• 2s complement:
invert all bits and add 1
• Alternative (easier) method: copy all the bits from right to left until and
including the first 1, invert the rest)
Ex : +20 = 0001 0100
-20 = 1110 1100
–
–
–
–
Range:
[-2n-1 . . . 2n-1-1]
For n = 4,
[-23 . . . 23-1 ]
[-8 . . . 7 ]
single 0
addition and subtraction complexities simplified
note the range (one more negative as compared to 1's complement): -2 (n-1) ... (2 (n-1) -1)
current standard for representing signed integers
Number
+7
+6
+5
+4
+3
+2
+1
0
-1
-2
-3
-4
-5
-6
-7
-8
2’s-complement
0111
0110
0101
0100
0011
0010
0001
There is only one zero
0000
1111
There is one more negative number
1110
as compared to positives
1101
1100
1011
1010
1001
1000
Possible Representations: summary
Sign Magnitude:
000 = +0
001 = +1
010 = +2
011 = +3
100 = -0
101 = -1
110 = -2
111 = -3
One's Complement
000 = +0
001 = +1
010 = +2
011 = +3
100 = -3
101 = -2
110 = -1
111 = -0
Two's Complement
000 = +0
001 = +1
010 = +2
011 = +3
100 = -4
101 = -3
110 = -2
111 = -1
Notice: Positive numbers are represented the same way (same bit strings) in
all representations! So for all three representations, representation of a
positive number is directly decimal to binary / binary to decimal
conversion.
Decimal Conversion for Negatives
If you are given a bit string representing a nagative number, you can find the
decimal equivalent depending on the number representation used.
– if sign/magnitude representation is used:
• If MSB is 1, that means the number is negative
• but this bit has no contribution to the magnitude. Convert the remaining bits to
decimal for the magnitude.
• For example, 10010010 is equivalent to –18 (- (1x16 + 1x2))
– if 1s complement representation is used:
• If MSB is 1, that means the number is negative
• To find the magnitude:
– invert all bits (i.e. negate): 10010010 => 01101101
– find the positive number corresponding to the negated string
» 1x64 + 1x32 + 1x8 + 1x4 + 1 = 109
– 10010010 is equivalent to -109
– note that this is the reverse operation of what we would do if we wanted to find
the bit representation of –109 (find the bit representation of 109 and then take
1s complement)
Decimal Conversion for Negatives– ctd.
– if 2s complement representation is used:
• If MSB is 1, that means the number is negative.
• To find the magnitude:
– invert all bits
1001 0010 => 0110 1101
– add 1
=> 0110 1110
– this is the negated value
– find the positive number corresponding to the negated string (01101110)
» 1x64 + 1x32 + 1x8 + 1x4 + 1x2 = 110
– 10010010 is equivalent to –110
– note that this is the reverse operation of what we would do if we wanted to find
the bit representation of –110 (find the bit representatipn of 110 and then take
2s complement)
Alternative decimal conversion – 2s comp.
You can also directly/quickly find the decimal equivalent of a 2s
complement number:
• use the usual binary to decimal conversion, using the negative
for the coefficient of the most significant bit
1
-128
206
205
214
203
202
1
2
1
0
2
0
64
32
16
8
4
2
1
-27
Hence: 100100102 = -1x27 + 1x24 + 1x21 = -11010
conversion to decimal with 32 bit
numbers – 2s comp.
Same idea as 8 bit 2s complement integers, but the most significant bit is
-231.
…
-2,147,483,648
3131
-2-2
230
27...
64
32
16
8
6
2
26
5
5
2
2
4
4
2
2
2233
4
2
1
222 2 2211
2200
A very important note
Converting n bit numbers into numbers with more than n bits:
– copy the most significant bit (the sign bit) into the other bits
– Example: 4-bit to 8-bit
0010
-> 0000 0010
(both has decimal value 2)
1010
-> 1111 1010
(both has decimal value -6 in 2's complement)
– This method is valid for both 1's and 2's complement representations
Subtraction
a-b can always be represented as a+(-b). Doing the arithmetic in this way causes
wrong results in sign-multitude and 1's complement representations, but not in 2's
complement. We will see an example now.
Consider 3 - 2 which is the same as 3 + (-2)
In sign-magnitude representation using 4 bits:
3 + (-2) should give 1, but instead we get -5 !
0 011
1 010
+--------1 101
= +310
= -210
= -510 which is a wrong result
To remedy this, the operation can take special notice of the sign bits and perform a
subtraction instead. This complicates the implementation; we have a better
solution using 2's complement (next slide).
Subtraction
In 1's complement representation using 4 bits:
3 + (-2) becomes 0 !
0 011 = +310
1 101 = -210
+--------1 0 000 = 010 which is a wrong result
We got rid of it automatically since it does not fit
But two's complement addition results in the correct sum without hassle.
0 011
= +310
1 110
= -210
+--------1 0 001
= 110 which is the correct result!
We got rid of it automatically since it does not fit
Why 2's Complement?
There is only one zero. Range for negative numbers is one more than the
other representations
Subtraction can be implemented as addition (a - b = a + -b). Thus no
borrowing logic needs to be implemented. Let's us give two 8-bit
examples.
97 - 120 =?
01100001 97
10001000 -120
+------------11101001 -23
-51 – 70 = ?
11001101 -51
10111010 -70
+------------110000111 -121
Due to fixed width of the registers, carry overflow is lost automatically but the result is correct.
Two's Complement – Negation
Negating a two's complement number is simple:
1. Start at least significant bit. Copy through the first “1”; after that, invert each bit.
•
Example: 0010101100
1101010100
2. Alternatively, invert all bits and add one to the least significant bit
If you negate twice, you will obtain the same number:
0011
3
1101
-3
0011
3
Important Note on Terminology!
"2's complement" (or "two’s complement") does not mean a negative
number!
2's complement is a representation used to represent all integers, not just
negative integers!
So 2's complement is a format specification, but we also use the term "2's
complement of a number" as its negation
e.g. when we want to negate a number, either from positive to negative or
negative to positive, we may say "take its 2's complement". That means
"2's complement of a number" does not always mean a negative number.
Overview of Built-in Types and
Their Ranges
Built-in Types in C++
The types which are part of the C++ language; not implemented as a
class
–
–
–
–
–
char
int, long (a.k.a.. long int), short (a.k.a. short int)
float, double
Mostly for numeric data representation
There are signed (which is the default one) and unsinged versions
for integer/char storage
– Signed integer representation uses 2's complement
Now we are going to see some characteristics of these types and their
limits. Some of these discussions are not new to you (discussed in
CS201), but after learning data representation and 2's complement,
they will mean more to you now.
char
The type char is known to store an ASCII character, but actually it stores a
signed one byte integer number (2's complement representation).
– Since there is no other one-byte integer type in C++, char is widely used as
integers as well
– Of course, it is also used to store a character (as seen at the beginning of this
ppt file)
char ch;
ch = 'A';
ch = 99;
//valid
//valid
The type char is by default "signed" in Visual Studio
– The range is -128 . . 127
ch = -26;
ch = 135;
//valid
//out of range, but not a syntax error. Compiler gives a warning
– 135 is out of range but still fits in 8-bits. When you have
cout << ch;
– Output is the character for which the ASCII code is 135. However, when you have
cout << (int) ch;
– Output is signed integer (2's complement) representation of 135 (in 32 bits).
13510 = 11111111 11111111 11111111 100001112 which is -121 in 2's
complement representation. Thus you see -121 as the output
unsigned char
You can change the dafault behaviour to unsigned by changing the project properties
–
–
–
–
Open the project's Property Pages dialog box.
Click on "C/C++"
Click on "Command Line"
Add /J compiler option.
You can also explicitly specify a char variable unsigned by putting the keyword
unsigned before char.
– For non-negative one-byte integers. Since there are no negatives, no need to
use 2's complement
• The range is 0 . . 255.
– For ASCII interpretation, signed and unsigned do not make a difference
• The ASCII character corresponding to the binary representation
unsigned char ch;
ch = 200; //valid; the ASCII character with code 200
ch = -26; //out of range, but not a syntax error. Compiler gives a warning
– -26 is out of range but can be represented in 2's complement as 11100110 in
binary and the unsigned interpretation of this bit string is 230. Thus:
cout << ch;
– displays the character for which the ASCII code is 230.
int, short, long, long long
The "signed" integer types of C++
Uses 2's complement representation
int
– Mostly used signed integer type of C++
– Typically the number of bytes used is the word size of the processor
• So in 32 bit computers it is 4 bytes, but for 64-bit computers it should be 8 bytes
– However, Visual Studio fixed it to 4 bytes: thus, in CS204 we can assume that int always
uses 4 bytes
• But if you port your code to another platform using another compiler, do not trust that int uses 4
bytes.
– Range:
INT_MIN to INT_MAX (these are defined in limits.h or climits header file)
-2n-1 . . . 2n-1-1 where n is the number of bits used
32 bits (our case): -231 . . . 231-1 -2,147,483,648 . . . 2,147,483,647
64 bits: -263 . . . 263-1 -9,223,372,036,854,775,808 to +9,223,372,036,854,775,807
long
(can also be used as long int)
long num;
//can also be defined as long int num;
– Signed integer that always use 4 bytes, independent of the platform and compiler
– The range is the same as 32-bit int
int, short, long, long long
long long (can also be used as long long int)
long long wow;
//can also be defined as long long int wow;
– Microsoft specific 64-bit signed integer (always 64-bits)
– Do not use it for codes to be ported to other platforms/compilers, it won't work.
– Range: LLONG_MIN to LLONG_MAX ( -263 . . . 263-1 )
short (can also be used as short int)
short count;
//can also be defined as short int count;
– Always 2 bytes, independent of the platform and compiler
– Signed integer that always uses 2 bytes
– Range:
SHRT_MIN to SHRT_MAX (these are defined in limits.h or climits header file)
-215 . . . 215-1 -32768 . . . 32767
count = 31500; //valid
count = 35000;//out of range, but not a syntax error. Compiler gives a warning
– So, what is the output of cout << count; ?
– It displays -30536, why?
– Write 35000 in binary in 16-bits and interpret this bit string as a 2's complemented
signed number
3500010 = 1000 1000 1011 10002 = 8 + 16 + 32 + 128 + 2048 – 32768 = -30536
unsigned
integers
In order to store only non-negative values, char, int, short, long, long long can
be defined as unsigned by putting unsigned keyword before the type name.
unsigned
unsigned
unsigned
unsigned
int mynum;
short cinekop;
// same as unsigned short int cinekop;
long lufer;
//same as unsigned long int lufer;
long long kofana; // same as unsigned long long int kofana;
In unsigned representation there is no sign bit; most significant bit is part of the
magnitude. Thus we do not need 2's complement.
– In this way, we can use the full range (2, 4 or 8 bytes) for zero and positive values.
The ranges become (note that the positive range is almost doubled as compared to
signed integers):
– 16-bit:
0...
– 32-bit:
0...
– 64-bit:
0...
0 to USHRT_MAX (defined in limits.h or climits header file)
216-1 0 . . . 65535
0 to UINT_MAX (defined in limits.h or climits header file)
232-1 0 . . . 4,294,967,295
0 to ULLONG_MAX (defined in limits.h or climits header file)
264-1 0 . . . 18,446,744,073,709,551,615
unsigned
integers
Unsigned numbers does not store negatives, but nothing can stop us to assign a
negative value to an unsigned variable
unsigned short num;
num = -25;
cout << num;
-25 is negative so it is represented using 2's complement. The resulting bit string is
then interpreted as an unsigned number since it is assigned to unsigned number
(implicit type casting).
-2510 = 1111 1111 1110 01112 = 6551110
So the output becomes 65511
Of course, it is not a normal programmer behavior to assign a negative value to an
unsigned variable, but such things may unintentionally occur.
If you use a literal or constant at the right-hand-side of assignment, then compiler
may warn you (depending on the warning level). However, if rhs is an expression,
then the problem occurs at run-time and compiler cannot see that problem.
Thus, you have to know what happens in such situations to locate the problem easily.
Limits
You can include limits.h which defines the ranges of integers (depending
on your platform/computer)
#include <limits.h>
OR
#include <climits>
Tip: Type #include <limits.h> (or any other filename) in your
program, then go to that line, and right click on the file name and
choose “Open Document”. That will bring you this header file.
You can do this in general and it will save you the effort lo locate the file.
Typecasting between signed and unsinged numbers
Typecasting may be done explicitly or sometimes it happens implicitly (e.g.
when you assing an unsigned variable to a signed one, or vice versa)
– So, you should know how it executes
Signed to unsigned typecasting
– Represent to signed number using 2's complement format.
– Interpret this bit string as unsinged
• If MSB is 0, then the signed and unsigned are the same
• If MSB is 1, then signed is negative. For unsigned conversion, MSB is not
considered as the sign bit, it is interpreted as part of the magnitude.
– 2 slides ago, we had an example, but let us give another one.
short ints = -30000;
unsigned short intus = ints;
cout << intus;
//implicit typecasting
Output is 35536, the same bit representation as -30000
Typecasting between signed and unsinged numbers
Unsigned to signed typecasting
– Represent the unsigned number as bit string.
– Interpret this bit string as signed
• If MSB is 0, then the unsigned and signed are the same
• If MSB is 1, then interpret the bit string as a 2's complemented negative value
– I do not mean to take 2's complement.
– But you can take 2's complement to understand the magnitude of this
negative number
– Examples
unsigned short usnum = 30000;
cout << (short) usnum;
• Output is 30000, MSB of usnum is 0.
unsigned short usnum = 63000;
cout << (short) usnum;
• Output is -2536, MSB of usnum is 1.
Moral of the story behind typecasting: They are all the same bit
strings; the only thing that changes is how to interpret it
Some tips about selecting integer type
You may consider to use an unsigned variable if you will store a nonnegative number.
Well, if you are too close to 0, this is a bit risky. Consider the following loop:
unsigned int j;
for (j = 5; j >= 0; j--)
cout << j << endl;
This loop is infinite. When j is 0, it is decremented and you expect to have
-1. Actually it is -1 as the bit string representation (a bit string with all 1's
in it). This bit string is the largest unsigned integer number when you
typecast into unsigned int. Thus it is >=0.
Moral of the story: use unsigned only if you make sure that the value of the
variable will never go below 0. Otherwise use signed integers.
Some tips about selecting integer type
Do not mix signed and unsigned numbers in an expression. Some strange things
may happen. Consider the following code:
unsigned int a = 5;
int b = -10;
if (a+b < 0)
cout << "hede" << endl;
else
cout << "hodo" << endl;
You expect the output to be "hede", but it displays "hodo". Why?
In C++, there is a rule saying that built-in operands of an operator must be of the
same type. If they are different, one of them is implicitly typecasted to the other
before evaluating the expression.
– Typical case: if you add an integer to a double, integer is converted to double before
the operation.
– In the example above, signed (b) is typecasted to unsigned. -10 is a binary number
with lots of 1's in 2's complement format. Thus as unsigned, it is a big number. When
added 5, the sum gets bigger and can never be less than 0.
RULE: If there is a signed and an unsigned number in an expression, signed is
automatically converted to unsigned by the compiler before the evaluation.
Some tips about selecting integer type
Which integer size to use?
Of course, this depends on the possible range of values you want to store in this
variable.
– Using the largest one all the time may cause unnecessary usage of memory. This is
not good for efficiency.
– But on the other hand, allow yourself some margin to proactively defend against some
unanticipated problems.
If you want to store a constant or literal bigger than the capacity, the compiler warns
you (at compile time)
short num = 45000;
//compiler warns
However, if you assign an expression that goes beyond the capacity, then compiler
cannot see this and cannot warn you. This is a big problem and technically called
as "overflow" and we are going to see overflow today (if time permits, otherwise
in the next lecture).
Wrap-up: Number Representations
– Unsigned
: for non-negative integers
– Two’s complement
: for signed integers (zero, negative or positive)
Unless otherwise noted (as unsigned), always assume that numbers we consider
are in 2's complement representation.
– IEEE 754 floating-point
: for real numbers (float, double)
– We did not add anything for float and double on top of what you know from
CS201. At the end of these slides you can find how IEEE 754 floating-point
representation works, but we will not talk about this and you are not
responsible.
Arithmetic Overflow
Overflow
In this subsection, we will see the related topic of overflow, which basically means that
after an operation such as addition or subtraction the result is not correct due to
the fact that the result cannot be represented (does not fit) in the allocated space.
There is overflow in the following piece of code since a +b goes beyond the range
covered by c
unsigned char a = 200;
unsigned char b = 250;
unsigned char c = a + b;
– We will also give the rule about determining the value of c after the overflow.
• This may not look essential since the result is already wrong, but getting into that
deep may help us to find out logic error during debugging.
We will start with small cases where the storage is 4-bits to understand the basics and
then later we will generalize to built-in types of C++
We will give the basics of overflow on addition and subtraction with two operands
– Other arithmetic operations and expressions with multiple operations may also
cause overflow. We will generalize to this case at the end
How Can We Detect Arithmetic Overflow?
Having carry out of MSB?
– Arithmetic overflow is not always understood by having a carry out of the MSB
– If there is a carry out of the MSB, then we say that there is a "carry overflow", but this
may or may not mean that there is "arithmetic overflow" and may or may not mean that
the result is wrong.
E.g. 7-6 = 7 + (-6)
0111 ( 7)
+ 1010 (-6)
x1 0001
( 1)
There is a carry out of MSB; it is discarded and the result is correct!
E.g. -7-6 = -7 + (-6)
1001 (-7)
+ 1010 (-6)
x1 0011
( 3)
There is a carry out of MSB; it is discarded and the result is wrong!
How Can We Detect Arithmetic Overflow?
Having no carry out of MSB?
– Does not always mean that there is no arithmetic overflow. We may have arithmetic
overflow even if there is no carry out of MSB
E.g. 7+1
0111 ( 7)
+ 0001 ( 1)
1000 (-8)
There is no carry out of MSB, but the result is incorrect!
E.g. 2-3 = 2 + (-3)
0010 ( 2)
+ 1101 (-3)
1111 (-1)
There is no carry out of MSB and the result is correct!
Overflow and 8 bit addition
1111
01111000
+ 01111000
11110000
Overflow!
It fits, but it’s still
overflow!
120
+120
-16
Reminder:
Max 2s complement range with 8 bits: -128 to +127
01111000 = 1x64 + 1x32 + 1x16 + 1x8 = 12010
11110000 = -1x128 + 1x64 1x32 + 1x16 = -1610
Overflow – definition & detection
Overflow means that the right answer don’t fit !
If you think in decimal and know the ranges, it is easy to detect.
120 +120 = 240 and the range of signed 8-bit integer is -128 . . 127
240 is not in this range, so there is overflow
More formally, there is arithmetic overflow when
the sign of the numbers is the same -ANDthe sign of the result is different than the sign of the numbers
Detecting Overflow
There can’t be an overflow when adding a positive and a negative number
– Why? Basically because the magnitude of the number gets smaller without
changing sign
There can’t be an overflow when signs are the same for subtraction
– Why? Same as above since arithmetically this is adding a positve to a
negative.
Overflow occurs when the value affects the sign:
1.
2.
3.
4.
overflow when adding two positives yields a negative
or, adding two negatives gives a positive
or, subtract a negative from a positive and get a negative (similar to 1)
or, subtract a positive from a negative and get a positive (similar to 2)
Of course, this rule is for signed integers;
for unsigned, we will see later
Visualizing Overflow
Let us visualize the reason of
overflow on 4-bit case for
signed integers (2's
complement)
Start with the first operand and
–
–
–
(circularly) go up by the second
operand for subtraction
(circularly) go down by the
second operand for addition
Overflow occurs if our
arithmetic operation causes to
pass this red line (in any
direction)
Wrapping around
( 0 -1 or -1 0) does not
mean overflow
Number
2’s-complement
0
0000
+1
0001
+2
0010
+3
0011
+4
0100
+5
0101
+6
0110
+7
0111
---------------------------------8
1000
-7
1001
-6
1010
-5
1011
-4
1100
-3
1101
-2
1110
-1
1111
Visualizing Overflow for char and short
char
Number
2’s-complement
0
0000 0000
+1
0000 0001
+2
0000 0010
...
...
+124
0111 1100
+125
0111 1101
+126
0111 1110
+127
0111 1111
---------------------------------128
1000 0000
-127
1000 0001
-126
1000 0010
...
...
-4
1111 1100
-3
1111 1101
-2
1111 1110
-1
1111 1111
short
Number
2’s-complement
0
0000 0000 0000 0000
+1
0000 0000 0000 0001
+2
0000 0000 0000 0010
...
...
+32764
0111 1111 1111 1100
+32765
0111 1111 1111 1101
+32766
0111 1111 1111 1110
+32767
0111 1111 1111 1111
-----------------------------------------------32768
1000 0000 0000 0000
-32767
1000 0000 0000 0001
-32766
1000 0000 0000 0010
...
...
-4
1111 1111 1111 1100
-3
1111 1111 1111 1101
-2
1111 1111 1111 1110
-1
1111 1111 1111 1111
Detecting Overflow – Complex Expressions
The rule of detecting the change of sign in the result applies to all signed
integer types of C++.
–
–
But only for simple addition and subtraction
What about more complex operations?
No simple formula for that; apply these steps
–
–
–
–
–
Simply calculate using decimal arithmetic and see if it fits in the range.
If does not fit, then there is overflow
Convert the overflowed result in binary and truncate as it fits to n-bits (where
n is the number of bits in the corresponding type)
Interpret the truncated bit string in 2's complement logic
Examples
char d = 3*200+21;
•
•
621 is not between -128 . . 127, so there is overflow.
62110 = 10 0110 11012 Discard the most significant two bits since they do not fit
in 8 bits (storage for char). Resulting bit string is 0110 1101 which is 109 (decimal)
char d = 2*200+15;
•
•
415 is not between -128 . . 127, so there is overflow.
41510 = 1 1001 11112 Discard the most significant bit since it does not fit in 8
bits (storage for char). Resulting bit string is 1001 1111 which -97 is (2's compl.)
Detecting Overflow - unsigned integers
Let us visualize the overflow case on 4-bit
unsigned signed integers
We have two red lines here
–
–
–
There is overflow if you go beyond 0
and beyond 15
If you add 1 to 15 you end up 10000 in
binary and when you discard the
overflow bit, the resulting value
becomes 0.
Similarly subtracting 1 from 0 yields 15
Generalization of this case to n-bit
unsigned integers is trivial
–
Max value is 2n-1 and number of bits in
binary is n
Evaluation of complex expression is
similar to signed case
–
–
Do the operation, convert to binary,
discard the overflowed bits
But this time interpret as unsigned
number
Dec. Number
Binary number
---------------------------------------0
0000
+1
0001
+2
0010
+3
0011
+4
0100
+5
0101
+6
0110
+7
0111
+8
1000
+9
1001
+10
1010
+11
1011
+12
1100
+13
1101
+14
1110
+15
1111
----------------------------------------
Detecting Overflow in Programs
So far we have not talked much about automatic ways of detecting overflows
in programs
–
Only detecting the change in the sign bit for addition and subtraction of signed
integers
Unfortunately, there is no other silver bullet for detecting overflows once it
occured
Better if overflows are avoided
–
–
To do so, you may use simple expressions and check the values of the
operands to see if they are small enough not to cause overflow
For example suppose a and b are two unsigned ints, a*b does not overflow if
b < UINT_MAX / a
Floating Point Representation
SKIP – Not Covered in CS204
You may read if you are curious
Floating Point (a brief look)
We need a way to represent:
– numbers with fractions, e.g., 3.1416
– very small numbers, e.g., .000000001
– very large numbers, e.g., 3.1 x 1020
Solution: A floating (decimal) point representation
IEEE 754 floating point representation is the standard:
sign
------------------------------ --------mantissa
≡ +/- ……………….. X 2 -------
exponent
– single precision: 1 bit sign, 23 bit significand (mantissa), 8 bit exponent
– more bits for significand gives more accuracy
– more bits for exponent increases range
Range approximately: 10–44 to 1038
IEEE Floating Point Std. - Details
The Mantissa
The mantissa, also known as the significand, represents the precision bits of
the number.
To find out the value of the implicit leading bit, consider that any number can
be expressed in scientific notation in many different ways. For example,
the number five can be represented as any of these:
– 5.00 × 100
– 0.05 × 102
– 5000 × 10-3
In order to maximize the quantity of representable numbers, floating-point
numbers are typically stored in normalized form. This basically puts the
radix point after the first non-zero digit. In normalized form, five is
represented as 5.0 × 100.
Floating Point – what floats?
For simplicity, let’s use a decimal representation and assume we have 1 digit for sign,
8 digits for the mantissa and 3 digits for the exponent:
+/- - - - - - - - -
---
We will “illustrate” the format for the number 0.000000000023
-10
0.000000000023 = . 23 0 0 0 0 0 0 x 10 - - So it will be stored as
.2 3 0 0 0 0 0 0
mantissa
-10
exponent
The actual IEEE Floating point representation follows this principle, but differs from this
in details:
- normalization (floaing point comes after the first nonzero digit)
- binary instead of decimal
- exponent (not sign/magnitude but biased)
Bias – why?
Since we want to represent both positive and negative exponents, e.g. 1011
and 10-11, we can do two things:
1. Reserve a separate sign bit for the exponent
2. Use only positive exponents, together with a bias
– The bias (e.g. 127) is subtracted from whatever is stored in the exponent, to find
the real exponent
Stored exponent= 0
real exponent= 0 – 127 = -127
Stored exponent=227
real exponent= 227 – 127 = 100
Bias of the Exponent
The Exponent
The exponent field needs to represent both positive and negative exponents.
To do this, a bias is added to the actual exponent in order to get the
stored exponent.
– For IEEE single-precision floats, this value is 127.
Thus,
– if the real exponent is zero, 127 is stored in the exponent field.
– if 200 is stored in the exponent field, it actually indicates a real exponent of
(200-127), or 73.
Exponents of -127 (all 0s) and +128 (all 1s) are reserved for special numbers
(NaN, Infnty)
IEEE 754 floating-point standard: summary
Leading “1” bit of significand is implicit
Exponent is “biased” to make sorting easier
– all 0s is smallest exponent, all 1s is largest
– bias of 127 for single precision (note addition of the bias while storing, subtracting of
the bias while converting to decimal)
– Decimal equivalent: (–1)sign (1+significand) 2exponent - bias
Example:
–
–
–
–
–
decimal: -.75 = - (0.5 + 0.25)
binary: -.11
canonical form: -1.1 x 2-1
(note: shifting the radix point by k is same as multip./dividing by radixk)
stored exponent = 126 = 01111110
Resulting IEEE single precision representation:
1
sign
10000000000000000000000
mantissa
01111110
exponent
A more complex example
Let us encode the decimal number −118.625 using the IEEE 754 system.
–
First we need to get the sign, the exponent and the fraction. Because it is a negative
number, the sign is "1".
– Now, we write the number (without the sign; i.e. unsigned, no two's complement) using
binary notation. The result is 1110110.101 (notice how we represent .625)
– Next, let's move the radix point left, leaving only a 1 at its left:
1110110.101 = 1.110110101 × 26. This is the normalized floating point number. The
mantissa is the part at the right of the radix point, filled with 0 on the right until we get
all 23 bits. That is 11011010100000000000000.
– The exponent is 6, but we need to bias it and convert it to binary (so the most negative
exponent is stored as 0, and all exponents are non-negative binary numbers). For the
32-bit IEEE 754 format, the bias is 127 and so the stored exponent is 6 + 127 = 133. In
binary, this is written as 10000101.
Putting them all together:
This example is from wikipedia.
IEEE Floating Point: Ranges
Explanation for minimum positive (just a sign chg. for negative):
+ .00000000....0
= 0
23 bits mantissa
000000001 = + (0+1) x 21-127 = + 1.0 x 2-126
1
8 bits exponent
Note1: Exponent “00000000” is reserved for special numbers, so min is “00000001”
Note2: Approx. conversion between 2s powers and 10s powers:
Ex. 2-149 = 10-44.85 since 23.3 = 10 and 149/3.3 = 45
IEEE Floating Point Ranges
Explanation for maximum positive (just change sign for negative):
+ .111.....1
11111110 = + (1- 2-23 +1) x 2254-127 = + 1.0 x 2127
=1- 2-23
= 254
23 bits mantissa
8 bits exponent
Note1: Since it represents the part after the radix point, “.111111…1” = 1-2-23 , just as
“.11” = 1-2-2
Note2: 11111111 as exponent is reserved for special numbers, so max is 11111110
Summary
Computer arithmetic is constrained by limited precision
Bit patterns have no inherent meaning but standards do exist
– two’s complement
– IEEE 754 floating point
Computer instructions determine “meaning” of the bit patterns
http://babbage.cs.qc.edu/courses/cs341/IEEE-754.html
Floating Point Complexities
• In addition to overflow we can have “underflow”
• A number that is smaller than what is representable (e.g. < 2-126)
• Accuracy can be a big problem
• IEEE 754 keeps two extra bits, guard and round
• four rounding modes
• positive divided by zero yields “infinity”
• zero divide by zero yields “not a number”
• other complexities…
© Copyright 2026 Paperzz