presentation source

Portable Bit Map (PBM)
• ASCII PBM image
Working with General
Computer Data
P1
# This is a 16 column x 3 row PBM ASCII
# image
16 3
001010100011010000111110
010010000101001001011100
• RAW PBM image
P4
# This is a 16 column x 3 row PBM RAW
# image
16 3
*4>HR\
Representation of Information
in a Computer
• All information in a computer is stored as bits
• A bit, binary digit, is base 2.
–It can only be either a 1 (one) or 0 (zero)
• Information - data with structure
–Data
• Sequence of 1’s and 0’s
• How they are interpreted generates information
(or garbage)
Information Types
• Data stored in a computer can represent
–Instructions
• Executable Programs
–Information
• Numbers - integers and real values
• Text - ASCII, multinational characters
• Others - imagery, sound, etc.
Portable Grey Map (PGM)
• ASCII PGM image
P2
# This is a 3 column x 2 row PGM ASCII image
# With a possible maximum grey value of 255
3 2
255
42 52 62
72 82 92
• RAW PGM image
P5
# This is a 3 column x 2 row PGM Raw
# image
3 2
255
*4>HR\
Basic Computer Data
Structures
• Bits in a computer are usually grouped
into the following units
4 bits == 1 nibble
8 bits == 1 byte or 1 ( char ) - a single character
16 bits == 2 bytes or 1 ( short int, int ) - integer value
32 bits == 4 bytes or 1 ( long int, int ) - integer value
or 1( float) - real value
64 bits == 8 bytes or 1 (double) - real value
Information Contained by
Basic Data Structures
•
• Text is typically represented using the
ASCII character set where each character
is represented as a byte value.
• Numbers - integers or real values
– integers { -5, 0, 125, -32767 }
– real { 3.14159, -2.71828, -1.0e-20, 10.000 }
Line Delimiters on Different
Systems
• UNIX System, lines delimiter
–linefeed (also designated as nl )
–ASCII value of 012 (base 8) or 10 (base 10)
• In MACS and PC’s
–carriage returns (also designated as cr)
–ASCII value of 015 (base 8) or 13 (base 10)
• Other systems use a combination of
carriage return-line feeds to delimit lines
Representation of Information
in a Computer
ASCII Character Set
Text is typically represented using the ASCII character set
where each character is represented as a byte value.
The ASCII decimal character set is:
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
nul
soh
stx
etx
eot
enq
ack
bel
bs
ht
nl
vt
np
cr
so
si
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
dle
dc1
dc2
dc3
dc4
nak
syn
etb
can
em
sub
esc
fs
gs
rs
us
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
sp
!
"
#
$
%
&
'
(
)
*
+
,
.
/
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
0
1
2
3
4
5
6
7
8
9
:
;
<
=
>
?
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
`
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
del
Common Cross Platform
File Problem
• Text file created on a PC transferred UNIX
–vi complains that the “line is too long”
• Solution: (A Stupid UNIX Trick)
% cat old_file | tr “\015” “\012” > new_file
–Command translates all carriage returns to line
feeds.
The quick brown
84
104 101
32
113 117 105
99 107
32
98 114
111 119
110 10
fox jumps over
102 111 120
32
106 117 109 112 115
32
111 118 101 114 10
the lazy dog.
116 104 101
32
108 97
122 121
32
100 111 103 46 10
Other Cross Platform
File Problem
• Columns of numbers
–Import into a spreadsheet,
–Places the data in one cell.
• Solution: (Another Stupid UNIX Trick)
% cat old_file | tr -s “ ” “\011” > new_file
–Command translates spaces into a single tab
character.
Numbering Systems
DECIMAL
(Base 10)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
BINARY
(Base 2)
0
1
10
11
100
101
110
111
1000
1001
1010
1011
1100
1101
1110
1111
OCTAL
(Base 8)
0
1
2
3
4
5
6
7
10
11
12
13
14
15
16
17
HEXADECIMAL
(Base 16)
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
Number Base Conversion
using bc
• To convert decimal to binary
obase=2
42
101010
• To convert octal to hexadecimal
obase=16
ibase=8
52
2A
Converting Between Bases
• 42 base 10 converted to base 10,
4210 = [ 4 × 10 = 40 ] + [2 × 10 = 2 ] = 4210
1
0
• 42 base 10 converted to base 8,
4210 = [5 × 81 = 40 ] + [2 × 8 0 = 2 ] = 52 8
Negative Number
Representation
DECIMAL
(Base 10)
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
DECIMAL
(4-digit)
9992
9993
9994
9995
9996
9997
9998
9999
0
1
2
3
4
5
6
7
BINARY
(4-digit)
1000
1001
1010
1011
1100
1101
1110
1111
0000
0001
0010
0011
0100
0101
0110
0111
Converting Between Bases
• 42 base 10 converted to base 2,
4210 = [1 × 25 = 32] + [0 × 2 4 = 0 ] + [1 × 23 = 8] +
[0 × 2
2
= 0] + [1 × 21 = 2] + [0 × 2 0 = 0 ] = 1010102
• 42 base 10 converted to base 16,
4210 = [ 2 ×161 = 32] + [a × 160 = 10] = 2a16
Converting a Negative Number into
Binary using 2’s Complement
• To convert -42 into a negative binary value
–Determine the number of bits to represent (e.g.
16 bits)
–Convert the absolute value (42) into binary
0000000000101010
–Take the 1’s complement of the number
1111111111010101
–Add 1 to the number
1111111111010110
Floating Point Representation
• Sets of bits are separated into
–mantissa
–exponent field
Another Example of
Unformatted Output (e.pro)
Octal Interpretation of e.dat
(big endian)
pro e
openw, lun, 'e.dat', /get_lun
e = exp(1.0)
writeu,lun, e
free_lun, lun
% od -b e.dat
0000000 100 055 370 124
0000004
end
Binary Interpretation of e.dat
(big endian)
% bc -l
obase=2
ibase=8
100
1000000
55
101101
370
11111000
124
1010100
Binary Interpretation of e.dat
(big endian)
01000000 00101101
11111000 01010100
(SINGLE PRECISION) IEEE (ANSI/IEEE Std 754-1985)
+-+--------+-----------------------+
|S| exp
|
fraction
|
+-+--------+-----------------------+
^
^
Bit31
Bit0
A formula that gives the value of this
float is:
Value=(-1)**S X 1.fraction X 2**(exp-127)
Binary Interpretation of e.dat
(big endian)
01000000 00101101
(SINGLE PRECISION) IEEE (ANSI/IEEE Std 754-1985)
Value=(-1)**S X 1.fraction X 2**(exp-127)
11111000 01010100
% od -f e.dat
0000000 2.7182817e+00
0000004
2**(128-127)= 2
[Bit 31]
S = 0
[Bits 30:23]
exp = 1000000 0 = 128 (decimal)
[Bits 22:0]
fraction = 0101101 11111000 01010100
How Does This All Relate
to Computers and Programming
• Computer Architectures generally are described using the
numbering systems we have mentioned.
% bc -l
ibase=2
1.0101101 11111000 01010100
1.35914087295532226562500
ibase=1010
2*1.35914087295532226562500
2.71828174591064453125000
1-bit Address Line and
1-bit Data Line
Computer Architecture
0
1
• We can determine the limitations of systems by the
ranges described by these numbers.
• In hardware, these numbers determine the amount of memory a
particular system can address and the range of values a particular
memory location can hold.
• In programming, these numbers will tell you what data type you can
work with.
• In images, these numbers can tell you the limits of your dimensions
and limits of the number of “colors” you can have.
CPU
Floating Point Interpretation of
e.dat (big endian)
2-bit Address Line and
1-bit Data Line
Computer Architecture
0
1
2
3
MEMORY
CPU
ADDRESS
BUS
ADDRESS
BUS
DATA BUS
DATA BUS
MEMORY
1-bit Address Line and
2-bit Data Line
Computer Architecture
0
1
2-bit Address Line and
2-bit Data Line
Computer Architecture
0
1
2
3
MEMORY
CPU
CPU
ADDRESS
BUS
ADDRESS
BUS
DATA BUS
DATA BUS
In General
• If you have n bits of data, you can
have 2^n possible states.
• Unsigned values range
0 to (2^n)-1
• Signed value range
2^(n-1) to 2^(n-1)-1
• memory locations
0 to(2^n)-1
8-bit Address Line and 8-bit Data
Line Computer Architecture
MEMORY
MEMORY
For Images
• Bits per picture element (pixels)
–determines the number of “colors” it can represent
• Bilevel images
–1 bit per pixel or 2 “colors”
• Grey level images
– n bits per pixel or 2^n colors
• Color images
–3n bits per pixel or 2^3n colors
• Multispectral images
– m bands and n bits per pixel or 2^(mn) colors
CPU
ADDRESS
BUS
DATA
BUS
0
1
2
3
.
.
.
255
.
.
.
Caveat on Multi-Byte Data
• Data structures of more than a single byte
–(e.g. integer value greater than 255)
–you must know the “endian” of the data
• Endian
–Represents how a machine interprets the
significance of a set of bytes.
–big-endian machines
–little endian machines
Caveat on Multi-Byte Data
Caveat on Multi-Byte Data
• Big Endian Machines
–(Suns, Motorolas)
–Represents the integer value 512
internally as two bytes in the following
order.
MSB LSB
00000010 00000000
Caveat on Multi-Byte Data
• Little Endian Machines
–(DECS, Intels)
– represent the integer value 512 internally
as two bytes in the following order
– images
–Unformatted numbers
• Transposing multibyte data
–dd utility in UNIX
• (does not work for data spanning more than
two bytes)
LSB
MSB
00000000 00000010
–SWAP_ENDIAN() function in IDL
Working with IDL Variables
Working with IDL Variables
• Variable names must start with a letter
Working with Data in IDL
• Problem when transferring raw data
–can include, letters, digits, underscore or
dollar characters.
• Variables are not case sensitive, i.e.,
–Celsius is equivalent to celsius
• which works on data more than two bytes long.
• Variables in IDL have two important
attributes
–Data type
–Attributes
IDL Variable Data Structures
IDL Variable Data Types
• Undefined
• Integer Values
–Byte, Integer, Long Integer
• Floating Point Values
–Floating Point, Double Precision
–Single Precision Complex, Double
Precision Complex
• String
Initializing Array Variables
•
•
•
•
•
•
•
•
byte_array = bytarr( 256, 256, 3 )
integer_array = intarr( 256, 256 )
long_array = lonarr( 256, 256 )
float_array = fltarr( 256, 256 )
double_array = dblarr( 256, 256 )
complex_array = complexarr( 256, 256 )
dcomplex_array = dcomplexarr( 256, 256)
string_array = strarr(80)
•
•
•
•
Scalar (single value)
Vector (one-dimensional array)
Array (up to eight dimensions)
Structure
–(combination of any mixture of data)
–Very Important in
• Simplifying parameter passing
• Understanding Object-Oriented Programming
Concepts
Initializing Array Variables
• Initializing Array with each element set
to its subscript
–bindgen, indgen, lindgen, findgen,
dindgen, cindgen, dcindgen, sindgen
–Only real parts of complex numbers are
set to the subscripts
–An array of strings containing the numbers
is created by sindgen
Initializing Scalar Variables
byte_variable = 0B ; Max = 255
integer_variable = 0 ; Max = 32767
long_integer = 0L
float_variable = 0.0
double_variable = 0.0D
complex_variable = Complex(0.0, 0.0)
dcomplex_variable = DComplex(0.0, 0.0)
string_variable = ‘‘
Initializing Array Variables
• Make_array function is a general way
of creating and initializing arrays
array=make_array(10,12,Value=1,/Int)
• Replicate functions
–allows initialization of all elements to a
given variable (useful in structures)
initial_value = 1
array=replicate(initial_value, 10,12)
Structures in IDL
Structures in IDL
• Allows more abstract data types to be
created from other basic data types
• Example
–Date data structure
IDL> a={date, month:0B, day:0B, year:0B}
IDL> print,a
{ 0 0 0}
Structures in IDL
• Anonymous Structures
IDL> d={month:1,day:1,year:97}
IDL> print,d
{
1
1
97}
IDL> e=d
IDL> print,e
{
1
1
97}
• Date Example (using a Named Structure)
IDL> a.month=12
IDL> a.day=25
IDL> a.year=96
IDL> print,a
{ 12 25 96}
IDL> print,a.month
12
IDL> print,a.day
25
IDL> print,a.year
96
Structures in IDL
• Anonymous Structures
IDL>
e={month:'January',day:d.day,year:d.year}
IDL> print,e
{ January
1
97}
IDL> e.month='December'
IDL> print,e
{ December
1
97}
Structures in IDL
• Date Structure Example
IDL> c={date}
IDL> print,c
{ 0 0 0}
IDL> help,c
C
STRUCT
= -> DATE Array(1)
Structures in IDL
• Determining Variable Structure
IDL> help,e,/structure
** Structure <761058>, 3 tags, length=12,
refs=1:
MONTH
STRING 'December'
DAY
INT
1
YEAR
INT
97
Structures in IDL
• Determining Structure Characteristics
IDL> print,n_tags(e)
3
IDL> print,tag_names(e)
MONTH DAY YEAR
Common Problems with Using
IDL Data Types
• But with computers, there is always a
way to make it do things which have
unexpected, but explainable results
IDL> a=byte(256)
IDL> print,a
0
Common Problems with Using
IDL Data Types
Common Problems with Using
IDL Data Types
• Users usually forget the minimum and
maximum values that certain data types
can hold.
• Example
• The limits of a data type is sometimes
caught by IDL
IDL> a=256B
a=256
^
% Byte constant must be less than 256.
IDL> a=255B
IDL> print,a
255
Common Problems with Using
IDL Data Types
• To illustrate the ambiguity of data
IDL> a=byte(257)
IDL> print,a
1
IDL> a=byte(-1)
IDL> print,a
255
Common Problems with Using
IDL Data Types
IDL> print,a
0
16
32
48
64
80
96
112
128
144
160
176
192
208
224
240
1
17
33
49
65
81
97
113
129
145
161
177
193
209
225
241
2
18
34
50
66
82
98
114
130
146
162
178
194
210
226
242
3
19
35
51
67
83
99
115
131
147
163
179
195
211
227
243
4
20
36
52
68
84
100
116
132
148
164
180
196
212
228
244
5
21
37
53
69
85
101
117
133
149
165
181
197
213
229
245
6
22
38
54
70
86
102
118
134
150
166
182
198
214
230
246
7
23
39
55
71
87
103
119
135
151
167
183
199
215
231
247
8
24
40
56
72
88
104
120
136
152
168
184
200
216
232
248
9
25
41
57
73
89
105
121
137
153
169
185
201
217
233
249
10
26
42
58
74
90
106
122
138
154
170
186
202
218
234
250
11
27
43
59
75
91
107
123
139
155
171
187
203
219
235
251
12
28
44
60
76
92
108
124
140
156
172
188
204
220
236
252
13
29
45
61
77
93
109
125
141
157
173
189
205
221
237
253
14
30
46
62
78
94
110
126
142
158
174
190
206
222
238
254
15
31
47
63
79
95
111
127
143
159
175
191
207
223
239
255
Common Problems with Using
IDL Data Types
• For a=bindgen(512,512) you would
expect a larger version of the previous
image instead you get
Common Problems with Using
IDL Data Types
Common Problems with Using
IDL Data Types
• The same care must be given for other
integer values (is it signed or unsigned)
• Consider the following
• The same care must be given for other
integer values (is it signed or unsigned)
• Consider the following
IDL> a=fix(32767)
IDL> print,a
32767
IDL> a=fix(32768)
IDL> print,a
-32768
• This is important
because a lot of images
are stored as byte values
Common for loop Mistake
• The following for loop statement will not
execute properly (infinite loop)
IDL> for i=0, 32767 do begin
• The correct form is
IDL> for i=0L, 32767 do begin
Summary
•
•
•
•
ASCII vs. RAW Data
Formatted vs. Unformatted
Number Representation
Data Types in IDL
IDL> a=fix(65536)
IDL> print,a
0
IDL> a=fix(65535)
IDL> print,a
-1