Chapter 2
2.1 Structure of the Human Eye
Figure 2.1 shows a simplified horizontal cross section of the human
eye. The eye is nearly a sphere, with an average diameter of
approximately 20 mm.Three membranes enclose the eye: the cornea
and sclera outer cover; the choroid; and the retina. The cornea is a
tough, transparent tissue that covers the anterior surface of the eye.
Continuous with the cornea, the sclera is an opaque membrane that
encloses the remainder of the optic globe. The choroid lies directly
below the sclera. This membrane contains a network of blood vessels
that serve as the major source of nutrition to the eye.
Even superficial injury to the choroid, often not deemed serious, can
lead to severe eye damage as a result of inflammation that restricts
blood flow. The choroid coat is heavily pigmented and hence helps to
reduce the amount of extraneous light entering the eye and the
backscatter within the optical globe.At its anterior extreme, the
choroid is divided into the ciliary body and the iris diaphragm.The
latter contracts or expands to control the amount of light that enters
the eye. The central opening of the iris (the pupil) varies in diameter from approximately 2 to 8 mm. The front of the
iris contains the visible pigment of the eye, whereas the back contains a black pigment.
The lens is made up of concentric layers of fibrous cells and is suspended by fibers that attach to the ciliary body. It
contains 60 to 70%water, about 6%fat,and more protein than any other tissue in the eye.The lens is colored by a
slightly yellow pigmentation that increases with age. In extreme cases, excessive clouding of the lens, caused by the
affliction commonly referred to as cataracts, can lead to poor color discrimination and loss of clear vision.The lens
absorbs approximately 8% of the visible light spectrum, with relatively higher absorption at shorter wavelengths.
Both infrared and ultraviolet light are absorbed appreciably by proteins within the lens structure and, in excessive
amounts, can damage the eye.
The innermost membrane of the eye is the retina, which lines the inside of the wall’s entire posterior portion.When
the eye is properly focused, light from an object outside the eye is imaged on the retina. Pattern vision is afforded by
the distribution of discrete light receptors over the surface of the retina.There are two classes of receptors: cones and
rods. The cones in each eye number between 6 and 7 million. They are located primarily in the central portion of the
retina, called the fovea, and are highly sensitive to color. Humans can resolve fine details with these cones largely
because each one is connected to its own nerve end.
Muscles controlling the eye rotate the eyeball until the image of an object of interest falls on the fovea. Cone vision
is called photopic or bright-light vision. The number of rods is much larger: Some 75 to 150 million are distributed
over the retinal surface.The larger area of distribution and the fact that several rods are connected to a single nerve
end reduce the amount of detail discernible by these receptors. Rods serve to give a general, overall picture of the
field of view.They are not involved in color vision and are sensitive to low levels of illumination. For example,
objects that appear brightly colored in daylight when seen by moonlight appear as colorless forms because only the
rods are stimulated.This phenomenon is known as scotopic or dim-light vision.
Image Formation in the Eye
The principal difference between the lens of the eye and an ordinary optical lens is that the former is flexible. As
illustrated in Fig. 2.1, the radius of curvature of the anterior surface of the lens is greater than the radius of its
posterior surface. The shape of the lens is controlled by tension in the fibers of the ciliary body.To focus on distant
objects, the controlling muscles cause the lens to be relatively flattened. Similarly, these muscles allow the lens to
become thicker in order to focus on objects near the eye.
The distance between the center of the lens and the retina (called the focal length) varies from approximately 17 mm
to about 14 mm, as the refractive power of the lens increases from its minimum to its maximum.When the eye
focuses on an object farther away than about 3 m, the lens exhibits its lowest refractive power.When the eye focuses
on a nearby object, the lens is most strongly refractive.This information makes it easy to calculate the size of the
retinal image of any object. In Fig. 2.3, for example, the observer is looking at a tree 15 m high at a distance of 100
m. If h is the height in mm of that object in the retinal image, the geometry of Fig. 2.3 yields 15/100=h/17 or
h=2.55 mm.As indicated in Section 2.1.1, the retinal image is reflected primarily in the area of the fovea.
Perception then takes place by the relative excitation of light receptors, which transform radiant energy into
electrical impulses that are ultimately decoded by the brain.
2.2 Light and the Electromagnetic Spectrum:
Sir Isaac Newton discovered that when a beam of sunlight is passed through a glass prism, the
emerging beam of light is not white but consists of colors ranging from violet to red.
The rage of colors perceive in visible light represents a very small portion of the electromagnetic
spectrum. On one end of the spectrum are gamma rays.
The electromagnetic spectrum can be expressed by:
= C/
The energy of the various components of the electromagnetic is given by;
E = h
Electromagnetic waves can be visualized as a stream of massless particles each moving at the
speed of light and contains a certain amount of energy each bundle energy is called a photon, that
energy is proportional to frequency, Thus, radio waves have photons with low energies gamma
rays, the most energetic of all.
Light is a particular type of electromagnetic radiation that can be seen and sensed by the human
eye. The visible band spans the rage from 0,43 micro (violet) to about 0.79 micro (red).
The colors that humans perceive of an object are determined by the light reflected from the
object. A body that reflects light and is relatively balanced in all visible wavelengths appears
white. However a body that favors reflectance in a limited range of the visible spectrum exhibits
some shades of color. Light that is void of color is called monochromatic light. The attribute of
such light is its intensity. The term gray level is used to describe monochromatic intensity.
Chromatic light spans from 0.43 to 0,79 micro. Three basic quantities are used to describe the
quality of a chromatic light scoured: radiance; luminance; and brightness. Radiance is the total
amount of energy that flows from the light source. In watts (w). Luminance gives a measure of
the amount of energy an observer perceives from a light source (In). Brightness is a subjective
descriptor of light perception that is practically impossible to measure.
2.3 Cameras
The cameras and recording media available for modern digital image processing
applications are changing at a significant pace. To dwell too long in this section on
one major type of camera, such as the CCD camera, and to ignore developments in
areas such as charge injection device (CID) cameras and CMOS cameras is to run
the risk of obsolescence. Nevertheless, the techniques that are used to characterize
the CCD camera remain “universal” and the presentation that follows is given in
the context of modern CCD technology for purposes of illustration.
2.3.1 LINEARITY
It is generally desirable that the relationship between the input physical signal (e.g.
photons) and the output signal (e.g. voltage) be linear. Formally this means
that if we have two images, a and b, and two arbitrary complex constants,
w1 and w2 and a linear camera response, then:
1
where R{•} is the camera response and c is the camera output. In practice the
relationship between input a and output c is frequently given by:
2
where is the gamma of the recording medium. For a truly linear recording system
we must have = 1 and offset = 0. Unfortunately, the offset is almost never zero
and thus we must compensate for this if the intention is to extract intensity
measurements. Compensation techniques are discussed in Section 10.1.
Typical values of that may be encountered are listed in Table 8. Modern cameras
often have the ability to switch electronically between various values of .
2.3.2 SENSITIVITY
There are two ways to describe the sensitivity of a camera. First, we can determine
the minimum number of detectable photoelectrons. This can be termed the absolute
sensitivity. Second, we can describe the number of photoelectrons necessary to
change from one digital brightness level to the next, that is, to change one analogtodigital unit (ADU). This can be termed the relative sensitivity.
2.3.2.1 Absolute sensitivity
To determine the absolute sensitivity we need a characterization of the camera in
terms of its noise. If the total noise has a of, say, 100 photoelectrons, then to
ensure detectability of a signal we could then say that, at the 3level, the minimum
detectable signal (or absolute sensitivity) would be 300 photoelectrons. If all the
noise sources listed in Section 6, with the exception of photon noise, can be reduced
to negligible levels, this means that an absolute sensitivity of less than 10
photoelectrons is achievable with modern technology
2.3.2.2 Relative sensitivity
The definition of relative sensitivity, S, given above when coupled to the linear case,
eq. (2) with = 1, leads immediately to the result:
The measurement of the sensitivity or gain can be performed in two distinct ways.
• If, following eq. (2), the input signal a can be precisely controlled by either
“shutter” time or intensity (through neutral density filters), then the gain can be
estimated by estimating the slope of the resulting straight-line curve. To translate
this into the desired units, however, a standard source must be used that emits a
known number of photons onto the camera sensor and the quantum efficiency ()
of the sensor must be known. The quantum efficiency refers to how many
photoelectrons are produced—on the average—per photon at a given wavelength.
In general 0 () 1.
If, however, the limiting effect of the camera is only the photon (Poisson) noise
then an easy-to-implement, alternative technique is available to determine the sensitivity.
the sensitivity measured from an image c is given by:
The extraordinary sensitivity of modern CCD cameras is clear from these data. In a
scientific-grade CCD camera (C–1), only 8 photoelectrons (approximately 16
photons) separate two gray levels in the digital representation of the image. For a
considerably less expensive video camera (C–5), only about 110 photoelectrons
(approximately 220 photons) separate two gray levels.
2.3.3 SNR
in modern camera systems the noise is frequently limited by:
• amplifier noise in the case of color cameras;
• thermal noise which, itself, is limited by the chip temperature K and the
exposure time T, and/or;
• photon noise which is limited by the photon production rate and the
exposure time T.
2.3.3.1 Thermal noise (Dark current)
Using cooling techniques based upon Peltier cooling elements it is straightforward
to achieve chip temperatures of 230 to 250 K. This leads to low thermal electron
production rates. As a measure of the thermal noise, we can look at the number of
seconds necessary to produce a sufficient number of thermal electrons to go from
one brightness level to the next, an ADU, in the absence of photoelectrons. This last
condition—the absence of photoelectrons—is the reason for the name dark current.
Measured data for the five cameras described above are given in Table 10.
The video camera (C–5) has on-chip dark current suppression.
Operating at room temperature this camera requires more than 20 seconds to
produce one ADU change due to thermal noise. This means at the conventional
video frame and integration rates of 25 to 30 images per second (see Table), the
thermal noise is negligible.
2.3.3.2 Photon noise
From this eq. we see that it should be possible to increase the SNR by increasing
the integration time of our image and thus “capturing” more photons. The pixels in
CCD cameras have, however, a finite well capacity. This finite capacity, C, means
that the maximum SNR for a CCD camera per pixel is given by:
2.3.4 SHADING
Virtually all imaging systems produce shading. By this we mean that if the physical
input image a(x,y) = constant, then the digital version of the image will not be
constant. The source of the shading might be outside the camera such as in the
scene illumination or the result of the camera itself where a gain and offset might
vary from pixel to pixel. The model for shading is given by:
where a[m,n] is the digital image that would have been recorded if there were no
shading in the image, that is, a[m,n] = constant.
2.3.5 PIXEL FORM
While the pixels shown in Figure 1 appear to be square and to “cover” the
continuous image, it is important to know the geometry for a given camera/digitizer
system. In Figure 18 we define possible parameters associated with a camera and
digitizer and the effect they have upon the pixel.
Pixel form parameters
The parameters Xo and Yo are the spacing between the pixel centers and represent
the sampling distances . The parameters Xa and Ya are the dimensions of that portion of the
camera’s surface that is sensitive to light. different video digitizers (frame grabbers) can have
different values for Xo while they have a common value for Yo.
2. 3.5.1 Square pixels
square sampling implies that Xo = Yo or alternatively Xo / Yo = 1. It is not uncommon, however,
to find frame grabbers where Xo / Yo = 1.1 or Xo / Yo = 4/3. (This latter format matches the format
of commercial television.
The risk associated with non-square pixels is that isotropic objects scanned with non-square
pixels might appear isotropic on a camera-compatible
monitor but analysis of the objects (such as length-to-width ratio) will yield nonisotropic
results. This is illustrated in Figure
.
The ratio Xo / Yo can be determined for any specific camera/digitizer system by
using a calibration test chart with known distances in the horizontal and vertical
direction. These are straightforward to make with modern laser printers. The test
chart can then be scanned and the sampling distances Xo and Yo determined.
2.3.5.2 Fill factor
In modern CCD cameras it is possible that a portion of the camera surface is not
sensitive to light and is instead used for the CCD electronics or to prevent
blooming. Blooming occurs when a CCD well is filled and additional photoelectrons spill over
into adjacent CCD wells. Anti-blooming regions between the active CCD sites can be used to
prevent this. This means, of course, that a fraction of the incoming photons are lost as they strike
the nonsensitive portion of the CCD chip. The fraction of the surface that is sensitive to light is
termed the fill factor and is given by:
The larger the fill factor the more light will be captured by the chip up to the
maximum of 100%. This helps improve the SNR. As a tradeoff, however, larger
values of the fill factor mean more spatial smoothing
2.3.6 SPECTRAL SENSITIVITY
Sensors, such as those found in cameras and film, are not equally sensitive to all
wavelengths of light. The spectral sensitivity for the CCD sensor is given in Figure
Spectral characteristics of silicon, the sun, and the human visual system.
UV = ultraviolet and IR = infra-red.
The high sensitivity of silicon in the infra-red means that, for applications where a
CCD (or other silicon-based) camera is to be used as a source of images for digital
image processing and analysis, consideration should be given to using an IR
blocking filter. This filter blocks wavelengths above 750 nm. and thus prevents
“fogging” of the image from the longer wavelengths found in sunlight.
Alternatively, a CCD-based camera can make an excellent sensor for the near
infrared wavelength range of 750 nm to 1000 nm.
2.3.7 SHUTTER SPEEDS (INTEGRATION TIME)
The length of time that an image is exposed—that photons are collected—may be
varied in some cameras or may vary on the basis of video formats (see Table 3).
For reasons that have to do with the parameters of photography, this exposure time
is usually termed shutter speed although integration time would be a more
appropriate description.
2.3.7.1 Video cameras
Values of the shutter speed as low as 500 ns are available with commercially
available CCD video cameras although the more conventional speeds for video are
33.37 ms (NTSC) and 40.0 ms (PAL, SECAM). Values as high as 30 s may also
be achieved with certain video cameras although this means sacrificing a
continuous stream of video images that contain signal in favor of a single integrated
image amongst a stream of otherwise empty images. Subsequent digitizing
hardware must be capable of handling this situation.
2.3.7.2 Scientific cameras
Again values as low as 500 ns are possible and, with cooling techniques based on
Peltier-cooling or liquid nitrogen cooling, integration times in excess of one hour
are readily achieved.
2.3.8 READOUT RATE
The rate at which data is read from the sensor chip is termed the readout rate. The
readout rate for standard video cameras depends on the parameters of the frame
grabber as well as the camera. For standard video, the readout rate
is given by:
While the appropriate unit for describing the readout rate should be pixels / second,
the term Hz is frequently found in the literature and in camera specifications; we
shall therefore use the latter unit. For a video camera with square pixels
Video camera readout rates
Note that the values in Table are approximate. Exact values for square-pixel
systems require exact knowledge of the way the video digitizer (frame grabber)
samples each video line.
The readout rates used in video cameras frequently means that the electronic noise
occurs in the region of the noise spectrum described by > max where the noise power
increases with increasing frequency.
Readout noise can thus be significant in video cameras.
Scientific cameras frequently use a slower readout rate in order to reduce the
readout noise. Typical values of readout rate for scientific cameras, such as those
described are 20 kHz, 500 kHz, and 1 MHz to 8 MHz.
Aliasing and Moiré patterns
The Shannon sampling theorem tells us that if the functions sampled at a rate >= than twice its
highest frequency, it is possible to recover completely the original function from its samples
Zooming and shrinking Digital image
Zooming requires two steps:
1- The creation of new pixel locations
2- The assignment of gray levels to those new locations
Example
If we have 500*500 pixels and we want to enlarge it to 1.5 times to 750*750 pixels
1-lay an imaginary 750*750 grid over the original image, the spacing in the grid would be
less than one pixel
2-Look for the closest pixel in the original image and assign its gray level to the new pixel
in the grid
This method is called (nearest neighbor interpolation)
Pixel replication
The method used to generate figs. 2.20(b) through (f)
This method is applied when we want to increase the size of an image an integer number of times
for instance to double the size of an image, by duplicating each column in the horizontal
direction and duplicating each row in the vertical direction
A slightly more sophisticated way of accomplishing gray level assignment is bilinear
interpolation using the four nearest neighbors of a point
Where the four coeff. Are determined that can written using the four nearest neighbors of a
point(x,y)
Image shrinking is done in a similar manner as just described for zooming, we can use the
zooming grid analogy to visualize the concept of shrinking by a non integer factor
* Representing Digital Images
The result of sampling and quantization is a matrix of real numbers .Assume that an
image
F(x, y) is sampled so that the resulting digital image has M rows and N columns.
The values of the coordinates (x, y) now become discrete quantities. Thus, the values of
the coordinates at the origin are (x, y) = (0, 0). The next coordinate values along the first
row of the image are represented as (x, y) = (0, 1). It is important to keep in mind that
the notation (0, 1) is used to signify the second sample along the first row. It does not
mean that these are the actual values of physical coordinates when the image was
sampled. Figure
The complete M X N Digital image in the following compact matrix form:
The right side of this equation is by definition a digital image. Each element of this
matrix array is called an image element, picture element, pixel, or pel.
*Let Z and R denote the set of real integers and the set of real numbers, respectively
There are two cases:
1) The set of all ordered pairs of elements (zi, zj), with zi and zj being integers from Z.
Hence, f(x, y) is a digital image if (x, y) are integers from Z^2 and f is a function that
assigns a gray-level value (that is, a real number from the set of real numbers ,R) To
each distinct pair of coordinates (x, y).
2) If the gray levels also are integers, Z replaces R, and a digital image then becomes a
2-D function whose coordinates and amplitude values are integers.
This digitization process requires decisions about values for M, N, and L (no of discrete
gray level allowed for each pixel.
M and N -- positive integers and L = 2^k
*The dynamic range of an image:
The dynamic range is the ratio between the maximum and minimum values of a
physical measurement. Its definition depends on what the dynamic range refers to.
Ratio between brightest and darkest parts of the scene... Ratio between the maximum
and minimum intensities emitted from the screen... Ratio of saturation to noise of a
camera
When an appreciable number of pixels exhibit this property, the image will have high
contrast.
The number, b, of bits required to store a digitized image is
b=M X N X k. When M=N, this equation becomes b=N^2 X K
© Copyright 2026 Paperzz