Laboratorio di Algoritmi e Strutture Dati

Multimedia Systems
Giorgio Leonardi
A.A.2014-2015
Lecture 2: A brief history of image and sound
recording and storage
Overview
• Course page (D.I.R.):
– https://disit.dir.unipmn.it/course/view.php?id=639
• Consulting:
– Office hours by appointment:
[email protected]
• Office #182 (in front of “Sala Seminari”)
• Email me any time
Outline
• From physical phenomenon to signals
• Representation of audio and light as recordable
entities: electric signals
–
–
–
–
–
–
What is sound
Sound recording
What is light
Image recording
Images as electric signals
Analog video encoding
• …a little history of media supports!
Multimedia representation
• Multimedia can originate
– In digital form
• E.g., a picture created using a painting program
• E.g., a document created with a word processor
• E.g., CAD schemas
– From a physical phenomenon
• E.g., sound from a loudspeaker
• In both cases, they can be formally
described by means of signals
• We will learn how to transform sound and
images into analog signals, and then into
their digital representation
Sound
• Sound perception is due to the variations of air pressure around
our eardrum
– Vibration of a source object disturbs the particles in the surrounding
medium; those particles disturb those next to them and so on, thus
creating a wave pattern (compression/decompression), which
propagates through the air
– This wave reaches our ear, making our eardrum vibrate. Through the
complex ear’s organic system, the vibration is then received, stored
and interpreted by the brain.
Sound waveforms
• We are used to «see» sound as waveforms
• Waveform is the physical transform of the air
pressure which changes in the space, to a signal
which changes in time
Sound waveforms
• A waveform can be recorded as a signal, which
defines the variations of our eardrum’s position
in time, while we are hearing a sound:
Wave pattern direction
Eardrum
vibration
extension
Time direction
Transformation from space domain:
y=F(s)
to time domain:
y=F(t)
Sound waveform
• The transformed sound wave can be stored on an analog
media, as an analogic electric signal, captured by means of
a condenser microphone
• From left to right:
– The longitudinal sound wave hits the microphone’s
diaphragm, which vibrates just as the eardrum does
– Diaphragm’s vibration generates an electric signal,
proportional to the strenght of this vibration
– This signal is recorded on a magnetic media
Definition of signal
• A signal 𝑔 is a mathematical abstraction
representing a quantity whose values
change as a function of an independent
parameter 𝑘 ∈ K
– Usually, the parameter 𝑘 represents (belongs to)
either the time or the space
• Without loss of generality, in the following
we assume that 𝑘 represents the time
• Either 𝑘 or 𝑔(𝑘) can be multidimensional
Properties of analogic signals
• The recorded signal has physical and
mathematical properties
• Among all of them, we will define:
–
–
–
–
–
–
Waveform
Volume/Amplitude
Wavelength/Frequency
Pitch
Bandwidth
Timbre
Waveform
• The shape of a signal is called waveform
Shapes of the base periodic waves
Waveform
• Signals can be classified as:
– Continuous: lim 𝑔 𝑥 = 𝑔 𝑥0 = 𝑐, ∀𝑥 ∈ 𝐷𝑜𝑚(𝑔)
𝑥→𝑥0
– Non Continuous: all the signals without the latter
property
Waveform
• Signals can be classified as:
– Periodic: Signals repeating within a fixed period T:
– Aperiodic: all the signals without the latter property
Volume/Amplitude
• Volume of a sound signal is proportional to its
amplitude
• The more the eardrum extends from its initial
position, the higher the amplitude of the signal
(therefore, its volume).
• Amplitude id defined as:
1. Peak amplitude (Û)
2. Peak-to-peak amplitude (2Û)
3. Root mean square amplitude (Û/ 2)
Wavelength/Frequency
• The wavelength 𝜆 is the distance the wave travels through
its medium within a period
• It is inversely proportional to frequency 𝑓:
• Where 𝑣 is the speed of sound
– In dry air (i.e., at 0% humidity) at 𝜏 °C
• 𝑣≅(331.3+0.606𝜏) m/s
– E.g., at 20 °C, 𝑣≅343.4 m/s
Pitch
• Pitch is often related to the (perceived) frequency of a
sound wave
• Frequency is measured in Hertz (Hz):
– 1 Hz = signal oscillating 1 time per second
– 1 kHz = signal oscillating 1000 times per second
High Pitch
Low Pitch
Pitch
• When playing instruments, a different pitch
defines a different note which can be played
Note
Frequency (Hz)
Wavelength (cm)
A3
B3
C4
D4
E4
F4
G4
A4
B4
C5
220.00
246.94
261.63
293.66
329.63
349.23
392.00
440.00
493.88
523.25
156.82
139.71
131.87
117.48
104.66
98.79
88.01
78.41
69.85
65.93
Pitch
• Pitch is the responsible of the «doppler» effect: let a sound
source produce a soundwave with a defined frequency f:
• When the sound source moves in our direction, the its soundwave
is «squeezed» and perceived at a higher pitch :
• Finally, when the sound source leaves us, the soundwave comes
to our ears as «stretched», therefore perceived ad a lower pitch:
Doppler effect
•
Doppler effect rule: given a sound source producing:
– A waveform with frequency fs, and
– moving to/leaving an observer at speed Vs
•
The waveform «hitting» an observer will have frequency fo:
𝒗
𝒇𝒐 = 𝒇𝒔
𝒗 − 𝒗𝒔
•
Where 𝑣 is the speed of sound
– at 𝜏 °C: 𝑣 ≅ (331.3+0.606𝜏) m/s
1)
Ambulance does not move. Vs = 0  fo = fs
(same pitch)
2)
Ambulance moves to obs. Vs > 0  fo > fs
(higher pitch)
3)
Ambulance leaves obs.
(lower pitch)
V s < 0  fo < fs
Other devices exploiting the
Doppler effect?
OUCH!
Bandwidth
• A signal may be composed by (as the sum of) multiple frequency
components
• Frequency components: periodic sine waves, each one with a
particular frequency
– The component at lower frequency is called Fundamental
– The others components add Harmonics to the fundamental wave
=
+
+
+
+
Bandwidth
•
Spectrum: the range of frequency that a signal contains
– Fmin= Fundamental frequency
– Fmax= Higher harmonic frequency
•
Bandwidth (also called width of the signal): the width of the spectrum
In this example:
Fmin = Fundamental frequency = 50 Hz
Fmax = Harmonic 9 frequency = 450 Hz
Spectrum:
S= [Fmin; Fmax] = [50Hz; 450Hz]
Bandwidth:
W= Fmax – Fmin = 450Hz – 50Hz = 400Hz
Bandwidth
• Finite bandwidth: the total signal can be fully reconstructed
by adding a fundamental and a finite number of harmonics:
=
+
+
+
+
• Periodic sine-based signals usually have limited bandwidth
Bandwidth
•
Infinite bandwidth: To reconstruct perfectly the total signal, a
fundamental and an infinite number of harmonics must be added:
Total
Fundamental
+
Harmonic 3
+
Harmonic 5
=
Approx. sum of
finite components
•
Digital signals, such as (a), need the sum of infinite components (b), (c),
(d) to obtain the original form.
•
The sum of finite components generates only the approximation (e)
Bandwidth
• A second example of a signal with infinite bandwidth: a
sawtooth function
Timbre
• A pure sound is a wave made of only a
single frequency and has a sinusoidal form
• In nature, there are no pure sounds
• Pure sounds can be produced artificially
– E.g., a tuning fork (or diapason) is an acoustic
resonator which vibrates at a precise frequency
Timbre
• In general, sound sources vibrate in more
complicated ways, creating the rich variety of
sounds and noises we are familiar with
• The timbre of a sound is defined by its wave form,
which means the «shape» of its wave
Different instruments playing the same note (same pitch, different waveforms)
What about images?
Photography
• The term «Photography» derives from two
greek words:
– Phos, which means «light»
– Graphos, which means: «to draw»
• What we see, when we look at a picture, is a
«drawing with light» performed by the
photographer
• But what is light, and how can we «capture»
the light?
Light
• Visible light (or visible, or, simply, light) is an
electromagnetic (EM) radiation that is visible to the
human eye
– An EM radiation is a transverse wave, that is a wave that
is oscillating perpendicularly to the direction of
propagation
Light waves properties
• EM radiations propagate at the speed of light 𝑐 which in the
vacuum is ~2.998×108 m/s
• Frequency 𝜈 and wavelength 𝜆 are strictly related by 𝜆𝜈=𝑐
• According to the particle-wave
duality (from quantum
mechanics), EM radiations can
be thought both as
propagating waves and as a
stream of elementary
(massless) particles (called
photons), each traveling in a
wavelike pattern and moving at
the speed of light
Light waves properties
• Each colour we can see has a different
wavelength/frequency
• Red has the longest wavelength and violet has the shortest
wavelength
Visible light
• Visible light represents a very small portion of the EM
spectrum
– Visible light has wavelengths between ~380 nm (violet color)
and ~740 nm (red color)
Analogic recording of light
• The human eye:
– Lightwaves reflected by the cyclist hit the cornea and, through
the pupil, are focused by the lens
– From the lens, the light reaches the retina and is captured by
its photoreceptors
– Finally, the (upside-down) image is transmitted to the brain by
the optic nerve
Analogic recording of light
• The analogic photocamera:
– Photocameras use the same principle, to store analogic images on
chemical photographic films
– Light hits the camera’s lens and is focused by the photographer, who
moves some of the lens to adjust focusing
– The photographer opens the shutter for a fraction of time, during
which the light hits the chemical photographic film, impressing the
chemical material in it
Shutter
Lens
Photographic
film
Images as signals
• Even images can be represented as electrical
signals Remember what we told before?
• A signal 𝑔 is a mathematical abstraction representing a
quantity that changes its values as a function of an
independent parameter 𝑘∈K
– Usually, the parameter 𝑘 represents either the time or the
space
• Without loss of generality, in the following we assume
that 𝑘 represents the time
• Either 𝑘 or 𝑔(𝑘) can be multidimensional
Monodimensional signal
• Considering only a horizontal «line» of the following image,
it can be represented as a monodimensional signal
– k represents the horizontal position in the image
– g(k) is the variation of the greyscale values from left to right
Grayscale value
𝑔:R → R
y= g(k)
Pixel position
Multidimensional signal
• A complete black & white image is a signal from 2D
points (positions in the space) to light intensities
– It is a 2D surface
𝑔:R2→R
y= g(k,j)
Multidimensional signal
• A color image is a signal from 2D points (positions in the
space) to red, green and blue light intensities
– It becomes three 2D surfaces: one for each colour
𝑔:R2→R3
<r, g, b> = g(k, j)
Analog video encoding
What is video
• A video is a sequence of images, played at a
constant framerate:
– PAL: 25 frames/sec
– NTSC: 29.97 frames/sec
• Images are recorded in a negative film strip (such as
the super 8mm, in use nowadays), where each frame
is impressed with (about) the same technique we
have seen for analog photocameras
• Video encoding (for video transmission over media)
is a 2-pass process
Video encoding – Pass 1
• Each frame is treated separately:
– Frame F is divided into lines (called scanlines)
– Each scanline generates a 1-dimensional electrical
signal (as seen for still images)
Video encoding – Pass 2
• Video is then recorded (on a magnetic tape,
remember VHS?) or transmitted with a timedependent signal, encoding the sequence of all the
scanned images
Why going digital?
Problems in analog processing
• Magnetic devices used to store the analog
signals are:
– Affected by mechanical noise
– Perishable, with loss of quality in time
– Space-hungry 
• Treatment of analog signals requires:
– Dedicated hardware
– Very complex real-time calculations (real-time
integration, as we will see)
– Image processing is a completely phisical/chemical
process
– Analogical recovery of video quality is limited
Analog audio: vinyl records
• Sound is literally carved into a phonograph record, because the
groove undulations are analogous to the sound waves they
represent.
• A classical 33 RPM record contains about 20-30 minutes of analog
audio per side, compared to the CD, which holds from 80 minutes
to 800 minutes of digital audio, depending on format (CD-DA or
MP3).
Don’t laugh, it is still my favourite media!!! 
And the favourite choice of audiophiles: all the harmonics are intact
Analog audio: tapes
• Analog tapes record a magnetical pattern,
proportional to the electrical signal
A blast from the 80’s!
• It revolutioned the world of media
recording, because everyone could record
its own music collection, or its favourite
radio programs on-the-fly, with a cheap
device and physical storage media.
• Up here you will find a
common, low-cost audio
cassette for home and
«portable» use.
• The one to the left is a
professional recording and
production device. Many
masterpieces from the late
‘70 and ‘80s were recorded
on a master tape, instead of
a master vynil as used
before
Analog tapes for computer storage
• Analog tapes were also used to store backup and computer
programs and games
• The sequence of 0 and 1 was coded as an electric (digital)
sound signal and stored as if it were real audio
The world-famous IOMEGA zip
tape devices for incremental data backup
One of the much more famous
Commodore 64 cassettes!
Analog photography: 35mm
• Images are stored in negative
35mm Color/BW films, or in
single 35mm positive diapos
• Images can be viewed only when
printed, or by means of
special (and costly) visors
• Anyway, analog photography is still a good alternative to
the digital one (better colours and gamut), but:
– Necessity to shoot accurately (no possibility of taking 1.200
pictures and then choose, it will cost as a new car!)
– You will know the result of a photographic session only after
the development of the negative/positive film
– No EXIF data available! If you want to try different setting, you
must write/remember all the exposure data by yourself!
Analog video: The VHS
• The VHS was the «DVD system of the 90’s»,
and allowed the large-scale diffusion of
motion pictures, and the recording of the
favourite programs at home.
• VHS system consists of a magnetic tape,
whose storage technique is the union of the
audiocassette and the coding of analog
video signals
• The video signal is transformed into an
alectric signal in the way we have seen
before, and this signal is recorded on the VHS
tape as the corresponding magnetical pattern
• VHS for PAL and VHS for NTSC were not
compatible, due to the difference in frame rate
and in color representation
…and nowadays?
• Digitized audio, video and pictures can be stored in
cheap, high-capacity, reliable(?) and portable
devices
• Is it all gold? What are the pros and cons?