Sound - Computer and Information Science

CISC 1600
Introduction to Multimedia Computing
Unit 3. Multimedia Components
Lecture 3.3 Sound
Notes from M. Meyer (CISC3630) and Multimedia: Making It Work, 8th Ed, by
Tay Vaughan, McGraw Hill, 2011, Ch 4
Overview
 Introduction to sound
 Digital audio
 MIDI audio
 MIDI versus digital audio
 Recording and editing digital audio
 Audio file formats
 Adding sound to multimedia projects
Introduction to Sound
 Vibrations in the air create waves of pressure that are
perceived as sound.
 Sound waves vary in sound pressure level
(amplitude) and in frequency or pitch.
 “Acoustics” is the branch of physics that studies
sound.
 Sound pressure levels (loudness or volume) are
measured in decibels (dB).
Analogue Sound Wave
• Sound wave produced by a
tuning fork.
• The wavelength (frequency)
determines pitch (high-low).
• The Amplitude determines
volume.
Volume is measured in Decibels
Frequency is measured in Hertz
 Hertz (symbol Hz) is a unit
of frequency defined as the
number of cycles per
second of an event.
 0.5 HZ = Once every 2
seconds.
 Kilo (thousand) Mega
(million) Giga (billion) and
Tera (trillion) being the
most commonly used
multiple values.
SI multiples for hertz (Hz)
Submultiples
Multiples
Value
Sym
bol
Name
Value
Symbol
Name
10−1 Hz
dHz
decihertz
101 Hz
daHz
decahertz
10−2 Hz
cHz
centihertz
102 Hz
hHz
hectohertz
10−3 Hz
mHz
millihertz
103 Hz
kHz
kilohertz
10−6 Hz
µHz
microhert
z
106 Hz
MHz
megahertz
10−9 Hz
nHz
nanohertz
109 Hz
GHz
gigahertz
10−12 Hz
pHz
picohertz
1012 Hz
THz
terahertz
10−15 Hz
fHz
femtohert
z
1015 Hz
PHz
petahertz
10−18 Hz
aHz
attohertz
1018 Hz
EHz
exahertz
10−21 Hz
zHz
zeptohert
z
1021 Hz
ZHz
zettahertz
10−24 Hz
yHz
yoctohertz
1024 Hz
YHz
yottahertz
Common prefixed units are in bold face.
"Experiments have shown
that a healthy young
person hears all sound
frequencies from
approximately 20 to
20,000 hertz."
KH
MH
GH
Harmonics
Even a single plucked
string on a guitar
produces multiple
sound waves at
different frequencies
(harmonics).
Likewise a sound wave
at a specific frequency
(resonant harmonic) can
cause a string to vibrate
on its own.
Complex Sound Waves
 The harmonics of an
instrument are different
tones, but those
generated sound waves
combine.
 This results in a new,
more complex sound
wave.
Most Individual Sound Waves Are Complex
Digital Audio
 For a long time, intelligent people, sincerely believed
that sound was too complex of a phenomena to ever be
able to capture and store digitally.
 To a certain degree they were correct in that there
will always be some loss, some distortion, when
trying to capture audio (of a certain complexity).
 That being said, quality digital stored sound is
indistinguishable to the human ear from analogue.
 In addition, digital audio is cheaper, faster, takes up less
storage space, and in theory can last forever.
Digital Audio (cont)
 Digital audio is audio data stored as 0’s and 1’s.
 Digital audio data is a representation of sound, stored
in the form of numerous individual samples.
 Samples represent the amplitude (or loudness) of
sound at a discrete point in time.
 The quality of digital recording depends on the
sampling rate (or frequency), that is, the number of
samples taken per second and the bit depth.
Digital Representation of a Sound Wave
Digital Representation of a Sound Wave
 Sampling involves measuring the amplitude of the
analog wave file at discrete intervals
 The frequency of sampling is known as sampling rate
 Each sample is typically stored in a value ranging
from 4 to 24 bits in size
 The size of the sample value in bits is known as the
‘bit depth’
 Most music CDs have a sample rate and bit depth of
44.1 kHz (samples/sec) and 16 bits (sample size)
Digital Audio (continued)
 The three sampling frequencies most often used in
multimedia are CD-quality 44.1 kHz, 22.05 kHz, and
11.025 kHz.
 The sample rate and the number of bits used to
describe the amplitude of a sound wave when
sampled determines the sample size.
Sound Playback Techniques
 Two basic playback methods:
 1. Play sample entirely from memory buffer
 2. Stream data in real-time from storage medium
 Streaming is more memory efficient for very large
audio files, such as music tracks, dialogue, etc
 Streaming systems use either a circular buffer with
read-write pointers, or a double-buffering algorithm
Sample Playback and Manipulation
Three basic operations you should know

Panning is the attenuation of left and right channels of a
mixed sound

Results in spatial positioning within the aural stereo field
Pitch allows the adjustment of a sample’s playback
frequency (high-low) in real-time
 Volume control typically attenuates the volume of a
sound


Amplification is generally not supported at this stage as hardware
manipulations (gains) will be applied.
Digital Audio (continued)
 Crucial aspects of preparing digital
audio files are:


Balancing the need for sound
quality against available RAM
and hard disk resources
Setting appropriate recording
levels to get a high-quality and
clean recording
Digital Audio (continued)
 Once a recording has been completed, it almost
always needs to be edited.
 Basic sound editing operations include trimming,
splicing and assembly, volume adjustments, and
working on multiple tracks.
Digital Audio (continued)
 Additional available sound editing operations
include







format conversion
resampling or downsampling,
fade-ins and fade-outs
equalization
time stretching
digital signal processing
reversing sounds
Normalizing
/ Normalization
 Peak Normalization: is an automated process that changes the
level of each sample in a digital audio signal by the same
amount, such that the loudest sample reaches a specified level.
Traditionally, the process is used to ensure that the signal peaks
at 0dBfs, the loudest level allowed in a digital system.
Digital Audio (continued)
 Audio resolution determines the accuracy with which
sound can be digitized.
 Size of a monophonic digital recording:
sampling_rate x (bit resolution/8) x time x 1.
 Size of stereo recording:
sampling_rate x (bit resolution/8) x time x 2.
MIDI Audio
 Since they are small, MIDI (Musical Instrumental
Digital Interface) files embedded in web pages load
and play promptly.
 The length of a MIDI file can be changed without
affecting the pitch of the music or degrading audio
quality.
 Working with MIDI requires knowledge of music
theory.
MIDI Audio (continued)
MIDI Audio (continued)
 MIDI is a shorthand representation of music stored in
numeric form.
 It is not digitized sound.
 A sequencer software and sound synthesizer is
required in order to create MIDI scores.
 MIDI is device dependent.
MIDI Audio (continued)
MIDI Versus Digital Audio
 MIDI is analogous to structured or vector
graphics, while digitized audio is analogous to
bitmapped images.
 MIDI is device dependent, while digitized
audio is device independent.
 MIDI files are much smaller than
digitized audio.
 MIDI files sound better than digital audio
files when played on a high-quality
MIDI device.
MIDI Versus Digital Audio (continued)
 With MIDI, it is difficult to play back spoken dialog,
while digitized audio can do so with ease.
 MIDI does not have consistent playback quality,
while digital audio provides consistent playback
quality.
 One requires knowledge of music theory in order to run
MIDI, while digital audio does not have this
requirement.
ADSR Envelope (Attack, Decay, Sustain, Release)
Parting thought
 MIDI, with intelligent design, robust libraries and good
music theory and design, can be spectacular.
 Don’t believe me? As a programmer, you should go and
check this out: http://www.youtube.com/watch?v=Xu-A0jqMPd8
and this: http://www.youtube.com/watch?v=8Z5Z5zo1Rc4
Recording and Editing Digital Audio
 System sounds are assigned to various
system events such as startup and
warnings, among others.
 Macintosh provides several system sound
options such as glass, indigo, laugh.
 In Windows, available system sounds
include start.wav, chimes.wav, and
chord.wav.
 Multimedia sound is either digitally recorded
audio or MIDI music.
Recording and Editing Digital Audio (cont)
Audio File Formats
 A sound file’s format is a recognized methodology for
organizing data bits of digitized sound into a data file.
 On the Macintosh, digitized sounds may be stored as
data files, resources, or applications such as AIFF or AIFC.
 In Windows, digitized sounds are usually stored as
WAV files.
Audio File Formats (continued)
 MP3 compression is a space saver.
 MP4 is used when audio and video are streamed
together.
 AAC (Advanced Audio Coding) is used by Apple’s
iTunes store.
Adding Sound to Multimedia Project
File formats compatible with multimedia authoring
software being used, along with delivery media, must
be determined.
2. Sound playback capabilities offered by end users’
systems must be studied.
3. The type of sound, whether background music,
special sound effects, or spoken dialog, must be
decided.
4. Digital audio or MIDI data should be selected on the
basis of the location and time of use.
1.
Adding Sound to Multimedia Project (cont)
5. Create or purchase source material.
6. Edit the sounds to fit your project.
7. Test the sounds to be sure they are timed
properly with your project.
Adding Sound to Multimedia Project (cont)
 Professional sound quality (44.1KHZ by 16 bits) is
expensive and may require special setup to handle.
 Compression techniques reduce space, but reliability
suffers.
 Space can be conserved by downsampling or
reducing the number of sample slices taken per
second.
 File size of digital recording (in bytes) =
sampling rate x duration of recording (in secs) x (bit
resolution/8) x number of tracks.
Adding Sound to Multimedia Project (cont)
 Recording on inexpensive media rather than
directly to disk prevents the hard disk from
being overloaded with unnecessary data.
 The project’s equipment and standards must
be in accordance with the requirements.
 It is vital to maintain a high-quality
database that stores the original sound
material.
Adding Sound to Multimedia Project (cont)
 Keeping track of your sounds
 Audio CDs
The Red Book (or ISO 10149) standard is a standard for
digitally encoding high-quality stereo.
 For this standard, the digital audio sample size is 16
bits and the sampling rate is 44.1 KHz.
 The amount of digital sound information required for
high-quality sound takes up a great deal of disk storage
space.



Sound for your mobile
Sound for the Internet
Adding Sound to Multimedia Project(cont)
Web browsers must be told what to do when they
download file types.
Adding Sound to MM Project (cont)
Sound and image synchronization must be tested at
regular intervals.
The speed at which most animations and computerbased videos play depends on the user’s CPU.
 The sound’s RAM requirements as well as the user’s
playback setup must be evaluated.
 Copyrighted material should not be recorded or used
without securing appropriate rights from the owner or
publisher.
Vaughan’s Law of MM Minimums
 “ There is an acceptable minimum level of adequacy that
will satisfy the audience, even when that level may not be
the best that technology, money, or time and effort can
afford.”
 For web, game and other online MM projects lower
quality encodings are often more than sufficient:


22kHz with 8-bit (stereo) very popular and adequate for voice
quality projects.
11kHz with 8-bit (mono) as low as you can go and get usable
results (many high frequencies distorted).
MP3 Example Compression
1. The waveform is separated into small sections (frames).
2. Each frame is analyzed to see what frequencies are
present (aka spectral analysis).
3. The psychoacoustic model is applied. Any information that
matches the psychoacoustic model is retained and the rest is
discarded.
4. Depending on the bitrate, the codec uses the allotted
amount of bits to store this data.
5. Once this has taken place, the result is then passed through
the lossless Huffman zip-type compression which reduces the
size by another 10%.
MP3, Ogg Vorbis, Licensing & Patent Issues
 The MP3 format is patented
 Any commercial use is subject to licensing terms as
determined by Fraunhofer & Thompson Multimedia, the
holders of the patents
 Ogg Vorbis is similar to MP3 in many ways
 Open source and patent-free (royalty-free)
 Be aware of patent and license restrictions when
using 3rd party software
Summary
 Vibrations in the air create waves of pressure that are
perceived as sound.
 Multimedia system sound is digitally recorded audio or
MIDI (Musical Instrumental Digital Interface) music.
 Digital audio data is the actual representation of a
sound, stored in the form of samples.
Summary (continued)
 MIDI is a shorthand representation of music stored in
numeric form.
 Digital audio provides consistent playback quality.
 MIDI files are much smaller than digitized audio.
 MIDI files sound better than digital audio files when
played on a high-quality MIDI device.