Statistics and Data Analysis: Wk 11

Statistics and Data Analysis: Wk 11
Recap Wk 10
●
●
●
●
●
●
●
Linear regression
Fitting Gaussian profiles
Measurement grid rounding errors
Spectral line calibration (arc lamps)
Transform analysis
Fourier analysis (square/sawtooth wave)
Fast Fourier Transform
Statistics and Data Analysis: Plan
Lectures
● Friday 6th June @ 11am
● Tuesday 10th June @ 4pm
Mock
● Exam: Tuesday 17th June @ 4pm
● Results: Friday 20th June @ 11am
Exam
● Tuesday 24th June @ 4pm
The Fourier Transform
The Fourier Transform
The Fourier Transform is an immensely useful application in
modern everyday scenarios:
● earthquake vibrations can be boiled down into their dominant
frequencies, allowing us to better design buildings to withstand
those frequencies
● musical instruments can be better designed to emphasise the
frequencies we care about
● computer data can be ‘lossy compressed’ into only the most
dominant frequencies (e.g., image compression)
● frequency filters can be adopted to allow one transmission to
carry multiple signals of information
Discontinuities
Functions with discontinuities
end up as an infinite Fourier
series.
→ see Dirac delta function
Discrete Fourier Transform
The Discrete Fourier Transform (DFT) is the equivalent of the
continuous Fourier Transform for signals known only at N instants.
Each instant is separated by sample time T.
e.g.: 1000 Hz sinusoid
32 samples
sampling rate of 8000 Hz
Discrete Fourier Transform
We may write the DFT in similarly to the Fourier Transform:
where:
N = number of time samples
n = current sample under consideration
xn = value of the signal at time n
k = current frequency under consideration
Xk = amount of frequency k in the signal (amplitude and
phase)
DFT Speed
DFTs with many points (> 1 million) are common in many
applications (e.g., modern signal processing).
Directly applying the DFT to a data vector of length N requires N
multiplications and N additions → goes as N2 floating point
operations!
E.g., to compute a million point DFT, a computer capable of doing
one multiplication and addition every microsecond requires a
million seconds (~11.5 days)!
The Fast Fourier Transform Speed
In contrast, the speed of the Fast Fourier Transform (FFT) scales
as N·logN.
Why? Because the standard DFT relies upon a lot of redundant
calculations.
The cyclical nature of the root of unity (exponential term) requires
that the same values will be calculated over and over again as the
computation proceeds.
Folding
A widely used technique for fourier transforms is in image
reconstruction.
An image is smeared by an instrumental function. This can be due
to a finite resolution of optics in astronomy or solid state detectors
in particle physics. But it also might be by a scintillation of air
(seeing), etc...
HST PSF
Thus the un-smeared
radiation coming in at
any point (i.e., a Dirac
Delta function) is
distributed to a larger
area due to the effects.
The smearing function
is called point spread
function (PSF).
Point Spread Function
Power Spectrum
The power spectrum is the square of the measured amplitudes
from a signal (resp., the square of the FFT image) vs frequency.
Error due to sampling
Parallel Data Processing
A historical parallel computer...
Parallel Data Processing
In 1837, Charles Babbage first
described his proposed mechanical
general-purpose computer known as
the Analytical Engine.
Sadly, the machine was never built in
Babbages lifetime, owing to funding
and design issues.
“When a long series of identical computations is to be
performed, such as those required for the formation of
numerical tables, the machine can be brought into play so as
to give several results at the same time, which will greatly
abridge the whole amount of the processes.”
General L. F. Menabrea, ‘Sketch of the Analytical Engine invented by Charles Babbage’, 1842
Parallel Data Processing
John Von Neumann – the original designer of the nowadays
computational standard which combines an arithmetical logical
unit (ALU) and a central processing unit (CPU) working in a
pipeline of operational codes and an external memory (RAM) –
wrote a lot about parallel computing in the early 1950-ties (!).
This type of setup is known as Von Neumann Architecture, or a
Von Neumann Processor.
Why parallel?
We can’t speed up a Von Neumann
Processor towards infinity?
The speed of light limits us in getting
information from A to B!
Additionally, quantum mechanics
limits solid state production – in
2007 an AMD Opteron using 65nm
technology had 8 Atoms per gate in
the transistor.
Amdahl's Law
Amdahl’s law, first proposed by Gene Amdahl in 1967, is used to
find the expected speed-up to a system when only part of the
system is optimised to use parallel processing.
The law states that the speed-up is given by:
where:
N = number of parallel parts (cores)
P = parallel portion of system
Amdahl's Law
The ‘Lawnmower Law’
http://youtu.be/ehyO7mxeU74