Paper Title (use style: paper title)

Marc Jurchak
ECE 3512: Signals – Continuous and Discrete
Department of Electrical and Computer Engineering, Temple University, Philadelphia, PA 1912
I.
PROBLEM STATEMENT
This assignment has two parts. In the first we will plot a regression line on the Google stock prices to
estimate its linear average value. We are given the Google stock prices, as well as a filtered version of the stock
prices, with frame size 1 and window size 7. We wish to find a linear estimate of the average of the signal.
In the second part, we will plot a histogram of the speech signal, with bin size 10. We will have a plot for
the positive and negative amplitudes of the signal, as well as the cumulative distribution function (CDF). We will
plot the histogram as a function of the center bin value.
II.
APPROACH AND RESULTS
We find the best way to implement a regression line is with the following equation, where “m” is the slope
and “b” is the intercept, where the regression line is plotted as 𝑦 = 𝑚𝑥 + 𝑏.
𝑚=
𝑥 ∙ 𝑦 − 𝑥𝑦
(𝑥)2 − 𝑥 2
𝑏 = 𝑦 − 𝑚𝑥
𝑦 is the average of all the “y” values of our signal, i.e.
𝑖=1
1
𝑦 = ∑ 𝑦𝑖
𝑁
𝑖=𝑁
The same is true for 𝑥, 𝑥𝑦, 𝑥 2 . The regression line is plotted over the stock prices, as well as the averaged signal in
figure 1.
Figure 1: Google stock price, windowed and framed average, and regression line.
For part 2, we use the “histcounts” function to plot the histogram. We begin by defining the bin edges from
-5 to 32767 for the positive amplitudes, and from 5 to -32767 for negative in increments of 10. We also define the
center value of these bins in another vector. Next we normalize these histograms by dividing by N. We then
concatenate these plots to get the overall distribution of the signal. Finally we use the “cdfplot” function to plot the
CDF of the signal. Plots are shown below.
Figure 2: Histogram of positive amplitude probability
Figure 3: Histogram of negative amplitude probability
Figure 4: Histogram of all amplitudes
Figure 5: CDF of entire signal
III.
MATLAB CODE
Part 1
clc;
google_stock = xlsread('google_v00.xlsx');
%test = [1 7 4 9 8 12 10 15];
%signal = [2 1 0 3];
google_stock = google_stock(:,4)';
[signal,variance] = compute_avg_and_var_v01(google_stock, 1, 7);
signal = signal';
Y = signal; % signal average with frame 1, window 7
X = linspace(1,length(signal),length(signal));
%linear_avg = ones(1,length(signal))*mean(signal);
m = 0; % slope of regression line
%variables in regression line equation
%
x_avg_sum = 0;
y_avg_sum = 0;
xy_avg_sum = 0;
x_sq_avg_sum = 0;
%summs all x, y, x*y and x^2 data points.
%
for i = 1:length(signal)
x_avg = X(i);
x_avg_sum = x_avg_sum + x_avg;
y_avg = Y(i);
y_avg_sum = y_avg_sum + y_avg;
xy_avg = X(i)*Y(i);
xy_avg_sum = xy_avg_sum + xy_avg;
x_sq_avg = X(i)^2;
x_sq_avg_sum = x_sq_avg_sum + x_sq_avg;
end
% Divide all these by N to obtain the average value of this data
%
x_avg_sum = x_avg_sum / length(signal);
y_avg_sum = y_avg_sum / length(signal);
xy_avg_sum = xy_avg_sum / length(signal);
x_sq_avg_sum = x_sq_avg_sum / length(signal);
% Equation for regression line
%
m = ((x_avg_sum * y_avg_sum) - xy_avg_sum)/((x_avg_sum^2) - x_sq_avg_sum);
b = y_avg_sum - m * x_avg_sum;
regression_line = m.*X + b;
plot(X,regression_line, '-.r',X,signal, 'b', X,google_stock, 'k')
legend('regression line','average','signal','Location','northwest')
axis([0 2618 0 650]);
xlabel('days')
ylabel('stock')
title('Google Stock and Averages')
Part 2
clc;
fp = fopen('rec_01_speech.raw','r');
speech = fread(fp,inf,'int16');
fclose(fp);
% test signal
a = [1 1 1 1 2 2 2 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 8 9 10 11 12 13 14 15
16 17 18 19 20 21 ...
19 18 18 17 17 17 16 16 16 16 15 15 15 14 14 14 13 13 13 13 13 12 12 11
11 11 10 10 10 9 9 8 7 7 7 6 5 4 3 2];
% bin edges for positive amplitudes
posedge = -5:10:32767;
% bin edges for negative amplitudes
negedge = -32767:10:5;
% middle of bins
posmiddle = 0:10:32767;
negmiddle = -32767:10:0;
% histogram for positive and negative amplitudes
posN = histcounts(speech,posedge);
negN = histcounts(speech,negedge);
% normalize histogram so sum(signal) = 1
posN_norm = posN./length(speech);%speech
negN_norm = negN./length(speech);%speech
% combine histograms and middle of bin values so the entire PMF can be seen
neg_and_pos_norm = horzcat(negN_norm,posN_norm);
neg_and_pos_middle = horzcat(negmiddle,posmiddle);
figure(1)
plot(posmiddle,posN_norm)
ylabel('probability')
xlabel('amplitude')
title('positive amplitudes')
figure(2)
plot(negmiddle,negN_norm)
ylabel('probability')
xlabel('amplitude')
title('negative amplitudes')
figure(3)
cdfplot(speech)
figure(4)
plot(neg_and_pos_middle, neg_and_pos_norm)
ylabel('probability')
xlabel('amplitude')
title('all amplitudes')
% should sum to 1
sum = 0;
for n = 1:length(neg_and_pos_norm)
sum = sum + neg_and_pos_norm(n);
end
sum
IV.
CONCLUSIONS
For part 1, we found it is most helpful to get a closer view of the averages. Figure 6 shows a place where the
regression line can be seen as well as the averaged signal.
Figure 6: Close up of Google stock price averages
We can see from this plot that the regression line tells us of the long term behavior of the signal. While our averaged
signal has a frame size of 1 and averages the short term behavior, the regression line can be thought of as the signal
averaged over a single frame. Thus while the signal with frame 1 filters the signal to a smoothed approximation, the
regression line filters the signal to a maximum degree. When investing in stocks long term, we essentially are
predicting the slope of the regression line and hoping that stocks will increase steadily in the long term. The frame
size 1 signal is useful for short term investing, so we are not thrown by the usual spikes and sags from the mean.
For part 2, we see that plotting the frequency of each amplitude is a useful way to predict which amplitudes
are more likely to occur. When we normalize the plots so that they sum to 1, we see that this is essentially the PMF of
the signal. Even if a specific amplitude never occurs, we can reasonably predict is probability based on its
neighboring amplitudes. We see from figure 2, 3 and 4 that the positive and negative amplitudes mirror each other
about the y axis. This makes sense since sound signals return to 0 amplitude quite often, but only reach fringe
amplitudes in small spurts. We see that the CDF makes sense, since the CDF is the integral of the PMF. The CDF
increases slowly at first, has a maximum slope at 0, and exponentially approaches 1. This makes sense since the PMF
is small at its edges and maximum at 0.

Download Report

Paper Title (use style: paper title)

Paperzz.com

Your Paperzz