Dissolved Oxygen Analysis

Dissolved Oxygen Analysis- Part One
Summary Statistics
Of the 360 samples of Chicago water collected, the mean amount of dissolved
oxygen in the samples was 8.6325mg/L with a standard deviation of 2.471965. The
median was 8.75mg/L, and the mode was 7.6mg/L. The dissolved oxygen levels ranged
from 0-15mg/L.
150
100
Frequency
50
16
12
8
4
0
0
Frequency
Dissolved Oxygen in Greater Chicago
District(1997)
Dissolved Oxygen (in mg/L)
The distribution of the amount of dissolved oxygen was normal. There were no extreme
outliers skewing the curve in any one direction. The mean and the standard deviation are
good estimations of the center and spread of the observed data.
Confidence Interval
The data shows that there is a 95% certainty that
the average amount of dissolved oxygen in Chicago
water is between 8.376285mg/L and 8.888715mg/L.
These values are well above the minimum amount of
dissolved oxygen (6mg/L) needed to sustain aquatic life
in waterways. Having an average amount of dissolved
oxygen above 6mg/L means that the samples of Chicago
water, on average are safe.
Mean
Standard Error
Median
Mode
Standard Deviation
8.6325
0.130284
8.75
7.6
2.471965
Significance Test
Our null hypothesis for this experiment was that the dissolved oxygen levels
would be 6mg/L or higher. Our alternative hypothesis was the dissolved oxygen levels
would be less than 6mg/L. The results of our significance test showed that our null
hypothesis should not be rejected, meaning the dissolved oxygen levels in Chicago
water’s are at or above 6mg/L. The p-value was 0.975 confirming that there is high
probability the dissolved oxygen levels are above 6mg/L. In conducting the significance
test we have to assume that the data was collected using the same standards, for example:
the same instruments were used to collect the samples, the samples of water were
collected randomly, etc. We have to assume that the same standards were used because
the data may be skewed if improper collection methods were used.
Dissolved Oxygen Analysis- Part Two
Summary Statistics
As shown in the scatter plot below, the overall trend of dissolved oxygen to
temperature is negative. The plot follows a clear form in a linear line with few outliers.
The conclusion drawn from this data is temperature and dissolved oxygen levels are
negatively associated. As temperature increases, the level of dissolved oxygen decreases
Dissolved oxygen Levels
Dissolved oxygen vs. Temperature
16
14
12
10
8
6
4
2
0
0
5
10
15
20
25
30
Temperature
Regression Analysis
The R- Square value 0.3955 represents the proportion of variation in dissolved
oxygen levels that is explained by its linear relationship with temperature. Therefore,
nearly 40 percent of the variation in dissolved oxygen levels in the Chicago water’s
sampled is explainable by its linear relationship with temperature.
Regression Statistics
Multiple R
0.628882916
R Square
0.395493723
Adjusted R Square
0.393492046
Standard Error
1.951549437
Observations
304
The R- Square values falls between its expected values of 0 and 1, but it is not very large.
Our value being smaller shows that the data is not fit to the regression line perfectly. To
improve this value, all existing outliers could be removed or a larger data set could be
obtained increasing the amount of data points around the regression line.
Dissolved Oxygen vs. Temperature Residual
Plot
Dissolved Oxygen
Level (mg/L)
10
5
0
0
10
20
30
-5
-10
Temperature
Temperature Line Fit Plot
Y
Dissolved Oxygen Level
(mg/L)
Predicted Y
20
15
10
5
0
0
10
20
30
Temperature
Temperature Normal Probability Plot
Dissolved Oxygen
Level (mg/L)
16
14
12
10
8
6
4
2
0
0
20
40
60
80
100
Temperature Sample Percentile
The normal probability plot shows that the dissolved oxygen values fall closely on a line
but not exactly. We can assume since the line is fairly straight that the values are
normally distributed. If the outliers were removed from the data a more clear line would
be formed.
Regression minus Outliers
The R- Square value without the outliers present in the data set is 0.4616. With
the outliers taken out of the data, nearly 47 percent of the variation in dissolved oxygen
levels is explainable by its linear relationship with temperature.
Regression Statistics
Multiple R
0.679402673
R Square
0.461587992
Adjusted R Square 0.459775157
Standard Error
1.726641684
Observations
299
The R- Square value is still low, but without the outliers present the value is getting
larger. As the R- Square value gets larger, the more the data will fit the regression line.
DO vs. Temp minus outliers Residual Plot
6
DO Residuals
4
2
0
-2
0
5
10
15
20
25
30
-4
-6
Temperature
Temperature minus outliers
Line Fit Plot
Y
Predicted Y
Dissolved
oxygen (mg/L)
20
15
10
5
0
0
10
20
Temperature
30
Dissolved Oxygen
Level (mg/L)
Temperature minus outliers
Normal Probability Plot
20
15
10
5
0
0
20
40
60
80
100
Temperature Sample Percentile
The normal probability plot without the outliers shows the dissolved oxygen values
falling closer on a line than the original data. We can assume since the values are almost
in a straight line that they are normally distributed. If more values on the outermost edges
of the scatter plot were taken away, the R- Square value would continue to get larger and
the values would fit closer on the regression line.