Math Statistics MATH 203 Student: Nicholas Nishikawa Student

Math Statistics MATH 203
Student: Nicholas Nishikawa
Student Number: 260316198
Professor: David Wolfson
Assignment #1
1) pg. 46,
2.29 a) By adding the percentages of aftershocks with intensity between 2.5 and 3.5 I estimated
that the percentage of aftershocks between 1.5 and 2.5 on the Richter scale is approximately 68%
of the 2,929 aftershocks.
b) By adding the percentages after 3.0 on the Richter scale, I estimate that about 9% of the
aftershocks are greater than 3.0.
2) pg. 48,
2.37
a) The graph is very roughly symmetrical around the value 5.5. There is a slight positive skew
but it is roughly symmetrical.
b) There are two observations that are unusually large (two at 14.5). There is one low value of
3.3. All other values are within one standard deviation of the mean.
3) pg.59,
2.59
Data (ammonia concentrations):
1.53
1.50
1.37
1.51
1.55
1.42
1.41
1.48
1.53
1.55
Data (ammonia concentrations) in Order from Lowest to Highest:
1.37
1.41
1.42
1.48
1.50
1.51
a) Determining the sample mean
𝑛
1
π‘₯=
π‘₯𝑖
𝑛
𝑖=1
8
1
1.37 + 1.41 + 1.42 + 1.48 + 1.50 + 1.51 + 1.53 + 1.55
8
𝑖=1
1
π‘₯ = (11.77)
8
π‘₯ = 1.47125
π‘₯=
b) The sample median for this data set is defined by the following equation:
1.48 + 1.50
2
π‘š = 1.49
π‘š=
c) The sample mean for this data was 1.47125 this was skewed lower by the low observation of
1.37 which has an effect on bringing the average lower than the sample median of 1.49. This is
because there are only 8 observations and each has a strong pull on the sample mean.
4) pg. 60
2.66 *Sample mean and median were calculated using the equations from question 2.59. Data
used can be found on page 61.
a) Sample mean: 12.82- The average of number of ant species at the 11 sites is 12.82
Sample median: 5-This value has 50% of data above it and 50% of data below it
Sample mode: 4&5-These values represent the most repeated data value
b) The median would be the best appropriate measure of central tendency for this data set. This is
because it is influenced by outliers which makes mean a bad representation of the entire set.
c) Dry Steppe Region:
Sample mean: 40.4
Sample median: 40
Sample mode: 40
d) Gobi Desert Region:
Sample mean: 28
Sample median: 26
Sample mode: 30
e) Yes, the center of the distribution for plant cover percentage is different between the two
areas. The total plant cover percentage in the Dry Steppe Region is greater than in the Gobi
Desert, as shown by the measures of central tendency above.
5) pg. 67
2.83
Data:
1.53
1.50
1.37
1.51
1.55
1.42
1.41
1.48
Data Ordered:
1.37
1.41
1.42
1.48
1.50
1.51
1.53
1.55
a) Sample range is determined by the following equation:
𝑅 = 1.55 βˆ’ 1.37
𝑅 = 0.18
b) Variance of the data is defined by the following equation:
1
𝑠2 =
π‘›βˆ’1
𝑛
(π‘₯𝑖 βˆ’ π‘₯ )2
𝑖=1
1.37 + 1.41 + 1.42 + 1.48 + 1.50 + 1.51 + 1.53 + 1.55
8
π‘₯ = 1.47125
π‘₯=
1
βˆ΄π‘  =
7
8
2
1.37 βˆ’ 1.47125
𝑖=1
𝑠 2 = 0.0040985
2
+ 1.41 βˆ’ 1.47125
2
… + 1.55 βˆ’ 1.47125
2
c) Sample standard deviation is defined by the following equation:
𝑠 = 𝑠2
𝑠 = 0.06402
d) If the standard deviation is 1.45ppm in the morning and the standard deviation in the
afternoon is 0.6402ppm then there is more variation during the morning.
6) pg. 73
2.97 Hand Washing
a) About 95% of the data will fall within a range of two standard deviations away from the mean.
Therefore the higher end of the range that contains 95% of data is:
35 + 2 59 = 153
The lower end will be 0 because there cannot be less than 0 bacteria.
Thus, the range that contains about 95% of the data is 0-153
b) About 95% of the data for hand washing is also two standard deviations away. Therefore the
higher end of this range is defined by this expression:
69 + 2 106 = 281
The lower end of the range will be 0.
Thus, the range that contains about 95% of the data is 0-281
c) Based on the data above, it can be deduced that hand rubbing is more effective than hand
washing because not only is the mean lower the range that contains 95% of the data is much
lower in hand rubbing. This means on average there are less bacteria on the hands of people that
use rubbing alcohol as opposed to washing.
7) pg.87
2.125Box Plot
a) The median of this horizontal box plot is approximately 4.
b) The upper quartile is approximately 6 and the lower quartile is approximately 3.
c) Interquartile Range is approximately defined by the following equation:
𝐼𝑅 = 6 βˆ’ 3 = 3
Therefore, the Interquartile range is approximately 3.
d) The horizontal box plot is skewed to the right (positive skew) as the data is more spread out
and skewed to the right of the median.
e) 50% of the data is to the right of the median. 75% of the data is to the left of the upper
quartile.
f) There are three outliers on this box plot. They are identified by asterisks. These are at the
values of approximately 12, 13 and 16.
8) pg.87
2.127 Doctorfish of Kangal
a) The approximate 25th percentile or the lower quartile in the pre-treatment PASI score is 10.
The approximate median for pre-treatment is 15. The approximate 75th percentile or the upper
quartile in the pre-treatment PASI score is 28.
b) The approximate 25th percentile or lower quartile post-treatment PASI score is 3. The
approximate median post-treatment is 5. The approximate 75th percentile or upper quartile posttreatment is a PASI score of 7.
c) The technique of ichthyotherapy is very effective in treating psoriasis. It decreases the PASI
score absolutely as is demonstrated in the box plot on page 87.