Math 140 Project #1 Instructions – Data Analysis Comparison

Math 140 Project #1 Instructions – Data Analysis Comparison Report
Collecting, Analyzing and Comparing two Quantitative Data Sets
1. Decide on a question about quantitative data that compares two populations or groups. For example,
suppose we want to find out whether part time COC students or full time COC student work more.
Notice I would need two bits of information. If the student is a full or part time student and how many
hours per week they work. You will then be analyzing the quantitative variable (hours of work) for each
of the two groups. Be specific about the population and make the question address a quantitative
variable.
2. Devise a method for taking a sample. The sample does not have to be large or randomly selected, but
those are better. Part of your report will be to describe your sampling method and how well it applies to
the population. For example, you may use a voluntary response survey, but in the report you will say that
the sample data will not apply very well to the population. Also talk about the different kinds of biasif
you write questions for people to answer, make sure to avoid question bias.
3. Collect your data. You should have two quantitative data sets each having at least 20 values.
4. Write a paragraph describing the method used to collect the data and if the data represents the
population you are after. Also describe the various types of bias that could be present.
5. Use Statcrunch, Minitab or Statcato to analyze your data sets. Include, sample statistics for both data
sets, dot plots, box plots and histograms for both data sets. There should be a total of 7 graphs. Also
copy and paste the summary statistics from the computer program. There should be min, max, mean,
standard deviation, mode, range, median, Q1, Q3, and IQR. Analyze all the sample statistics. Describe
the shape, outliers, center and spread for each data set. Now compare the averages and typical values for
the two groups. Which group had the higher average? Which group had more spread? Do the typical
ranges of the groups overlap? Do you think the difference between the averages was significant? Why?
6. Print out a hard copy of your report, graphs and sample statistics and turn in.
Grading Rubric (150 points total)
1.
Introduction Paragraph – Why was this data important or interesting to you? (10 points)
2. Data Paragraph - Discuss how you collected the data and how well it applies to the population
and if there were any sources of bias present. (10 points)
3. Dot Plot for each data set – Describe the two dot plots. (10 points)
4. Histograms for each data set – What is the true shape each of the two data sets? (Must have
appropriate number of bins) (10 points)
5. Side by Side Boxplot or two separate box plots – Analyze the two boxplots. Are there any
outliers in the data set? Are they mistakes? Should we leave them in the data set or take the
outliers out? (10 points)
6. Sample Statistics and what the statistic tells us about the data sets.
a. Minimum and Maximum Value for both data sets (5 points)
b. Total frequency. How many numbers are in each of the two data set?
(These do not have to be the same.) (5 points)
c. Mean for each of the two data sets (5 points)
d. Median for each of the two data sets (5 points)
e. Mode for each of the two data sets (5 points)
f. Range for each of the two data sets (5 points)
g. Standard Deviation for each of the two data sets (5 points)
h. Q1 for each of the two data sets (5 points)
i.
Q3 for each of the two data sets (5 points)
j.
IQR for each of the two data sets (5 points)
7. For each of the two data sets, answer the following question: What is the best measure of Center
and how did it relate to the shape? (10 points)
8. What is the Average value for each data set? Compare the average values for both groups.
Which group had a higher average? Is there a significant difference between the averages?
(10 points)
9. What is the best measure of Spread for each data set and how did it relate to the shape? Write a
sentence explaining the meaning of the best Spreads. Which data set had more spread? (You can
compare IQR to IQR or standard deviation to standard deviation.) (10 points)
10. For each data set answer the following question: Give two numbers that typical numbers in the
data set fall in between. (“Typical Range”) (either Q1 and Q3 or the Mean +or- Standard
Deviation) Do the typical ranges for the two data sets overlap? (10 points)
11. Conclusion Paragraph – Compare the two groups. What did the data show about the two
populations? Why was your topic important or interesting? (10 points)