Applied Data Mining: Homework 2
Due Date, 2014
Probability Basics
1. Coin flipping (15 points)
Your (purported) “friend” Bernie has bet you that his coin will come up heads 3 times in a row.
a.
What distribution does a coin flip follow? What is the parameter of this distribution? If the coin is
fair, what is this parameter?
b.
What is the probability that three flips in a row come up heads? (Show the calculation).
c.
Suppose heads does come up three times in a row. Do you think Bernie used a fair coin? Or did he
cheat and use a weighted one? What would contribute to this speculation?
2. Coin flipping in R
(15 points)
Here is code to construct an empirical distribution of the observed frequences of ‘heads’ in three flips of
a fair coin (one that will come up with heads and tails equiprobably). Here we’ve repeated the experiment
10000 times. Copy and paste this into your R console.
results <- list()
for(i in 1:10000) {
coin.tosses
<- cumsum(sample(c(0,1), 3, replace = TRUE))
results[[i]] <- sum(coin.tosses[length(coin.tosses)])/length(coin.tosses)}
results <- as.numeric(results)
hist(results, xlim=c(0,1))
a.
Explain this plot in your own words.
b.
We observed 1.0, because ‘heads’ came up 3/3 = 1 times. Where is this on the plot? In light of this
empirical distribution, how likely was this outcome?
3 (from the book). (20 points) Sampling in R.
Follow along with the ‘gumball’ example in Stanton,
Chapter 7. Please hand in the entire log of the R session (do this in R by selecting ‘File’ → ‘Save as...’; be
sure to include plots separately!).
1
4. Perceptron (20 points) In lecture3.r there is an implementation of the Perceptron algorithm. You will
work this a bit for this problem. Be sure to upload your code (.r file) in Canvas. For b, also hand in written
answers.
• (a) As written, this will never terminate if the dataset is not separable. For example, try running
it with X2, y2. Edit the routine such that the routine will run for only some maximum number of
iterations (make max.iters a parameter to the routine). (make sure that it runs on X2 and y2, i.e.,
that
> perceptron(X2,y2)
halts.
• (b) As written, the learning.rate is a constant. Implement a version where this ‘decays’ with time.
Specifically, start the learning rate at some constant c (perhaps 1), then set it such that it is
c
t
where
t is the number of iterations taken. Experiment with setting c to different values on the toy dataset
(bonus points for playing with other data!): what is the effect?
5. Go play with some data (30 points) Go find a (hopefully interesting) dataset online – ANY dataset
(feel free to use tutorials, plenty of datasets also come with R, and tons are already formatted – no munging
requirement here! See, e.g., http://www.r-bloggers.com/datasets-to-practice-your-data-mining/)
– that has a binary (yes or no) outcome. You will try to predict this outcome, using the other available
attributes. Specifically, load your dataset into R, then run any two classifiers you’d like (presumably using
existing implementations!). Compare their results by reporting average sensitivity, specificity and overall
accuracy – either doing cross-fold validation or performing bootstrapping, as discussed in class: whichever
is fine. Report and interpret these results.
To hand in: (1) a description of the dataset that you used (attributes and what you were trying to
predict); (2) the code showing you loading the dataset into R; (3) a description of the classifiers you used
(and what implementations you used); (4) the empirical results and discussion about these.
2
© Copyright 2026 Paperzz