IT 241 Information Discovery and Architecture Exam 3 Page 1 December 5, 2013 Name _____________________________ This exam permits one page of handwritten notes. 1. Interaction concepts. Consider 1) an interactive geographic map as a pixel level display but the elements are very hierarchical (states, have counties, towns have streets, etc.), 2) a very large Excel spreadsheet as data level structure, and 3) a full family tree (including siblings, ancestors and descendants) as a relational/hierarchical structure to answer the following questions. [3 pts each=15] a. Describe panning in the map and in the spreadsheet. b. How do you distinguish between panning and scrolling? c. Which of the three objects would the zooming operation be most appropriate? Why? Describe what happens on a zoom in or out on that object. d. Distinguish between selection and filtering in the spreadsheet. e. Apply overview+detail to the family tree. If you focused on one person in the tree, what might you show as detail for the person? What would be displayed nearby? What could be displayed further away? f. Describe what opportunistic intention on the map. IT 241 Information Discovery and Architecture Exam 3 Page 2 2. Describe the issues with the design of this visualization above. Identify at least 4 problems. [12 pts] a. b. c. d. IT 241 Information Discovery and Architecture Exam 3 Page 3 3. Draw a tree map in the box of the hierarchy to the right. Assume sibling nodes have equal weight. [5 pts] C A D B 4. Draw a directed graph (lines have arrowheads) from the following adjacency matrix. [5 pts] 5. Describe a word cloud. How is it constructed? What could the visualization attributes be based on? [6 pts] IT 241 Information Discovery and Architecture Exam 3 Page 4 6. A graphic should have three levels of viewing: what is seen 1) at a distance, 2) in the details, and 3) implicitly. Explain the meaning of these levels. [6 pts] 7. Why is a stem and leaf graph considered a good graphic, such as the one below? How does data-ink ratio apply here? [4 pts] 8. Discuss why area and volume are not good choices to visualize magnitude? [5 pts] IT 241 Information Discovery and Architecture Exam 3 Page 5 [17 pts] 9. Decision trees. a. Given the decision tree rule for the above dataset IF Sex=Male THEN CreditCardInsurance=No Determine its accuracy = ___________% and its coverage = ___________% b. Draw a decision tree to correspond with these three production rules. (Not all leaves are defined.) IF Sex=Male THEN CreditCardInsurance=No IF Sex=Female && IncomeRange=30-40K THEN CreditCardInsurance=Yes IF Sex=Female && IncomeRange != 30-40K THEN CreditCardInsurance=No c. In predicting MagazinePromotion, why is the entropy=0 bits for Salary=”50-60K”? d. In predicting WatchPromotion, the entropy for Salary=”30-40K” is expressed as info([ ___ , ___ ]) = entropy ( ____/___ , ___/___) [There are only 3 different numbers in these 6 blanks.] IT 241 Information Discovery and Architecture Exam 3 Page 6 10. Association Rules. [13 pts] a. Using the credit card data from the previous page, identify 4 single items sets would be generated with a confidence threshold of 33%? (exclude the age attribute) single item sets Number of items A. B. C. D. b. What pairings of your 4 item sets A-E, if any, also meet the 33% confidence threshold? c. If you had the item set pairing (which you may not necessarily have) of Sex=Male and LifeInsPromo=Yes, what two rules could be expressed? i. IF ________________________ THEN _________________________ ii. IF ________________________ THEN _________________________ 11. Given the confusion matrix. [5 pts] Actual\Predicted Cat Dog Rabbit Cat 9 1 2 Dog 5 10 3 Rabbit 2 2 16 a. Total number of dogs = ____________ b. Number of cats classified as a dog = _______ c. Number of dogs incorrectly classified = ________ d. Percent classified correctly (all three categories) = ________ [7 pts] 12. Clustering data mining true/false. _____ The user must pre-specify the number of clusters. _____ Outliers are chosen to initialize the cluster centroids. _____ Euclidean distances are preferred in the K-means clustering when associating data points to cluster centroids. _____ The K-means clustering algorithm determines the final clusters in 2 iterations. _____ Different random number seeds are good to experiment with to start the K-means algorithm. _____ The Perceptron method is a form of neural networks. _____ The Perceptron method adjusts its biases away from an instance when it discovers a misclassified instance. IT 241 Information Discovery and Architecture Exam 3 Page 7 Course feedback regarding the shared lectures with University of Applied Sciences in Germany. I know this isn’t very anonymous, but hope you will still give us constructive suggestions of what worked and what didn’t work and what we might consider doing differently. What was good about sharing lectures? What didn’t work for you? What suggestions do you have for us? What concerns would you have if we arranged for some short group projects that involved collaboration internationally in the course? Thank you!
© Copyright 2026 Paperzz