Data Mining Cave-Ins Roland Minton Roanoke College Golf by the Numbers, JHU Press • System of volunteers and lasers • 1.2 million shots (20,000 rounds) per year, 2004-present • PGA Tour events only {R,PGATOUR,2008,380,01035,745,027, Tommy,Armour III,01,The Barclays,Ridgewood Country Club, 15, 3,03,155,01,S,TeeBox,,Green, Unknown,05171,05400,N,N,, 00000000228,0907,Good,With,Level, 9,922.2740, 8,232.4720, 74.6350,00022,00000} Where Did the Ball Stop? • It started 5171 inches away. • It traveled 5400 inches. • It ended 228 inches away. • The ball went too far (long). • The ball was not on line. Where Did the Ball Stop? Where Did the Ball Stop? Where Did the Ball Stop? Da Secret Code • It actually is the result of rounding. • When I “unrounded” the data, Da Secret Code • The inner branch is approximately Da Secret Code • If B changes to B – 1 and x = 0, Assume that B, d about the same, and convert to feet. Coming Up Short Means of y-values Different Strokes Std deviations of y-values Going Offline Std. Means Dev. of of|x|-values |x|-values Correlations • Sort the data by tournament. • Correlate percentage of putts made from a given distance and score. • The first time I did this I got Correlations • • • • • • • From 0-3 feet, -.150 From 3-4 feet, -.104 From 4-5 feet, -.061 From 5-6 feet, -.034 From 10-15 feet, -.053 From 15-20 feet, .110 From 20-25 feet, .229 Correlations • From 20-25 feet, the scatter plot is Correlations • When only those who made the cut were included in the calculation, • From 15-20 feet, -.051 • From 20-25 feet, -.008 • From 25-30 feet, -.096 • What does this mean? Correlations Lake Wobegon Open • • • • Ranking system for bunker (sand) play: For given player, for each bunker shot Compare “score” to “average score”. Add up all differences and divide by total number of bunker shots. Lake Wobegon Open • • • • • • Compare “score” to “average score”. Distance before B, distance after A Replace A with tour avg. # putts from A Find tour avg. distance after from B Replace with tour avg. # putts Difference is value of shot. Lake Wobegon Open • • • • • Ranking of tour players looks right. In 2008, #1 Mike Weir 0.147 The sum of all the ratings is 1.084. (These guys are good!) So A – A > 0 ! Lake Wobegon Open • • • • Two problems with averaging. 1) The function avgputts(d) is not linear but is concave down. So 5 feet better beats 5 feet worse. Lake Wobegon Open • Two problems with averaging. • 2) “All” is not the same as “all.” • The tour average is ALL shots, but I only computed ratings for the most active 230 players. • For this type of shot, the regular players are better than the irregulars. Any Questions?
© Copyright 2025 Paperzz