Efficient Computation of Regret-ratio Minimizing Set: A Compact Maxima Representative ABOLFAZL ASUDEH U N I V ERSITY OF T E X AS AT A R L I N GTON AZADE NAZI U N I V ERSITY OF T E X AS AT A R L I N GTON NAN ZHANG GEORGE WASHINGTON UNIVERSITY GAUTAM DAS SIGMODβ17 © 2017 ACM. ISBN 978-1-4503-4197-4/17/05 UNIVERSITY OF TEXAS AT ARLINGTON Outline Motivation and Problem statement 2D-RRMS (Two-Dimensional Regret-Ratio Minimizing Set) HD-RRMS (Higher-Dimensional Regret-Ratio Minimizing Set) Experiments 2 Maxima Queries β¦ to give the best trade-off b/w price, duration, number of stops, β¦ π = βπ€π π΄π 3 Y 1 Example 0.9 π‘π οΌ 0.8 0.7 π =π₯+π¦ 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X 4 1 Convex hull (sky convex) Y Example οΌ 0.9 οΌ 0.8 0.7 0.6 0.5 0.4 0.3 0.2 οΌ 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X 5 οΌ ο Y Example 1 0.9 οΌ ο 0.8 0.7 A subset of skyline: the set of non-dominated points 0.6 ο 0.5 0.4 0.3 0.2 οΌ 0.1 ο 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X 6 Example Convex hull (sky convex) Y 1 0.9 οΌ 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 X 7 Convex hull size Problem Curvature effect 8 Convex hull size Problem effect of the number of attributes (m) m=6 m=5 m=3 m=2 m=4 Regret-Ratio Minimizing Set Problem: Find a subset of size at most r that minimizes the maximum Regret-ratio over all functions π π‘ β π(π‘ β² ) π π‘ β π(π‘ β² ) π(π‘) οΌ 10 Overview of the literature, Our contributions The regret-ratio notion and the problem was first proposed at [Nanongkai et. al. VLDB 2010]. In two dimensional data: β¦ [Chester et. al. VLDB 2014]: Sweeping line π(π. π2 ) β¦ We: a dynamic algorithm O r. s. log s . log c < O r. n. (log n)2 -- s: skyline size; c: convex hull size. In higher dimensional data: β¦ Complexity: NP-complete β¦ For arbitrary dimensions: [Chester et. al. VLDB 2014] β¦ Recently for fixed dimensions: [W. Cao et. al. ICDT 2017], [P. K. Agrawal et. al. Arxiv:1702.01446, 2017] β¦ Existing work: (a) a greedy heuristic with unproven theoretical guarantee, (b) a simple attribute space discretization with a fixed upper bound on the regret-ratio of output [Nanongkai et. al. VLDB 2010]. β¦ We: a linearithmic time approximation algorithm that guarantees a regret ratio, within any arbitrarily small user-controllable distance from the optimal regret ratio. β¦ Assumption: fixed number of dimensions 11 Outline Motivation and Problem statement 2D-RRMS (Two-Dimensional Regret-Ratio Minimizing Set) HD-RRMS (Higher-Dimensional Regret-Ratio Minimizing Set) Experiments 12 High-level idea t0 t1 t2 ο» Order the skyline points from top-left to bottom right, add two dummy points t0 and ts+1, and construct a complete weighted graph on these points t3 ο» t4 ο» Weight of an edge is the Max. regret ratio of removing all the points in its top-right half-space t5 ο» t6 t7 13 t0 High-level idea t1 t2 Order the skyline points from top-left to bottom right, add two dummy points t0 and ts+1, and construct a complete weighted graph on these points t3 Weight of an edge is the Max. regret ratio of removing all the points in its top-right half-space ο use binary search t4 t5 t6 t7 14 High-level idea t0 t1 t2 ο» Order the skyline points from top-left to bottom right, add two dummy points t0 and ts+1, and construct a complete weighted graph on these points t3 ο» t4 ο» Weight of an edge is the Max. regret ratio of removing all the points in its top-right half-space ο use binary search Apply the Dynamic programming, DP(ti,rβ): optimal solution from ti to ts+1 with at most rβ intermediate steps t5 ο» π(π. π . log π log π) t6 t7 15 Outline Motivation and Problem statement 2D-RRMS (Two-Dimensional Regret-Ratio Minimizing Set) HD-RRMS (Higher-Dimensional Regret-Ratio Minimizing Set) Experiments 16 Steps β’ Start with a conceptual model RRMS β’ Discuss its problems β’ Propose the idea of function space discretization DMM β’ Transform RRMS to a Min Max problem β’ Define the intermediate problem βMin Rows Satisfying a Thresholdβ MRST β’ Transform MRST to a fixed-size instance of Set-cover problem 17 Conceptual Model f Transform the problem to a min-max problem Regret-ratio on π if only π‘2 is remained Problem1: β¦ F is continuous ο infinite number of columns β¦ Matrix Discritization ... π‘1 π‘2 F (all possible functions) Problem2: π‘π Max ( Min ) π β¦ Even if could construct the matrix, π to solve it β¦ Transform to fixed-size set-cover instances 18 Matrix Discretization π2 f Arbitrarily small user-controllable distance from the optimal solution π1 19 DMM: Discretized Min Max Problem FF(discretized function space) (all possible functions) (discretized function space) F set-cover Order theinstances values in M. f f 1. Accept a result if its size is at most πππππ(πΎ): Index size increase, no in quality of output Do change a binary search over the values and for each value 2. Accept the result if size is at most r: index size does not change, Define an intermediate problem: output quality may increase. 1 if regret-ratio of t for f is at most threshold, 0 otherwise ... π‘π‘π‘π1 2 Observation: the optimal regret-ratio is one of thefor cellsolving values! Practical HD-RRMS: Use greedy approximate algorithm the β¦ Min. rows satisfying the threshold (MRST) Convert M to a (fixed-size) binary matrix Convert MRST to a (fixed size) set-cover instance π‘π Max ( Min ) For fixed values of π and πΎ, can be solved in constant time. ο The running time of HD-RRMS is π(π log π) 20 Outline Motivation and Problem statement 2D-RRMS (Two-Dimensional Regret-Ratio Minimizing Set) HD-RRMS (Higher-Dimensional Regret-Ratio Minimizing Set) Experiments 21 Setup Synthetic Data: β¦ Three datasets (correlated, independent, and anti-correlated) 10M tuples over 10 ordinal attributes. Real-world Datasets β¦ Airline dataset: 5.8M records over two ordinal attributes. β¦ US Department of Transportation (DOT) dataset: 457K records over 7 ordinal attributes. β¦ NBA dataset: 21K tuples over 17 ordinal attributes. 22 2D-RRMS NBA dataset Airline dataset 23 HD-RRMS DOT dataset NBA dataset 24 Thank You! 25
© Copyright 2026 Paperzz