Promotion Analysis
in Multi-Dimensional
Space
VLDB 2009
Tianyi Wu1 Dong Xin2 Qiaozhu Mei 2 Jiawei Han1
1University of Illinois, Urbana-Champaign, Urbana, IL, USA
2Microsoft Research, Redmond, WA, USA
Presenter : Chun Kit Chui (Kit)
Supervisor : Dr. Ben Kao
Content
Introduction
Promotion
analysis problem
Problem definition
Promotiveness
measure
The basic query execution framework
Subspace
pruning
Object pruning
Promotion cube
Experimental evaluations
Conclusion
Introduction
Introduction
Promotion has been playing
a key role in marketing…
Book sales database
Retailer
Category
Readership
Year
#Sales
A
Sci & Tech
College students
2009
20
A
Sci & Tech
University students
2009
5
A
Comedy
University students
2009
9
B
Sci & Tech
College students
2009
20
B
Sci & Tech
University students
2009
7
B
Fiction
University students
2009
5
B
Comedy
Kindergarten
2009
20
B
Comedy
College students
2009
10
C
Sci & Tech
College students
2009
12
…
…
…
…
…
A
Sci & Tech
College students
2010
22
A
Sci & Tech
University students
2010
4
A
Comedy
College students
2010
1
B
Sci & Tech
College students
2010
13
B
Sci & Tech
University students
2010
30
B
Fiction
University students
2010
5
B
Comedy
Kindergarten
2010
20
B
Comedy
College students
2010
10
C
Sci & Tech
College students
2010
16
C
Comedy
Kindergarten
2010
52
Introduction
Promotion has been playing
a key role in marketing…
What is the rank of our book sales
among other retailers?
Manager of
retailer A
Book sales database
Retailer
Category
Readership
Year
#Sales
A
Sci & Tech
College students
2009
20
A
Sci & Tech
University students
2009
5
A
Comedy
University students
2009
9
B
Sci & Tech
College students
2009
20
B
Sci & Tech
University students
2009
7
B
Fiction
University students
2009
5
B
Comedy
Kindergarten
2009
20
B
Comedy
College students
2009
10
C
Sci & Tech
College students
2009
12
…
…
…
…
…
A
Sci & Tech
College students
2010
22
A
Sci & Tech
University students
2010
4
A
Comedy
College students
2010
1
B
Sci & Tech
College students
2010
13
B
Sci & Tech
University students
2010
30
B
Fiction
University students
2010
5
B
Comedy
Kindergarten
2010
20
B
Comedy
College students
2010
10
C
Sci & Tech
College students
2010
16
C
Comedy
Kindergarten
2010
52
Introduction
Promotion has been playing
a key role in marketing…
What is the rank of our book sales
among other retailers?
Manager of
retailer A
Global aggregate result
E.g. To compute the aggregate
value of this cell, we project all
tuples with Retailer = “A” and
sum up their sales.
Book sales database
Retailer
Category
Readership
Year
#Sales
A
Sci & Tech
College students
2009
20
A
Sci & Tech
University students
2009
5
A
Comedy
University students
2009
9
B
Sci & Tech
College students
2009
20
B
Sci & Tech
University students
2009
7
B
Fiction
University students
2009
5
B
Comedy
Kindergarten
2009
20
B
Comedy
College students
2009
10
C
Sci & Tech
College students
2009
12
…
…
…
…
…
A
Sci & Tech
College students
2010
22
A
Sci & Tech
University students
2010
4
A
Comedy
College students
2010
1
Retailer
#Sales
A
61
B
Sci & Tech
College students
2010
13
B
180
B
Sci & Tech
University students
2010
30
C
80
B
Fiction
University students
2010
5
B
Comedy
Kindergarten
2010
20
B
Comedy
College students
2010
10
C
Sci & Tech
College students
2010
16
C
Comedy
Kindergarten
2010
52
We ranked the 3rd among all book retailers !
Introduction
Promotion has been playing
a key role in marketing…
What is the rank of our book sales
among other retailers?
Manager of
retailer A
Discover the most interesting
subspaces where the our brand
(Retailer A) is highly ranked among
other competitors.
Book sales database
Retailer
Category
Readership
Year
#Sales
A
Sci & Tech
College students
2009
20
A
Sci & Tech
University students
2009
5
A
Comedy
University students
2009
9
B
Sci & Tech
College students
2009
20
B
Sci & Tech
University students
2009
7
B
Fiction
University students
2009
5
B
Comedy
Kindergarten
2009
20
B
Comedy
College students
2009
10
C
Sci & Tech
College students
2009
12
…
…
…
…
…
A
Sci & Tech
College students
2010
22
A
Sci & Tech
University students
2010
4
Retailer
#Sales
A
Comedy
College students
2010
1
A
61
B
Sci & Tech
College students
2010
13
B
180
B
Sci & Tech
University students
2010
30
C
80
B
Fiction
University students
2010
5
B
Comedy
Kindergarten
2010
20
B
Comedy
College students
2010
10
C
Sci & Tech
College students
2010
16
C
Comedy
Kindergarten
2010
52
We ranked the 3rd among all book retailers !
Introduction
Book sales database
Promotion has been playing
a key role in marketing…
What is the rank of our book sales
among other retailers?
Manager of
retailer A
Discover the most interesting
subspaces where the our brand
(Retailer A) is highly ranked among
other competitors.
Retailer
Category
Readership
Year
#Sales
A
Sci & Tech
College students
2009
20
A
Sci & Tech
University students
2009
5
A
Comedy
University students
2009
9
B
Sci & Tech
College students
2009
20
B
Sci & Tech
University students
2009
7
B
Fiction
University students
2009
5
B
Comedy
Kindergarten
2009
20
B
Comedy
College students
2009
10
C
Sci & Tech
College students
2009
12
…
…
…
…
…
A
Sci & Tech
College students
2010
22
A
Sci & Tech
University students
2010
4
Retailer
#Sales
Retailer
Category
Readership
#Sales
A
Comedy
College students
2010
1
A
61
A
Sci & Tech
College students
42
B
Sci & Tech
College students
2010
13
B
180
B
Sci & Tech
University students
2010
30
C
80
B
Fiction
University students
2010
5
B
Comedy
Kindergarten
2010
20
B
Comedy
College students
2010
10
C
Sci & Tech
College students
2010
16
C
Comedy
Kindergarten
2010
52
We ranked the 3rd among all book retailers !
Introduction
Book sales database
Promotion has been playing
a key role in marketing…
What is the rank of our book sales
among other retailers?
Manager of
retailer A
Discover the most interesting
subspaces where the our brand
(Retailer A) is highly ranked among
other competitors.
Retailer
Category
Readership
Year
#Sales
A
Sci & Tech
College students
2009
20
A
Sci & Tech
University students
2009
5
A
Comedy
University students
2009
9
B
Sci & Tech
College students
2009
20
B
Sci & Tech
University students
2009
7
B
Fiction
University students
2009
5
B
Comedy
Kindergarten
2009
20
B
Comedy
College students
2009
10
C
Sci & Tech
College students
2009
12
…
…
…
…
…
A
Sci & Tech
College students
2010
22
A
Sci & Tech
University students
2010
4
Retailer
#Sales
Retailer
Category
Readership
#Sales
A
Comedy
College students
2010
1
A
61
A
Sci & Tech
College students
42
B
Sci & Tech
College students
2010
13
B
180
B
Sci & Tech
College students
33
B
Sci & Tech
University students
2010
30
C
80
B
Fiction
University students
2010
5
B
Comedy
Kindergarten
2010
20
B
Comedy
College students
2010
10
C
Sci & Tech
College students
2010
16
C
Comedy
Kindergarten
2010
52
We ranked the 3rd among all book retailers !
Introduction
Book sales database
Promotion has been playing
a key role in marketing…
What is the rank of our book sales
among other retailers?
Manager of
retailer A
Discover the most interesting
subspaces where the our brand
(Retailer A) is highly ranked among
other competitors.
Retailer
Category
Readership
Year
#Sales
A
Sci & Tech
College students
2009
20
A
Sci & Tech
University students
2009
5
A
Comedy
University students
2009
9
B
Sci & Tech
College students
2009
20
B
Sci & Tech
University students
2009
7
B
Fiction
University students
2009
5
B
Comedy
Kindergarten
2009
20
B
Comedy
College students
2009
10
C
Sci & Tech
College students
2009
12
…
…
…
…
…
A
Sci & Tech
College students
2010
22
A
Sci & Tech
University students
2010
4
Retailer
#Sales
Retailer
Category
Readership
#Sales
A
Comedy
College students
2010
1
A
61
A
Sci & Tech
College students
42
B
Sci & Tech
College students
2010
13
B
180
B
Sci & Tech
College students
33
B
Sci & Tech
University students
2010
30
C
80
C
Sci & Tech
College students
28
B
Fiction
University students
2010
5
B
Comedy
Kindergarten
2010
20
B
Comedy
College students
2010
10
C
Sci & Tech
College students
2010
16
C
Comedy
Kindergarten
2010
52
We are the top-1 bookseller in the
{ Readership = College Students,
Category = Sci & Tech } segment !!!
We ranked the 3rd among all book retailers !
Introduction
Promotion has been playing
a key role in marketing…
What is the rank of our book sales
among other retailers?
Manager of
retailer A
Category
Readership
Year
#Sales
A
Sci & Tech
College students
2009
20
A
Sci & Tech
University students
2009
5
University
students
2009
May Anot be Comedy Compare
with
Single
SQL. 9
interesting.
in
B
Sci & TechALL objects
College students
2009
20
ALL aspects.
B
Sci & Tech
University students
Retailer
#Sales
Retailer
Category
Readership
#Sales
A
61
A
Sci & Tech
College students
42
B
180
B
Sci & Tech
College students
33
C
80
C
Sci & Tech
College students
28
7
B
Comedy
Kindergarten
2009
20
B
Comedy
College students
2009
10
C
Sci & Tech
College students
2009
12
Local
rank… Subspaces
High… cost…
…
…
A
Sci & TechCompare
College
studentsA naïve
2010
22
Globally
lowwith
ranked
object
in studentsapproach
A
Sci & Techobjects
University
2010 is 4
mayA becomes
certain
area.
to compute
Comedy
College students
prominent in
rank 2010
for ALL 1
B
Sci & Tech
College studentspossible
2010
13
some
subspaces.
subspaces
B
Sci & Tech
University students
2010
30
and return the
B
Fiction
University students
2010
5
interesting
B
Comedy
Kindergarten ones.
2010
20
B
Comedy
College students
Promotion query
We ranked the 3rd among all book retailers !
2009
B
Fiction
students
Global
rank
FullUniversity
space
Low2009
cost 5
Discover the most interesting
subspaces where the our brand
(Retailer A) is highly ranked among
other competitors.
We are the top-1 bookseller in the
{ Readership = College Students,
Category = Sci & Tech } segment !!!
Retailer
2010
10
C
Sci & Tech
College students
2010
16
C
Comedy
Kindergarten
2010
52
Introduction
Target
object
Promotion has been playing
a key role in marketing…
What is the rank of our book sales
among other retailers?
Manager of
retailer A
Subspace
dimensions
Object
dimension
Discover the most interesting
subspaces where the our brand
(Retailer A) is highly ranked among
other competitors.
Score
dimension
Retailer
Category
Readership
Year
#Sales
A
Sci & Tech
College students
2009
20
A
Sci & Tech
University students
2009
5
A
Comedy
University students
2009
9
B
Sci & Tech
College students
2009
20
B
Sci & Tech
University students
2009
7
B
Fiction
University students
2009
5
B
Comedy
Kindergarten
2009
20
B
Comedy
College students
2009
10
C
Sci & Tech
College students
2009
12
…
…
…
…
…
A
Sci & Tech
College students
2010
22
A
Sci & Tech
University students
2010
4
Retailer
#Sales
Retailer
Category
Readership
#Sales
A
Comedy
College students
2010
1
A
61
A
Sci & Tech
College students
42
B
Sci & Tech
College students
2010
13
B
180
B
Sci & Tech
College students
33
B
Sci & Tech
University students
2010
30
C
80
C
Sci & Tech
College students
28
B
Fiction
University students
2010
5
B
Comedy
Kindergarten
2010
20
B
Comedy
College students
2010
10
C
Sci & Tech
College students
2010
16
C
Comedy
Kindergarten
2010
52
We are the top-1 bookseller in the
{ Readership = College Students,
Category = Sci & Tech } segment !!!
We ranked the 3rd among all book retailers !
Object
dimension
Introduction
Person Promotion
Subspace
dimensions
Player
Position
Team
Year
Game
…
Score
Michael
Jordan
Guard
Chicago
Bulls
1998
vs N.Y.
Knicks
…
33
Michael
Jordan
Guard
Chicago
Bulls
1998
vs Utah
Jazz
…
15
Scottie
Pippen
Small
Forward
Chicago
Bulls
1998
vs Utah
Jazz
…
18
…
…
…
…
…
…
…
An
Target
object
Score
dimension
NBA manager would like to promote
Michael Jordan as a superstar.
3rd all time leading scorer.
Further analysis… Local rank in some subspaces
Top scorer in the guard position.
Top scorer on the Chicago Bulls team.
11 individual years’ scoring champion.
Introduction
The promotion query problem
Given
an object (e.g. a product, a person)
Goal: Discover the most interesting
subspaces where the object is highly ranked.
Problem Definition
Promotiveness measure
Object
dimension
Problem Definition
Subspace
dimensions
Score
dimension
Object
Location
Year
Score
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2007
1.0
T2
WA
2008
1.0
Query
T3
NY
2007
T3
WA
2007
Target Object : T1
T3
WA
2008
Aggregation measure : SUM
Goal : Discover the most interesting subspaces where T1 is
highly ranked.
0.3
0.6
0.7
Problem Definition
Object
Location
Year
Score
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2007
1.0
T2
WA
2008
1.0
Query
T3
NY
2007
T3
WA
2007
Target Object : T1
T3
WA
2008
Aggregation measure : SUM
Goal : Discover the most interesting subspaces where T1 is
highly ranked.
All possible subspaces.
{*}
{NY}
{2008}
{WA}
{NY,2008}
{WA,2008}
{NY,2007}
{2007}
{WA,2007}
0.3
0.6
0.7
Problem Definition
Object
Location
Year
Score
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2007
1.0
T2
WA
2008
1.0
Query
T3
NY
2007
T3
WA
2007
Target Object : T1
T3
WA
2008
Aggregation measure : SUM
Goal : Discover the most interesting subspaces where T1 is
highly ranked.
{2008}
{WA}
{NY,2008}
0.6
0.7
Note that the target object T1 only appears in
year = 2008, therefore the subspace {2007}
can be pruned.
{*}
{NY}
0.3
{WA,2008}
{NY,2007}
{2007}
{WA,2007}
Problem Definition
Object
Location
Year
Score
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2007
1.0
T2
WA
2008
1.0
Query
T3
NY
2007
T3
WA
2007
Target Object : T1
T3
WA
2008
Aggregation measure : SUM
Goal : Discover the most interesting subspaces where T1 is
highly ranked.
Target subspaces of T1.
{*}
{NY}
{2008}
{WA}
{NY,2008}
{WA,2008}
0.3
0.6
0.7
Problem Definition
Object
Location
Year
Score
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2007
1.0
T2
WA
2008
1.0
Query
T3
NY
2007
T3
WA
2007
Target Object : T1
T3
WA
2008
Aggregation measure : SUM
Goal : Discover the most interesting subspaces where T1 is
highly ranked.
Object
SUM(Score)
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
0.5 + 0.8 = 1.3
T2
1+1=2
T3
0.3 + 0.6 + 0.7 = 1.6
{2008}
{WA}
{NY,2008}
T1
{WA,2008}
0.3
0.6
0.7
We project all tuples of
T1 into this cell and
sum up their scores.
Problem Definition
Object
Location
Year
Score
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2007
1.0
T2
WA
2008
1.0
Query
T3
NY
2007
0.3
T3
WA
2007
0.6
Target Object : T1
T3
WA
2008
0.7
Aggregation measure : SUM
Goal : Discover the most interesting subspaces where T1 is
highly ranked.
Object
SUM(Score)
Object
Year
SUM(Score)
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
0.5 + 0.8 = 1.3
T1
2008
0.5 + 0.8 = 1.3
T2
1+1=2
T2
2008
1
T3
0.3 + 0.6 + 0.7 = 1.6
T3
2008
0.7
{2008}
{WA}
{NY,2008}
T1
{WA,2008}
SUM (T1) = 1.3
Rank (T1) = 1st / 3
We project all tuples of
T1 with Year = “2008”
into this cell and sum
up their scores.
Problem Definition
Object
Location
Year
Score
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2007
1.0
T2
WA
2008
1.0
Query
T3
NY
2007
0.3
T3
WA
2007
0.6
Target Object : T1
T3
WA
2008
0.7
Aggregation measure : SUM
Goal : Discover the most interesting subspaces where T1 is
highly ranked.
Object
SUM(Score)
Object
Year
SUM(Score)
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
0.5 + 0.8 = 1.3
T2
1+1=2
T3
0.3 + 0.6 + 0.7 = 1.6
{WA,2008}
T1
2008
0.5 + 0.8 = 1.3
We project all tuples of T1
2008
1
withT2Location
= “NY”
and
2008 into this
0.7 cell
YearT3= “2008”
and sum up their scores.
SUM (T1) = 1.3
Rank (T1) = 1st / 3
{2008}
{WA}
{NY,2008}
T1
Object
Location
Year
SUM(Score)
T1
NY
2008
0.5
T2
NY
2008
NO Tuples !
T3
NY
2008
NO Tuples !
Problem Definition
Object
Location
Year
Score
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2007
1.0
T2
WA
2008
1.0
Query
T3
NY
2007
0.3
T3
WA
2007
0.6
Target Object : T1
T3
WA
2008
0.7
Aggregation measure : SUM
Goal : Discover the most interesting subspaces where T1 is
highly ranked.
Object
SUM(Score)
Object
Year
SUM(Score)
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
T1
T2
T3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
0.3 + 0.6 + 0.7 = 1.6
{WA,2008}
T3
2008
0.7
SUM (T1) = 1.3
Rank (T1) = 1st / 3
{2008}
{WA}
{NY,2008}
0.5 + 0.8
1.3 both {2008}
T1
2008
0.5 + 0.8 = 1.3
T1 ranks
1st= in
and
{NY,2008},
which one
1 + 1 is
= 2 more interesting?
T2
2008
1
Object
Location
Year
SUM(Score)
T1
NY
2008
0.5
T2
NY
2008
NO Tuples !
T3
NY
2008
NO Tuples !
Problem Definition
Promotiveness of a subspace S : a class of
measures to quantify how well a subspace S can promote
the target object T.
Rank of the target object, Rank(S,T)
Higher rank -> more promotive.
Significance of the subspace, Sig(S)
More significant subspace (e.g. more objects) -> more promotive.
P(S, T) = f( Rank(S, T) ) * g( Sig(S) )
Example
P(S,T) = Rank-1(S,T)
P(S,T) = Rank-1(S,T) * ObjectCount(S)
P(S,T) = Rank-1(S,T) * I(ObjectCount(S) > MinSig)
Problem Definition
The promotion query problem
Input
a target object T
a top-R parameter
Output
top-R subspaces with the largest P(S, T) scores
Assume simple ranking model
P(S,T)
= Rank-1(S,T)
Query processing
methods
Query execution framework
Basic framework
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
Partition
Start
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2007
1.0
T2
WA
2008
1.0
T3
NY
2007
0.3
T3
WA
2007
0.6
T3
WA
2008
0.7
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
Start from the coarsest
subspace {*}.
Start
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2007
1.0
T2
WA
2008
1.0
T3
NY
2007
0.3
T3
WA
2007
0.6
T3
WA
2008
0.7
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
{WA}
Start
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2007
1.0
T2
WA
2008
1.0
T3
NY
2007
0.3
T3
WA
2007
0.6
T3
WA
2008
0.7
Partition the data based on the first dimension (i.e.
Location).
Generate candidate subspaces by substituting
values in that dimension.
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
{WA}
Start
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T3
NY
2007
0.3
T2
WA
2007
1.0
T2
WA
2008
1.0
T1
WA
2008
0.8
T3
WA
2007
0.6
T3
WA
2008
0.7
Partition the data based on the first dimension (i.e.
Location).
Generate candidate subspaces by substituting
values in that dimension.
Query execution framework
Basic framework
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
Partition
Start
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T3
NY
2007
0.3
T2
WA
2007
1.0
T2
WA
2008
1.0
T1
WA
2008
0.8
T3
WA
2007
0.6
T3
WA
2008
0.7
{WA}
Recursive operate on the child subspace, perform aggregation.
T1 ranks 1st among two objects (i.e. T1 and T3).
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{NY,2007}
Start
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T3
NY
2007
0.3
T2
WA
2007
1.0
T2
WA
2008
1.0
T1
WA
2008
0.8
T3
WA
2007
0.6
T3
WA
2008
0.7
{WA}
{NY,2008}
Partition the data based on the next dimension (i.e.
Year).
Generate candidate subspaces by substituting
values in that dimension.
Query execution framework
Basic framework
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{NY,2007}
Pruned!!!
Partition
Start
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T3
NY
2007
0.3
T2
WA
2007
1.0
T2
WA
2008
1.0
T1
WA
2008
0.8
T3
WA
2007
0.6
T3
WA
2008
0.7
{WA}
{NY,2008}
Recursive operate on the child subspace.
The target object T1 does not appear in this subspace, prune it!!!
Query execution framework
Basic framework
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{NY,2007}
Pruned!!!
{WA}
{NY,2008}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
Partition
Start
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T3
NY
2007
0.3
T2
WA
2007
1.0
T2
WA
2008
1.0
T1
WA
2008
0.8
T3
WA
2007
0.6
T3
WA
2008
0.7
Query execution framework
Basic framework
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{NY,2007}
Pruned!!!
{WA}
{NY,2008}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
SUM (T1) = 0.8
Rank (T1) = 3rd / 3
Partition
Start
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T3
NY
2007
0.3
T2
WA
2007
1.0
T2
WA
2008
1.0
T1
WA
2008
0.8
T3
WA
2007
0.6
T3
WA
2008
0.7
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{NY,2007}
Pruned!!!
{WA}
{NY,2008}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T3
NY
2007
0.3
T2
WA
2007
1.0
T2
WA
2008
1.0
T1
WA
2008
0.8
T3
WA
2007
0.6
T3
WA
2008
0.7
SUM (T1) = 0.8
Rank (T1) = 3rd / 3
{WA,2007}
Start
{WA,2008}
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{NY,2007}
Pruned!!!
{WA}
{NY,2008}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T3
NY
2007
0.3
T2
WA
2007
1.0
T3
WA
2007
0.6
T1
WA
2008
0.8
T2
WA
2008
1.0
T3
WA
2008
0.7
SUM (T1) = 0.8
Rank (T1) = 3rd / 3
{WA,2007}
Start
{WA,2008}
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{NY,2007}
Pruned!!!
{WA}
{NY,2008}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
Pruned!!!
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T3
NY
2007
0.3
T2
WA
2007
1.0
T3
WA
2007
0.6
T1
WA
2008
0.8
T2
WA
2008
1.0
T3
WA
2008
0.7
SUM (T1) = 0.8
Rank (T1) = 3rd / 3
{WA,2007}
Start
{WA,2008}
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{NY,2007}
Pruned!!!
{WA}
{NY,2008}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
Pruned!!!
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T3
NY
2007
0.3
T2
WA
2007
1.0
T3
WA
2007
0.6
T1
WA
2008
0.8
T2
WA
2008
1.0
T3
WA
2008
0.7
SUM (T1) = 0.8
Rank (T1) = 3rd / 3
{WA,2007}
Start
{WA,2008}
SUM (T1) = 0.8
Rank (T1) = 2nd / 3
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{NY,2007}
Pruned!!!
{WA}
{NY,2008}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
SUM (T1) = 0.8
Rank (T1) = 3rd / 3
{WA,2007}
Pruned!!!
Start
Aggregation
Object
Location
Year
Score
T1
NY
2008
0.5
T3
NY
2007
0.3
T2
WA
2007
1.0
T3
WA
2007
0.6
T1
WA
2008
0.8
T2
WA
2008
1.0
T3
WA
2008
0.7
{2007}
{2008}
{WA,2008}
SUM (T1) = 0.8
Rank (T1) = 2nd / 3
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{NY,2007}
Pruned!!!
{WA}
{NY,2008}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
SUM (T1) = 0.8
Rank (T1) = 3rd / 3
{WA,2007}
Pruned!!!
Start
Aggregation
Object
Location
Year
Score
T3
WA
2007
0.6
T3
NY
2007
0.3
T2
WA
2007
1.0
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2008
1.0
T3
WA
2008
0.7
{2007}
{2008}
{WA,2008}
SUM (T1) = 0.8
Rank (T1) = 2nd / 3
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{WA}
SUM (T1) = 0.8
Rank (T1) = 3rd / 3
Pruned!!!
{NY,2008}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
{WA,2007}
Pruned!!!
Aggregation
Object
Location
Year
Score
T3
WA
2007
0.6
T3
NY
2007
0.3
T2
WA
2007
1.0
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2008
1.0
T3
WA
2008
0.7
{2007}
{2008}
Pruned!!!
{NY,2007}
Start
{WA,2008}
SUM (T1) = 0.8
Rank (T1) = 2nd / 3
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{WA}
SUM (T1) = 0.8
Rank (T1) = 3rd / 3
Start
Aggregation
Object
Location
Year
Score
T3
WA
2007
0.6
T3
NY
2007
0.3
T2
WA
2007
1.0
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2008
1.0
T3
WA
2008
0.7
{2007}
{2008}
SUM (T1) = 1.3
Rank (T1) = 1st / 3
Pruned!!!
{NY,2007}
Pruned!!!
{NY,2008}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
{WA,2007}
Pruned!!!
{WA,2008}
SUM (T1) = 0.8
Rank (T1) = 2nd / 3
Finish!!!
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{WA}
SUM (T1) = 0.8
Rank (T1) = 3rd / 3
Start
Aggregation
Object
Location
Year
Score
T3
WA
2007
0.6
T3
NY
2007
0.3
T2
WA
2007
1.0
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2008
1.0
T3
WA
2008
0.7
{2007}
{2008}
SUM (T1) = 1.3
Rank (T1) = 1st / 3
Pruned!!!
Return Top-3 subspaces
{NY,2007}
Pruned!!!
{NY,2008}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
{WA,2007}
Pruned!!!
{WA,2008}
SUM (T1) = 0.8
Rank (T1) = 2nd / 3
P(S,T) = Rank-1(S,T)
Query execution framework
Basic framework
Partition
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{*} SUM (T1) = 1.3
rd
Rank (T1) = 3 / 3
{NY}
SUM (T1) = 0.5
Rank (T1) = 1st / 2
{WA}
SUM (T1) = 0.8
Rank (T1) = 3rd / 3
Start
Aggregation
Object
Location
Year
Score
T3
WA
2007
0.6
T3
NY
2007
0.3
T2
WA
2007
1.0
T1
NY
2008
0.5
T1
WA
2008
0.8
T2
WA
2008
1.0
T3
WA
2008
0.7
{2007}
{2008}
SUM (T1) = 1.3
Rank (T1) = 1st / 3
Pruned!!!
{NY,2007}
Pruned!!!
{NY,2008}
SUM (T1) = 0.5
Rank (T1) = 1st / 1
{WA,2007}
Pruned!!!
{WA,2008}
SUM (T1) = 0.8
Rank (T1) = 2nd / 3
Return Top-3 subspaces
P(S,T) = Rank-1(S,T) *
I(ObjCount(S) > 1)
Query execution framework
Query execution framework
The basic execution framework…
Computes
ALL subspaces, and thus the
overall cost could be quite prohibitive for large
datasets.
Develop optimization techniques based on
thresholding techniques
Subspace
pruning
Object pruning
Efficient
computation
methods
Subspace pruning
Object pruning
Subspace pruning
Key motivation
If
the upper bound of the promotiveness score of an
unseen subspace is lower than the current top-R
promotiveness score, we can prune the subspace.
How to obtain the upper bound of promotiveness
scores of the unseen subspaces?
= Rank-1(S,T)
Obtain a lower bound of the rank.
P(S,T)
{*}
Subspace pruning
{A}
Assumption :
The aggregation measure is a
monotone function (e.g. SUM)
The Sig measure is also monotone.
{AB}
{B}
{AC}
{C}
{BC}
{ABC}
Key observations
Observation 1: Objects in the child subspace must be a
member of its parent subspace.
Observation 2: Object’s aggregate score in the child
subspace must be smaller than or equal to its parent
subspace.
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Subspace
S1 = {a}
S2 = {ab}
S3 = {abc}
S4 = {ac}
Objects in the cuboid and their aggregate scores
S4={ac}
S3={abc}
Aggregate
of target
object
Rank
P
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Initialization step: We first scan the dataset once to calculate
the aggregate of the target object t7 in each subspace.
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
S1 = {a}
0.7
S2 = {ab}
0.6
S3 = {abc}
0.3
S4 = {ac}
0.4
Rank
P
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Start: Compute the
aggregate of objects in
subspace {a}.
Subspace
S1 = {a}
Objects in the cuboid and their aggregate scores
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
Aggregate
of target
object
0.7
S2 = {ab}
0.6
S3 = {abc}
0.3
S4 = {ac}
0.4
Rank
P
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Current top-1 result : S1(1/3)
Subspace
S1 = {a}
Objects in the cuboid and their aggregate scores
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
Aggregate
of target
object
Rank
P
0.7
3
1/3
S2 = {ab}
0.6
S3 = {abc}
0.3
S4 = {ac}
0.4
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Current top-1 result : S1(1/3)
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
3
1/3
S1 = {a}
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
S2 = {ab}
t6(0.7) t7(0.6) t3(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
S3 = {abc}
0.3
S4 = {ac}
0.4
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Current top-1 result : S1(1/3) S2(1/2)
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
3
1/3
S2 = {ab}
t6(0.7) t7(0.6) t3(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
2
1/2
S3 = {abc}
0.3
S4 = {ac}
0.4
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Current top-1 result : S2(1/2)
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
3
1/3
S2 = {ab}
t6(0.7) t7(0.6) t3(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
2
1/2
S3 = {abc}
t3(0.6) t6(0.5) t7(0.3) t1(0.1) t5(0.1)
0.3
S4 = {ac}
0.4
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Current top-1 result : S2(1/2)
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
3
1/3
S2 = {ab}
t6(0.7) t7(0.6) t3(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
2
1/2
S3 = {abc}
t3(0.6) t6(0.5) t7(0.3) t1(0.1) t5(0.1)
0.3
3
1/3
S4 = {ac}
0.4
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Current top-1 result : S2(1/2) Can we deduce a lower bound of Rank of S4?
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
3
1/3
S2 = {ab}
t6(0.7) t7(0.6) t3(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
2
1/2
S3 = {abc}
t3(0.6) t6(0.5) t7(0.3) t1(0.1) t5(0.1)
0.3
3
1/3
S4 = {ac}
0.4
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Current top-1 result : S2(1/2) Can we deduce a lower bound of Rank of S4?
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
3
1/3
S2 = {ab}
t6(0.7) t7(0.6) t3(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
2
1/2
S3 = {abc}
t3(0.6) t6(0.5) t7(0.3) t1(0.1) t5(0.1)
0.3
3
1/3
S4 = {ac}
Given the tuples in S3, can we deduce some of the members of S4?
0.4
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Tuples in the child subspace must also
in the a
parent
subspaces.
Current top-1 result : S2(1/2) Can appear
we deduce
lower
bound of Rank of S4?
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
3
1/3
S2 = {ab}
t6(0.7) t7(0.6) t3(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
2
1/2
S3 = {abc}
t3(0.6) t6(0.5) t7(0.3) t1(0.1) t5(0.1)
0.3
3
1/3
S4 = {ac}
t3(?)
t6(?)
t7(?)
t1(?)
t5(?)
Can we say something about
the scores of these members?
0.4
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Current top-1 result : S2(1/2) Can we deduce a lower bound of Rank of S4?
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
3
1/3
S2 = {ab}
t6(0.7) t7(0.6) t3(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
2
1/2
S3 = {abc}
t3(0.6) t6(0.5) t7(0.3) t1(0.1) t5(0.1)
0.3
3
1/3
S4 = {ac}
t3(?)
t6(?)
t7(0.4) t1(?)
t5(?)
Can we say something about
the scores of these members?
0.4
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Aggregate score of a tuple in the child subspace must
be smaller or equal to its score in the parent subspace.
Current top-1 result : S2(1/2) Can
we deduce a lower bound of Rank of S4?
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
3
1/3
S2 = {ab}
t6(0.7) t7(0.6) t3(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
2
1/2
S3 = {abc}
t3(0.6) t6(0.5) t7(0.3) t1(0.1) t5(0.1)
0.3
3
1/3
S4 = {ac}
t3(0.6) t6(0.5) t7(0.4) t1(0.1) t5(0.1)
Can we say something about
the scores of these members?
0.4
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Current top-1 result : S2(1/2) Can we deduce a lower bound of Rank of S4?
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
3
1/3
S2 = {ab}
t6(0.7) t7(0.6) t3(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
2
1/2
S3 = {abc}
t3(0.6) t6(0.5) t7(0.3) t1(0.1) t5(0.1)
0.3
3
1/3
S4 = {ac}
t3(0.6) t6(0.5) t7(0.4) t1(0.1) t5(0.1)
0.4
?
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Current top-1 result : S2(1/2) Can we deduce a lower bound of Rank of S4?
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
3
1/3
S2 = {ab}
t6(0.7) t7(0.6) t3(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
2
1/2
S3 = {abc}
t3(0.6) t6(0.5) t7(0.3) t1(0.1) t5(0.1)
0.3
3
1/3
S4 = {ac}
t3(0.6) t6(0.5) t7(0.4) t1(0.1) t5(0.1)
0.4
>= 3
Subspace pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Current top-1 result : S2(1/2)
Subspace
S4={ac}
S3={abc}
The promotive score of S4 should be less than or
equal to 1/3, which is less than the current top-1
promotive score (1/2), so S4 can be pruned!!!!
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
t6(1.2) t3(1.0) t7(0.7) t1(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
3
1/3
S2 = {ab}
t6(0.7) t7(0.6) t3(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
2
1/2
S3 = {abc}
t3(0.6) t6(0.5) t7(0.3) t1(0.1) t5(0.1)
0.3
3
1/3
S4 = {ac}
t3(0.6) t6(0.5) t7(0.4) t1(0.1) t5(0.1)
0.4
>= 3
?
S4 Pruned!!!
Subspace pruning
Object pruning
Key motivation
Try to prune the objects by obtaining an
upper bound of the aggregate score of
unseen objects.
Unseen objects with upper bound smaller
than the smallest aggregate score of target
object can be pruned.
Object pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Minimum of the aggregate scores of t7:
= min{0.7, 0.6, 0.3, 0.4} = 0.3
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
S1 = {a}
0.7
S2 = {ab}
0.6
S3 = {abc}
0.3
S4 = {ac}
0.4
Rank
P
Object pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Minimum of the aggregate scores of t7:
= min{0.7, 0.6, 0.3, 0.4} = 0.3
Subspace
S1 = {a}
S4={ac}
S3={abc}
Question: Can we prune some
objects in the subtree of S1?
Objects in the cuboid and their aggregate scores
t6(1.2) t3(1.0) t1(0.7) t7(0.7) t4(0.3) t5(0.3) t2(0.2)
Aggregate
of target
object
0.7
S2 = {ab}
0.6
S3 = {abc}
0.3
S4 = {ac}
0.4
Rank
P
Object pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Minimum of the aggregate scores of t7:
= min{0.7, 0.6, 0.3, 0.4} = 0.3
Subspace
S1 = {a}
S4={ac}
S3={abc}
Question: Can we prune some
objects in the subtree of S1?
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
t6(1.2) t3(1.0) t1(0.7) t7(0.7) t4(0.3) t5(0.3) t2(0.2)
S2 = {ab}
S3 = {abc}
S4 = {ac}
0.7
0.6
0.2 is the upper bound of the aggregate scores
of t2 in the subtree of S1.
i.e. the aggregate score will only be <= 0.2 !!!
0.3
0.4
Rank
P
Object pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Minimum of the aggregate scores of t7:
= min{0.7, 0.6, 0.3, 0.4} = 0.3
Subspace
S1 = {a}
S4={ac}
S3={abc}
Since the minimum of the aggregate scores
of t7 is 0.3, the aggregate scores of t2 will
not affect the rank of t7 in the subtree of S1.
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
t6(1.2) t3(1.0) t1(0.7) t7(0.7) t4(0.3) t5(0.3) t2(0.2)
S2 = {ab}
S3 = {abc}
S4 = {ac}
0.7
0.6
0.2 is the upper bound of the aggregate scores
of t2 in the subtree of S1.
i.e. the aggregate score will only be <= 0.2 !!!
0.3
0.4
Rank
P
Object pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Minimum of the aggregate scores of t7:
= min{0.7, 0.6, 0.3, 0.4} = 0.3
Subspace
S1 = {a}
S4={ac}
S3={abc}
Since the minimum of the aggregate scores
of t7 is 0.3, the aggregate scores of t2 will
not affect the rank of t7 in the subtree of S1.
Objects in the cuboid and their aggregate scores
t6(1.2) t3(1.0) t1(0.7) t7(0.7) t4(0.3) t5(0.3) t2(0.2)
Aggregate
of target
object
Pruned!!!
S2 = {ab}
S3 = {abc}
S4 = {ac}
0.7
0.6
0.2 is the upper bound of the aggregate scores
of t2 in the subtree of S1.
i.e. the aggregate score will only be <= 0.2 !!!
0.3
0.4
Rank
P
Object pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
S4={ac}
S3={abc}
Minimum of the aggregate scores of t7:
= min{0.7, 0.6, 0.3, 0.4} = 0.3
Subspace
S1 = {a}
Objects in the cuboid and their aggregate scores
t6(1.2) t3(1.0) t1(0.7) t7(0.7) t4(0.3) t5(0.3) t2(0.2)
S2 = {ab}
S3 = {abc}
S4 = {ac}
Aggregate
of target
object
0.7
0.6
Similarly, t4, t5 can also be Pruned!!!
0.3
0.4
Rank
P
Object pruning
Promotion cube
Promotion cube
Promotion cell
Given
a subspace S, a promotion cell S.Pcell is
defined as the sequence of the top-k largest
object aggregate scores in S.
Promotion cube
The
promotion cube consists of a set of triples
in the format (S, S.Pcell, Sig), where Sig is the
significance of the subspace S.
Promotion cube
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Subspace
Top-3 largest aggregate
scores in the subspace
S1={a}
S2={ab}
S4={ac}
Sig
S3={abc}
S1 = {a}
(1.2) (1.0) (0.7)
1
S2 = {ab}
(0.7) (0.6) (0.6)
1
S3 = {abc}
(0.6) (0.5) (0.3)
1
S4 = {ac}
(0.8) (0.7) (0.6)
1
In the promotion cube, we precompute
the top-3 largest aggregate scores
(not the object) in each subsapce.
Promotion cube
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Subspace
Top-3 largest aggregate
scores in the subspace
S2={ab}
S4={ac}
Sig
S3={abc}
S1 = {a}
(1.2) (1.0) (0.7)
1
S2 = {ab}
(0.7) (0.6) (0.6)
1
S3 = {abc}
(0.6) (0.5) (0.3)
1
S4 = {ac}
(0.8) (0.7) (0.6)
1
Subspace
S1={a}
In the promotion cube, we precompute
the top-3 largest aggregate scores
(not the object) in each subsapce.
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
S1 = {a}
0.7
S2 = {ab}
0.6
S3 = {abc}
0.3
S4 = {ac}
0.4
Rank
P
Promotion cube
S1={a}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Subspace
Top-3 largest aggregate
scores in the subspace
S3={abc}
(1.2) (1.0) (0.7)
1
S2 = {ab}
(0.7) (0.6) (0.6)
1
S3 = {abc}
(0.6) (0.5) (0.3)
1
S4 = {ac}
(0.8) (0.7) (0.6)
1
Subspace
S4={ac}
Sig
S1 = {a}
^
S2={ab}
Can you tell the exact rank of t7 in S1?
The aggregate score of t7 is 0.7, there are 2
other objects with aggregate value larger than t7!
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
S1 = {a}
0.7
S2 = {ab}
0.6
S3 = {abc}
0.3
S4 = {ac}
0.4
Rank
P
Promotion cube
S1={a}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Subspace
Top-3 largest aggregate
scores in the subspace
S2={ab}
S4={ac}
Sig
S3={abc}
S1 = {a}
(1.2) (1.0) (0.7)
1
S2 = {ab}
(0.7) (0.6) (0.6)
1
S3 = {abc}
(0.6) (0.5) (0.3)
1
Can you tell the exact rank of t7 in S1?
The aggregate score of t7 is 0.7, there are 2
other objects with aggregate value larger than t7!
S4 = {ac}
(0.8) (0.7) (0.6)
1
Current top-1 result : S1(1/3)
Subspace
S1 = {a}
^
Objects in the cuboid and their aggregate scores
No need to compute
Aggregate
of target
object
Rank
P
0.7
3
1/3
S2 = {ab}
0.6
S3 = {abc}
0.3
S4 = {ac}
0.4
Promotion cube
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Subspace
Top-3 largest aggregate
scores in the subspace
S4={ac}
S3={abc}
(1.2) (1.0) (0.7)
1
S2 = {ab}
(0.7) (0.6) (0.6)
1
S3 = {abc}
(0.6) (0.5) (0.3)
1
S4 = {ac}
(0.8) (0.7) (0.6)
1
Subspace
S2={ab}
Sig
S1 = {a}
^
S1={a}
Current top-1 result : S2(1/2)
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
No need to compute
0.7
3
1/3
S2 = {ab}
No need to compute
0.6
2
1/2
S3 = {abc}
0.3
S4 = {ac}
0.4
Promotion cube
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Subspace
Top-3 largest aggregate
scores in the subspace
S4={ac}
S3={abc}
(1.2) (1.0) (0.7)
1
S2 = {ab}
(0.7) (0.6) (0.6)
1
S3 = {abc}
(0.6) (0.5) (0.3)
1
S4 = {ac}
(0.8) (0.7) (0.6)
1
Subspace
S2={ab}
Sig
S1 = {a}
^
S1={a}
Current top-1 result : S2(1/2)
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
No need to compute
0.7
3
1/3
S2 = {ab}
No need to compute
0.6
2
1/2
S3 = {abc}
No need to compute
0.3
3
1/3
S4 = {ac}
0.4
Promotion cube
S1={a}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Subspace
Top-3 largest aggregate
scores in the subspace
S2={ab}
S4={ac}
Sig
S3={abc}
S1 = {a}
(1.2) (1.0) (0.7)
1
S2 = {ab}
(0.7) (0.6) (0.6)
1
S3 = {abc}
(0.6) (0.5) (0.3)
1
Can you tell the exact rank of t7 in S4? No!
The aggregate score of t7 is 0.4, there are at
least 3 objects with aggregate value larger than t7!
S4 = {ac}
(0.8) (0.7) (0.6)
1
Current top-1 result : S2(1/2)
Subspace
^
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
No need to compute
0.7
3
1/3
S2 = {ab}
No need to compute
0.6
2
1/2
S3 = {abc}
No need to compute
0.3
3
1/3
S4 = {ac}
No need to compute
0.4
Promotion cube
S1={a}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Subspace
Top-3 largest aggregate
scores in the subspace
S2={ab}
S4={ac}
Sig
S3={abc}
S1 = {a}
(1.2) (1.0) (0.7)
1
S2 = {ab}
(0.7) (0.6) (0.6)
1
S3 = {abc}
(0.6) (0.5) (0.3)
1
Can you tell the exact rank of t7 in S4? No!
The aggregate score of t7 is 0.4, there are at
least 3 objects with aggregate value larger than t7!
S4 = {ac}
(0.8) (0.7) (0.6)
1
Current top-1 result : S2(1/2)
Subspace
Objects in the cuboid and their aggregate scores
Aggregate
of target
object
Rank
P
S1 = {a}
No need to compute
0.7
3
1/3
S2 = {ab}
No need to compute
0.6
2
1/2
S3 = {abc}
No need to compute
0.3
3
1/3
S4 = {ac}
No need to compute
0.4
>=3
S4 Pruned!!!
Experimental
evaluations
Settings
Implementation
Pentium
3GHz processor
2GB of memory
160G hard disk
WinXP/ Microsoft Visual C# 2008 (in-memory)
Dataset
DBLP
Dataset
TPC-H
Algorithms
PromoRank
The
basic query execution framework.
PromoRank++
The
basic query execution framework
with subspace pruning and object
pruning.
PromoCube
DBLP Dataset
Subspace dimensions
Conference
(2,506)
Year
(50)
Database (boolean)
Data mining (boolean)
Information retrieval (boolean)
Machine learning (boolean)
Object dimension: Author(450K)
Score dimension: Paper count
Base tuples : 1.76M
DBLP Dataset
The running time increases as R increases.
It is because the pruning threshold is determined
by the current top-R’s aggregate score. The
threshold becomes looser as R becomes larger.
PromoCube performs extremely well when R is
small.
It is because in such case, the PromoCube can
directly return the result using O(1) lookup time.
DBLP Dataset
The running time increases as R increases.
It is because the pruning threshold is determined
by the current top-R’s aggregate score. The
threshold becomes looser as R becomes larger.
Number of subspace aggregations
PromoCube performs extremely well when R is
small.
It is because in such case, the PromoCube can
directly return the result using O(1) lookup time.
TPC-H Dataset
Subspace dimensions
l_shipdate (2526)
l_quantity (50)
l_discount (11)
l_tax (9)
l_linenumber (7)
l_returnflag (3)
Object dimension: l suppkey (10,000)
Score dimension: l_extendedprice (ranges from 901.00
to 104949.50)
Base tuples: 6,001,215
TPC-H
PromoCube is increasingly faster w.r.t. number of tuples.
This is because the actual aggregation and partition
cost saving of PromoCube is much larger.
PromoCube prunes subspace before any aggregation
happens, but PromoCube++ prunes subspaces during
aggregation process.
Runtime increases when dimensionality increases.
This is because there will be more target subspaces
when there are more dimensions.
The gap between PromoRank and PromoRank++ is
not large when number of dimensions is small.
This is because the total number of target subspace
itself is quite small, less chance to perform pruning
that exploit parent-child relationship.
TPC-H
All algorithm’s running time is faster when there are
more objects.
It is because more objects, less number of target
subspaces for each object.
With other parameters unchanged, if there are
more objects, each object will appear in less
tuples, causing less number of target subspaces
for each object .
Both PromoRank++ and PromoCube favor large
cardinalities, because…
With other parameters unchanged, larger cardinality
implies more subspaces.
With the same number of tuples, the chance of two
tuples having the same dimension values
becomes lower.
Therefore, it is more likely that the aggregate
scores would be equal across parent-child
subspaces, thereby providing a tighter lower
bound for Rank.
TPC-H
{*}
{*}
{NY}
{NY,2007}
{2007}
{2008}
{NY,2008}
{NY}
{NY,2007}
{WA}
{NY,2008}
Both PromoRank++ and PromoCube favor large
cardinalities, because…
With other parameters unchanged, larger cardinality
implies more subspaces.
With the same number of tuples, the chance of two
tuples having the same dimension value
becomes lower.
Therefore, it is more likely that the aggregate
scores would be equal across parent-child
subspaces, thereby providing a tighter lower
bound for Rank.
{2007}
{WA,2007}
{2008}
{WA,2008}
TPC-H
{*}
{NY} 1
Location
Year
Score
T1
NY
2007
0.6
T2
NY
2008
0.4
{2007}
0.6
{NY,2007}
Object
{2008}
0.4
{NY,2008}
{*}
{NY} 0.6
Location
Year
Score
T1
NY
2007
0.6
T2
WA
2008
0.4
{WA} 0.4
{2007}
0.6
{NY,2007}
{NY,2008}
With other parameters unchanged, larger cardinality
implies more subspaces.
With the same number of tuples, the chance of two
objects having the same dimension value
becomes lower (sparse).
Therefore, it is more likely that the aggregate
scores would be equal across parent-child
subspaces, thereby providing a tighter lower
bound for Rank.
{2008}
0.4
Both PromoRank++ and PromoCube favor large
cardinalities, because…
Object
{WA,2007}
{WA,2008}
Conclusion
Introduced the promotion analysis
problem.
Presented a basic query execution
framework.
Proposed two pruning techniques and the
Promotion Cube for efficient query
processing.
The End
Appendix
Object pruning
S1={a}
S2={ab}
Target object : t7
Return top-1 promotive subspace
P(S,t7) = Rank-1(S,t7)
Subspace
Objects in the cuboid and their aggregate scores
S4={ac}
S3={abc}
Aggregate
of target
object
Rank
P
S1 = {a}
t6(1.2) t3(1.0) t1(0.7) t7(0.7) t4(0.3) t5(0.3) t2(0.2)
0.7
3
1/3
S2 = {ab}
t6(0.7) t3(0.6) t7(0.6) t1(0.4) t4(0.3) t2(0.2) t5(0.2)
0.6
2
1/2
S3 = {abc}
t3(0.6) t6(0.5) t7(0.3) t1(0.1) t5(0.1)
0.3
3
1/3
S4 = {ac}
t3(0.8) t6(0.7) t1(0.6) t7(0.4) t5(0.2) t2(0.1) t4(0.1)
0.4
4
1/4
Introduction
Promotion has been playing a key role in
marketing…
Manager
Query execution framework
{*}
Basic framework
To use a recursive process
to partition and aggregate
the data to compute the target
object’s rank in each
subspace.
{AB}
Depth-first manner
{A}
{AC}
{ABC}
{B}
{AD}
{ABD}
{BC}
{ACD}
{ABCD}
{C}
{D}
{BD}
{CD}
{BCD}
TPC-H
Effectiveness of promotion
query
© Copyright 2026 Paperzz