UP-Growth: An Efficient Algorithm for
High Utility Itemset Mining
Vincent S. Tseng, Cheng-Wei Wu, Bai-En Shie, and
Philip S.Yu
SIG KDD 2010
1
2010/8/25
Outline
Motivation
Problem Definition
Method
UP-Tree Structure
UP-Growth Method
Experimental Results
Conclusions
2
2010/8/25
Motivation
The unit profits and purchased quantities of the items are not
taken into considerations in frequent itemset mining.
The basic meaning of utility is the interestedness/
importance/profitability of items to the users.
3
2010/8/25
(Cont.)
The utility of items in a transaction database consists of two
aspects:
External utility: the importance of distinct items.
Internal utility: the importance of the items in the
transaction.
The utility of an itemset is defined as the external utility
multiplied by the internal utility.
High utility itemset: its utility is no less than a userspecified threshold.
4
2010/8/25
(Cont.)
Mining high utility itemsets from the databases is not an easy
task since the downward closure property used in frequent
itemset mining cannot be applied here.
How to effectively prune the search space and efficiently
capture all high utility itemsets with no miss is a big
challenge.
5
2010/8/25
Problem Definition
u((XX
)=p(i
, Tu)( X , T )
Xp)*q(i
p,T
uu(i
,)Td
Td TdpD d u (i p d, Td )
d)
i X X T
p
d
u({AD})=u({AD},T )+u({AD},
u({AC},T1)=u({A},T11)+u({C},T1
T )=7+17=24
u({A},T31)=5+1=6
)=5*1=5
An itemset is called a high utility itemset if
its utility is no less than min_util
TWU (TU
X ) (Td
) X uT(dTdTd,TDdTU
) (Td )
TWU({AD})=TU(T1)+TU(T3)
TU(T1)=u({ACD},T1)= 8
=8+30=38
6
TheIf transaction-weighted
TWU(X) is no less thandownward
the minimum utility
closure(TWDC):
threshold, X is called a high transactionForweighted
any itemsetutilization
X, if X is notitemset
a HTWUI,(abbreviated
any supersetasof
X isHTWUI)
a low utility itemset.
2010/8/25
Proposed Method
Construction of UP-Tree
Generation of potential high utility itemsets (PHUIs) from the
UP-Tree by UP-Growth
7
2010/8/25
Construction of UP-Tree
The construction of UP-Tree can be performed with two scans of
the original database.
First scan
TU of each transaction is computed.
TWU of each single item is also accumulated.
Discarding global unpromising items.
Unpromising items are removed from the transaction and utilities are
eliminated from the TU of the transaction.
The remaining promising items in the transaction are sorted in the
descending order of TWU.
Second scan
Transactions are inserted into UP-Tree.
8
2010/8/25
min_util= 40
(Cont.)
First scan
unpromising items
9
Descending order of
TWU
2010/8/25
(Cont.)
Second scan
10
2010/8/25
(Cont.)
1 8
11
2010/8/25
(Cont.)
1 8
12
2010/8/25
(Cont.)
2
1
1
13
30
22
22
2010/8/25
(Cont.)
14
Strategy 1. Discarding global unpromising items (DGU).
2010/8/25
Generating PHUIs from the global UPtree
An item ip is called a local promising
item in {ai}-CPB if pu(ip, {ai}-CPB) is
no smaller than min_util;
{D}’s conditional pattern base ({D}-CPB)
15
{A}is a local unpromising item in {D}-CPB ,
any superset of {A} is not a high utility
itemset.
2010/8/25
(Cont.)
Generating PHUIs from {D}-Tree:
{{D}:58,{DE}:45, {DEB}:45, {DEC}:45, {DEBC}:45,
{DB}:45,{DBC}:45, {DC}:53}
16
A set of PHUIs is {{D}:58,{DE}:45, {DEB}:45, {DEC}:45, {DEBC}:45,
{DB}:45,{DBC}:45, {DC}:53}, {B}:61 {BE}:54, {BEC}:54, {BC}:54,
{A}:65, {AC}:55, {ACE}:47, {AE}:47, {E}:88, {EC}:76, {C}:96}.
2010/8/25
Decreasing global node (DGN) utilities
in construction of a global UP-Tree
Strategy 2. Discarding global node utilities (DGN)
The utilities of its descendants are discarded from the utility of
the node during the construction of a global UP-Tree
17
{B}’s-CPB
2010/8/25
(Cont.)
18
2010/8/25
(Cont.)
1 1
19
2010/8/25
(Cont.)
1 1
20
2010/8/25
(Cont.)
{C}.nu=1+p({C})×q({C}, T2’)=1+1×6=7
2 7
21
2010/8/25
(Cont.)
{E}.nu=p({C})×q({C}, T2’)+p({E})×q({E},
T2’)=1×6+3×2=12
2 7
1 12
22
2010/8/25
(Cont.)
{E}.nu=p({C})×q({C}, T2’)+p({E})×q({E},
T2’)+p({A})×q({A}, T2’)=1×6+3×2+5×2=22
2 7
1 12
1 22
23
2010/8/25
(Cont.)
A set of PHUIs is {{D}:58, {DE}:45,
{DEB}:45, {DEBC}:45, {DEC}:45, {DB}:45,
{DBC}:45, {DC}:53, {B}:61, {A}:65, {E}:88,
{C}:96}.
24
2010/8/25
UP-Growth
For efficiently generating PHUIs from the global UP-Tree
with two strategies:
DLU(Discarding local unpromising items)
DLN(Decreasing local node utilities)
25
2010/8/25
DLU
Due to memory space limit, instead of maintaining exact
utility values of the items in the conditional pattern base, we
maintain a minimum item utility table(MIUT).
Strategy 3. Discarding local unpromising items(DLU)
The MIUT of unpromising items are discarded from path
utilities of the paths during the construction of a local UP-Tree
26
2010/8/25
(Cont.)
8-miu({A})× {AC}.count = 5×1 = 5
25-miu({A})× {BAEC}.count = 5×1
=5
27
2010/8/25
DLN
Strategy 4. Decreasing local node utilities(DLN):
The MIUT of descendant nodes for the node are decreased
during the construction of a local UP-Tree.
1 3
28
2010/8/25
DLN
Decreasing local node utilities(DLN):
The MIUT of descendant nodes for the node are decreased
during the construction of a local UP-Tree.
3+{20-miu({B})×1-miu({E}) ×1} =
3+13 = 16
2
16
1
17
1
20
20-miu({E})×1 = 20-3= 17
29
2010/8/25
DLN
Decreasing local node utilities(DLN):
The MIUT of descendant nodes for the node are decreased
during the construction of a local UP-Tree.
16+{20-miu({B})×1-miu({E}) ×1}
= 16+13 = 29
3
29
2
34
2
40
17+20-miu({E})×1 = 17+17= 34
30
2010/8/25
Experimental Results
31
2010/8/25
Scalability
32
2010/8/25
Conclusions
This paper proposed an efficient UP-Growth algo. For
mining high utility itemsets.
A UP-Tree structure is proposed for maintaining the
information of high utility itemsets
By four strategies, the mining performance is enhanced
significantly since both the search space and the number of
candidates are effectively reduced.
33
2010/8/25
© Copyright 2026 Paperzz