The Chosen Few:
On Identifying Valuable Patterns
Bjorn Bringmann & Albrecht Zimmermann
ICDM 2007
10/09/08
Outline
Introduction
General Algorithm
Instantiation of Algorithm
Experimental Evaluation
Conclusion
Introduction
To remove the redundant information contained in such pattern sets, we propose
a general heuristic approach for selecting a small subset of patterns.
We identify several selection techniques for use in this general algorithm and
evaluate those on several data sets.
The aim is to reduce the set of patterns returned by a data mining operation to a
subset that is small enough to be inspected by human user.
General Algorithm
Given a set
a subset
S
of patterns
S* S
,s.t.
S*
pi ,the database T, redundancy measure
satisfies the following requirement:
*
1. S is small, s.t. a human expert could inspect it.
*
2.members of S have low redundancy w.r.t. T according to
3.members of
S *describe characteristics of T.
Notion:
each pattern p be associated with function p : T true, false
S p1,..., pn
is called a pattern set
Define an E.R ~ S on the set T of transaction as
~ S t1 , t2 T T pS p t1 p t2
.
,select
General Algorithm
•
Using the E.R ~ S the partition or quotient set of T over S is defined as
T/ ~ S
xT
•
a T a ~ S x
Define measure
the equivalences classes called blocks
as T , S* , p 0,1 R
General Algorithm
Minimal number of patterns
needed to induce a partition is
log 2 T / ~ S*
Instantiations of Algorithm
Q
Partition size quotient
Agglomerative clustering
Inference of patterns
I
C
Partition size quotient
This may be acceptable in early steps of selection process when not many
patterns are used and only few blocks formed
Advantage: easy to evaluate
Disadvantage: focus sorely on the number of blocks without considering which
blocks are split is not enough.
1 – 2/4 = 0.5
if >= t add p2
Agglomerative clustering
To alleviate Q ,one can use an agglomerative cluster which combines some of
new sub-blocks until the old number of blocks is reached
Rand index defined as follow: assume two partition P, P’. For each pair of instances
ti,,tj, two decision variables exist –
cij set to 1 if two instances end up in the same block in both P and P’, 0 otherwise
dij set to 1 if two instances are assigned to different blocks in both partition,0 otherwise
Agglomerative clustering
t1
t2
t3
t4
t5
t1
t2 t3 t4
t5
d=1 0 c=1 d=1
d=1 d=1 0
0 d=1
d=1
1- 2*(1+6)/20 = 0.3
if >= t add p2
similarity matrix
Advantage: it’s about size and composition of blocks not only about the
number
Disadvantage: will take longer to evaluate ,clustering process need quadratic
time and the Rand-index has n(n-1)/2 pair-wise decisions need to be made
Inference of patterns
Evaluate the possibility of predicting the presence/absence of a pattern based on the
presence of previously chosen patterns.
Given a pattern set S*={p1,…,pk}, a new pattern pk+1 and database T, identify each
transaction ti with its binary feature vector f S* ti p1 ti ,..., pk ti
and label it with c(ti) =pk+1(ti), and use a learner to induce a hypothesis
where
X f S* t t T
h: X
{0,1}
Inference of patterns
f S* t1
= 1
pk+1(t1) = 1
t2
= 0
t2 = 0
t3
= 1
t3 = 0
1 – 3/5 = 0.4
t4
= 1
t4 = 1
if >= t
t5
= 0
t5 = 1
add p2
It is consider about the previous pattern , not about any information of blocks.
Experimental Evaluation
We used pattern sets from five UCI itemsets mining tasks, harvested using an APRIORI implementation
with different minimum support thresholds. We obtained closed pattern sets of a size as shown in Table
1.
Experimental Evaluation
•
Even set to quite large and restricting the result to closed patterns, removing quite some
redundancy , the size of pattern set still too large for user.
•
It is need to recreate the partition gives a large reduction for “right” order.
•
In each measure we can use totally four orderings : support ascending ( s ),support
descending( s ),length ascending( l ),length descending( l )
Experimental Evaluation
S*
The left curve is
The y-axis is
2
T~ S
,shown for comparison
T~ S*
T~ S ~
1.“high support” ordering &”low support” ordering
2.Reduction in patterns for most settings not linear
to the decrease in blocks
Use the thresholds {0, 0.01, 0.03, 0.05, 0.1, 0.2, 0.3,
0.4} ,obtaining one reduced pattern set per threshold
for each of the itemsets.
Experimental Evaluation
Experimental Evaluation
Conclusion
The results show that the technique succeeds in severely reducing the number of
patterns, while at the same time apparently retaining much of the original information.
The experiments show that reducing the pattern set indeed improves the quality of
classification results. Both results show that the approach is very well suited for the
goals they aim at.
There could place other measure based on the general algorithm.
© Copyright 2026 Paperzz