Action Rules 2 - Personal Web Pages

Action Rules Discovery
without pre-existing
classification rules
By
Zbigniew W. Ras1,2 and Agnieszka Dardzinska1,3
1)
2)
3)
University of North Carolina, Charlotte, NC, USA
Polish Academy of Science, Warsaw, Poland
Bialystok Technical University, Bialystok, Poland
1. Introduction : Action Rule
An action rule is a rule extracted from a datatable
that describes a possible transition of objects from
one state to another with respect to a
distinguished attribute called a decision attribute
Information System
A
B
D
a1 b2 d1
a2 b2
a2 b2 d2
1. Introduction : Action Rule
An action rule is a rule extracted from a database
that describes a possible transition of objects from
one state to another with respect to a
distinguished attribute called a decision attribute
Assumption:
Information System
A
B
D
a1 b2 d1
a2 b2
a2 b2 d2
- attributes are partitioned into
stable and flexible
1. Introduction : Action Rule
An action rule is a rule extracted from a database
that describes a possible transition of objects from
one state to another with respect to a
distinguished attribute called a decision attribute
Assumption:
Information System
A
B
D
- attributes are partitioned into
stable and flexible
a1 b2 d1
a2 b2
a2 b2 d2
Values of attributes
can be changed
1. Introduction : Action Rule
Action rule is defined as a term [(ω) ∧ (α → β)] →(ϕ→ψ)
conjunction of fixed condition
features shared by both groups
proposed changes in values
of flexible features
Information System
A
B
D
a1 b2 d1
a2 b2
a2 b2 d2
desired effect of the action
2. Background and Objectives
Information System S
Information System
X
a
b
c
d
x1
a1
b1
c1
d1
x2
a2
b1
c2
d1
x3
a2
b2
c2
d1
x4
a2
b1
c1
d1
x5
a2
b3
c2
d1
x6
a1
b1
c2
d2
x7
a1
b2
c2
d1
x8
a1
b2
c1
d3
S = (X,A,V)
X = {x1, x2, x3, x4, x5, x6, x7, x8}
A = {a, b, c, d}
X is a set of objects
A is a set of attributes (partial
functions: X  V)
Va is a set of values of attribute a,
where a  A, and V = {Va : a  A},
3. ARED Algorithm (Stable and Flexible Att)
Decision System
Decision Attribute : {d}
Stable Attributes : Ast= {a}
Flexible Attributes : Afl = {b,c}
Constraints:
Minimum Support 1
Minimum Confidence 2
X
a
b
c
d
x1
a1
b1
c1
d1
x2
a2
b1
c2
d1
x3
a2
c2
d1
x4
a2
b1
c1
d1
x5
a2
b3
c2
d1
x6
a1
b1
x7
a1
b2
c2
d1
x8
a1
b2
c1
d3
d2
2. Background and Objectives
Action rule generation in earlier work used existing
classification rules (generated, for instance, by rule
discovery algorithms such as LERS or ERID)
Proposed Algorithm
1. Extract action rules directly from an information system
without using pre-existing classification rules.
2. Extract action rules that have minimal attribute
involvement.
Action Rules Discovery System DEAR
Objects
x1, x2, x3, x4
a
b
c
d
0
x1, x3
0
x2, x4
L
2
1
x7, x8
2
x7, x8
L
1
H
1
2
Decision Attribute: d
a=2
Objects
b
x7, x8
c
a
b
c
2
1
x2, x4
1
x5, x6
2
1
Objects
x7, x8
c=?
3
c=?
b
c
1
2
Objects
c
x2, x4
c=2
Objects
b
Objects
b
x7, x8
1 T1
x7, x8
1
x5, x6
3 T3
c=1
Objects
Objects
x1, x2, x3, x4
T4
T5
(T3, T1) : (a = 2)  (b, 21)  ( d, L  H)
T2
b
T6
b
x2, x4
x1, x3
c
c=?
3
b
b
x1, x2, x3,
x4
1
Objects
2
Objects
2
c=0
b
a=0
0
x2, x4
a=?
b
1
a=?
x1, x3
x7, x8
2
x5, x6
H
c
0
x2, x4
Objects
x7, x8
b
0
x2, x4
Set of rules R with supporting objects
Objects
a
x1, x3
Flexible Attribute: b
L
3
x1, x2, x3, x4
Stable Attributes: {a, c}
L
x2, x4
x5, x6
Objects
L
(a = 2)  (b, 31)  ( d, L  H)
3. Action Rules
Atomic Action Term:
(a, a1 → a2 )
a1, a2 ∈ Va
attribute
If a1 = a2 then a – stable on a1
3. Action Rules
Atomic Action Term:
(a, a1 → a2 )
a1, a2 ∈ Va
attribute
If a1 = a2 then a – stable on a1
Set of Action Terms:
1. If t – atomic action term, then t – an action term
2. If t1, t2 – action terms, then t1*t2 – an action term
3. If t – action term containing sub-terms (a, a1 → a2 ), (b, b1 → b2),
then a ≠ b
Domain of Action Term t:
Dom(t) – set of all attribute names listed in t
3. Action Rules
Atomic Action Term:
(a, a1 → a2 )
a1, a2 ∈ Va
attribute
If a1 = a2 then a – stable on a1
Action Rule:
r = [t1 → t2]
action term
Dom(t2)={d} and Dom(t1) in A
atomic action term
3. Example
Decision System S
X
a
b
c
d
x1
a1
b1
c1
d1
x2
a2
b1
c2
d1
x3
a2
b2
c2
d1
x4
a2
b1
c1
d1
x5
a2
b3
c2
d1
x6
a1
b1
c2
d2
x7
a1
b2
c2
d1
x8
a1
b2
c1
d3
(a, a2 → a2)
(b, b1 → b3)
(c, c2 → c2)
(d, d1 → d2)
atomic action terms
r=[[(a, a2→a2)*(b, b1→b3)] → (d, d1→d2)
Dom (r) ={a, b, d}
action rule
4. Interpretation NS of action terms
Information System S=(X, A, V)
1. If (a, a1 → a2) – atomic action term then:
NS((a, a1 → a2))=[{x∈X:a(x)=a1}, {x∈X:a(x)=a2}]
2. If t1=(a, a1 → a2) * t and NS(t)=[Y1, Y2], then:
NS(t1)=[ Y1 ∩ {x∈X:a(x)=a1}, Y2 ∩ {x∈X:a(x)=a2} ]
Assume that [Y1, Y2]∩[Z1, Z2] = [Y1∩Z1, Y2∩Z2], and
NS(t1)=[Y1, Y2], NS(t2)=[Z1, Z2], then: NS(t1*t2)=NS(t1)∩NS(t2)
4. Interpretation NS of action terms
Information System S=(X, A, V)
NS(t1)=[Y1, Y2]
NS(t2)=[Z1, Z2]
Action Rule:
r = [t1 → t2]
Support:
sup( r )  card (Y1  Z1 )
Confidence:
 card (Y1  Z1 )   card (Y2  Z 2 ) 
conf (r )  


card
(
Y
)
card
(
Y
)
1
2

 

Information System S
X
a
b
c
d
x1
a1
b1
c1
d1
x2
a2
b1
c2
d1
x3
a2
b2
c2
d1
x4
a2
b1
c1
d1
x5
a2
b3
c2
d1
x6
a1
b1
c2
d2
x7
a1
b2
c2
d1
x8
a1
b2
c1
d3
r = [[(a, a2→a2)*(b, b1→b2)] → (d, d1→d2)
NS((a, a2→a2))=[{x2, x3, x4, x5}, {x2, x3, x4, x5}]
NS((b, b1→b2))=[{x1, x2, x4, x6}, {x3, x7, x8}]
NS((d, d1→d2))=[{x1, x2, x3, x4, x5, x7}, {x6}]
Information System S
X
a
b
c
d
x1
a1
b1
c1
d1
x2
a2
b1
c2
d1
x3
a2
b2
c2
d1
x4
a2
b1
c1
d1
x5
a2
b3
c2
d1
x6
a1
b1
c2
d2
x7
a1
b2
c2
d1
x8
a1
b2
c1
d3
r = [[(a, a2→a2)*(b, b1→b2)] → (d, d1→d2)
NS((a, a2→a2))=[{x2, x3, x4, x5}, {x2, x3, x4, x5}]
NS((b, b1→b2))=[{x1, x2, x4, x6}, {x3, x7, x8}]
NS((d, d1→d2))=[{x1, x2, x3, x4, x5, x7}, {x6}]
NS((a, a2→a2)*(b, b1→b2))=[{x2, x4}, {x3}]
sup (r) = 2, conf (r) = 0
5. Algorithm ARED
-Similar to ERID and LERS
-Only positive marks yield action rules
-Action terms of length k are built from unmarked action terms of
length k – 1 and unmarked atomic action terms of length 1
S=(X, A ∪ {d}, V) – decision system
λ1, λ2 – minimum support and confidence
CS(a)={NS(ta): ta – an atomic action term built from elements in Va}
a∈A
5. ARED Algorithm
Object reclassification from class d1 to d2
X
a
b
c
d
x1
a1
b1
c1
d1
x2
a2
b1
c2
d1
x3
a2
b2
c2
d1
x4
a2
b1
c1
d1
x5
a2
b3
c2
d1
x6
a1
b1
c2
d2
x7
a1
b2
c2
d1
x8
a1
b2
c1
d3
For decision attribute in S:
NS(d,d1  d2)=[{x1,x2, x3, x4, x5, x7}, {x6}]
Notation:
t1=(b,b1b1), t2=(b,b2b2), t3=(b,b3b3),
t4=(a,a1a2), t5=(a,a1a1), t6=(a,a2a2),
t7=(a,a2a1),
t8=(c,c1c2), t9=(c,c2c1), t10=(c,c1c1),
t11=(c,c2c2),
t12 = (d,d1  d2),
stable attribute
flexible attributes
λ1=2, λ2=1/4
5. ARED Algorithm
Object reclassification from class d1 to d2
λ1=2, λ2=1/4
For decision attribute in S:
NS(d,d1  d2)=[{x1,x2, x3, x4, x5, x7}, {x6}]
For classification attribute in S:
Not marked λ1=3
NS(t1)=[{x1,x2, x4, x6}, {x1,x2, x4, x6}]
Mark “-” λ1=1
NS(t2)=[{x3,x7, x8}, {x3,x7, x8}]
Mark “-” λ1=1
NS(t3)=[{x5}, {x5}]
Mark “-” λ2=0
NS(t4)=[{x1,x6, x7, x8}, {x2,x3, x4, x5}]
Not marked λ1=4
NS(t5)=[{x1,x6, x7, x8}, {x1,x6, x7, x8}]
Mark “-” λ2=0
NS(t6)=[{x2,x3, x4, x5}, {x2,x3, x4, x5}]
Mark “+” λ1=4,λ2=1/4
NS(t7)=[{x2,x3, x4, x5}, {x1,x6, x7, x8}]
5. ARED Algorithm
Object reclassification from class d1 to d2
λ1=2, λ2=1/4
For decision attribute in S:
NS(t12)=[{x1,x2, x3, x4, x5, x7}, {x6}]
For classification attribute in S:
Not marked λ1=3
NS(t1)=[{x1,x2, x4, x6}, {x1,x2, x4, x6}]
Marked “-” λ2=0
NS(t2)=[{x3,x7, x8}, {x3,x7, x8}]
Marked “-” λ1=1
NS(t3)=[{x5}, {x5}]
Marked “-” λ2=0
NS(t4)=[{x1,x6, x7, x8}, {x2,x3, x4, x5}]
Not marked λ2=2
NS(t5)=[{x1,x6, x7, x8}, {x1,x6, x7, x8}]
Marked “-” λ2=0
NS(t6)=[{x2,x3, x4, x5}, {x2,x3, x4, x5}]
Mark “+” λ1=4, λ2=1/4 NS(t7)=[{x2,x3, x4, x5}, {x1,x6, x7, x8}]
r = [t7  t1]
5. ARED Algorithm
Object reclassification from class d1 to d2
λ1=2, λ2=1/4
For decision attribute in S:
NS(t12)=[{x1,x2, x3, x4, x5, x7}, {x6}]
For classification attribute in S:
Not marked
NS(t8)=[{x1,x4, x8}, {x2, x3, x5, x6, x7}]
Rule r = [t8→t12], conf = 2/3 *1/5 <λ2, sup=2 ≥ λ1
Marked “-”
NS(t9)=[{x2, x3, x5, x6, x7}, {x1, x4, x8}]
Marked “-”
NS(t10)=[{x1, x4, x8}, {x1, x4, x8}]
Not marked
NS(t11)=[{x2, x3, x5, x6, x7},
{x2, x3, x5, x6, x7}]
5. ARED Algorithm
Object reclassification from class d1 to d2
Now action terms of length 2
from unmarked action terms of length 1
λ1=2, λ2=1/4
For decision attribute in S:
NS(t12)=[{x1,x2, x3, x4, x5, x7}, {x6}]
For classification attribute in S:
Not marked
NS(t1*t5)=[{x1, x6}, {x1, x6}]
Marked “+”
NS(t1*t8)=[{x1, x4}, {x2, x6}]
Rule r = [t1*t8→t12], conf = 1/2 ≥ λ2, sup=2 ≥ λ1
Not marked
NS(t1*t11)=[{x2, x6}, {x2, x6}]
Marked “-”
NS(t5*t8)=[{x1, x8}, {x6, x7}]
Rule r = [t5*t8→t12], conf = 1/2 ≥ λ2, sup= 1 < λ1
Not marked
NS(t5*t11)=[{x6, x7}, {x6, x7}]
Marked “-”
NS(t8*t11)=[Ø, {x2, x3, x5, x6, x7}]
5. ARED Algorithm
Object reclassification from class d1 to d2
Now action terms of length 3
from unmarked action terms of length 2
λ1=2, λ2=1/4
For decision attribute in S:
NS(t12)=[{x1,x2, x3, x4, x5, x7}, {x6}]
For classification attribute in S:
(t1*t5*t8) –extension of (t5*t8)
Marked “-”
Algorithm STOPS!
5. ARED Algorithm
Object reclassification from class d1 to d2
λ1=2, λ2=1/4
For decision attribute in S:
NS(t12)=[{x1,x2, x3, x4, x5, x7}, {x6}]
For classification attribute in S:
X
a
b
c
d
x1
a1
b1
c1
d1
[[(b,b1→b1)*(c,c1→c2)] → (d, d1→d2)]
x2
a2
b1
c2
d1
x3
a2
b2
c2
d1
[[(a,a2→a1] → (d, d1→d2)]
x4
a2
b1
c1
d1
x5
a2
b3
c2
d1
x6
a1
b1
c2
d2
x7
a1
b2
c2
d1
x8
a1
b2
c1
d3
Action rules:
6. Application
Database HEPAR (Medical Center of Postgraduate Education,
Warsaw, Poland) prepared by Dr. med. Hanna Wasyluk

758 patients described by 106 attributes (including 31 laboratory tests
with values discretized to: “below normal”, “normal”, “above normal”).
14 attributes are stable. Two tests are invasive tests: HBsAg [in tissue],
HBcAg [in tissue]. There are 7 decision values:
I. Acute hepatitis
IIa. Subacute hepatitis (types BC)
IIb. Subacute hepatitis (alcohol-abuse)
IIIa. Chronic hepatitis (curable)
IIIb. Chronic hepatitis (non-curable)
IV. Cirrhosis hepatitis
V. Liver cancer
We can ask for re-classifications: IIb  I, IIIa  I
6. Application
Database HEPAR (heavily incomplete)
Handling data incompleteness

Attributes with more than 90% of null values have been
removed


Subjective attributes have been removed
[example: History of alcohol abuse]

Removal of absolute medical tests

RSES null value imputation module was used to make the
resulting dataset complete.
6. Application
Chosen reduct has no invasive tests, 11% incomplete data
{m, n, q, u, y, aa, ah, ai, am, an, aw, bb, bg, by, cj, cm}
m
n
q
u
y
aa
ah
ai
am
an
aw
bb
bg
bm
by
cj
cm
dd
Bleeding
Subjaundice symptoms
Eructation
Obstruction
Weight loss
Smoking
History of viral hepatitis (stable)
Surgeries in the past (stable)
History of hospitalization (stable)
Jaundice in pregnancy
Erythematous dermatitis
Cysts
Sharp liver edge (stable)
Blood cell plaque
Alkaline phosphatase
Prothrombin index
Total cholesterol
Decision attribute
6. Application Action Rules discovery in Database HEPAR








No history of viral hepatitis but with history of surgery and hospitalization, sharp
liver edge normal, no subjaundice symptoms, total cholesterol normal, erythematous
dermatitis normal, weight normal, no cysts , patient does not smoke
[(ah=1) (ai=2) (am=2)(bg=1)](cm=1) (aw=1)(u,  1)(bb=1) (aa=1)
 (q,  1) (m,  1) (n=1)  (bm,  down)  (y=1)  (by, norm  up)
 (dd, IIIa  I )
[(ah=1) (ai=2) (am=2) (bg=1)](cm=1)(aw=1)(u,  1) (bb=1) (aa=1)
 (q,  1)(m,  1)(n=1)(bm,  down)(y=1) (by, norm  down)
 (dd, IIIa  I )
Two quite similar action rules have been constructed:
By getting rid of obstruction, eructation, bleeding, by decreasing the blood cell
plaque and by changing the level of alkaline phosphatase we should be able
reclassify the patient from IIIa group to I.
Medical doctor should decide for a given patient if the alkaline phosphatase level
needs to be decreased or increased.
Attribute values of total cholesterol, weight, and smoking have to stay unchanged.
7. Conclusion and Future Work
• Presented algorithm discovers action rules directly from a decision
table.
• The proposed algorithm generates the shortest action rules
without using pre-existing classification rules
• Future work includes
- Action rule generation from incomplete information system
- Action rule generation when attributes are hierarchical
- Considering different flexibilities of attributes (e.g. the social condition is most likely less flexible
than the health condition )
Questions?
Thank You