Monitoring, Security, and Rescue Techniques in Multiagent Systems

Monitoring, Security, and Rescue Techniques
in Multiagent Systems
Advances in Soft Computing
Editor-in-chief
Prof. Janusz Kacprzyk
Systems Research Institute
Polish Academy of Sciences
ul. Newelska 6
01-447 Warsaw, Poland
E-mail: [email protected]
Further books of this series can be found on our homepage: springeronline.com
Henrik Larsen, Janusz Kacprzyk,
SJawomir Zadrozny, Troels Andreasen,
Henning Christiansen (Eds.)
Flexible Query Answering Systems
2000. ISBN 3-7908-1347-8
Robert John and Ralph Birkenhead (Eds.)
Developments in Soft Computing
2001. ISBN 3-7908-1361-3
Leszek Rutkowski, Janusz Kacprzyk (Eds.)
Neural Networks and Soft Computing
2003. ISBN 3-7908-0005-8
Jurgen Franke, Gholamreza Nakhaeizadeh,
Ingrid Renz (Eds.)
Text Mining
2003. ISBN 3-7908-0041-4
Mieczyslaw Kbpotek, Maciej Michalewicz
and Slawomir T. Wierzchon (Eds.)
Intelligent Information Systems 2001
2001. ISBN 3-7908-1407-5
Tetsuzo Tanino,Tamaki Tanaka,
Masahiro Inuiguchi
Multi-Objective Programming and Goal
Programming
2003. ISBN 3-540-00653-2
Antonio Di Nola and Giangiacomo Gerla (Eds.)
Lectures on Soft Computing and Fuzzy Logic
2001. ISBN 3-7908-1396-6
Mieczyslaw Klopotek, Slawomir T. Wierzchon,
Krzysztof Trojanowski (Eds.)
Intelligent Information Processing and Web Mining
2003. ISBN 3-540-00843-8
Tadeusz Trzaskalik and Jerzy Michnik (Eds.)
Multiple Objective and Goal Programming
2002. ISBN 3-7908-1409-1
James J. Buckley and Esfandiar Eslami
An Introduction to Fuzzy Logic and Fuzzy Sets
2002. ISBN 3-7908-1447-4
Ajith Abraham and Mario Koppen (Eds.)
Hybrid Information Systems«
2002. ISBN 3-7908-1480-6
Ahmad Lotfi, Jonathan M. Garibaldi (Eds.)
Applications and Science in Soft-Computing
2004. ISBN 3-540-40856-8
Mieczyslaw Klopotek, Slawomir T. Wierzchon,
Krzysztof Trojanowski (Eds.)
Intellligent Information Processing and
Web Mining
2004. ISBN 3-540-21331-7
Przemyslaw Grzegorzewski, Olgierd Hryniewicz,
Maria 9 . Gil (Eds.)
Soft Methods in Probability, Statistics
and Data Analysis
2002. ISBN 3-7908-1526-8
Miguel Lopez-Diaz, Maria 9 . Gil, Przemyslaw
Grzegorzewski, Olgierd Hryniewicz, Jonathan
Lawry
Soft Methodology and Random Information
Systems
2004. ISBN 3-540-22264-2
Lech Polkowski
Rough Sets
2002. ISBN 3-7908-1510-1
Kwang H. Lee
First Course on Fuzzy Theory and Applications
2005. ISBN 3-540-22988-4
Mieczyslaw Klopotek, Maciej Michalewicz
and Slawomir T. Wierzchon (Eds.)
Intelligent Information Systems 2002
2002. ISBN 3-7908-1509-8
Barbara Dunin-K^plicz, Andrzej Jankowski,
Andrzej Skowron, Marcin Szczuka
Monitoring, Security, and Rescue Techniques in
Multiagent Systems
2005. ISBN 3-540-23245-1
Andrea Bonarini, Francesco Masulli
and Gabriella Pasi (Eds.)
Soft Computing Applications
2002. ISBN 3-7908-1544-6
Barbara Dunin-K^plicz
Andrzej Jankowski
Andrzej Skowron
Marcin Szczuka
Monitoring, Security,
and Rescue Techniques
in Multiagent Systems
With 138 Figures
^ S p r iinger
Barbara Dunin-K^plicz
Institute of Computer Science
Polish Academy of Sciences
Ordona 21
01-237 Warsaw, Poland
and
Institute of Informatics, Warsaw University
Banacha 2
02-097 Warsaw, Poland
and
Institute for Decision Process Support
Chemikow 5
09-411 Piock, Poland
Andrzej Jankowski
Institute for Decision Process Support
Chemikow 5
09-411 Plock, Poland
Andrzej Skowron
Institute of Mathematics
Warsaw University
Banacha 2
02-097 Warsaw, Poland
and
Institute for Decision Process Support
Chemikow 5
09411 Ptock, Poland
Marcin Szczuka
Institute of Mathematics
Warsaw University
Banacha 2
02-097 Warsaw, Poland
Library of Congress Control Number: 2004116865
ISSN 16-15-3871
ISBN 3-540-23245-1 Springer Berlin Heidelberg NewYork
This work is subject to copyright. AU rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication
of this publication or parts thereof is permitted only under the provisions of the German Copyright
Law of September 9,1965, in its current version, and permission for use must always be obtained from
Springer. Violations are liable to Prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
springeronline.com
© Springer-Verlag Berlin Heidelberg 2005
Printed in Germany
The use of general descriptive names, registered names, etc. in this publication does not imply, even
in the absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and free for general use.
Cover design: Erich Kirchner, Springer Heidelberg
Typesetting: Digital data supplied by editors
Production: medionet AG, Berlin
Printed on acid-free paper
62/3141/Rw-5 43 210
Preface
In todays society the issue of security, understood in the widest context, have become
a crucial one. The people of the information age, having instant access to the sources
of knowledge and information expect the technology to improve their safety in all
respects. That, straightforwardly, leads to the demand for methods, technologies,
frameworks and computerized tools which serve this purpose. Nowadays such methods and tools are more and more expected to embed not only ubiquitous information
sources, but also the knowledge that stems from them. The use of knowledge-based
technology in security applications, and in the society at large, clearly emerges as the
next big challenge. Within general set of security-related tasks we may exhale some
sub-fields such us monitoring, control, crisis and rescue management.
Multiagent systems are meant to be a toolset for modelling of, automated reasoning with, and study on the behavior of compound environments that involve many
perceiving, reasoning and acting parties. In a natural way they are well suited for supporting the research on foundations of automatic reasoning processes starting from
data acquisition (including data entry, sensor measurements and multimedia information processing) to automatic knowledge perception, real-life situation assessment,
through planning to action execution in the context of monitoring, security, and rescue techniques. These activities are closely related to many very active research areas
like autonomous systems, spatio-temporal reasoning, knowledge representation, soft
computing with rough, fuzzy and rough mereological approaches, perception, learning, evolution, adaptation, data mining and knowledge discovery, collective intelligence and behavior. All these research directions have plenty of possible applications
in the systems that are concerned with assuring security, acting in emergency and crisis situations, monitoring of vital infrastructures, managing cooperative jobs in the
situation of danger, and planning of action for the rescue campaigns.
This volume contains extended and improved versions of selected contributions
presented at the International Workshop "Monitoring, Security, and Rescue Techniques in Multiagent Systems" (MSRAS 2004) held in Plock, Poland, June 7-9,
2004.
The MSRAS 2004 workshop was aimed at gathering world's leading researchers
active in areas related to monitoring, security, and rescue techniques in multiagent
VI
Preface
systems. Such techniques are among the core issues that involve very large numbers
of heterogeneous agents in the hostile environment. The intention of the workshop
was to promote research and development in these significant domains.
The workshop itself was a significant success thanks to the presence and contributions of the leading researchers in the field. In this way, by establishing a forum
for exchanging results and experience of top specialists working in areas closely related to such tasks the new possibilities for scientific cooperation have been created.
The workshop was also the first step on the road to establishing permanent research
and technology center in Plock. We hope that this center, a part of Industrial and
Technology Park, will become an important institution contributing to fostering the
research in knowledge-based technologies and security.
Organization of the volume
As the 48 contributions in this volume span over very wide area of research, it is
quite hard to categorize them precisely. Therefore, there are only two major parts
in this volume. First one, entitled "Foundations and Methods" gathers the papers
of more theoretical and fundamental character as well as those dealing with general,
basic descriptions of various methodologies and paradigms. The second part, entitled
"Application Domains and Case Studies" is meant to encapsulate the articles that deal
with more specific problems, concrete solutions and application examples. Naturally,
the division is very subjective and should not be treated as definite.
Within each part the papers are organized in accordance with they role at the
workshop. It means that in each of the parts the articles from keynote presenters
come first, followed by invited and regular contributions, and finshed by papers that
were part of special and poster sessions.
Acknowledgements
We wish to express our gratitude to Professors Zdzislaw Pawlak and Lotfi A. Zadeh
who accepted our invitation to act as honorary chairs of the workshop. We are also
very grateful to all oustanding scientists who participated in the workshop including:
Andrzej Uszok
Jean-Pierre Muller
James F. Peters
KatiaSycara
David Wolpert
Philip S. Yu
Hans-Dieter Burkhard
Tom R. Bums
Nikolai Chilov
Andrzej Czyzewski
Barbara Dunin-K^plicz
Rineke Verbrugge
Amal EL Fallah-Seghrouchni
Vladimir Gorodetski
Zbigniew Michalewicz
Hideyuki Nakanishi
SankarK.Pal
L^ch Polkowski
Alberto Pettorossi
Zbigniew Ras
Alexander Ryjov
Marek Sergot
V.S. Subrahmanian
^^^^^^ ^^^g
^ui Wang
Preface
VII
Many thanks to all authors who prepared their articles for this volume.
We would like to thank PKN Orlen, City of Flock and the supervisor of MSRAS
organization - Ms. Sulika Kamiiiska. Without the money, expertise, and organizational muscle they provided, neither the MSRAS workshop nor the publication of
this volume would have been possible.
We are also thankful to Springer-Verlag and Dr. Thomas Ditzinger for the opportunity of publishing this volume in the "Advances in Soft Computing" series.
Warsaw,
July 2004
Barbara Dunin-K^plicz
Andrzej Jankowski
Andrzej Skowron
Marcin Szczuka
Contents
Part I Foundations and Methods
1 Flow Graphs, their Fusion and Data Analysis
Zdzislaw Pawlak
3
2 Approximation Spaces for Hierarchical Intelligent Behavioral System
Models
James F. Peters
13
3 Distributed Adaptive Control: Beyond Single-Instant, Discrete Control
Variables
David H. Wolpert, Stefan Bieniawski
31
4 Multi-agent Planning for Autonomous Agents' Coordination
Amal El Fallah-Seghrouchni
53
5 Creating Common Beliefs in Rescue Situations
Barbara Dunin-K§plicz, Rineke Verbrugge
69
6 Coevolutionary Processes for Strategic Decisions
Rodney W. Johnson, Michael E. Melich, Zbigniew Michalewicz,
Martin Schmidt
85
7 Automatic Proofs of Protocols via Program Transformation
Fabio Fioravanti, Alberto Pettorossi, Maurizio Proietti
8 Mereological Foundations to Approximate Reasoning
Lech Polkowski
9 Data Security and Null Value Imputation in DIS
Zbigniew W. Ras, Agnieszka Dardzinska
99
117
133
X
Contents
10 Basic Principles and Foundations of Information Monitoring Systems
Alexander Ryjov
147
11 Modelling Unreliable and Untrustworthy Agent Behaviour
Marek Sergot
161
12 Nearest Neighbours without k
Hui Wang, Ivo Duntsch, Gunther Gediga, Gongde Guo
179
13 Classifiers Based on Approximate Reasoning Schemes
Jan Bazan, Andrzej Skowron
191
14 Towards Rough Applicability of Rules
Anna Gomolinska
203
15 On the Computer-Assisted Reasoning about Rough Sets
Adam Grabowski
215
16 Similarity-Based Data Reduction and Classification
Gongde Guo, Hui Wang, David Bell, Zhining Liao
227
17 Decision Trees and Reducts for Distributed Decision Tables
Mikhail Ju. Moshkov
239
18 Learning Concept Approximation from Uncertain Decision Tables
Nguyen Sink Hoa, Nguyen Hung Son
249
19 In Search for Action Rules of the Lowest Cost
Zbigniew W Has, Angelina A. Tzacheva
261
20 Circularity in Rule Knowledge Bases Detection using Decision Unit
Approach
Roman Siminski, Alicja Wakulicz-Deja
273
21 Feedforward Concept Networks
Dominik Sl§zak, Marcin Szczuka, Jakub Wroblewski
281
22 Extensions of Partial Structures and Their Apphcation to Modelling
of Multiagent Systems
Bozena Staruch
293
23 Tolerance Information Granules
Jaroslaw Stepaniuk
305
24 Attribute Reduction Based on Equivalence Relation Defined on
Attribute Set and Its Power Set
Ling Wei, Wenxiu Zhang
317
Contents
XI
25 Query Cost Model Constructed and Analyzed in a Dynamic
Environment
Zhining Liao, Hui Wang, David Glass, Gongde Quo
327
26 The Efficiency of the Rules' Classification Based on the Cluster
Analysis Method and Salton's Method
Agnieszka Nowak, Alicja Wakulicz-Deja
333
27 Extracting Minimal Templates in a Decision Table
Barbara Marszai-Paszek, Piotr Paszek
339
Part II AppUcation Domains and Case Studies
28 Programming Bounded Rationality
Hans-Dieter Burkhard
347
29 Generalized Game Theory's Contribution to Multi-agent Modelling
Tom R. Burns, Jose Castro Caldas, Ewa Roszkowska
363
30 Multi-Agent Decision Support System for Disaster Response and
Evacuation
Alexander Smimov, Michael Pashkin, Nikolai Chilov, Tatiana Levashova,
Andrew Krizhanovsky
385
31 Intelligent System for Environmental Noise Monitoring
Andrzej Czyzewski, Bozena Kostek, Henryk Skarzynski
397
32 Multi-agent and Data Mining Technologies for Situation Assessment
in Security-related Applications
Vladimir Gorodetsky, Oleg Karsaev, Vladimir Samoilov
411
33 Virtual City Simulator for Education, Training, and Guidance
Hideyuki Nakanishi
423
34 Neurocomputing for Certain Bioinformatics Tasks
Shubhra Sankar Ray, Sanghamitra Bandyopadhyay, Pabitra Mitra,
SankarK. Pal
439
35 Rough Set Based Solutions for Network Security
Guoyin Wang, Long Chen, YuWu
455
36 Task Assignment with Dynamic Token Generation
Alessandro Farinelli, Luca locchi, Daniele Nardi, Fabio Patrizi
467
37 DyKnow: A Framework for Processing Dynamic Knowledge and
Object Structures in Autonomous Systems
Fredrik Heintz, Patrick Doherty
479
XII
Contents
38 Classifier Monitoring using Statistical Tests
Rafal Latkowski, Cezary Gtowifiski
493
39 What Do We Learn When We Learn by Doing? Toward a Model of
Dorsal Vision
Ewa Ranch
501
40 Rough Mereology as a Language for a Minimalist Mobile Robot's
Eenvironment Description
Lech Polkowski, Adam Szmigielski
509
41 Data Acquisition in Robotics
Krzysztof Luks
519
42 Spatial Sound Localization for Humanoid
Lech Blazejewski
527
43 Oculomotor Humanoid Active Vision System
Piotr Kazmierczak
539
44 Crisis Management via Agent-based Simulation
Grzegorz Dohrowolski, Edward Nawarecki
551
45 Monitoring in Multi-Agent Systems: Two Perspectives
Marek Kisiel-Dorohinicki
563
46 Multi-Agent Environment for Management of Crisis in an
Enterprises-Markets Complex
Jaroslaw Kozlak
571
47 Behavior Based Detection of Unfavorable Events Using the
Multiagent System
Krzysztof Cetnarowicz, Edward Nawarecki, Gabriel Rojek
579
48 Intelligent Medical Systems on Internet Technologies Platform
Beata Zielosko, Andrzej Dyszkiewicz
589
Author Index
595
Flow Graphs, their Fusion and Data Analysis
Zdzislaw Pawlak
Institute of Computer Sciences
Warsaw University of Technology
Ul. Nowowiejska 15/19, 00665 Warsaw, Poland
and
Warsaw School of Information Technology
ul. Newelska 6, 01-447 Warsaw, Poland
zpwiiii . p w . e d u . p l
Summary. This paper concerns a new approach to data analysis based on information flow
distribution study in flow graphs. The introduced flow graphs differ from that proposed by
Ford and Fulkerson, for they do not describe material flow in the flow graph but information
"flow" about the data structure.
Data analysis (mining) can be reduced to information flow analysis and the relationship
between data can be boiled down to information flow distribution in aflownetwork. Moreover,
it is revealed that information flow satisfies Bayes' rule, which is in fact an information flow
conservation equation. Hence information flow has probabilistic character, however Bayes'
rule in our case can be interpreted in an entirely deterministic way, without referring to prior
and p(75r^nc>r probabilities, inherently associated with Bayesian philosophy.
Furthermore in this paper we study hierarchical structure of flow networks by allowing to
substitute a subgraph determined by branches x and y by a single branch connecting x and
y, called fusion of x and y. This "fusion" operation allows us to look at data with different
accuracy and move from details to general picture of data structure.
Key words: flow graphs, data fusion, data mining, Bayes' rule
1.1 Introduction
In [4] we presented a new approach to data analysis based on information flow distribution study in flow graphs. The introduced flow graphs differ from that proposed
by Ford and Fulkerson [1], for they do not describe material flow in the flow graph
but information "flow" about the data structure.
With every branch of the flow graph three coefficients are associated, called
strength, certainty and coverage factors. These coefficients were widely used in data
mining and rough set theory, but in fact they were first introduced by Lukasiewicz
[2] in connection with his study of logic and probability. These coefficients have a
4
Zdzislaw Pawlak
probabilistic flavor, but here they are interpreted in a deterministic way, describing
information flow distribution in the flow graph.
We claim that data analysis (mining) can be reduced to information flow analysis
and the relationship between data can be boiled down to information flow distribution
in a flow network. Moreover, it is revealed that information flow satisfies Bayes' rule,
which is in fact an information flow conservation equation. Hence information flow
has probabilistic character, however Bayes' rule in our case can be interpreted in
an entirely deterministic way, without referring to prior and posterior probabilities,
inherently associated with Bayesian philosophy.
Furthermore in this paper we study hierarchical structure of flow networks by
allowing to substitute a subgraph determined by branches x and 2/ by a single branch
connecting x and y, cdXltd fusion of x and y. This "fusion" operation allows us to
look at data with different accuracy and move from details to general picture of data
structure.
This approach allows us to study different relationships in data and can be used
as a new mathematical tool for data mining.
Summing up, we advocate to use flow analysis to:
•
•
•
•
searching for patterns in data,
searching for dependencies in data,
data classification,
data fusion.
A simple tutorial example will be used to illustrate the introduced ideas.
1.2 Example 1 - Smoking and Cancer
First let us explain basic concepts of the proposed methodology on a simple example
taken from [3].
In Table 1.1 data conceming 60 people who do or do not smoke and do or do not
have cancer are shown.
Table 1.1. Smoking and Cancer
Not cancer
Cancer
Total
Not smoke
40
7
47
Smoke
10
3
13
Total
50
10
60
With every data table like that in presented in Table 1.1 we associate a flow graph
as shown in Fig. 1.1.
Nodes XQ and xi are inputs of the graph, whereas yo and yi are outputs of the
graph. The numbers assigned to the input nodes (J){XQ) and 0(xi) of the flow graph
represent inflow to the flow graph, whereas numbers associated with the inputs 0(2/0)
and (j){yi) represent outflow of the graph. Every branch (x, y) of the flow graph is
1 Flow Graphs, their Fusion and Data Analysis
5
labeled by a number which represents a throughflow (j){x, y) through the branch from
nodes xioy.
This representation of data is intended to capture the relationships in the data and
is not meant to describe any material flow in the network.
yes
cp(^l)=13
cpO;j)=10
Fig. 1.1. Flow graph for Table 1.1
We will show in the next sections that representation of data as flow in a flow
graph can be used to discover many important relationships in data, e.g. dependences.
However to this end we have to "normalize" the flow graph by using instead of absolute values of flow (j){x) their relative values cr(x), i.e. percentage of flow with respect
to total flow of the graph. The absolute throughflow </>(a:, y) will be also replaced
by relative throghflow cr{x,y). This normalized representation has very interesting
mathematical properties, which can be use to discover patterns in data.
Beside, we will use two additional coefficients called the certainty and coverage
factors, denoted cer{x, y) and cov{x, y) respectively, which characterize how the
flow is spread between nodes x and y.
Normalized flow graph for the flow graph given in Fig. 1.1 is shown in Fig. 1.2.
a(jcj) = 13/60
a(yj)= 10/60
Fig. 1.2. Normalized flow graph for Table 1.1
From the flow graph we arrive at the following conclusions:
•
•
•
85% non-smoking persons do not have cancer (cer(a;o, yo) = 40/47 ^ 0.85),
15% non-smoking persons do have cancer (cer(xo, yi) = 7/47 ^ 0.15),
77% smoking persons do not have cancer {cer{xi,yo) = 10/13 ^ 0.77),
6
•
Zdzislaw Pawlak
23% smoking persons do have cancer (cer(xi, yi) = 3/13 ^ 0.23).
From the flow graph we get the following reason for having or not cancer:
•
•
•
•
80% persons having not cancer do not smoke {cov{xo^ yo) = 4/5 = 0.80),
20% persons having not cancer do smoke {cov{xi^yo) = 1/5 = 0.20),
70% persons having cancer do not smoke {cov{xo, yi) = 7/10 = 0.70),
30% persons having cancer do smoke {cov{xi,yi) = 3/10 = 0.30).
Let us observe that in the statistical terminology cr(xo), (T{XI) are priors while
(^{xo^yo)^ " ", c^(^i5 2/i) are joint distributions, cov(xo, yo),..., cov{xi,yi) SLTQ posteriors and cr{yo),a{yi) are marginal probabilities.
1.3 Flow Graphs Basic Concepts
1.3.1 Flow Graphs
In this section the fundamental concept of the proposed approach flow graph is defined and discussed.
A flow graph is a directed, acyclic, finite graph G = {N, B, </>), where A'^ is a set
oi nodes, B C N x N is Siset of directed branches, cj) : B —^ R^ is ?iflowfunction
and R^ is the set of non-negative reals.
Input of a node x e N is the set I{x) = {y E N : {y,x) e B}', output of a node
X e N is defined as 0{x) = {y e N : {x,y) e B}.
We will also need the concept of input and output of a graph G, defined, respectively, as follows: I{G) ^ {x e N : I{x) = 0}, 0{G) = {x e N : 0{x) = 0}.
Inputs and outputs of G are external nodes of G', other nodes are internal nodes
ofG.
If (x, y) £ B then (/)(x, y) is a throughflow from x to y.
With every node x of a flow graph G we associate its inflow
Mx)=
^
0(2/,x),
yel{x)
(1.1)
and outflow
4>-{x)=
Yl
^(^'2/).
(1.2)
yeo{x)
Similarly, we define an inflow and an outflow for the whole flow graph, which
are defined as
ct>^{G)= Y.
^-(^)'
(1-^)
yei{G)
xei(0)
We assume that for any intemal node x, 4>+{x) = </>-(a:) = (t){x), where </)(a:) is a
throughflow of node x.
1 Flow Graphs, their Fusion and Data Analysis
7
Obviously, </>+(G) = (/)-{G) = (j){G), where </>(G) is a troughflow of graph G.
The above formulas can be considered as^ow conservation equations [4].
We will define now a normalized flow graph.
A normalized flow graph is a directed, acyclic, finite graph G = (N^B^a),
where N is a set of nodes, B C A/^ x A^ is a set of directed branches and
cr:>B—» < 0,1 > is a normalized flow of (a:, y) and
is a strength of (x,2/). Obviously, 0 < cr{x^y) < 1. The strength of the branch
expresses simply the percentage of a total flow through the branch.
In what follows we will use normalized flow graphs only, therefore by flow
graphs we will understand normalized flow graphs, unless stated otherwise.
With every node x of a flow graph G we associate its inflow and outflow defined
as
^^
^
yeO{x)
Obviously for any internal node x, we have cr^{x) =
a normalized throughflow of x.
Moreover, let
(T-{X)
— cr{x), where a{x) is
Obviously, a+(G) = (7_(G) = c7(G) = 1.
If we invert direction of all branches in G, then the resulting graph G = (AT, B\ a')
will be called an inverted graph of G. Of course the inverted graph G' is also a flow
graph and all inputs and outputs of G become inputs and outputs of G\ respectively.
1.3.2 Certainty and Coverage Factors
With every branch (x, y) of a flow graph G we associate the certainty and the coverage factors.
The certainty and the coverage of (x, y) are defined as
cer(z,j/) = ^ % f ,
(1.10)
8
Zdzislaw Pawlak
and
COv{x,y) = ^
^
.
(1.11)
respectively.
Evidently, cer{x, y) = cov{y, x), where (a;, y) E B and (y, x) G ^ ' .
Below some properties, which are immediate consequences of definitions given
above are presented:
^
cer(x,2/) = l,
(1.12)
yeO{x)
Y^ cov{x,y) = l,
(1.13)
xel{y)
(^{x)=
Y^ cer{x,y)cF{x) = ^
2/€0(a;)
^(y)=
(T{x,y),
(1.14)
cr{x,y),
(1.15)
yeO(x)
X I coi;(x,2/)a(2/) =
xel{y)
^
xyel{y)
cer(.,,)^-(-'.y^),
(1.16)
(T(X)
co^0^,7/ = — H ^ r ^ .
(1.17)
Obviously the above properties have a probabilistic flavor, e.g., equations (14) and
(15) have a form of total probability theorem, whereas formulas (16) and (17) are
Bayes' rules. However, these properties in our approach are interpreted in a deterministic way and they describe flow distribution among branches in the network.
1.3.3 Paths, Connections and Fusion
A {directed) path from x to y, x ^ y in G is a sequence of nodes x i , . . . , x^ such
that xi = x^Xn — y and (xj, xi^i) G B for every i, l < z < n — l . A path from x
to y is denotedhy[x...y].
The certainty of the path [ x i . . . Xn] is defined as
n-l
cer[xi ,..Xn]=
]][ cer{xi,x^+i),
(1.18)
2=1
the coverage of the path [ x i . . . x^] is
n-l
COt'[xi . . . Xn] = J J COv{Xi, Xi+i),
i=l
and the strength of the path [ x . . . y] is
(1-19)
1 Flow Graphs, their Fusion and Data Analysis
a[x .. .y] = a{x)cer[x .. .y] = a{y)cov[x .. .y].
9
(1-20)
The set of all paths from x to y{x 7^ y) in O denoted < x, y >, will be called a
connection from x to y in G. In other words, connection < x, y > is a sub-graph of
G determined by nodes x and y.
The certainty of the connection < x, y > is
cer < x^y >=
V^
cer[x...y]^
(1.21)
[x...y]e<x,y>
the coverage of the connection < x, y > is
GOV < x,y >=
22
cov[x.. .y],
(1-22)
[x...y]e<x,y>
and the strength of the connection < x, y > is
a<x,y>=
^
cr[x...2/] =
[x...?/]€<a:,2/>
= a{x)cer < x^y >= a{y)cov < x,y > .
(1.23)
If we substitute simultaneously every sub-graph < x, y > of a given flow graph G,
where x is an input node and y an output node of G, by a single branch (x, y) such
that cr(x, y) = a < x,y >, then in the resulting graph G\ called the fusion of G, we
have cer(x,y) = cer < x,y >, cov{x,y) = cov < x,y > and <j(G^) = cr{G').
Thus fusion of a flow graph can be understood as a simplification of the graph
and can be used to get a general picture of relationships in the flow graph.
1.3.4 Dependences in Flow Graphs
Let X and y be nodes in a flow graph G = (iV, 3, a), such that (x, y) e B.
Nodes X and y are independent in G if
(j(x,y) =cr(x)a(y).
(1.24)
a(x,y)
= cer{x,y) =(j{y),
cr(x)
(1.25)
From (21) we get
and
cr(x,y)
cot'(x,y) = cr(x).
(1.26)
If
or
cer{x,y) > a{y),
(1.27)
cov{x,y) > cr(x),
(1.28)
10
Zdzislaw Pawlak
then X and y are positively depends on x in G.
Similarly, if
cer{x,y) < a{y),
(1.29)
or
cov{x,y) < CF{X),
(1.30)
then X and y are negatively dependent in G.
Relations of independency and dependences are symmetric ones, and are analogous to those used in statistics.
For every branch (x, y) G B'WQ define a dependency (correlation) factor //(x, y)
defined as
cov{x, y) — a[x)
cer{x, y) — a{y)
r]{x,y)
(131)
cer{x^y) -\- (j{y)
cov[x^y) -{- a(x)'
Obviously —1 < rj{x,y) < 1; ri{x,y) = 0 if and only \i cer{x^y) — a{y) and
cov{x,y) = a{x);r]{x,y) = — 1 if andonly if cer(x,t/) = cov{x,y) =0;r){x,y) =
1 if and only if a{y) = a{x) = 0.
It is easy to check that if r}{x, y) = 0, then x and y are independent, if - 1 <
77(x, y) < 0 then x and y are negatively dependent and if 0 < 77(x, y) < I then
X and y are positively dependent. Thus the dependency factor expresses a degree
of dependency, and can be seen as a counterpart of correlation coefficient used in
statistics.
Disease
yes
a(x{) = 0.70
a{x^ = 0.30
young
Fig. 1.3. Initial data
1 Flow Graphs, their Fusion and Data Analysis
11
1.4 Example 2 - Medical Test
Now we are ready to illustrate the basic concepts presented in this paper by a simple
tutorial example.
Various patient groups are put to the test for certain drug effectiveness. Initial
data are shown in Fig. 1.3. Corresponding flow graph is presented in Fig. 1.4.
a(jC2) = 0.30
a(z2) = 0.47
G(y^) = 0.25
young
Fig. 1.4. Relationship between Disease, Age and Test
Fig. 1.5 shows the corresponding fusion, of Disease and Test.
Disease
Test
yes
a(Xj) = 0.70
G(X^)
= 0.30
Giz^) = 0.55
G(Z^)
= 0.45
Fig. 1.5. Fusion of theflowgraph presented in Fig. 1.4
This flow graph leads to the following conclusions: