Distributed processing techniques for Knowledge Management

Distributed processing techniques for
Knowledge Management purposes in
cooperative environments
P. Belsis1, G. Pantziou1, C. Skourlas1, V. Zafeiris2
1
Department of Informatics, Technological Education Institute, Athens, Greece
Tel: +302105910974, Fax: +302105910975,
E-mail: [email protected], [email protected], [email protected]
2Department
of Informatics, Athens University of Economics and Business, Athens,
Greece
Tel: +302108203157, Fax: +302108203157, E-mail: [email protected]
Abstract: Distributed knowledge management is an emerging discipline that enables
utilization of knowledge assets from different domains. Identification and
management of knowledge assets within a distributed environment is by no
means a trivial task; several obstacles, such as heterogeneity management,
effective query processing and retrieval, security are all tasks that demand
intelligent solutions. In this paper we present an agent based architecture that
implements the concept of distributed knowledge management. Ontologies
have been used to facilitate semantically enhanced querying and for
standardization of agent messages. Software agents in our approach also
enable automated security management using a policy management
approach.
1. Introduction
Knowledge is a critical factor, empowering the organization with the ability to sustain
competitive advantage [1]. Knowledge Management (KM) is an emerging discipline
that enables management of processes, people and assets within an organization
[14][16]. Even though a lot of systems have been proposed and lot of case studies
have appeared in recent literature, still the capabilities of single-organization
oriented systems are limited. Distributed KM is a new discipline that attempts to
enable better results, with different repositories integration [2].
Such an attempt confronts with various types of obstacles: For example, different
structure of knowledge repositories due to their semantic as well as syntactic
heterogeneity; moreover, query processing demands the development of flexible
and intelligent techniques that enable transparent asset identification among the
participant domains. Security is also a major concern, since there is a demand for
integration of different security models and there is also a demand for authorization
and access control enforcement with a flexible manner [2][3].
Our work focuses on exploiting Distributed Organizational Memories characteristics
[4] towards a more integrative approach, that enables transparent integration of
different knowledge repositories [6]; we also enable automated query processing for
all the participating domains in a transparent to the user manner; we also allow
authorization to be performed using security agents. Our approach implements KM
related activities in a distributed manner, attempting to leverage knowledge from
different organizations in cooperative environments, enabling thus access to shared
objects to all the domains within the cooperating environment.
The rest of the paper is organized as follows: Section 2 presents related work in
context; section 3 presents the main requirements that were encountered in our
approach and the main principles considered when developing our architecture;
section 4 presents an example use case scenario; section 5 provides a discussion of
the benefits of our results while section 6 concludes the paper and provides
directions for future work.
2. Related work
Recently, there has been much focus on how the KM architectures may present a
shift from the centralised approach that limits their capabilities, towards a
decentralisation scheme that will enable to capture the knowledge potential residing in
different domains that collaborate. In this section we will in brief examine some
approaches in distributed KM literature, comparing them in brief with our approach.
ADAM [5] is a system that uses agents to facilitate heterogeneous knowledge
source management. An agent is assigned for each domain and handles queries posed
by users. A second agent manages authorization related tasks. ADAM allows a user to
change identities; authorization is performed using a reputation collecting trust
approach, which makes it suitable for environments where organizational policy is not
explicitly recorded, though it is mainly suited for ecommerce environments or
environments in general where shared assets may not be critical.
XAROP [7] is a peer to peer system that utilises ontologies to facilitate knowledge
sharing. Mainly documents are shared within the XAROP framework; though the
management of documents and security permissions are maintained in a difficult to
scale manner.
Sec-shield [3] is a system that encapsulates the main principles of our approach. In
this paper we extend the Sec-shield architecture and provide an implementation
architecture that is based upon the JADE [8] platform.
3. System architecture
3.1 Necessary requirements
There are several requirements that a distributed KM system [17] has to fulfil:




Scalability. The number of the participating domains should be able to grow,
without restricting the functionality of the system.
Transparency. Asset identification as well as security management should be
able to be performed with minimal effort on the user’s behalf. In fact, the less the
user involvement, the higher the user satisfaction when intelligent interfaces
provide the ability to carry out the user assigned tasks within the distributed
environment.
Security management. In fact, the system should retain all the security properties
for each separate domain, plus it should allow all accesses within the federated
environment that will be defined as permitted by the system’s administrators.
Heterogeneity management. A main obstacle to be overcome is the fact that
different organizations use different structural as well as data encoding
representations; thus, overcoming this problem which in general is known as
heterogeneity, demands efficient methods for effective retrieval of relevant
knowledge sources. Towards this direction, we have
Towards these directions, we have utilised ontologies for semantically enhanced query
processing and heterogeneity management, while towards achieving transparency we
have used agent based solutions.
3.2 System architecture
The developed system comprises from several modules:

Distributed organizational memory: This module is responsible to codify
documented (explicit [15]) knowledge in semi structured form; there are
several similar sub-modules deployed in different network locations each one
storing knowledge sources of diverse content. Through the agent based
architecture we enable to query different domains and identify knowledge
sources relevant to the interests of the user.

The agent module. This can be further divided to two sub-modules: the one
that is responsible for identifying assets relevant to the user query, by
enabling an agent that forwards the payload of the agent message to the
participating domains and queries their organizational memory, and the
security sub-module, which assigns to a security agent to perform on the
user’s behalf the authorization related tasks.

Ontology management module. In order to overcome heterogeneity issues,
as well as to facilitate agent communication, we maintain separate RDF
ontologies for each domain. Each ontology contains a description of the
thematic categories of the data each domain contains (ex. Medical data, egovernment administration data).

Policy based authorization module. This module is responsible to enforce
authorization decisions within the federated environment. The policy module
identifies from the message of the agent the payload which refers to the
asset and describes the action requested and the user who originated the
request. In respect to the edited policy, the Policy Decision Point (PDP)
reasons over the request and accordingly it directs the decision to the Policy
Enforcement Point or (PEP) in order to be enforced. Prior to distribution of
text or multimedia files, the authorization process is activated; the user
according to his/her role provides together with the access request the
authorization specific credentials and accordingly he/she is being granted
authorization by the PEP (or in opposite case is getting a negative response
that shows that his request is not compatible with the policy). For policy
editing we have used the XACML [9] standard which uses an XML format.
The overall system architecture as described above is depicted in figure 1.
The presence of agents was decided in order to enable transparent identification of
assets and to provide automated authorization for users. For each domain an agent is
responsible for knowledge assets identification; a second agent carries the user
credentials and according to the security policy and the role the user is assigned to
within the federated environment, provides him/her with access to the resource (or in
the opposite case denies access to the resource). The agents in our system were
implemented with the aid of the JADE platform [8]. The agents exchange
communication messages based on the agent specific FIPA-ACL language [10].
PDP
PEP
Policy storage
Ontology Module
S-agent
A-agent
S-agent
Document management
module
A-agent
Document management
module
Figure 1: Overall system architecture.
4. System usage scenario
In this section we will in brief analyse the operation of our framework explaining in
detail the role of each module. Imagine the following scenario: a user is seeking for
some information of dermatological nature. The user provides the query through the
application’s interface, which is implemented as a servlet webpage. Accordingly a pair
of agents is assigned to the domain and to the specific request; the A-agent or
authorization agent which is responsible to carry out the authorization related tasks and
the S-agent or search agent which forwards the user query to the different
organizational memories. In order to reduce the bandwidth and computational resources
consumption by forwarding the message to each domain independently, we first utilize
the ontology management module. For each domain we store an ontology in the
ontology repository, which upon query it provides the relevancy of a query with the
contents of a domain. Therefore, in case a domain does not contain any assets of
thematic relevancy to the query, the query is not forwarded to the organizational
memory of the domain but instead it is being forwarded to the next ontology that
corresponds to another domain’s contents. In case the query is relevant to the contents
of a domain (for example if the query refers to some dermatological data as in our case)
then the query is being forwarded to the domain’s organizational memory module
(which is implemented as a relational database). Accordingly, for the assets that have
been identified as relevant to the user query, the authorization module is invoked, which
operates as follows: the security agent in the message payload encapsulates the
credentials of the user who originated the query, as well as the action requested for the
asset. From this message payload the authorization enforcement module will reason
whether to authorise the user to gain access to the requested assets.
In this paragraph we will deal with the operation of the security module. The security
module which enables automated policy management consists of the following submodules [11]:
The Policy Enforcement Point (PEP) which enforces the decision
The Policy Decision Point (PDP), which loads the policy and reasons over the
request. We have adopted a policy mapping approach in order to allow roles from
different domains to be assigned rights in remote domains [11]. Therefore the request is
directed by the S- Agent to the PEP which accordingly queries the local or remote
PDP’s which evaluate the policies to verify if the user should be assigned access rights
and accordingly directs the reply to the PEP to enforce the decision. The whole process
happens on the background, and the messages exchanged by the PEP and PDP are all
exchanged in a transparent manner; the same stands for the messages exchanged
between the agents. In fig. 2 we have captured the exchanged messages by using a
special tool which is part of the utilised agent development platform, the agent sniffer
that allows capturing and visualising of the messages exchanged between the agents.
In figure 2 we can monitor the search and authorization agents, as well as other
platform dependent agents. For example the ds agent is the directory facilitator which
acts as directory services agent and provides the addresses of the cooperating agents
in different domains. On the right of the picture the messages exchanged can be
visualised. Each agent encapsulates in the payload of the message exchanged the
request or the authorization credentials (for the S-agent and A-agent respectively). The
user is not aware of the processes going in the background, only the result of the query
or of the authorization process is appearing finally in the interface where the user
initiated the request.
Figure 2: Capturing messages exchanged by the participating agents, using the Sniffer tool.
5. Discussion
The limits of traditional single-organization KM systems are obvious: even from a
theoretical perspective it has been recognized in Nonaka’s writings [12] that the larger
the interaction between different organizations with integrated repositories the better the
results from utilising the knowledge organizational potential. Though, as we analysed
previously, different organizations store different kinds of information, classified in
different categories, and also choose different knowledge representation forms.
Therefore, it is essential to provide intelligent means of identifying knowledge assets
related to the topics of interest. Towards this end, we have utilised ontologies.
Ontologies provide a means to identify a taxonomy of knowledge categories which help
us identify if one organization holds information relevant to the categories we are
interested. Thus, we avoid explicitly querying the organization’s knowledge potential. In
addition, by providing explicit ontologies that help the agents interact, we allow a
standardized communication between the software agents (standardization of agent
vocabulary).
Furthermore to describing the abstract architecture of a distributed KM approach
[13][5] we describe a prototype implementation which is built using the Java based
JADE [8] platform. We have also utilised techniques that enable transparent
authorization solutions within the distributed environment.
We have initially tested the validity of our approach by applying it in a case study
scenario using data from the medical domain. We have created two different
organizational memories and deployed them in different sub-networks. Through the
interface we have directed several queries from each domain to the other; the initial
behaviour of our system has proved to be very promising. The data used in each case
were representing different types of medical information. The organizational memories
were implemented as a relational database that manages both text and image files (for
the latter case of files a metadata description was provided in order to be able to pose
queries). The system thus offers the possibility to retrieve knowledge from a distributed
text retrieval repository, as well to identify image files relevant to a query (for example
some medical image to assist to a diagnosis). The authorization enforcement framework
for this distributed environment utilises the principles of the XACML [9] operational
framework.
6. Conclusions – Further work
The prototype system that we have described and that is still growing in functionality
supports several tasks, such as distributed identification of knowledge assets of textual
as well as multimedia form; it also allows transparent access control enforcement within
the distributed organizational framework, by providing a simple to use interface (for
querying, as well as for requesting authorization). For the query process the user sends
a number of keywords and receives a reply with a list of assets relevant to the query;
afterwards in order to gain authorization the user provides the username password,
identifies the role she/he is granted within the domain and the action over the requested
resource. Next, the security agent forwards the provided credentials and returns a
decision provided by the authorization module.
Our platform has the benefit that it is not platform specific, while it uses standardised
technologies for retrieval of knowledge assets and for authorization purposes. We are
planning to assess performance issues in respect to the platform’s operation, by
evaluating network-related parameters (such as bandwidth consumption) with respect to
the platform’s real time performance.
Acknowledgements
This work was co-funded by 75% from E.E. and 25% from the Greek Government
under the framework of the Education and Initial Vocational Training Program –
Archimedes.
References
[1] Bhatt G. “Management strategies for individual knowledge and organizational knowledge”, Journal of
Knowledge Management, vol. 6, number 1, 2002, pp. 31-39.
[2] Belsis P, Gritzalis S., Skourlas C., "Security Enhanced Distributed Knowledge Management
Architecture", Proceedings of the 5th International Conference on Knowledge Management, K.
Tochtermann, H. Maurer (Eds.), pp. 327-335, July 2005, Graz, Austria, JUCS Pubs.
[3] Belsis P., Gritzalis S., Malatras A., Skourlas C., Chalaris I., "Sec-Shield: Security Preserved
Distributed Knowledge Management between Autonomous Domains" in Proceedings of the DEXA'05
TrustBus'05 2nd International Conference on Trust, Privacy, and Security in the Digital Business, J.
Lopez, G. Pernul, (Eds.), August 2005, Copenhagen, Denmark, Lecture Notes in Computer Science
LNCS 3592, Springer, pp. 10-20.
[4] Abecker A., Bernardi A., Hinkelmann K., Kuhn O. and Sintek M. “Towards a Technology for
organizational memories”, IEEE Intelligent Systems, May/June 1998b, pp.40-48.
[5] Seleznyov A., Mohamed A., Hailes S., “ADAM: An agent-based Middleware Architecture for
Distributed Access Control” in Proceedings of the 22nd International Multi-Conference on Applied
Informatics: Artificial Intelligence and Applications, 2004
[6] Belsis P., Gritzalis S., ‘‘Distributed autonomous Knowledge Acquisition and Dissemination ontology
based framework’’, Workshop on Enterprise Modeling and Ontology: Ingredients for Interoperability
H. Kuhn (ed.) Dec. 2004 Vienna Austria, Univ. of Vienna.
[7] Tempich C., Ehrig M., Fluit C., Haase P., Marti E.L., Plechawski M., Staab S. “XAROP: A Midterm
Report on Introducing a Decentralized Semantics based Application, Proceedings of Practical
Aspects of Knowledge Management (PAKM) 2004, Vienna Austria, D. Karagiannis, U. Reimer (eds)
LNAI 3336 Kluwer Academic publishers, pp. 259-270.
[8] http://jade.tilab.com/
[9] ‘‘Extensible access control markup language specification 2.0’’, OASIS Standard, 2004 (available at
http://www.oasis-open.org).
[10] FIPA standard status specifications www.fipa.org/repository/standardspecs.html
[11] Belsis P., Gritzalis S., Katsikas S., "A Scalable Security Architecture enabling Coalition Formation
between Autonomous Domains", in Proceedings of the 5th IEEE International Symposium on Signal
Processing and Information Technology (ISSPIT'05), December 2005, Athens, Greece, IEEE
Computer Society Press
[12] Nonaka I., Takeuchi H. “The knowledge Creating Company”, Oxford University Press, 1995.
[13] Bonifacio, M., Bouquet, P. and P. Traverso. Enabling distributed knowledge management.
Managerial and technological implications. Informatik – Informatique, 1/2002
[14] Davenport T., S. Volpel. “The rise of knowledge towards attention management”, Journal of
knowledge management, vol. 5, No 3, 2001, pp 212-221.
[15] Polanyi “The Tacit Dimension”, Routledge & Kegan Paul, London (1966)..
[16] Skyrme D., Amidon D. M., “Creating the knowledge based business”, Business Intelligence, London,
1997.
[17] Belsis P., Malatras A., Gritzalis S., Skourlas C., Chalaris I. "Flexible Secure heterogeneous File
Management in Distributed Environments ", IADAT Journal of Advanced Technology, vol. 1 Number
2, pp. 66-68, December 2005, published by IADAT International Association for the Development of
Advances in Technology