Towards Scalable Non-Monotonic Stream

Towards Scalable Non-Monotonic Stream Reasoning
Thu-Le Pham
Insight Centre for Data Analytics, NUIG
[email protected]
In this paper, we address the issue of scalability for
non-monotonic stream reasoning in StreamRule
framework, by proposing a distributed approach for
parallel computation in the reasoning subprocess.
1. Introduction
An exponential growth in the availability of
streaming data has seriously challenged the ability of
the state-of-the-art reasoners to perform continuous
non-monotonic reasoning in a timely fashion. The high
expressiveness of non-monotonic reasoning results into
computationally intensive tasks. Hence an increase in
the streaming rate and input size impedes the scalability
and performance of the reasoners.
Web of
Data
Stream
Filtered Stream
query
processor
Query
Non-monotonic
Rule Engine
Data
Format
Processor
Facts
Stream Rule
Logic
Program
Reasoner R
Data
Format
Processor
Solutions
Answer
Sets
2. Proposed Solution
The StreamRule framework extended with the
partitioning process at the reasoning layer is shown in
Figure 2. The extension consists of the partitioning
handler and the combining handler. At run-time, the
partitioning handler splits an input window (a set of
input data items that the reasoner R processes per
computation) coming from the stream query processor
into several sub-windows with taking into account the
Input
dependency
graph
Decomposing
Process
Find input
dependency graph
Input
predicates
Logical
Program
Partitioning
Plan
Run time
StreamRule [1] is a declarative Web stream
reasoning system which combines: i) a stream processor
is used to filter semantic data elements, and ii) an
Answer Set Programming (ASP) reasoner is used for
computationally intensive tasks (Figure 1). This
approach has a potential for improving the scalability of
complex reasoning over semantic streams since the
stream processor reduces the size of input of the nonmonotonic reasoner. However, the ASP reasoner needs
to return results faster than new inputs arrive from the
stream processor in order to maintain the stability of the
whole system. Therefore, optimisation techniques are
highly needed for this reasoning subprocess to provide
faster responses.
Problem Statement. Consider the reasoning
subprocess in StreamRule with the declarative encoding
of the input program P (a set of rules) in ASP syntax. In
this paper, we study the data partitioning process, which
leverages the data dependency in order to: (i) speed up
the reasoning process by enabling parallelism in
StreamRule, and (ii) maximise the accuracy of the
answers.
Design time
Figure 1. StreamRule
input dependency. The combining handler combines
non-deterministic outputs (different results for the same
input) from parallel reasoners as follow:
⎧n
⎫
AnsP (W ) = ⎨∪ ansi : ansi ∈AnsP (Wi ) ⎬
⎩ i=1
⎭
Where AnsP(W) is answers provided by a reasoner
over P and W, and Wi is a partition of W.
For the realization of the partitioning process, the
analysis of input dependency will be made available
within the framework at first in the design phase. At
this phase, a logic program and a set of input predicates
are given in advance in order to build an input
dependency graph. If the input dependency graph is not
connected, it induces naturally a subdivision of input
predicates into several connected components (called
partitioning plan). Otherwise, the duplication process
builds a partitioning plan by decomposing this graph
into several components with their duplicated
predicates.
Web of
Data
Stream
query
processor
Reasoner R
Filtered Stream
Partitioning
Handler
Query
Combining
Handler
...
Abstract
Solutions
Reasoner R
Extended StreamRule
Reasoner PR
Figure 2. The extended StreamRule
3. Evaluation & Conclusion
We experimentally study the performance of the
reasoner PR in the extended StreamRule framework
with two logic programs P and P’. P has the
disconnected input dependency graph while P’ has the
connected one. The input data is generated randomly in
RDF triple format. We execute the reasoner PR over P
and P’ with increasing input window size from 5000 to
40000 items. The results confirm that using the reasoner
PR substantially reduces up to 50% (for P) and 30%
(for P’) of the latency. The partitioning handler for P’
case takes more time than for P case since it needs to
process the duplicated items while partitioning.
Moreover, the accuracy of the answers in the dependent
partitioning method (for both P and P’) is significantly
higher than in the case of random partitioning.
Acknowledgements. This research has been partially
supported by SFI under grant No. SFI/12/RC/2289 and EU
FP7 CityPulse Project under grant No.603095.
4. References
[1] Mileo, A., Abdelrahman, A., Policarpio, S., Hauswirth,
M.: Streamrule: a non- monotonic stream reasoning system
for the semantic web. In: Web Reasoning and Rule Systems,
pp. 247–252. Springer (2013).