Challenges and Directions for Making Cognitive Systems Creative

Challenges and Directions for Making Cognitive
Systems Creative
Kai-Uwe Kühnberger
Institute of Cognitive Science
University of Osnabrück
Albrechtstr. 28, Germany
[email protected]
Abstract. In this paper, we will roughly sketch the idea of cognitive mechanisms like analogy-making and conceptual blending as means for modeling and explaining creative abilities of humans. Then, we will discuss
IBM’s Watson/Bluemix services, which are intended to go beyond classical
computing paradigms in order to approach the level of “cognitive computing" in its human-inspired facets. We will discuss challenges that arise if
systems like Watson/Bluemix are used for implementing creative abilities.
Finally, some speculations about possible directions for addressing these
challenges in cognitive systems by referring to cognitive mechanisms and
their corresponding computational realizations will be mentioned.
1
Analogy-Making and Conceptual Blending as a Source for
Creativity
Modeling creative abilities with computing devices is considered to be a hard
problem. Despite difficult questions like what creativity is, whether creativity
should be assigned to a cognitive process, to the product of such a process, or
to both, or how creativity can be assessed and measured, there is a strong interest in the last years to develop computational approaches for creativity. Some
of these theories focus on the modeling of cognitive mechanisms like analogymaking and conceptual blending. By applying analogy-making and conceptual
blending to problem solving tasks, it turns out that both mechanisms are useful to model certain creative abilities of humans. Examples of such applications
can be found in fields like problem solving [2], mathematics [5], music [1], or
physics [8].
Analogy-making can be understood as the detection of structural commonalities of two domains [7]. Domains can be a variety of things, e.g. conceptual
spaces, micro-theories, or representations of commonsense knowledge. Some
researchers claim that the ability to establish analogical relations is a core of human cognition [4]. On the other hand, conceptual blending in the sense adopted
here takes two input spaces and attempts to compute a generic space and a blend
space, i.e. the latter being a new and independent conceptual space containing
a mixture of conceptual information from both input domains [2].
Generalization (G)
Source (L)
r
,/
analogical transfer
Target (R)
Fig. 1. HDTP’s overall approach to creating analogies (cf. [8]).
Fig. 1 depicts the general idea of the analogy engine Heuristic-Driven Theory Projection (HDTP) [8]. Given a source S and a target T , the HDTP engine
computes an analogical relation between S and T together with a generalization G that covers the structural commonalities between S and T . The conceptual idea of conceptual blending in Joseph Goguen’s work is slightly similar
to the analogy-making process (compare Fig. 2): in a blending process two input spaces S and T are generalized as well as merged in a blend space B. We
consider conceptual blending therefore as a twofold process, which first detects
structural similarities based on an analogical relation between S and T and then
merges the two input spaces based on appropriate heuristics.
Su
G
) B u
)T
Fig. 2. Goguen’s version of concept blending (cf. [3]).
2
Remarks on IBM’s Watson/Bluemix Services
An industry system that is claimed to go beyond classical computing paradigms
in order to approach “cognitive computing" or the “cognitive era of computing"
in its human-inspired facets is IBM’s Watson/Bluemix platform. Besides classical industry applications that emphasize the integration of Big Data with highly
structured knowledge in tasks such as predictive maintenance, production processes, and recommendation systems for health applications, IBM applied the
Watson/Bluemix services also in the domain of creativity research. An example
is “IBM Chef Watson" (cf. https://www.ibmchefwatson.com), where innovative
culinary recipes are generated by the system. A part of the scientific background
of the approach can be found in [6].
The strength of IBM’s system is based on the combination and integration
of different types of knowledge resources and the possibility to combine a large
number of different algorithms including deep learning theories. For example,
the Watson services that are currently offered range from classical NLP applications (like the conversation of speech to text and vice versa), to such services
like AlchemyAPI (service to extract semantic meta-data from images), concept
insights (used for content exploration and recommendation of texts), or personality insights (classification of people’s personality characteristics based on
textual descriptions) just to mention a few of them. For such services different
algorithms are used and the strength of the overall architecture can be seen in
the possibility to combine these services.
An exemplification of such a combination of different knowledge resources
and algorithms is the Flu Prediction study project that took place recently in
Osnabrück (http://www.flu-prediction.com/about). The task is to improve the
prediction of influenza spreading across the US. The approach combines data
science methods to allow the identification of complex causal relations, the efficient use of Watson/Bluemix services, the usage of social media data (Twitter)
and conventional data from the Center for Disease Control (CDC), and Watson
as a question-answering system to allow the user to receive background information about the flu.
Close the gap by fusing data
Institute of
Cognitive Science
Realtime fuzzy
social media
Twitter activity
(geo tag + tweet)
+
Slow but relibale
CDC data
Institute of
Cognitive Science
Why cognitive computing?
Influenza
matters
Data science
methods
Prediction is
important
Social media
analysis
Delayed and
too few data
Watson as
medical expert
CDC – delayed
influenca data
Better
prediction
Fully informed
user
Cognitive Computing
Use the best from both worlds to improve prediction
Fig. 3. The left image depicts a graphical representation of the combination of reliable
but delayed CDC data about influence and current social media data (based on realtime Twitter activity). The image on the right side shows the overall idea to combine
services and algorithms to get better predictions about the future spreading of the flu.
Both images are taken from an online lecture given by Gordon Pipa and available on
Youtube: https://www.youtube.com/watch?v=I2FcCpXwxVw
Although the strength of the described combination of services for cognitive
computing applications is for certain domains quite remarkable, a natural question is whether this suffices to develop strong models of computational creativity as well. The application “IBM Chef Watson" mentioned above seems to allow
such an extension to creativity tasks, nevertheless, further examples are rare and
the general problem remains whether systems based on Big Data can be used for
highly abstract domains like the invention of new concepts in mathematics or
physics. Furthermore, to assess the degree to which approaches similar to IBM’s
Watson/Bluemix system can be called cognitively inspired is less clear given that
classical cognitive mechanisms like analogy-making, conceptual blending, similarity, or heuristic assessments play only, if at all, a very limited role. Finally, the
cross-domain aspect of cognitive mechanisms, namely the ability to transfer certain patterns from one domain to a totally unrelated domain, in order to solve
a certain problem as well as the multi-modal knowledge integration ability of
humans is often missing in classical cognitive systems.
3
Possible Directions for Addressing These Challenges
In order to sketch ideas how to extend cognitive computing systems to improve
their creative capacities, we propose the following extensions.
– Models of computational creativity in highly abstract domains like mathematics require more than visual or textual information. The natural choice
for a representation of mathematical knowledge are axiomatic systems formulated in a first-order or better monadic second-order language. A natural
choice to expand cognitive systems such as IBM’s Watson/Bluemix system
would be to add services that can deal with such axiomatic theories. In [5],
it is shown how reasoning on such representations can be used for modeling
creative processes.
– Humans use multi-modal representations for all sorts of cognitive tasks. For
example, verbal communication of humans does neither only work on the
syntactic level of natural language, nor only on the semantic level. Verbal
communication is rather a complex interplay between linguistic representations (syntactic, semantic, pragmatic etc.) and non-verbal means of communication like gestures, expression of emotions, motor actions etc. Similarly, in solving creativity tasks, humans often change representations between abstract and concrete representations, switch between and integrate
multi-modal information, and project knowledge between these representation types. A creative cognitive system should also be able to use such
multi-modal representation. In the Watson/Bluemix system, a first step into
this direction is the multi-modal integration of image and text.
– In order to make cognitive computing cognitive in a strong sense, cognitive mechanisms such as analogy-making and conceptual blending should
be added to such systems. It was argued in Section 1 that a large variety of
creativity aspects in various domains can be computationally modeled by implementations of such cognitive mechanisms. Expanding cognitive systems
into such a direction has the potential to increase the capabilities of such
systems significantly.
References
1. M. Eppe, R. Confalonieri, E. Maclean, M. Kaliakatsos-Papakostas, E. Cambouropoulos, M. Schorlemmer, M.
Codescu & K.-U. Kühnberger (2015): Computational Invention of Cadences and Chord Progressions by Conceptual Chord-Blending, IJCAI 2015.
2. G. Fauconnier & M. Turner (2002): The Way We Think. Conceptual Blending and the Mind’s Hidden Complexities, Basic Books, New York 2002.
3. J. Goguen (2006): Mathematical models of cognitive space and time. In: D. Andler, Y. Ogawa, M. & S. Watanabe
(eds.): Reasoning and Cognition: Proc. of the Interdisciplinary Conference on Reasoning and Cognition, Keio
University Press.
4. D. Hofstadter (2001): Epilogue: Analogy as the core of cognition. In: D. Gentner, K. Holyoak, B. Kokinov (eds.):
The Analogical Mind: Perspectives from Cognitive Science, MIT Press, pp. 499–538.
5. M. Martinez, A. Abdel-Fattah, U. Krumnack, D. Gomez-Ramirez, A. Smaill, T. Besold, A. Pease, M. Schmidt,
M. Guhe & K.-U. Kühnberger (2016): Theory blending: extended algorithmic aspects and examples. Annals of
Mathematics and Artificial Intelligence, DOI: 10.1007/s10472-016-9505-y.
6. F. Pinel, L. Varshney & D. Bhattacharjya (2015): A Culinary Computational Creativity System. In: T. Besold, M.
Schorlemmer & A. Smaill (eds.): Computational Creativity Research: Towards Creative Machines, Amsterdam,
Paris, Beijing, Atlantis Press, pp. 327–346.
7. M. Schmidt, H. Gust, K.-U. Kühnberger & U. Krumnack (2011): Refinements of Restricted Higher-Order AntiUnification for Heuristic-Driven Theory Projection. In: KI 2011: Advances in Artificial Intelligence, LNCS vol.
7006, pp. 289–300.
8. A. Schwering, U. Krumnack, K.-U. Kühnberger & H. Gust (2009): Syntactic Principles of Heuristic-Driven Theory Projection. Cognitive Systems Research 10(3):251–269.