Extraction of Synonyms in User-Generated Content Bachelor Thesis Alex Oberhauser [email protected] Supervisor: Dr. Anna Fensel STI Innsbruck, Austria University of Innsbruck, Austria January 4, 2011 1 Abstract The Semantic Web offers a great opportunity to gain more information, not only in quantity but also in quality, from the existing data than it is possible with the current web. One major improvement is the identifying of new relations between objects. This thesis addresses the problem of synonym computation to form a search query, that is relevant for the current context. For this purpose the work focused on the gain, the evaluation and the classification of the current context on the one side and the gain of synonym sets on the other side. After the two information clouds are computed the intersection returns all suitable synonyms in this context. Additional the context-aware synonyms are extended with Hyponyms and Hypernyms. To achieve the goal of the context computation I developed and analyzed a web crawler to gain context information from a bunch of public RDF files that describes these user. The final work was then integrated, in a slightly adapted form, into the m:Ciudad framework. Keywords Context-Aware Synonyms, Semantic Web, Linked Open Data, RDF, m:Ciudad, UDL 2 Contents 1 Introduction 4 2 Motivation & Problem Statement 5 3 Approach 3.1 Lexicographical Synonyms (Core) . . 3.2 Conceptional Synonyms (Extension) 3.3 Context . . . . . . . . . . . . . . . . 3.4 Intersection Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 9 12 12 19 4 Implementation 4.1 Used Technology . . . . . . . . . 4.2 Architecture . . . . . . . . . . . . 4.2.1 Package: synonyms . . . . 4.2.2 Package: context . . . . . 4.3 Server/Client Architecture . . . . 4.3.1 Server . . . . . . . . . . . 4.3.2 Client . . . . . . . . . . . 4.4 Integrated Version - The Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 21 22 23 25 26 26 26 27 . . . . . . . . . . . . . . . . 5 Evaluation 27 6 Further Work 28 A Appendix - RDF example files 30 B Appendix - Android Client 39 3 1 Introduction The next generation web will be not only more interlinked [20] then today known web, but also available on many mobile devices. The availability of the data on the go expands the context and makes it possible to guess the semantic of a search query more precisely. For example the search could now depend on the location or on the personal interests. Thinkable is also that the appointments in the calendar are used to check if special synonyms are useful for the current context. The thesis was written in the research area of Semantic Web [25]. The term Semantic Web was shaped by Tim Berners Lee [4]. It extends the current Social Web (Web 2.0) with semantics. The main goal of the extension is to have a web that is machine readable and understandable throughout different systems. One widely used approach, that is also used in this thesis, is the use of RDF [27] as data exchange format with RDF Schema [26] as language definition. RDF consists of triples in the form of <Subject, Predicate, Object>. This concept in combination with unique ontology URIs describes the most real world relations. For exchange the XML syntax is commonly used. That makes the technology compatible with the current web technology. Other representations are N3 [24], N-Triples [23], TRiG [11], TRiX [18], Turtle [30] and RDFa [28]. The success of this technology is based on the linking of more RDF files. This concept is named Linked Data [20]. The linked structure makes it possible to crawl over a network of RDF files knowing only one starting point. Additional there exists a query language named SPARQL [29]. SPARQL is used to query the stored data set with the help of the triple format. The query in Listing 1 returns all name, e-mail combination from all FOAF [5] entities in the data set. Such a query has two different term types. One are variables, defined by the question mark as prefix the other are constants that have the namespace as prefix separated with a colon from the property, this makes the term unique over all properties. Listing 1: SPARQL example PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?mbox WHERE { ?x foaf:name ?name . ?x foaf:mbox ?mbox } One major scenario of context-aware synonyms are mobile search queries that are related to the current context of the user. The available context information could be used to enrich the search query or to filter unneeded search results. Either way the information could be improved by the developed algorithm. The application area for mobile devices is important because there we have context information and limited resources to visualize a lot of information. The work starts with the specification of the problem in section 2 and why such context-aware synonyms are needed in future applications. After that, in section 3 a conceptional solution how to compute such sets of synonyms that suites for a concrete context is given. The next section (4) explains the theoret4 ically solution on the base of a concrete implementation. The last two sections evaluate the current work and related works and give an outlook to further works. 2 Motivation & Problem Statement With the storage of knowledge in a way that everybody can easily access it and with the presence of all knowledge on a worldwide network, the problem of searching suitable information has shifted from searching information (quantity) to filtering the gained information (quality). The problem is not anymore to find information about special topics, but to filter the mass of gained data and convert it, if needed, to a machine readable format. Based on the search term this could be a time intense work and possibly not always successful. Even worse is this fact on mobile devices that have limited resources and the user not always the time to search intensely for information. The context-awareness will be even more important on the Future Internet [16] where each thing is connected to the internet or at least to a subnetwork. In such scenario the context could be described more precisely than it is possible with the today’s technology. More information means not only that the context could be described more precisely, but also that it is more important to filter the gained data. The problem that this thesis tries to solve is the ambiguity of the natural language with the help of the current environment or context. Although synonyms are by definition words with the same meaning, is possible that one input word evaluates to more, semantically different, synonym blocks. For example AI could evaluate to two semantically different synonym blocks. On for the term Artificial Intelligence and the other for Amnesty International. To find the right blocks for the current situation the context information are evaluated and on the base of the result the right synonym blocks are chosen. To formalize the problem we introduce the following sets: Sinput = ∪i∈I Si ... Cagent = CDY N ∪ CST AT ... R ... Synonyms for the input term input. Si is one synonym block (synset), with synonyms with the same semantic. The context information of the agent that triggers the search. CDY N is the dynamic and CST AT is the static context. The context-aware synonym result set. The two sets Sinput and Cagent are gathered from different sources. On the base of the this two sets the result set R is computed. The following formula shows how the result is defined. ∃i(∃t(t ∈ Si ∧ t ∈ Cagent ) =⇒ ∀r(r ∈ Si ∧ r ∈ R)) The formula above claims that there exists a term that is part of one (or maybe more) synonym block(s) and the context set. That implies that all terms in the found synonym block(s) are also part of the result set. If this statement holds we have found context-aware synonyms. This thesis does not address the problem of verifying data, neither the filtering of a concrete search result. The focus of this work is to compute suitable 5 synonyms for the current context. A possible use scenario is the filtering of a search result with the help of computed synonyms of a input search term. Let’s consider a first scenario of the search of AI. AI is a abbreviation that has more than one meaning and suites perfectly to show how important it is to filter the mass of information gained from a simple search. In our scenario AI means Artificial Intelligence, but on other scenarios is possible that the same abbreviation means Amnesty International. As precondition the context cloud of the agent that triggers the search should include the term Artificial Intelligence or some related term that is part of the same synonym block. Later, in a second scenario, we show that this term could be also part of the context of a related entity. This indirect context evaluation has the effect that the result has a lower accuracy. Scenario 1 - Initial Situation A search of AI without the help of a context filtering includes approximately 449 million search results1 . The first two results shown in the search excerpt in Figure 1 are about the topic artificial intelligence and the third about Amnesty International. Although we had a clear topic in mind what we expect to find, we do not were able to find only the right results with our short and ambiguous search term. Figure 1: Excerpt of the search result AI searched on http://www.google. com (not filtered) Scenario 1 - Improved Situation The Semantic Web represents each concept in the form of URIs (short for Uniform Resource Identifier ), or the successor IRIs (Internationalized Resource Identifier ). Such a concept identifier is concatenated from a ontology URI/IRI and the concept name. The gained concept, assumptive the same ontology is used, is not any more ambiguous. The developed algorithm goes a step further and combines disambiguation with context information. The gained results are not only about a unique topic2 , but also relevant for current situation. 1 Searched on 11. May 2010 on http://www.google.com is possible that results are found about more than one topic if the context information and the synonym matches for more than one subject. If this is the case the results are 2 It 6 As we have seen in the previous scenario there is a need for filtering of search results. Let’s consider the same scenario, but now with a wrapper around the search engine that removes all results that do not include at least one of our context-aware synonyms3 . The exact computation will be discussed in the following chapters for now it is only important that the context of the user includes tags that describes artificial intelligence and not anymore other search results that do not occur in the context. Figure 2: Excerpt of the search result AI searched on http://www.google. com (filtered) Scenario 2 - Related Context Information The second scenario shows how the algorithm should react if there is no useful information in the context of the searching agent, but in the context of a related context data set. Let’s consider that an agent searches for the term dive and there is no related information in his context cloud, but in a known agent context we found the term swimming. With the help of a context similarity algorithm we reach the information that the two agents are to 75 % similar. The search result should now include the terms swim, swimming, diving and dive with the priority 0.75 (or 75 %). The thesis should show that the use of context information in combination with synonym computation is a powerful mechanism to improve the quality of gained data sets. The developed algorithm could be used in a variety of prioritized. 3 The wrapper around the search engine is not part of the thesis, but should give a use scenario how the context-aware synonym library could be used in a practically manner. 7 use cases, such as filtering of search queries on the internet or for suggestion of free time activities on mobile devices. Another real world example is the suggestion of movies. In such case the developed algorithm could support your choice by computing the synonyms for the keywords of each movie and then by filtering the computed set on the base of the agent’s context. The first step, the computation of synonyms, is needed to receive a higher probability for a match. The second one takes into account your preferences. 3 Approach The following section describes the general approach that is used for the formal definition of the solution and the implementation. The explained phases will be deepening into the next section in more detail. Figure 3: The workflow of the computation of context-aware synonyms. As you can see in Figure 3 the computation starts with the interaction with an agent. Important here is that an agent does not have to be a human being, but could be also a piece of software. There are two types of interaction with the entity. First there is a ”passive” or indirect way of communication. In this phase there will be generated, from different sources that are related to the agent, the context that is relevant for 8 the next phase of interaction. The gained data could be stored in a RDF file, but a better solution is to store this data in a more sophisticated solution, such as a RDF repository. At least the static context changes not very often and is used for each computation. The easy access through a SPARQL endpoint and the scalability of a RDF repository are big advantages. The next phase is the ”active” part, where an input string is given. This string is used for the input to query different dictionaries that returns the synonyms. A dictionary that suites perfect for the computation of context-aware synonyms is WordNet from the Princeton University [21]. The output are grouped into blocks, called synsets. Each synonym in a synset has semantically the same meaning and the block has a natural language description of the meaning with example sentences. To simplify the computation in later steps and to make it possible to expand the approach with other dictionaries the data will be abstracted to a RDF file. Additional to the simplification of the computation, it hides also the underlying implementation of the data retrieval part with the advantage to be able to expand the software in a later step with additional dictionaries without changing the algorithm that is responsible for the generation of context-aware synonyms. After the gain of synonyms it is possible to compute for each synonym in the newly gained set the translation of this word. In this step there is used the first time the context to receive the languages that are spoken by the agent. The expanded set is written to the same RDF file for the synonyms that we have saved in the previous step. This generated file is called in the scope of this work synonym cloud, analog to the context cloud. The preparation for the algorithm consists of two major parts. One is the computation of synonyms and the other the computation of context information. After that the intersection of the two sets is performed with the help of a SPARQL query. This approach is possible through the abstraction of different data sets to the RDF format. As optional and less accurate extension the result set could be extended by super- and sub-concept (called also Hypernyms and Hyponyms). This method should only be used if there are needed more context-aware synonyms and a high accuracy is not an obligational condition. 3.1 Lexicographical Synonyms (Core) The first major task of the process is to compute all synonyms of a given input word. In this step it does not matter if the input comes from a human being, a group or a piece of software. Later, when we have to compute a context in that the given synonyms were given, we will see that this makes a difference. As the computation of synonyms is a well-researched area there are a lot of databases that could be used. For the implementation part there will be used WordNet [21] and LexVo [9]. Another possibility is to use a library that simplifies the dictionary access, such as Apache Lucence [10]. If there are also non English search terms the library has to be extended by different dictionaries for this language. On the other site the extension with domain specific dictionaries are possible too. The extension guarantees for different use scenarios the most accurate output. WordNet [21] is a English lexical database that suites perfectly for the current work, because the search result are grouped in so called synsets. Synsets are 9 blocks of synonyms with associated example sentences and explanations what the synonyms in the block means. This structure suite perfectly to make a first pre-computation, to gain logical blocks of synonyms that have semantically the same meaning. To be able to make computation on the data, for example to unify the data with the context, it is needed to transform the data into RDFdata structure. Once the newly, well-formed data structure was gained it is possible to extend each synonym in the block with multilingual synonyms. At this point the LexVo [9] database is used. Although there are a lot of good multilingual dictionaries around, LexVo [9] is used because the easy access (unique URI for each word), the RDF format and the fact that the dictionary searches to an English word all multilingual translations. With the help of the information what language the user speaks it is possible to gain a first context-awareness set of results. Through the conversion of the gained synonyms to a unique format it is possible to extend or substitute the computation with other dictionaries. Another advantage is the normalization of the terms, to be able to reach a match also if the spelling differs slightly. Thinkable for example is the extension with technical dictionaries for special domains. The used ontology to store the data is SKOS [7]. SKOS is an abbreviation for Simple Knowledge Organization System and is used to store knowledge about objects in a semantic way. For the conversation from the source data format to the SKOS ontology and for the querying of the abstracted data sets the framework uses Jena [2]. The excerpt of an RDF file in Listing 2 shows how a synonym block looks like. Listing 2: Synset block in raw RDF format. <skos:Collection rdf:about="http://example.org/synonyms/AI_noun. cognition-def2"> <rdf:object>nouns denoting cognitive processes and contents</rdf: object> <skos:scopeNote>noun.cognition</skos:scopeNote> <skos:definition>the branch of computer science that deal with writing computer programs that can solve problems creatively</ skos:definition> <skos:example>workers in AI hope to imitate or duplicate intelligence in computers and robots</skos:example> <skos:hasMember> <rdf:Bag rdf:about="http://example.org/synonyms/computer_science "> <skos:note>computerscience</skos:note> <skos:altLabel xml:lang="en">computer_science</skos:altLabel> </rdf:Bag> </skos:hasMember> <skos:hasMember> <rdf:Bag rdf:about="http://example.org/synonyms/computing"> <skos:note>computing</skos:note> <skos:altLabel xml:lang="en">computing</skos:altLabel> </rdf:Bag> </skos:hasMember> <skos:hasMember rdf:resource="http://example.org/synonyms/AI"/> <skos:hasMember> <rdf:Bag rdf:about="http://example.org/synonyms/ artificial_intelligence"> <skos:note>artificialintelligence</skos:note> <skos:altLabel xml:lang="en">artificial_intelligence</skos: altLabel> 10 </rdf:Bag> </skos:hasMember> </skos:Collection> The synonym blocks are not sorted and hard to read for human being. With the help of XSLT it is possible to output the result in a human readable form. The XSLT shown in Listing 3 outputs the data as shown in Figure 4. Listing 3: Synonym cloud XSLT for visualization in HTML <?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:skos="http://www.w3.org/2009/08/skos-reference/skos.rdf#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <xsl:template match="/"> <html> <body> <div style="text-align:center"> <h1><font color="red">Synonym Blocks gained from CAS framework</font></h1> </div> <xsl:for-each select="//skos:Collection"> <b>Definition: </b> <xsl:value-of select="skos:definition"/>< br/> <b>Category: </b> <xsl:value-of select="skos:scopeNote"/> (< xsl:value-of select="rdf:object"/>)<br/> <b>Example: </b> <xsl:value-of select="skos:example"/><br/> <ul> <xsl:for-each select="skos:hasMember/rdf:Bag"> <li> <xsl:value-of select="skos:altLabel"/><br/> <i>normalized: </i> <xsl:value-of select="skos:note"/></ li> </xsl:for-each> </ul> <p/> </xsl:for-each> </body> </html> </xsl:template> </xsl:stylesheet> Figure 4: Excerpt of synset block visuzalized in HTML 11 3.2 Conceptional Synonyms (Extension) A useful extension to the lexicographical synonyms (see section 3.1) are conceptional synonyms that extend the input with super concepts (Hypernyms) and sub concepts (Hyponyms). In difference to the lexicographical synonyms the conceptional synonyms are not so accurate. For this reason they are implemented as extension to the context-aware synonyms. The following tree in Figure 5 shows the extension of the input artificial intelligence with sub- and super-concept of deep one. Figure 5: Excerpt of Hypernyms and Hyponyms for the search term artificial intelligence They are not so accurate because they extend the result set with additional information and related to the deep of the search (the tree above has deep one, because we have only direct sub- or super-concepts) this extension could be to general or too specific. During the tests of the implementation a maximum depth of two was useful. If there is no need for a payload of additional information a deep of one is enough. The depth depends heavily on the use case of the extension, so the deep could be dynamically changed depending on the use case on that the library is used. Through the wide range of the data set from dbpedia.org [22] the results from the previous step could be extended with a lot of additional terms from the given taxonomy. This extension lowers the quality of the search result. That means that in the worst case the computed context-aware synonyms are extended with terms that are not related to the current context and only weak related to the synonyms. For this reason this part was only developed for the integration into the m:Ciudad [14] project and is not part of the main context-aware synonym algorithm. 3.3 Context The second major information cloud, beside the synonyms, is the context. It is useful to split the context into two major categories. One is the Static Context and the other is the Dynamic Context. The separation is made, because the 12 computation on the two sets differs and additional the Static Context should not be touched if something in the Dynamic Context changes. Another advantage that comes with the distinction is that two context types could have different priorities. Logically the same context splitting was made in [15], although with different notations. There the dynamic context was called environmental context and the static context was called personal context. Although the context is classified into two parts, throughout the implementation the two context types are stored in the same format. The difference is indicated only by the related priority4 . The priority value could be chosen and is related to the scenario where the algorithm is used. For example in a scenario where there are searched long term synonyms the static context could have a higher priority. The principle behind the abstraction is the same as described in the section 3.1. Now we do not use the SKOS [7], but the SCOT [3] ontology. SCOT is an abbreviation for Social Semantic Cloud of Tags. Listing 4 shows an excerpt of a context tag in raw RDF format. The list:priority defines the priority that indicates if the term is part of the static or dynamic context. The term in the example is part of the static context. This relation is implicit through the fact that we have defined in the current example that the static context has the priority 1.0. There is no other indication what type of context the term is related to. That makes the data structure very flexible for extension. Listing 4: Excerpt of the context term computer science <scot:cooccure_tag> <scot:Tag rdf:about="http://example.org/context/computerscience "> <scot:name>computerscience</scot:name> <scot:own_afrequency>2</scot:own_afrequency> <scot:own_rfrequency>1.3605442</scot:own_rfrequency> <list:priority>1.0</list:priority> <scot:synonym>computerscience</scot:synonym> <scot:synonym>Computer Science</scot:synonym> <scot:cooccure_with>http://koni.networld.to/foaf.rdf#me</ scot:cooccure_with> <scot:cooccure_with>http://devnull.networld.to/foaf.rdf#me</ scot:cooccure_with> </scot:Tag> </scot:cooccure_tag> To have a more human readable output we use the XML stylesheet in Listing 5. With the help of this XSLT we receive an ordered output in HTML format as shown in Figure 6 Listing 5: Excerpt of the context term computer science <?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:list="http://crschmidt.net/ns/list#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:scot="http://scot-project.org/scot/ns#"> <xsl:template match="/"> 4 http://crschmidt.net/ns/list#priority 13 <html> <body> <h1><font color="red">Context Cloud - Sorted by Importance</ font></h1> <h2><xsl:value-of select="//scot:Tagcloud/dc:title"/></h2> <xsl:value-of select="//scot:Tagcloud/dc:description" /><p/> <table border="1"> <tr bgcolor="#9acd32"> <th>#</th> <th>Tag Name</th> <th>Priority</th> <th>Absolute Frequency</th> <th>Relative Frequency</th> <th>URI</th> </tr> <xsl:for-each select="//scot:Tag"> <xsl:sort select="scot:own_afrequency * list:priority" data-type="number" order="descending"/> <tr> <td><xsl:value-of select="scot:own_afrequency * list: priority"/></td> <td><xsl:value-of select="scot:name"/></td> <td><xsl:value-of select="list:priority"/></td> <td><xsl:value-of select="scot:own_afrequency"/></td> <td><xsl:value-of select="scot:own_rfrequency"/></td> <td><xsl:value-of select="@rdf:about"/></td> </tr> </xsl:for-each> </table> <p/> ---<br/> <i>#</i>: The sorting value is calculated as follows: <i> Absoulte Frequency * Priority</i> </body> </html> </xsl:template> </xsl:stylesheet> The following two sections explain more in deep the two context types. Static Context As the name implies the Static Context changes only from time to time. Usually that is when the agent changes the interests. This type of context is an extension of the general definition of the concept of a context. In general it could be said that this type of context describes the agent that searches for synonyms. If the agent is a person that could be the interests or the tags of the made publications. If the gained information are not enough to compute suitable context-aware synonyms it could be extended with context information from a social network. Similar as described in [31] the similarity algorithm presented in this section takes into account the connection between two (or more) agents. Useful connection information from a social network are the context clouds of known agents, further called ”friends”. As friend we define a known agent with 14 Figure 6: Excerpt of the sorted context visuzalized in HTML a similarity great than S, where S is called friendship threshold and expressed in a percentage value. More in general we reach the following formula: f riend2,1 = sim(agent1 , agent2 ) > S f riend2,1 ... sim(agent1 , agent2 ) ... S ... if true agent2 is defined as friend of agent1 computes a percentage value of the similarity of the two agents. Threshold value to indicate who is a friend. Should be set during the integration to gain the optimal performance for the scenario. The term friend is more abstract than the real world term, because it is possible that the real agents have not met each other. It could be interpreted as could be possible friends. Although it is useful to define the relation friend in such a way, that we can say If sim(agent1 , agent2 ) computes to a higher value it is more likely that the other context that not matches describes also each other person. An argument for this definition is that people with the same background or social status are likely to have similar interests. That means if the majority of the interests are the same we can assume that the two agents are similar with a likelihood of the presented algorithm. In the scope of this work and for the implementation the sim function is defined as follow. #(matching) X sim(agent1 , agent2 ) = absoluteF requency (1) #(total) X 15 absoluteF requency simagent1 ,agent2 = sim(agent1 , agent2 ) + sim(agent2 , agent1 ) 2 (2) Figure 7: Compute the similarity of two Contexts For convenience the implementation uses FOAF [5] files to express an agent. The FOAF file could point to other RDF file that could be used to evaluate further static context related data, for example published articles that are tagged. The use of linked data as abstraction is useful to consolidate information from different sources, with the advantage that the underlying algorithm is consistent after source changes. To handle the intersection of synonyms and context we have to abstract the information further. After the evaluation of the FOAF file and all related files that includes static context we have a further RDF file written in the ontology SCOT [3]. This information cloud includes the first time also priority values. Important to mention is that the priority consists of two types of values. The first is the quantity and the second the quality. The quantity counts only how often a tag occurs in the whole context. The value is then stored as absolute amount. The relative value is not stored explicitly, so the blocks are consistent also if the total amount of tags changes. If needed the relative value is computed 16 by the value total amount and the absolute values. The other value is the quality. For the own context the quality is 1 (100% accuracy) and for the extension with the context of friends the quality value is sim(agentme , agentf riend ) if the tag does not occur in your own context otherwise it is also 1. In the following example there will be used the abbreviation M for myself and F for friend. Through the bidirectional of the similarity function the agents are interchangeable. Figure 8 shows the two entities, the shared context tags and the absolute frequency in the context cloud for the example scenario. Figure 8: Excerpt of context cloud relation. Context Tags Artificial Intelligence Computer Science Semantic Web Business Computer Security Computer Network Diving Backtracker MokSec Research Networld Networld Team Absolute Frequency 3 1 5 1 1 1 1 2 2 3 1 1 22 Common Yes Yes Yes No No No No No No No No No Table 1: Tags that occurs in the context cloud of M We compute first the similarity of the context cloud of M in relation to F and vice versa with the help of the formula 1. 17 Context Tags Artificial Intelligence Computer Science Semantic Web Robotics Web 2.0 Absolute Frequency 1 1 1 1 1 5 Common Yes Yes Yes No No Table 2: Tags that occurs in the context cloud of F 3+1+5 3+1+1+1+5+1+1+2+2+3+1+1 9 = 22 ≈ 0.41 sim(agentM , agentF ) = 1+1+1 1+1+1+1+1 3 = 5 = 0.60 sim(agentF , agentM ) = The average of this two values is computed with the formula 2. 0.41 + 0.60 2 1.01 sim = 2 sim ≈ 0.51 sim = We compute for M and F the similarity and take than the average. The average is needed because the friend relationship is bidirectional so it is not possible to have different friend values for the two agents. More general speaking M could not be more a friend of F than F of M. This different measuring of friendship could be possible in real world scenarios but is not permitted in our use case where we compute the similarity of interests. Additional the average weakens also the fact that the number of tags in the two context clouds could differ. After the computation, as shown above, we receive the following new, extended context cloud described in Table 3. Dynamic Context The definition of the Dynamic Context could be taken from any English speaking dictionary of the word context. The following definition was taken from WordNet [21]. 18 Context Tags Artificial Intelligence Computer Science Semantic Web Business Computer Security Computer Network Diving Backtracker MokSec Research Networld Networld Team Robotics Web 2.0 Absolute Frequency 3 1 5 1 1 1 1 2 2 3 1 1 1 1 24 Similarity 1 1 1 1 1 1 1 1 1 1 1 1 0.51 0.51 Common Yes Yes Yes No No No No No No No No No No No Table 3: Tags that occurs in the context cloud of M after extension S: (n) context, circumstance, setting (the set of facts or circumstances that surround a situation or event) ”the historical context” For the purpose of this thesis the following dynamic context parts are thinkable: Location A location could be expressed as street name, city name, ... or more accurate as GPS coordinates in the form latitude/longitude. Possible is also indirect location data, such as GSM cells or unique wireless hotspots (to be able to identify a hotspot the router MAC address could be used). Appointments Each appointment is sorted into a category or has a description. This information could be used to extract useful tags for the context. For the computation of synonyms, it is not always possible to use all available context data types. A good example for this is the location as GPS coordinates. The developed algorithm does not allow using raw GPS coordinates, but only concepts that describe the current valid context. To solve this problem the evaluation of the coordinates to the location as concept is one thinkable approach. For example a GPS coordinate could evaluate to the concept of university. With this location specification the synonyms could be computed. During the integration step into the m:Ciudad [14] framework the location are returned in such a manner. 3.4 Intersection Algorithm Through the good preparation of the two data sets the real intersection part works very stable and reliable. It does not matter what sources are added to the sets, with the help of the abstraction to a unique data format the intersection is always the same SPARQL query. The Figure 9 visualizes the concept of the data abstraction that is used throughout the work. The picture shows the abstraction 19 Figure 9: Data Abstraction Visualization from different data sources (Syn 1 to Syn X and Context 1 to Context Y ), but also the possibility to formalize the produced output in different data formats (Result View 1 to Result View Z ). This makes the algorithm very portable and easy to integrate into third party applications. The colored entities are the parts used by the algorithm to compute context-aware synonyms. RDF Synonym Cloud and RDF Context Cloud are the input sets and Context-Aware Synonyms is the output set after successful computation. Additional the intersection is simplified through the same handling of dynamic and static context. The difference between the two context types changes nothing on the computation and does not complicate the given output. More precise that means that the context types are consolidate into one big context and the difference is shown only indirectly by the priority of the entries. To do not lose the flexibility the related value of dynamic and static context could be changed by the developer. In the focus of this work the dynamic has a 1.5 higher priority than the static context. These values are chosen to demonstrate a real time scenario, where the dynamic context information is more important and the context of the searching agent is only used to optimize the search results. Such a scenario could be for example the search of facilities on the go, where the current location and time classifications play a higher role for a first pre-filtering. In a second step the personal preferences are included. Context ∩ Synonyms The SPARQL query in Listing 6 is used to find all synonym blocks that includes a term with the same name in the context. For the intersection the two ontologies SKOS [7] and SCOT [3] are used. 20 Listing 6: Intersection of Synonyms and Static Context with SPARQL PREFIX scot: <http://scot-project.org/scot/ns#> PREFIX skos: <http://www.w3.org/2009/08/skos-reference/skos.rdf#> SELECT DISTINCT * FROM </path/to/searchterm.rdf> FROM </path/to/contextcloud.rdf> WHERE { ?synset skos:hasMember ?bagLabel . ?bagLabel skos:note ?tagname . ?tagLabel scot:name ?tagname . ?tagLabel scot:synonym ?orgname . ?synset skos:hasMember ?members . ?members skos:altLabel ?synonym . ?synset skos:definition ?definition . } The SPARQL query in Listing 6 is executed over all data sources specified in the FROM statement. The same variable ?tagname for the two properties skos:note and scot:name makes the intersection of the two data sources and returns the identifier of all subtrees. From this subtree all synonyms and additional information are extracted. 4 Implementation In the focus of this thesis was also an implementation that proofs that the developed algorithm works as expected. Additional this implementation was integrated then, in a slightly modified form, into the m:Ciudad [14] framework. The following section describes a concrete implementation of the theoretically discussed computation of context-aware synonyms in the previous sections. The implementation could be divided into the following parts: Framework The core part that handles the computation is written as framework and could be deployed as library to other applications. This part was also used for the integration into the m:Ciudad framework [14]. Local Test Implementations To test the framework and during the development there are two implementation that output the received data to the standard output. One implementation uses the wrapper class ComputeCAS, the other uses the components directly and could be used in a more flexible way, for example if other dictionaries or context sources have to be integrated. Server/Client Implementation Another implementation, mainly written for presentation purpose and to demonstrate a real world scenario, is a server/client architecture with the computation on the server side and the visualization on the client side. Before the different implementations are described, the used technologies are explained. 4.1 Used Technology The implementation was written in the programing language JAVA version 1.6 [1]. The core implementation depends on the following libraries. 21 Jena [2] Writing, reading and querying of the RDF files, especially for the abstraction of the context and synonyms. JWI [17] The WordNet API library that queries the dictionary, that has to be stored locally. For demonstration purpose a mobile application that runs on the Android platform is used. The minimal requirement to run the client is Android 1.5 platform API level 3 [12], although it should run also on newer devices. The code on the mobile device is written in a subset of Java, but the generated class files runs on the Dalvik Virtual Machine [13]. This part is optional and was intended only for demonstration purpose and for a real world scenario how the framework could be used. 4.2 Architecture Figure 10: Architecture view from a developer perspective. Figure 10 and 11 explains the implementation of the algorithm from a developer perspective that uses the framework. The ComputeCAS class is a wrapper that simplifies the use. Through the loose binding of the components the framework is very flexible and could be adapted easily. Loose binding means that the component that gathers the synonyms (the SynToRDF class) and the component that computes the context (the Context class) are absolutely independent. 22 Figure 11: Sequence diagram from the perspective of the ComputeCAS class. The IntersectionAlgorithm uses only the abstracted RDF files from the other two classes. The dependence exists not on the code, but on the data representation layer. This fact makes the use of RDF ontologies a central point. Additional the substitution of one component with another implementation could be easily done if the abstracted data do not change. Figure 10 shows the dependencies between the classes in a static way. The sequence diagram in Figure 11 on the other side shows the interaction between the classes during the runtime. The returned context-aware synonyms are encapsulated in the IResultSet interface. The concrete implementation ResultSet includes CASTag objects that implements the IResultTag interface. There is no representation of the context-aware synonyms in a semantic way. That makes it possible to store the result in different data format without a lot of effort. 4.2.1 Package: synonyms To simplify the extension of the framework with other dictionaries the synonyms package abstract the gained data from different dictionaries and writes them to a special RDF file that uses mainly the SKOS [7] ontology. Each of the used dictionaries implements against the following interface. Listing 7: Synonym Java Interface package i n t e r f a c e s ; im po rt j a v a . u t i l . V e c t o r ; p u b l i c i n t e r f a c e ISynonyms { p u b l i c Vector<S t r i n g > g e t W o r d D e f i n i t i o n ( S t r i n g word ) ; p u b l i c Vector<SynEntry> getSynonymList ( S t r i n g word ) ; 23 Figure 12: Use Case diagram that describes what task have to be fulfilled to compute context-aware synonyms } The implementation against this interface ensures that the intersection algorithm works on the newly used set. The other two important classes in the package are SynEntry and SynToRDF. SynEntry implements the interface ISynEntry (see Listing 8) and combines synonyms with the same meaning to a logical block. Such a block includes the following values: word The search term that are input by an agent. word type The word type such as adjective, adverb, noun or verb. word type description A short description of the word type. language list For each synonym there exists a related language. synonym list Synonyms that have the same meaning. Listing 8: SynEntry Java Interface package i n t e r f a c e s ; im po rt j a v a . u t i l . HashMap ; p u b l i c i n t e r f a c e ISynEntry { p u b l i c a b s t r a c t v o i d setSynonym ( S t r i n g synonym , S t r i n g language ) ; p u b l i c a b s t r a c t v o i d setHypernym ( S t r i n g synonym , S t r i n g language ) ; p u b l i c a b s t r a c t v o i d setHyponym ( S t r i n g synonym , S t r i n g language ) ; public a b s t r a c t void setWordTypeDefinition ( String wordtypeDefintion ) ; p u b l i c a b s t r a c t S t r i n g getWord ( ) ; public abstract String getDefinition () ; 24 public public public public public abstract abstract abstract abstract abstract S t r i n g getWordType ( ) ; S t r i n g getWordTypeDefinition ( ) ; HashMap<S t r i n g , S t r i n g > getSynMap ( ) ; HashMap<S t r i n g , S t r i n g > getHypernymMap ( ) ; HashMap<S t r i n g , S t r i n g > getHyponymMap ( ) ; } The class SynToRDF handles the writing of the abstracted synonyms. For example it writes the translated words into the correct block. If a dictionary is added or removed this class has to be changed also. SynToRDF writes then all gained synonyms for a certain search term to a RDF file. 4.2.2 Package: context Figure 13: Describes the dependencies in the context package. On the context side there is used the class ContextTag to ensure the right handling of the context abstraction. To simplify the access, a manager class with the name ContextTagCloud is used. This class handles the set of all context tags 25 and could be used to extend the context cloud with information from different sources. Listing 9: Context Tag Java Interface package i n t e r f a c e s ; im po rt j a v a . u t i l . V e c t o r ; p u b l i c i n t e r f a c e IContextTag { p u b l i c a b s t r a c t v o i d setTagName ( S t r i n g tagName ) ; p u b l i c a b s t r a c t v o i d s e t O r g S p e l l i n g ( S t r i n g tagOrgName ) ; public abstract void setAbsoluteFrequency ( i n t absoluteFrequency ) ; priority ) ; public abstract void s e t P r i o r i t y ( f l o a t uri ) ; p u b l i c a b s t r a c t v o i d setCooccurURI ( S t r i n g p u b l i c a b s t r a c t void incrementFrequency ( ) ; public public public public public abstract abstract abstract abstract abstract S t r i n g getTagName ( ) ; i n t getAbsoluteFrequency ( ) ; float getPriority () ; Vector<S t r i n g > getCooccurURI ( ) ; Vector<S t r i n g > g e t O r g S p e l l i n g ( ) ; } 4.3 Server/Client Architecture The following section describes shortly the proof-of-concept implementation how the framework could be used in a real world scenario. 4.3.1 Server The server component uses a very simple proprietary5 protocol to exchange information with the client. In this scenario the server is responsible for the synonym computation and for the abstraction of the context information. The client sends the context data in form of a URL to a FOAF file. For the test scenario this context is enough, also because from the FOAF file the publications and the related tags are reachable. 4.3.2 Client The client is a simple GUI frontend that makes it easier to interact with the framework. As you see in Figure 14a the main part consists only of an input field and a search button. After the setting of the server variables and the URL to the context data (see Figure 14b and 14c) it is possible to search for synonyms. For this purpose the client sends first the context URLs to the server and then the search term. After the computation of the synonyms on the server side the result will be send back to the server. The client notifies the user with a message at the left top of the screen and prints the results in a list (see Figure 14d). The implementation of the client on a mobile device is not only useful to have the possibility to compute context-aware synonyms on the go, but also to have a dynamic context for the current user. Such device is for most people a 5 CSV (comma separated values) for the exchange of request/response messages 26 private equipment that stores a lot of information about them. This fact gives the possibility to compute a detailed dynamic context. 4.4 Integrated Version - The Library A slightly adapted version was integrated into the m:Ciudad framework [14]. m:Ciudad is a service infrastructure that allows users to generate composite services on the fly. One difference from the testing implementation is that the context is not related to one agent, but to a group of agents, a community. Another difference is that the data sources are not RDF files, but read out directly from the internal data store. For a detailed comparison between the two implementations see Table 4. The implementation into the m:Ciudad framework shows how flexible the implementation could be used. Context Source Context for... Synonym Source Standalone Test Application FOAF File with related information (e.g. publications) one agent WordNet, LexVo m:Ciudad Integration UDL-SP (m:Ciudad Service Profile) a community WordNet, DBpedia.org Table 4: Difference between the standalone test application and the m:Ciudad integration 5 Evaluation The focus of this thesis was to develop an algorithm that computes synonyms that are relevant for a given context. Through the abstraction of the two sets it was possible to use well defined, public available data sets, such as WordNet for the computation of the synonym blocks in English and the extension with multilingual terms from LexVo [9]. More important is the abstraction for the context information, because there are different data sources that have to be evaluated to reach a nearly complete representation of the real world environment. This approach makes the framework easily extendible with additional synonym and context sources. The choice of this architecture weakens also the fact that the algorithm is only as good as the underlying data sets. If the data sets are incomplete or do not define the right context the algorithm returns wrong results. The decision what data sets should be used is scenario dependent and can be only made during the integration. Additional to the flexible core algorithm there was developed also a second approach that extends the synonyms with Hypernyms and Hyponyms. This algorithm uses the dbpedia.org [22] dataset. This implementation is not related to any context and should be used only as extension for the core part if there are not enough results. Another extension could be the prediction of words in a sentence as described in [19]. This extension was not part of this work because the context is only related to the current sentence and not related to single agents or groups. 27 The developed algorithm bridge the gap between context information systems such as Context-Aware Services for Mobile Users - Technology and User Experiences [15], Context-Aware Query Processing on the Semantic Web [6] and synonym databases (e.g. WordNet [21]). In difference to the papers about context computation this work assumes that the context is available and starts from this, more abstract, layer. That fact makes it possible to shift the focus away from data gathering towards the intersection of context and synonyms. After the design of the computation of context-aware synonyms the implementation was integrated into the m:Ciudad context-aware search engine for mobile services that are generated on the fly [8]. The purpose of the algorithm is to use the internal stored context for user groups and the given search term to optimize the search results. Through the broader context information6 the additional extension with Hypernyms and Hyponyms was used. The major drawback of the presented solution is that the solution works only if the underlying data sets are correct and includes enough content to be able to reason about it. This problem could be minimized by the right choice of well-known synonym databases on the one side and by the use of information from data sets that are generate by the user. For example context information could be filtered from social networks or from mobile devices. To reach an optimal result set the algorithm should receive as input scenario related data, for example the use of domain specific synonym dictionaries guarantees optimal results for the synonym sets. The same is true for the context information. Here different sources should be chosen that describes the agent in a complete as possible manner. The algorithm is very successful if the input term is an abbreviation that expands to more, semantically different, synonym blocks and the searching agent has domain specific information in his context that is related to the search term. 6 Further Work The presented algorithm in this thesis shows a possible solution how contextaware synonyms could be computed. The related application provides a highly flexible implementation that could be used as starting point for a concrete real world integration. To use the algorithm in a productive way the data sources should be improved and adapted for the scenario. The improvement is not only related to domain specific synonym dictionaries, but also for context information that describes the searching agent/group. Context information could be extracted for example from social network. The focus of this work was only the computation of context-aware synonyms and not the creation of an ontology that represents the search results. In the presented and maybe in most scenarios the gained data has to be handled by a third party application. So it makes sense to have an ontology that represents context-aware synonyms. A further improvement of the algorithm is the parsing of the definition and the example sentences in the synonym cloud and take the gained information into account during the intersection process. This extension implies the understanding of the semantic of the natural language sentences. 6 m:Ciudad services are using context information related to groups and not to single agents. 28 A good approach for a optimized output is the improvement and/or the extension of the underlying data sets. 29 A Appendix - RDF example files Listing 10: Synonyms for AI abstracted as RDF file <?xml v e r s i o n=” 1 . 0 ”?> <r d f :RDF xmlns : r d f=” h t t p : / /www. w3 . o r g /1999/02/22 − r d f −syntax−ns#” xmlns : s k o s=” h t t p : / /www. w3 . o r g /2009/08/ s k o s −r e f e r e n c e / s k o s . r d f#”> <s k o s : C o l l e c t i o n r d f : about=” h t t p : / / example . o r g / synonyms / AI noun . ac t −d e f 4 ”> <r d f : o b j e c t >nouns d e n o t i n g a c t s o r a c t i o n s </ r d f : o b j e c t > <s k o s : scopeNote>noun . act </s k o s : scopeNote> <s k o s : d e f i n i t i o n >t h e i n t r o d u c t i o n o f semen i n t o t h e o v i d u c t o r u t e r u s by some means o t h e r than s e x u a l i n t e r c o u r s e </s k o s : d e f i n i t i o n > <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms /AI”> <s k o s : note>a i </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”>AI</s k o s : a l t L a b e l > </ r d f : Bag> </ s k o s : hasMember> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / i n s e m i n a t i o n ”> <s k o s : note>i n s e m i n a t i o n </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”>i n s e m i n a t i o n </s k o s : altLabel> </ r d f : Bag> </ s k o s : hasMember> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / a r t i f i c i a l i n s e m i n a t i o n ”> <s k o s : note>a r t i f i c i a l i n s e m i n a t i o n </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”> a r t i f i c i a l i n s e m i n a t i o n </s k o s : a l t L a b e l > </ r d f : Bag> </ s k o s : hasMember> </ s k o s : C o l l e c t i o n > <s k o s : C o l l e c t i o n r d f : about=” h t t p : / / example . o r g / synonyms / AI noun . animal−d e f 3 ”> <r d f : o b j e c t >nouns d e n o t i n g a n i m a l s </ r d f : o b j e c t > <s k o s : scopeNote>noun . animal </s k o s : scopeNote> <s k o s : d e f i n i t i o n >a s l o t h t h a t has t h r e e l o n g c l a w s on each f o r e f o o t and each h i n d f o o t </s k o s : d e f i n i t i o n > <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / t h r e e −t o e d s l o t h ”> <s k o s : note>t h r e e t o e d s l o t h </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”>t h r e e −t o e d s l o t h </ skos : altLabel > </ r d f : Bag> </ s k o s : hasMember> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / B r a d y p u s t r i d a c t y l u s ”> <s k o s : note>b r a d y p u s t r i d a c t y l u s </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”> B r a d y p u s t r i d a c t y l u s </s k o s : a l t L a b e l > </ r d f : Bag> </ s k o s : hasMember> <s k o s : hasMember> 30 <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / t r e e s l o t h ”> <s k o s : note>t r e e s l o t h </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”> t r e e s l o t h </s k o s : altLabel> </ r d f : Bag> </ s k o s : hasMember> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / s l o t h ”> <s k o s : note>s l o t h </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”>s l o t h </s k o s : altLabel> </ r d f : Bag> </ s k o s : hasMember> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / a i ”> <s k o s : note>a i </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”>a i </s k o s : a l t L a b e l > </ r d f : Bag> </ s k o s : hasMember> </ s k o s : C o l l e c t i o n > <s k o s : C o l l e c t i o n r d f : about=” h t t p : / / example . o r g / synonyms / AI noun . group−d e f 1 ”> <r d f : o b j e c t >nouns d e n o t i n g g r o u p i n g s o f p e o p l e o r o b j e c t s </ r d f : o b j e c t > <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / a u t h o r i t y ”> <s k o s : note>a u t h o r i t y </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”>a u t h o r i t y </s k o s : altLabel> </ r d f : Bag> </ s k o s : hasMember> <s k o s : scopeNote>noun . group </s k o s : scopeNote> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / A r m y I n t e l l i g e n c e ”> <s k o s : note>a r m y i n t e l l i g e n c e </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”>A r m y I n t e l l i g e n c e </ skos : altLabel > </ r d f : Bag> </ s k o s : hasMember> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / o f f i c e ”> <s k o s : note>o f f i c e </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”> o f f i c e </s k o s : altLabel> </ r d f : Bag> </ s k o s : hasMember> <s k o s : d e f i n i t i o n >an agency o f t h e United S t a t e s Army r e s p o n s i b l e f o r p r o v i d i n g t i m e l y and r e l e v a n t and a c c u r a t e and s y n c h r o n i z e d i n t e l l i g e n c e t o t a c t i c a l and o p e r a t i o n a l and s t r a t e g i c l e v e l commanders</s k o s : definition > <s k o s : hasMember r d f : r e s o u r c e=” h t t p : / / example . o r g / synonyms /AI”/> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / f e d e r a l a g e n c y ”> <s k o s : note>f e d e r a l a g e n c y </s k o s : note> 31 <s k o s : a l t L a b e l xml : l a n g=” en ”>f e d e r a l a g e n c y </ skos : altLabel > </ r d f : Bag> </ s k o s : hasMember> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / agency ”> <s k o s : note>agency </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”>agency </s k o s : altLabel> </ r d f : Bag> </ s k o s : hasMember> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / g o v e r n m e n t a g e n c y ”> <s k o s : note>governmentagency </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”>government agency </ skos : altLabel > </ r d f : Bag> </ s k o s : hasMember> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / bureau ”> <s k o s : note>bureau </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”>bureau </s k o s : altLabel> </ r d f : Bag> </ s k o s : hasMember> </ s k o s : C o l l e c t i o n > <s k o s : C o l l e c t i o n r d f : about=” h t t p : / / example . o r g / synonyms / AI noun . c o g n i t i o n −d e f 2 ”> <r d f : o b j e c t >nouns d e n o t i n g c o g n i t i v e p r o c e s s e s and c o n t e n t s </ r d f : o b j e c t > <s k o s : scopeNote>noun . c o g n i t i o n </s k o s : scopeNote> <s k o s : d e f i n i t i o n >t h e branch o f computer s c i e n c e t h a t d e a l with w r i t i n g computer programs t h a t can s o l v e p r o b l e m s c r e a t i v e l y </s k o s : d e f i n i t i o n > <s k o s : example>w o r k e r s i n AI hope t o i m i t a t e o r d u p l i c a t e i n t e l l i g e n c e i n computers and r o b o t s </s k o s : example> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / c o m p u t e r s c i e n c e ”> <s k o s : note>c o m p u t e r s c i e n c e </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”>c o m p u t e r s c i e n c e </ skos : altLabel > </ r d f : Bag> </ s k o s : hasMember> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / computing ”> <s k o s : note>computing </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”>computing </s k o s : altLabel> </ r d f : Bag> </ s k o s : hasMember> <s k o s : hasMember r d f : r e s o u r c e=” h t t p : / / example . o r g / synonyms /AI”/> <s k o s : hasMember> <r d f : Bag r d f : about=” h t t p : / / example . o r g / synonyms / a r t i f i c i a l i n t e l l i g e n c e ”> <s k o s : note> a r t i f i c i a l i n t e l l i g e n c e </s k o s : note> <s k o s : a l t L a b e l xml : l a n g=” en ”> 32 a r t i f i c i a l i n t e l l i g e n c e </s k o s : a l t L a b e l > </ r d f : Bag> </ s k o s : hasMember> </ s k o s : C o l l e c t i o n > </ r d f :RDF> Listing 11: Context abstracted as RDF file <?xml v e r s i o n=” 1 . 0 ”?> <r d f :RDF xmlns : l i s t =” h t t p : / / c r s c h m i d t . n e t / ns / l i s t #” xmlns : r d f=” h t t p : / /www. w3 . o r g /1999/02/22 − r d f −syntax−ns#” xmlns : s c o t=” h t t p : / / s c o t −p r o j e c t . o r g / s c o t / ns#”> <s c o t : C o o c u r r e n c e r d f : about=” h t t p : / / n e t w o r l d . t o /? s i o c t y p e= p o s t& ; s i o c i d =97”> <s c o t : c o o c c u r e t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / semanticweb ”> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =97</ s c o t : cooccure with> <s c o t : c o o c c u r e w i t h >h t t p : // d e v n u l l . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =207</ s c o t : cooccure with> <s c o t : o w n a f r e q u e n c y >7</ s c o t : o w n a f r e q u e n c y > <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =229</ s c o t : cooccure with> <s c o t : synonym>semanticweb </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // k o n i . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > <s c o t : synonym>Semantic Web</ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =339</ s c o t : cooccure with> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =172</ s c o t : cooccure with> <s c o t : name>semanticweb </ s c o t : name> <s c o t : o w n r f r e q u e n c y >22.580645 </ s c o t : own rfrequency> </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > <s c o t : c o o c c u r e t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / networldteam ”> <s c o t : name>networldteam </ s c o t : name> <s c o t : o w n a f r e q u e n c y >1</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >3.2258062 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : synonym>Networld Team</ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =97</ s c o t : cooccure with> </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > <s c o t : c o o c c u r e t a g > 33 <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / b a c k t r a c k e r ”> <s c o t : name>b a c k t r a c k e r </ s c o t : name> <s c o t : o w n a f r e q u e n c y >2</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >6.4516125 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : synonym>B a c k t r a c k e r </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =97</ s c o t : cooccure with> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =172</ s c o t : cooccure with> </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > <s c o t : c o o c c u r e t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / a r t i f i c i a l i n t e l l i g e n c e ”> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =97</ s c o t : cooccure with> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =229</ s c o t : cooccure with> <s c o t : synonym> A r t i f i c i a l I n t e l l i g e n c e </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // k o n i . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > <s c o t : o w n r f r e q u e n c y >12.903225 </ s c o t : own rfrequency> <s c o t : o w n a f r e q u e n c y >4</ s c o t : o w n a f r e q u e n c y > <s c o t : c o o c c u r e w i t h >h t t p : // d e v n u l l . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > <s c o t : name> a r t i f i c i a l i n t e l l i g e n c e </ s c o t : name> <s c o t : synonym> a r t i f i c i a l i n t e l l i g e n c e </ s c o t : synonym> </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > <s c o t : c o o c c u r e t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / moksec ”> <s c o t : name>moksec</ s c o t : name> <s c o t : o w n a f r e q u e n c y >2</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >6.4516125 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : synonym>MokSec</ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =97</ s c o t : cooccure with> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =172</ s c o t : cooccure with> </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > <s c o t : c o o c c u r e t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / r e s e a r c h ”> <s c o t : name>r e s e a r c h </ s c o t : name> <s c o t : o w n a f r e q u e n c y >3</ s c o t : o w n a f r e q u e n c y > 34 <s c o t : o w n r f r e q u e n c y >9.677419 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : synonym>Research </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =97</ s c o t : cooccure with> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =207</ s c o t : cooccure with> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =229</ s c o t : cooccure with> </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > <s c o t : c o o c c u r e t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / n e t w o r l d ”> <s c o t : name>n e t w o r l d </ s c o t : name> <s c o t : o w n a f r e q u e n c y >1</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >3.2258062 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : synonym>Networld </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =97</ s c o t : cooccure with> </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > </ s c o t : Coocurrence > <s c o t : C o o c u r r e n c e r d f : about=” h t t p : / / n e t w o r l d . t o /? s i o c t y p e= p o s t& ; s i o c i d =333”> <s c o t : c o o c c u r e t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / u n c a t e g o r i z e d ”> <s c o t : name>u n c a t e g o r i z e d </ s c o t : name> <s c o t : o w n a f r e q u e n c y >1</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >3.2258062 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : synonym>U n c a t e g o r i z e d </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =333</ s c o t : cooccure with> </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > </ s c o t : Coocurrence > <s c o t : C o o c u r r e n c e r d f : about=” h t t p : / / d e v n u l l . n e t w o r l d . t o / f o a f . r d f#me”> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / semanticweb ”/> <s c o t : c o o c c u r e t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / d i v i n g ”> <s c o t : name>d i v i n g </ s c o t : name> <s c o t : o w n a f r e q u e n c y >1</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >3.2258062 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : synonym>Diving </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // d e v n u l l . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > 35 </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > <s c o t : c o o c c u r e t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / computernetwork ”> <s c o t : name>computernetwork </ s c o t : name> <s c o t : o w n a f r e q u e n c y >1</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >3.2258062 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : synonym>Computer Network</ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // d e v n u l l . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / a r t i f i c i a l i n t e l l i g e n c e ”/> <s c o t : c o o c c u r e t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / c o m p u t e r s c i e n c e ”> <s c o t : name>c o m p u t e r s c i e n c e </ s c o t : name> <s c o t : o w n a f r e q u e n c y >2</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >6.4516125 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : synonym>c o m p u t e r s c i e n c e </ s c o t : synonym> <s c o t : synonym>Computer S c i e n c e </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // k o n i . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > <s c o t : c o o c c u r e w i t h >h t t p : // d e v n u l l . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > <s c o t : c o o c c u r e t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / c o m p u t e r s e c u r i t y ”> <s c o t : name>c o m p u t e r s e c u r i t y </ s c o t : name> <s c o t : o w n a f r e q u e n c y >1</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >3.2258062 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : synonym>Computer S e c u r i t y </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // d e v n u l l . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > <s c o t : c o o c c u r e t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / b u s i n e s s ”> <s c o t : name>b u s i n e s s </ s c o t : name> <s c o t : o w n a f r e q u e n c y >1</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >3.2258062 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : synonym>B u s i n e s s </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // d e v n u l l . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > </ s c o t : Tag> </ s c o t : c o o c c u r e t a g > </ s c o t : Coocurrence > <s c o t : C o o c u r r e n c e r d f : about=” h t t p : / / n e t w o r l d . t o /? s i o c t y p e= p o s t& ; s i o c i d =172”> 36 <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / semanticweb ”/> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / b a c k t r a c k e r ”/> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / moksec ”/> </ s c o t : Coocurrence > <s c o t : TagCloud> <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / computernetwork ”/> <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / a r t i f i c i a l i n t e l l i g e n c e ”/> <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / b a c k t r a c k e r ”/> <s c o t : h a s t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / web20 ”> <s c o t : name>web20</ s c o t : name> <s c o t : o w n a f r e q u e n c y >1</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >3.2258062 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >0.45</ l i s t : p r i o r i t y > <s c o t : synonym>web20</ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // k o n i . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > </ s c o t : Tag> </ s c o t : h a s t a g > <s c o t : h a s t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / m o b i l e d e v i c e s ”> <s c o t : name>m o b i l e d e v i c e s </ s c o t : name> <s c o t : o w n a f r e q u e n c y >1</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >3.2258062 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >1.0</ l i s t : p r i o r i t y > <s c o t : synonym>Mobile D e v i c e s </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // n e t w o r l d . t o /? s i o c t y p e=p o s t& ; s i o c i d =339</ s c o t : cooccure with> </ s c o t : Tag> </ s c o t : h a s t a g > <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / n e t w o r l d ”/> <s c o t : t o t a l t a g s >31</ s c o t : t o t a l t a g s > <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / semanticweb ”/> <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / c o m p u t e r s c i e n c e ”/> <s c o t : h a s t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / f o o t b a l l ”> <s c o t : name>f o o t b a l l </ s c o t : name> <s c o t : o w n a f r e q u e n c y >1</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >3.2258062 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >0.45</ l i s t : p r i o r i t y > <s c o t : synonym>f o o t b a l l </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // k o n i . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > </ s c o t : Tag> </ s c o t : h a s t a g > <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / 37 u n c a t e g o r i z e d ”/> <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / c o m p u t e r s e c u r i t y ”/> <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / networldteam ”/> <s c o t : h a s t a g > <s c o t : Tag r d f : about=” h t t p : / / example . o r g / c o n t e x t / r o b o t i c s ”> <s c o t : name>r o b o t i c s </ s c o t : name> <s c o t : o w n a f r e q u e n c y >1</ s c o t : o w n a f r e q u e n c y > <s c o t : o w n r f r e q u e n c y >3.2258062 </ s c o t : own rfrequency> < l i s t : p r i o r i t y >0.45</ l i s t : p r i o r i t y > <s c o t : synonym>r o b o t i c s </ s c o t : synonym> <s c o t : c o o c c u r e w i t h >h t t p : // k o n i . n e t w o r l d . t o / f o a f . r d f#me</ s c o t : c o o c c u r e w i t h > </ s c o t : Tag> </ s c o t : h a s t a g > <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / moksec ”/> <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / r e s e a r c h ”/> <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / b u s i n e s s ”/> <s c o t : h a s t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / d i v i n g ”/> </ s c o t : TagCloud> <s c o t : C o o c u r r e n c e r d f : about=” h t t p : / / n e t w o r l d . t o /? s i o c t y p e= p o s t& ; s i o c i d =229”> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / semanticweb ”/> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / a r t i f i c i a l i n t e l l i g e n c e ”/> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / r e s e a r c h ”/> </ s c o t : Coocurrence > <s c o t : C o o c u r r e n c e r d f : about=” h t t p : / / k o n i . n e t w o r l d . t o / f o a f . r d f#me”> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / semanticweb ”/> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / a r t i f i c i a l i n t e l l i g e n c e ”/> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / c o m p u t e r s c i e n c e ”/> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / r o b o t i c s ”/> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / f o o t b a l l ”/> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / web20 ”/> </ s c o t : Coocurrence > <s c o t : C o o c u r r e n c e r d f : about=” h t t p : / / n e t w o r l d . t o /? s i o c t y p e= p o s t& ; s i o c i d =207”> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / semanticweb ”/> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / r e s e a r c h ”/> </ s c o t : Coocurrence > <s c o t : C o o c u r r e n c e r d f : about=” h t t p : / / n e t w o r l d . t o /? s i o c t y p e= p o s t& ; s i o c i d =339”> <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / semanticweb ”/> 38 <s c o t : c o o c c u r e t a g r d f : r e s o u r c e=” h t t p : / / example . o r g / c o n t e x t / m o b i l e d e v i c e s ”/> </ s c o t : Coocurrence > </ r d f :RDF> B Appendix - Android Client (a) Main Window with Options (b) Settings (c) Settings - FOAF URL (d) Result - Search Term ”AI” Figure 14: Contex-Aware Synonym (CAS) Client 39 Listings 1 2 3 4 5 6 7 8 9 10 11 SPARQL example . . . . . . . . . . . . . . . . . . . . . . . Synset block in raw RDF format. . . . . . . . . . . . . . . . Synonym cloud XSLT for visualization in HTML . . . . . . Excerpt of the context term computer science . . . . . . . . Excerpt of the context term computer science . . . . . . . . Intersection of Synonyms and Static Context with SPARQL Synonym Java Interface . . . . . . . . . . . . . . . . . . . . SynEntry Java Interface . . . . . . . . . . . . . . . . . . . . Context Tag Java Interface . . . . . . . . . . . . . . . . . . Synonyms for AI abstracted as RDF file . . . . . . . . . . . Context abstracted as RDF file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 10 11 13 13 20 23 24 26 30 33 List of Figures 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Excerpt of the search result AI searched on http://www.google. com (not filtered) . . . . . . . . . . . . . . . . . . . . . . . . . . . Excerpt of the search result AI searched on http://www.google. com (filtered) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The workflow of the computation of context-aware synonyms. . . Excerpt of synset block visuzalized in HTML . . . . . . . . . . . Excerpt of Hypernyms and Hyponyms for the search term artificial intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . Excerpt of the sorted context visuzalized in HTML . . . . . . . . Compute the similarity of two Contexts . . . . . . . . . . . . . . Excerpt of context cloud relation. . . . . . . . . . . . . . . . . . . Data Abstraction Visualization . . . . . . . . . . . . . . . . . . . Architecture view from a developer perspective. . . . . . . . . . . Sequence diagram from the perspective of the ComputeCAS class. Use Case diagram that describes what task have to be fulfilled to compute context-aware synonyms . . . . . . . . . . . . . . . . . . Describes the dependencies in the context package. . . . . . . . . Contex-Aware Synonym (CAS) Client . . . . . . . . . . . . . . . 40 6 7 8 11 12 15 16 17 20 22 23 24 25 39 In Acknowledgement This work is part of the m:Ciudad [14] project carried out at the University Innsbruck and is supported by STI Innsbruck and the Telecommunications Research Center Vienna (FTW). References [1] JavaSE 1.6 API. http://java.sun.com/javase/6/docs/api/. [Online; accessed 19-April-2010]. [2] Jena - A Semantic Web Framework for Java. http://jena. sourceforge.net/. [Online; accessed 19-April-2010]. [3] Social Semantic Cloud of Tags. http://scot-project.org/. [Online; accessed 08-April-2010]. [4] T. Berners-Lee, J. Hendler, and O. Lassila. The semantic web. Scientific American, May 2001. [5] Dan Brickley and Libby Miller. The Friend of a Friend (FOAF) project. http://www.foaf-project.org/. [Online; accessed 19-February2010]. [6] Andrew Burton-Jones, Sandeep Purao, and C. Veda Storey. Context-Aware Query Processing on the Semantic Web. [7] W3 Consortium. Simple Knowledge Organization System (SKOS). http: //www.w3.org/2004/02/skos/. [Online; accessed 19-February-2010]. [8] J. Danado, M. Davies, P. Ricca, and A. Fensel. An Authoring Tool for User Generated Mobile Services. In Proceedings of the 3rd Future Internet Symposium (FIS’10), 20-22 September 2010 Berlin, Germany, 2010. [9] Gerard de Melo. Lexvo. http://lexvo.org/. [Online; accessed 24February-2010]. [10] Apache Software Foundation. Lucence. http://lucene.apache.org/. [Online; accessed 14-July-2010]. [11] Freie Universität Berlin. TRiG Notation. http://www4.wiwiss. fu-berlin.de/bizer/TriG/. [Online; accessed 23-October-2010]. [12] Google. Android 1.5 - API Level 3. http://developer.android. com/sdk/android-1.5.html. [Online; accessed 19-April-2010]. [13] Google. Dalvik VM Internals. http://sites.google.com/site/io/ dalvik-vm-internals. [Online; accessed 19-April-2010]. [14] ICT (Information and Communication Technlogies). m:Ciudad FP7. http://www.mciudad-fp7.org/. [Online; accessed 19-February2010]. 41 [15] Juha Kolari, Timo Laakko, Hiltunen Tapio, Veikko Ikonen, Minna Kulju, Raisa Suihkonen, Santtu Toivonen, and Tytti Virtanen. Context-aware services for mobile users - technology and user experiences. 2004. [16] Friedemann Mattern and Christian Floekemeier. Vom Internet der Computer zum Internet der Dinge. Informatik Spektrum, 33(2):107–121, April 2010. [17] MIT. MIT Java Wordnet Interface. http://projects.csail.mit. edu/jwi/. [Online; accessed 19-April-2010]. [18] Nokia. TRiX Notation. http://sw.nokia.com/trix/trix.html. [Online; accessed 23-October-2010]. [19] Michael Putcher. Performance Evaluation of WordNet-based Semantic Relatedness Measures for Word Prediction in Conversational Speech. December 2010. [20] Tom Heath. Linked Data - Connect Distributed Data across the Web. http://linkeddata.org/. [Online; accessed 19-February-2010]. [21] Princton University. Wordnet. http://wordnet.princeton.edu/. [Online; accessed 24-February-2010]. [22] University Leipzig, Free University Berlin, and OpenLink Software. DBpedia.org. http://dbpedia.org. [Online; accessed 24-October-2010]. [23] W3C. N-Triples Notation. http://www.w3.org/TR/ rdf-testcases/#ntriples. [Online; accessed 23-October-2010]. [24] W3C. N3 Notation. http://www.w3.org/DesignIssues/ Notation3.html. [Online; accessed 23-October-2010]. [25] W3C. Official Semantic Web Site. http://semanticweb.org. [Online; accessed 04-August-2010]. [26] W3C. RDF Schema. http://www.w3.org/TR/rdf-schema/. [Online; accessed 04-August-2010]. [27] W3C. RDF Syntax Grammar. http://www.w3.org/TR/ rdf-syntax-grammar/. [Online; accessed 04-August-2010]. [28] W3C. RDFa Notation. http://www.w3.org/TR/ xhtml-rdfa-primer/. [Online; accessed 23-October-2010]. [29] W3C. SPARQL. http://www.w3.org/TR/rdf-sparql-query/. [Online; accessed 23-October-2010]. [30] W3C. Turtle Notation. http://www.w3.org/TeamSubmission/ turtle/. [Online; accessed 23-October-2010]. [31] Anna V. Zhdanova, Livia Predoiu, Tassilo Pellegrini, and Dieter Fensel. A Social Networking Model of a Web Community. In Proceedings of the 10th International Symposium on Social Communication, 2007. 42
© Copyright 2026 Paperzz