Semantic Web and Semantic Technologies: Some relevant applications Photo illustration by Aurich Lawson (Ars Technica Website) Document Date Version Contact 13 August 2012 0.2 [email protected] Récipiendaire / Recipient: Auteurs / Authors : Lucie L. Séguin, Director General, Strategic Research and Policy Branch Mireille Miniggio, Director, Strategic Research Division Zeïneb Gharbi, Strategic Research Analyst, Strategic Research Ce Rapport technique a été rédigé par la Division de la Recherche stratégique et est destiné à un lectorat interne de BAC. This Technical Report was prepared by the Strategic Research Division and is intended for an internal readership at LAC. Table of Contents Introduction ............................................................................................................................. 2 Semantic Web and Linked Data Concepts .............................................................................. 3 What is Linked Data? .......................................................................................................... 5 Linked Data and Linked Open data ..................................................................................... 6 Graphing the Web: Google, MicroSoft and Yahoo!.................................................................. 6 Yahoo! MicroSearch............................................................................................................ 6 Google Knowledge Graph ................................................................................................... 7 MicroSoft’s Satori ................................................................................................................ 8 Linked Data in Library and Archives Domains ......................................................................... 9 Europeana .......................................................................................................................... 9 National Libraries experimenting with Linked Open Data .................................................... 9 OCLC WolrdCat Linked Data and other Library Tools ....................................................... 11 Library and Archives Canada’s Linked Data Activities ........................................................... 12 Conclusion ............................................................................................................................ 14 References ........................................................................................................................... 15 Appendix I – Other Relevant Applications of Semantic Web Technologies ........................... 16 Introduction Semantic Web technology is slowly changing the way people use the Internet. Semantic technology is approaching general adoption, and Google's recent launch of the Knowledge Graph platform is one manifestation of the technology's mainstream appeal. There is a growing interest in the field of Library and Information science regarding the application of Semantic Web and Linked Data technologies in order to improve users’ search experience. Library and Archives Canada (LAC) is one of the memory institutions around the globe that is aware of the impact that such technologies may have on developing its policies and tools, as well as their impact on how LAC will build its services to make its collections more discoverable. The following report provides an overview of semantic web, its potential use and current applications in search engine industry as well as in many memory institutions around the world. The report first introduces key Semantic Web concepts. Second, it includes the search engine 2|Page developments within semantic search techniques as well as some relevant applications of semantic technologies in select memory institutions (e.g. national libraries). Finally, the report presents Semantic Web activities that are currently undertaken by LAC and the recent PanCanadian Documentary Heritage Network project. The report is complemented by an appendix listing additional relevant applications of Semantic Web technologies. Semantic Web and Linked Data Concepts Principally, the Semantic Web is a Web 3.0 web technology, a way of linking data between systems or entities that allows for rich, self-describing interrelations of data available across the globe on the web. In essence, it marks a shift in thinking from publishing data in human readable HTML documents to machine readable documents. The web contains a great quantity of information, but typically the raw data itself is not available. The Semantic Web seeks to change the shape of the internet with regard to this problem in a number of ways: Opening up the web of data to artificial intelligence processes (getting the web to do a bit of thinking for us); Encouraging companies, organizations and individuals to publish their data freely, in an open standard format; Encouraging businesses to use data already available on the web (data give/take). According to a key article written in 2001 by Tim Berners-Lee et al., the semantic web is "a new form of Web Content that is meaningful to computers, a revolution of new possibilities". It is designed for computer programs to manipulate information meaningfully. It is an extension of the current web (Web 1.0) in which information is given well-defined meaning, better enabling computers and people to work in cooperation. As for the Internet, the Semantic Web will be as decentralized as possible. For the Semantic Web to function, computers must have access to structured collections of information and sets of inference rules that they can use to conduct automated reasoning. Knowledge Representation is for the Semantic Web what hypertext was for the web. Two important technologies are important for the Semantic Web: Extensible Markup Language (XML) and Resource Description Framework (RDF). XML allows users to add arbitrary structure to their documents, by creating their own tags, but says nothing about what the structures mean. Meaning is expressed by RDF where a document makes assertions that particular things (people, webpages, etc.) have properties with certain values. It is a natural way to describe the vast majority of the data processed by machines. In order to help a program to compare or to combine information across two databases, it has to know that some terms are being used to mean the same thing. The program must have a way to discover such common meanings. A solution to this is to provide another basic component of the Semantic Web, collections of information called ontologies. In this context, “ontology is a document or file that formally defines the relation among terms”. The most typical kind of ontology for the web has taxonomy and a set of inference rules. The taxonomy defines classes of objects and relations among them. "Properly designed, the Semantic Web can assist the evolution of human knowledge as a whole" (Tim Berners-Lee et al., 2011). In naming every concept by a URI (Uniform Resource Identifier), 3|Page the Semantic Web lets anyone express new concepts that they invent with minimal effort. Its unifying logical language will enable these concepts to be progressively linked into a universal Web. This structure will open up the knowledge and workings of humankind to meaningful analysis by software agents, providing a new class of tools allowing an enhanced user experience. In other words, this will take all that information published in HTML documents in different places, and allowing the description of models of data that allow it all to be treated and researched as if it were one database. The benefits to automated research of drawing from all the data humanity has to offer on the internet in comparison to today's tools and software, are tremendous. In addition to the classic “Web of documents” World Wide Web Consortium (W3C) is helping to build a technology stack to support a “Web of data”, the sort of data that can be founded in databases. The ultimate goal of the Web of data is to enable computers to do more useful work and to develop systems that can support trusted interactions over the network. The term “Semantic Web” refers to W3C’s vision of the Web of Linked Data. Semantic Web technologies enable people to create data stores on the Web, build vocabularies, and write rules for handling data. Linked data are empowered by technologies such as RDF, SPARQL (which is an RDF query language), OWL (Web Ontology Language), and SKOS (Simple Knowledge Organization System). The data storage model for the Semantic Web is the graph database (a network), rather than a hierarchical or relational database. This storage model constitutes a paradigm-shift in storing information on the web. Traditionally, data is structured either in a hierarchy (for example XML) or in a relational database (for example MySQL). In the Semantic Web, RDF statements (triples) define data graphs. Figure 1 shows the difference relational, hierarchical and graph databases. In a data graph, there is no concept of roots. A graph consists of resources related to other resources, with no single resource having any particular intrinsic importance over another. Figure 1 – A graph database (Source: Linked Data Tools: Introducing Graph Data) While the Web of documents is constituted linking different documents together with hyperlinks, the emergence of the Web of data is the result of connecting individual bits of data together with RDF triples, which express the relationship between these bits of data. Linked Data is no more 4|Page complex than this connecting related data across the Web using URIs, HTTP and RDF (Heath (2009). An RDF triple (see Figure 2) contains three components: the subject, the predicate (also known as the property of the triple), and the object. Figure 2 – A Triple (Source: Linked Data in Libraries) What is Linked Data? Like the web of hypertext, the web of data is constructed with documents on the web. However, unlike the web of hypertext, where links are relationships anchors in hypertext documents written in HTML, for the web of data, the links between arbitrary things are described by RDF (See Figure 3). Four rules are essential for Linked Data (Berners-Lee, 2006): 1. The 1st rule is to use URIs as names of things. 2. The 2nd rule is to use HTTP URIs so people can use those names. The only deviation has been a constant tendency for people to invent new URI schemes such as LSIDs (Life Sciences Identifiers) and handles and XRIs (Extensible Resource Identifiers) and DOIs (Digital Object Identifiers). 3. The 3rd rule is when someone looks up a URI, provide useful information using the standards (RDF, SPARQL). 4. The 4th rule is to include links to other URIs so that they can discover more things. 5|Page Figure 3 – Linked Data (Source: Linked Data in Libraries) The Web enables people to link related documents. Similarly it enables them to link related data. The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. Key technologies that support Linked Data are URIs (a generic means to identify entities or concepts in the world), HTTP (a simple yet universal mechanism for retrieving resources, or descriptions of resources), and RDF (a generic graph-based data model with which to structure and link data that describes things in the world) (Heath, 2009). Linked Data and Linked Open data Not all Linked Data need to be Linked Open Data. The label "Linked Open Data" is widely used, but often to refer to Linked Data in general, rather than to Linked Data that is explicitly published under an open license. Not all Linked Data will be open, and not all Open Data will be linked. Tim Berners-Lee (2006) defined the Linked Open Data (LOD) as Linked Data which is released under an open licence, which does not impede its reuse for free. Creative Commons CC-BY is an example open licence, as is Open Data Commons ODC-By licence, as well as the UK's Open Government Licence. Graphing the Web: Google, MicroSoft and Yahoo! Yahoo! MicroSearch According to Yahoo! (2008), MicroSearch is a richer search experience combining traditional search results with metadata extracted from webpages. MicroSearch aims to enrich Yahoo! Search in three ways: a) by showing smart snippets that summarize the metadata inside the page and allows a user to take action without actually visiting the page, b) by showing map and timeline views that aggregate metadata from various pages and, c) by showing pages related to the current result. 6|Page Google Knowledge Graph On May 16, 2012, Google launched in the United States its next-generation search technology called "Knowledge Graph" (KG), available for the English version of Google.com. The KG connects search queries to Google's knowledgebase of over 500 million people, things, and places to show the user relevant information. This knowledgebase also includes Wikipedia and the CIA World Factbook. The new tool allows Google to go beyond just keyword relevance to provide answers. With KG, Google can help narrow the search results to distinguish between words with different meanings. Alongside the search results, users will find related facts, which Google pulls based on what others have searched for before (See Figure 4). For practical explanation, see Google’s video introducing the KG (on Youtube). Figure 4 – Google Search results with Knowledge Graph for “Marie Curie”. According to Google, The KG impacts Google Search in three main ways: 1. Find the right thing: The intuitive feature will make the search engine bring the exact results users want. Google will be able to recognize the entities and thus produce more relevant information in searches. 2. Get the best summary: Users will get more summary in the result page itself. If the search for a prominent person’s date of birth, they will no longer have to access Wikipedia or other sites, Google will just have a short summary about the person on its search page. 3. Go deeper and broader: Google search will become a more deep and broad experience with the Knowledge Graph update. Search results will bring out more funny and unexpected information regarding the keyword. If users search for a book, they will get details about some other books that come in the same category or won the same award. The concretisation of the semantic search by Google is another example that demonstrates the limits of keyword search which consists on matching keywords to queries, and the shift to a more 7|Page intelligent model, a graph that “understands” real-world entities and their relationships to one another: “Things, not strings”, as stated in Google official blog. Google’s prerogative is to make semantic search the new norm. On August 8 2012, Google announced expansion and re-design of KG with a new navigation experience. As showed in Figure 5, KG will now allow users to browse visual results at the top of their text results list for thousands of search queries (museums, towns, people, animals, monuments, etc.). Figure 5 – Google Search visual browsable results with KG for “things to do in Paris”. MicroSoft’s Satori Like Google’s KG, Microsoft’s Satori is a graph-based repository that comes out of Microsoft Research’s Trinity graph database and computing platform (See Figure 6). It uses the RDF and the SPARQL query language, and it was designed to handle billions of RDF triples. For a sense of scale, the 2010 US Census in RDF form has about one billion triples. Satori extracts data from the unstructured information on webpages to create a structured database of the “nouns” of the Internet: people, places, things, and the relationships between them all. The entities in both KG and Satori are essentially semantic data objects, each with a unique identifier, a collection of properties based on the attributes of the real-world topic they represent, and links representing the topic’s relationship to other entities. They also include actions that someone searching for that topic might want to take. 8|Page Figure 6 – An overview of the system architecture of Microsoft Research's Trinity, the basis of Satori engine. Microsoft Research Linked Data in Library and Archives Domains Europeana Data.europeana.eu is an ongoing effort of making Europeana metadata available as Linked Open Data on the Web. It allows others to access metadata collected from Europeana data providers via standard Web technologies. The data are represented in the Europeana Data Model (EDM) and the described resources are addressable and dereferencable by their URIs. Links between Europeana resources and other resources in the Linked Data Web will enable the discovery of semantically related resources. The approach developed allows data providers to opt for their data to become Linked Data and converts their metadata to the EDM, benefiting from Europeana efforts to link them to semantically related resources on the Web. With this approach, a first Linked Data version of Europeana was produced and has published the resulting datasets on the Web. Experiences were gained with respect to EDM, HTTP URI design, and RDF store performance. More details are available at the W3C Use Case Europeana webpage. National Libraries experimenting with Linked Open Data The highly structured data contained in library catalogues are usually only available to clients using search and retrieve protocols such as Z39.50. Exposing library data as Linked Data would let libraries be part of a larger movement aiming to accelerate innovation by opening up data silos and make the data available to agents outside the library sector. The following examples 9|Page demonstrate the use of Linked Open Data Technologies by national libraries to expose their catalogue content to search engine in more efficient way, and share that content with other libraries. British Library (BL): In July 2011, the BL has announced a significant contribution to the development, application, and sharing of bibliographic data using Linked Data techniques and technologies to publish the British National Bibliography (BNB) of 2.8 million titles. Figure 7 below presents the BL Data Model. Of note, Linked Open BNB triples are available via the BL’s Free Data Services. Figure 7 – British Library Data Model: Overview (2011) Bibliothèque nationale de France (BnF): data.bnf.fr gathers data from the different databases of the Bibliothèque nationale de France, so as to create Web pages about Works and Authors, together with a RDF view on the extracted data. There are about 2.000.000 RDF triples. There are links to id.loc.gov for languages and nationalities, to dewey.info for subjects, and to DCMI type for types. BnF uses SKOS, FOAF, DC and RDA vocabularies, in a FRBR model. The Library of Congress (LC) has released Linked Data versions of its schemes and vocabularies, the LC Linked Data Service Authorities and Vocabularies. German National Library: In 2010 the German National Library (DNB) started publishing authority data as Linked Data. The existing Linked Data service of the DNB is now extended with title data. In this context, 10 | P a g e the licence for Linked Data is shifted to Creative Commons Zero. The RDF/XML representation of a title record is available in the DNB portal. This is an experimental service which will be extended and refined continually. National Library of Sweden (NLS): The NLS created the Linked Data interface to LIBRIS, the Swedish Union Catalogue, to experiment with RDF by mapping MARC21 and linking their datasets to other datasets. The NLS believe that a “data first” approach is better than “perfect metadata first”. Spanish National Library: The Ontology Engineering Group has announced the launch of datos.bne.es (February 2012), an open initiative aimed at enriching the Web of Data with library data from the Spanish National Library. The RDF generation from MARC 21 records was done using our tool MARiMbA, which allows non-technical users to work on the mappings from MARC21 metadata to RDF using different RDFS/OWL vocabularies. This initiative is part of the project ‘Linked data at the BNE’, supported by the BNE in cooperation with the Ontology Engineering Group (OEG) at the Universidad Politécnica de Madrid (UPM). OCLC WolrdCat Linked Data and other Library Tools In June 2012, OCLC (Online Computer Library Center), a non-profit computer library service and research organization, added Linked Data by appending Schema.org to WorldCat.org pages, the largest set of linked bibliographic data on the web. With the addition of Schema.org mark-up to all book, journal and other bibliographic resources in WorldCat.org, the entire publicly available version of WorldCat is now available for use by intelligent web crawlers, like Google and Bing, that can make use of this metadata in search indexes and other applications. Of note, Schema.org is one of the main incubators of semantic standards based on the RDF data that is being embedded in more Web pages. The Schema.org initiative, launched in 2011 by Google, Bing and Yahoo! and later joined by Yandex, provides a core vocabulary for markup that helps search engines and other web crawlers more directly make use of the underlying data. Companies that wish to take advantage of semantic technology can use the schema.org site to discover the vocabularies they need to include on their Web pages to take advantage of natural language search. Adding Linked Data to WorldCat records makes those records more useful – especially to search engines, developers and services on the wider Web, beyond the library community. This will make it easier for search engines to connect non-library organizations to library data. Eric Miller, President of Zepheira, a professional services company that is assisting OCLC with Linked Data strategy argued that "Libraries generate, maintain and improve an enormous amount of highquality data that is valuable well beyond traditional library boundaries. By operating as a kind of switchboard to and from other data-driven resources, WorldCat data can better connect students, scholars and businesspeople to library resources." OCLC has previously released Dewey.info, an experimental space for linked Dewey Decimal Classification (DDC) data. All assignable classes from DDC 23 have been released as Dewey Linked Data. In terms of licensing, all data are still reusable under the same terms (Creative Commons BY-NC-ND) and the license is carried in each record. It has also made available as Linked Data the VIAF (Virtual International Authority File), a joint project that explores virtually combining the name authority files of participating institutions – a merger of nearly 20 nationallevel name authority files – into a single name authority service. 11 | P a g e Library and Archives Canada’s Linked Data Activities In 2011, the W3C published the Library Incubator Report on Linked Open Data, encouraging library, archives and museums (LAMs) to share their rich stores of metadata developed over the past decades and to encourage global information interoperability in new ways. The report includes four recommendations for LAM heritage institutions in order to open their data to the world. Key recommendations are: That library leaders identify sets of data as possible candidates for early exposure as Linked Data and foster a discussion about Open Data and rights; That library standards bodies increase library participation in Semantic Web standardization, develop library data standards that are compatible with Linked Data, and disseminate best-practice design patterns tailored to library Linked Data; That data and systems designers design enhanced user services based on Linked Data capabilities, create URIs for the items in library datasets, develop policies for managing RDF vocabularies and their URIs, and express library data by re-using or mapping to existing Linked Data vocabularies; That librarians and archivists preserve Linked Data element sets and value vocabularies and apply library experience in curation and long-term preservation to Linked Data datasets. Encouraged by recent advances in web development and the GC Open Data Portal, LAC has produced four major Linked Open Datasets over the previous year: 1- The Government of Canada (GC) Core Subject Thesaurus (French and English) 2- Maps, plans and charts of Canada 3- Soldiers of the First World War (Canadian Expeditionary Force ) 4- The Canadian Subject Headings LAC will soon be publishing the Canada Gazette as Linked Open Data. In addition, LAC is contributing to the Virtual International Authority File (VIAF) – available on the web in an RDF/XML format, by adding the Canadiana Name Authorities, representing 850,000 Canadian authors and organizations. Working collaboratively with five institutions within the Pan-Canadian Documentary Heritage Network (PCDHN), LAC rendered the metadata for First World War digital resources as Linked Open Data and developed an experimental visualization application in support of the project. This project who has garnered worldwide attention is presented briefly in the next section. The PCDHN Proof-of-Concept Partners of the PCDHN have undertook an interesting project, a “proof-of-concept” showcase of using Linked Open Data visualizations for Out of the Trenches (See Figure 8). This is a look at the First World War from the Canadian perspective: war songs, postcards, newspapers, photos, films, and these resources’ intersection with Canadian soldiers who fought in the war. The underlying premise was to expose the metadata for these resources using RDF/XML and 12 | P a g e existing/published ontologies and vocabularies, maximizing discovery by a broad user community. Figure 8 – Out of the Trenches Project: Linked Open Data of the First World War (2012) Digital resources from organizations such as McGill University, the Universities of Alberta, Calgary and Saskatchewan, and the Bibliothèque et Archives nationales du Québec (BAnQ) have been linked through existing metadata provided in formats ranging from spreadsheets to MODS (Metadata Object Description) XML to RDF. Rather than reduce the metadata to a common subset, the approach was to maximize its use by moving to the web of data concept, so that the resources can be combined in different and unexpected ways. The approach to this proof-ofconcept was to make use of existing metadata about the resources and repurpose it without loss of context and meaning. Plans are next to determine the order of taking things beyond the proof of concept, including developing visualizations for the remaining dimensions; deliver cross-browser functionality; add more intelligence in query results parsing; make it possible to retrieve and filter by multiple dimensions; create geographic and time line visualizations of events and their related resources; and add additional resources and metadata from other interested parties, among other efforts. PCDHN partners, with the goal of ensuring widespread usage, have agreed that the project metadata may be used under the terms of the Open Data Commons Public Domain Dedication and License (PDDL). 13 | P a g e Conclusion With the proliferation of key players in the field, and experiments being conducted by many memory institutions, the field is beginning to mature. The benefits of Semantic Web technologies are substantial for an institution like LAC. First, these technologies allow LAC to “push out” its own data to users – rather than having them to “pull out” what they need form LAC’s collections, and expose LAC’s data to search engines and intelligent agents on the Web and to other interested institutions and partners. Secondly, it allows LAC to communicate what is unique in its collections while benefiting from other institutions treasures. Linked Open Data is undoubtedly the way forward in the world of libraries and archives, as in many other areas benefiting from a web environment. There are different approaches to use semantic search technologies, LAC needs to define and establish its own criteria to choose the most suitable approach to its holdings to meet the expectations of Canadians in today’s digital environment. 14 | P a g e References Bergman, Mike (2012). The Rationale for Semantic Technologies. http://www.mkbergman.com/1015/the-rationale-for-semantic-technologies/ Berners-Lee, Tim (2006). Linked Data Principles. http://www.w3.org/DesignIssues/LinkedData.html Berners-Lee, Tim ; Hendler, James and Lassila, Ora (2001). The Semantic Web: A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities, Scientific American 284 (5), May 2001 http://wwwsop.inria.fr/acacia/cours/essi2006/Scientific%20American_%20Feature%20Article_% 20The%20 Semantic%20Web_%20May%202001.pdf Cheng, Gong; Ge, Weiyi; Wu, Honghan and Qu, Yuzhong (2008). Searching Semantic Web Objects Based on Class Hierarchies, LDOW 2008. http://events.linkeddata.org/ldow2008/papers/12-cheng-ge-searching-semantic-web-objects.pdf Europeana Linked Open Data (LOD). http://pro.europeana.eu/linked-open-data Europeana. Towards a new Europeana http://pro.europeana.eu/support-for-open-data Data Exchange Agreement. Feigenbaum, Lee et al. (2009). "The Semantic Web in Action : Corporate applications are well under way, and consumer uses are emerging". Scientific American. January 19, 2009. http://www.scientificamerican.com/article.cfm?id=semantic-web-in-actio Feigenbaum, Lee, Ivan Herman, Tonya Hongsermeier, Eric Neumann, and Susie Stephens. “The Semantic Web in Action.” Scientific American, vol. 297, Dec. 2007, pp. 90-97. http://thefigtrees.net/lee/sw/sciam/semantic-web-in-action#single-page Gallagher, Sean. "How Google and Microsoft taught search to "understand" the Web". June 6 2012. http://arstechnica.com/information-technology/2012/06/inside-the-architectureofgoogles-knowledge-graph-and-microsofts-satori/ Haller, Armin and Ratcliffe, David. Semantic Web 101 (tutorial), Canberra Semantic Web meetup, Canberra, Australia. April 23, 2012. (PowerPoint Presentation). http://www.slideshare.net/RatcliffeDavid/semantic-web-101 Haslhofer, Bernhard ; Isaac, Antoine (2011). The Europeana Linked Open Data Pilot. http://dcevents.dublincore.org/index.php/IntConf/dc-2011/paper/view/55/14 Herman, Ivan. A short introduction to Semantic Web technologies. June 8, 2012. Oracle, Redwood City, CA, USA. http://www.w3.org/2012/Talks/0608-Oracle-IH/Talk.pdf Koster, Lukas (2009). Linked Data for Libraries. June 19th, 2009. http://commonplace.net/2009/06/linked-data-for-libraries/ 15 | P a g e Kyumars Sheykh Esmaili, Hassan Abolhassani. A Categorization Scheme for Semantic Web Search Engines (CiteSeer website). https://citeseer.ist.psu.edu/myciteseer/login Malmsten, Martin (2012). Cataloguing in the open - the disintegration and distribution of the record. Italian Journal of Library and Information Science. http://leo.cilea.it/index.php/jlis/article/view/5512 Misener, Dan. Giving meaning to the Semantic Web. The Globe and Mail. May 31, 2012. http://www.theglobeandmail.com/report-on-business/small-business/sbdigital/webstrategy/giving-meaning-to-the-semantic-web/article4217489/ Misener, Dan. Semantic Web offers challenges and opportunity. The Globe and Mail. June 7 2012. http://www.theglobeandmail.com/report-on-business/small-business/sbdigital/webstrategy/semantic-web-offers-challenges-and-opportunity/article4234723/ Myers, Anthony (2012). How Semantic Technology Will Be Used During the Olympics #semtech. Jun 7, 2012. http://www.cmswire.com/cms/customer-experience/howsemantictechnology-will-be-used-during-the-olympics-semtech-015960.php Myers, Anthony. How Semantic Technology Will Be Used During the Olympics #semtech. June 7, 2012. http://www.cmswire.com/cms/customer-experience/how-semantic-technology-willbeused-during-the-olympics-semtech-015960.php OCLC WorldCat Linked Data Release – Significant In Many Ways. http://dataliberate.com/2012/06/oclc-worldcat-linked-data-release-significant-in-many-ways/ Pan-Canadian Documentary Heritage Network (PCDHN) Linked Open Data (LOD) Visualization “Proof-of-Concept”. “Out of the Trenches : A Linked Open Data Project”. http://www.canadiana.ca/sites/pub.canadiana.ca/files/PCDHN%20Proof-ofconcept_FinalReport-ENG_0.pdf W3C. Semantic Web. http://www.w3.org/standards/semanticweb/ W3C. Semantic Web Search Engines. http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/SemanticWebSearch Engines Web 2.0 Summit 09: Tim Berners-Lee and Tim O'Reilly, A Conversation… (Video) http://www.youtube.com/watch?v=KY5skobffk0 Weber, Harrison . "Bing challenges Google’s Knowledge Graph". http://thenextweb.com/microsoft/2012/06/07/bing-challenges-googles-knowledge-graphwithnew-britannica-encyclopedia-partnership/ Appendix I – Other Relevant Applications of Semantic Web Technologies Web Semantic technologies are being used in many different domains which include the following: 16 | P a g e DBpedia is an effort to publish structured data extracted from Wikipedia. The data is published in RDF and made available on the Web for use under the GNU Free Documentation License, thus allowing Semantic Web agents to provide inferencing and advanced querying over the Wikipediaderived dataset and facilitating interlinking, re-use and extension in other datasources. The BBC's new 2012 Summer Olympics website takes advantage of RDF data and a dynamic semantic publishing architecture. According to Jem Rayfield, a senior technical architect for the BBC Future Technology division, the BBC expects to get 10 million page views a day to the 10,000 Web pages on the Olympic site. This will no doubt be the largest semantic technology implementation yet for a media site. BBC used a somewhat similar system for its 2010 World Cup Web site, but now is adding systems by fluid Operations on the content workflow side, and Ontotext, a leader in semantic technology, on the database side. The 2012 Olympics Web site uses Linked Data to allow a handful of journalists to populate thousands of Web pages with dynamic content. The Friend of a Friend (FOAF) project is creating a Web of machine-readable pages describing people, the links between them and the things they create and do; it is a contribution to the linked information system known as the Web. FOAF defines an open, decentralized technology for connecting social websites, and the people they describe. FOAF permits intelligent agents to make sense of the thousands of connections people have with each other, their jobs and the items important to their lives; connections that may or may not be enumerated in searches using traditional web search engines. Harper's Magazine has harnessed semantic ontologies on its Web site to present annotated timelines of current events that are automatically linked to articles about concepts related to those events. Garlik has been innovating at the forefront of industrial strength Semantic Web technologies by making use of these technologies to organise, store, and query the vast amount of personal information found in the public domain. Garlik merges personal information databases to create a unified view of online identity. Nextbio is a database consolidating high-throughput life sciences experimental data tagged and connected via biomedical ontologies. Nextbio is accessible via a search engine interface. Researchers can contribute their findings for incorporation to the database. The database currently supports gene or protein expression data and sequence centric data and is steadily expanding to support other biological data types. 17 | P a g e
© Copyright 2026 Paperzz