Demystifying the Google Knowledge Graph (Part I)
The reveal of Google’s Knowledge Graph has had a ripple effect on the industry. People have taken notice, and for good reason; Google has confirmed a radical change in their way of serving up search results, which since 1998, has been based on matching keywords to queries (as well as other variables such as the popularity, source, and quality of content).
Actually, the announcement from Google is simply an official coming out. Last February, Mashable was already talking about the Knowledge Graph, while the Wall Street Journal dedicated an article to Google’s efforts. Even I had thrown my hat into the ring from from time to time. What is new, however, is the launch of the Google Knowledge Graph in the US. Google users are already seeing suggestions appear in their search results, as pictured below.
Many parties view Google’s announcement as a paradigm shift in the industry and are guilty of confusing the Knowledge Graph with the Semantic Web. In reality, the Semantic Web is a much more complex notion that presumes the interaction of many entities. Google’s contribution is one of democratizing access to structured information through the use of semantically-enriched search results by virtue of their Knowledge Graph. This article, the first of three, aims to demystify this evolution by giving it context, and also to help you better understand the concept of the Semantic Web, the next collaborative movement of the W3C (World Wide Web Consortium). This series is organized into three chapters:
- The Google Knowledge Graph: An Evolution, not a Revolution
- The Contribution of the Knowledge Graph to the Advent of the Semantic Web
- Takeaways and Recommendations: Rethinking Web Strategy
The Google Knowledge Graph: An Evolution, not a Revolution
To fully understand the changes occurring at Google, we must examine their origin. For this, we go back to 2009, when an Austrian company revealed that Google could generate structured information from unstructured data. To refresh your memory, here is a screenshot from 2009 showing Google search results that were considered interesting at the time, yet are, admittedly, fairly trite by today’s standards.
At the time, Google was fully focused on improving the functionality of Google Squared, which was purportedly capable of comparing products on the basis of numerical data to generate a spreadsheet comparing characteristics such as height, width, weight, and price. Below are a few examples.
Certain analysts view the changes at Google, encompased in the Knowledge Graph, as the logical continuation of their Wonder Wheel project. A project that was launched in 2009 only to be officially abandoned in 2011 due to the difficulty its maintenance. We could also look at Google Direct Answers, which underwent a testbed phase in 2006.
To be fair, Google was not the first to implement this kind of technology. Back in 2008, Microsoft purchased Powerset to develop the capacity of semantic analysis in their search engine, Live Search, all in an effort to overthrow Google. Even earlier, it was Yahoo! who launched SearchMonkey, an open-source platform encouraging programmers to help develop means to interpret structured data, microformats, to generate semantically-enriched results. The image below pays tribute to this Yahoo! initiative.
Let’s get back to Google. In 2009, Google announced that it would support the Semantic Web standard, RDFa and microformats, that would later rival Microsoft and Yahoo! These integrations, while promising, were the topic of a presentation I gave that same year at the University of Montreal as part of my Masters in e-Commerce. As such, please allow me to quote myself:“It becomes interesting when we create a FOAF profile using RDF, then integrate it within our website or blog, and finally link it to our Twitter account.“ the Semantic Web builds on the W3C’s Resource Description Framework (RDF).
FOAF (Friend Of A Friend) is a machine-readable ontology, based on RDF, designed to describe persons, their activities, their affiliations and their relations in a simple manner. FOAF is used by the Canadian open source social networking and micro-blogging service Identi.ca, as well as FriendFeed, the first notable acquisition made by Facebook in August 2009. Today, its use is much more widespread (see figure below).
The Google Knowledge Graph is the fruit of years of compiling search results and user actions. It is also the leveraging of already structured information sources such as Wikipedia (Google’s object of admiration), Freebase (an offshoot acquired by Google in 2010), IMDb (the Internet Movie Database) and many others. Add to this all the ideas, notions and concepts that Google was able to collect via different pages marked by microformats, RDF, or more recently, Schema. I should note that the three major search engine providers have united since June 2011 to support Schema.org, which definitely contributes to it being my favourite. Furthermore, Schema is behind the technology deployed by the Indian company Global Logic if we believe this job listing posted on Facebook that demands a specialist in Google Knowledge Graph. It is interest to note that among Global Logic’s listed clients, one finds companies like Google, Microsoft, Yahoo, LinkedIn and Stryker.
Of course, the continual presence of social interactions leads to a deeper understanding of these relationships. This is certainly true when understanding that Google has a vested interest in increasing the adoption rate of the Google Plus platform. Take a Facebook profile, one who is friends with the author of a blog that, in turn, is connected with a website that contains Schema information (organization and place, for example). This website could be connected to a Twitter account that publishes a series of updates. Google has the possibility to then create a series of web entities that, when linked, not only reveal the identity of the Facebook profile, but also reveal the profile’s relationship to the author of the blog, or even the organization behind the Twitter account. Each entity is linked by Schemas that rely on standard Web Semantic languages (RDF), just as predicted by Tim Berners Lee when he spoke of Linked Data during his TED Talks presentation in February 2009. We are now at a stage to discuss the Semantic Web, also known as Web 3.0.
In Part 2 of this series, I will delve deeper into the Semantic Web. We will see more precisely how the Google Knowledge Graph first into the general framework of Web 3.0 and how semantic search, which is of considerable value to Google, has been rendered possible due to the leverage and exploitation of open databases.
*This article reflects my personal opinion and does not necessarily represent the position of Mediative.