Web 3.0: "Semantic Web" – navigating in an ocean of data

Even before the interactive "Web 2.0" was established with its social networks such as Facebook and Twitter, ways had been conceived of making semantic links, in other words meaningful connections, visible in a "Web 3.0". But despite some promising attempts, fifteen years later the "Semantic Web" is still waiting for a breakthrough. The Alumniportal talks about this to Daniel Pfirrmann.

Daniel Pfirrmann is Managing Director of DiOmega GmbH, a full service agency for IT and Web services, multimedia applications, mobile applications and web-based training, which he founded in 2006 together with Dominique Bös. His personal specialist area is "big data".

Question: Mr Pfirrmann, according to current estimates, the amount of data transmitted by global networks in 2016 will have grown to over one zettabyte. An almost unimaginable size – "big data" in the true sense of the word. In "Web 3.0", also known as the "Semantic Web", the intention is that we should still be able to find what we're looking for, even in this huge ocean of data. How is this conceivable?

Daniel Pfirrmann: The Internet as we know it today has been developed for people. The Semantic Web or Web 3.0 makes it possible to edit information in such a way that machines can process it. So the Semantic Web is an intelligent linking of data. It involve arranging information in logical and semantic relationships and automatically interpreting and classifying it. It is essentially based on metadata, which contains all the relevant details concerning the relationship between pieces of information.

Web 3.0: Automatically classifying relevant information

Question: How does data have to be edited in order to achieve this? Are there any practical examples that can be used today to show what we can expect?

Daniel Pfirrmann: Data linkage looks like this: Frankfurt<city> lies on the Main<river>. In this context, "city" and "river" are what is known as metadata. The technical foundations for such descriptions are the Resource Description Framework (RDF) specification and the Web Ontology Language (OWL).

One well-known example of Web 3.0 is DBpedia. This project is extracting data from Wikipedia in a semantic format, making it possible to access it in a significantly more structured and targeted manner. For example, this method can easily be used to retrieve a list of all cities with more than two million inhabitants. All the information needed for this specific query is automatically structured and displayed.

Another example of an early Semantic Web application is FOAF (Friend of a Friend), a project on the machine-readable modelling of social networks. It enables social relationships to be structured, analysed and visualised.

Semantic Web – An Overview

Semantic Web: Metadata generates additional work

Question: It was recognised even before the turn of the century that it was becoming increasingly difficult to navigate in a rapidly growing ocean of data. Even back then, the Internet of the future was seen as being the "Semantic Web". Why are people still talking about this Semantic Web as a "future project"? What's so difficult about implementing it?

Daniel Pfirrmann: Despite some very promising initial approaches, there are unfortunately various unresolved problems in establishing the Semantic Web. First of all, adding machine-readable information – metadata – to documents obviously involves additional work. In addition to creating their texts, authors themselves have to provide correct, meaningful keywords.

Question: Or content providers have to do that at a later date. Of course this involves considerable effort, which you would only undertake if you were expecting some sort of benefit from it.

Daniel Pfirrmann: What's more, we shouldn't underestimate the risk of metadata being misused. For example, some website providers try to use HTML meta tags to improve their search engine rankings, regardless of whether the meta information is accurate or not.

Another problem, for example in the FOAF project, is the ever-sinking threshold for the invasion of personal privacy. In the Semantic Web, it's easier than ever to find out other people's personal likes and dislikes or hobbies by automatically searching in social networks for information on users. Advertisers can use this information to tailor advertisements to lifestyle, friends, income or preferences much more closely.

Depending on your perspective, you may feel this is a good thing or a bad thing. Nevertheless, there are also other industries that are interested in this type of metadata. But we won't be looking at that in detail here. Thank you for the interview.