The German federal government has proclaimed Faceted Wikipedia Search as one of the 365 most innovative ideas in Germany in the context of the Deutschland – Land der Ideen competition. The competition showcases innovative ideas in areas such as science and technology, business, education, art and ecology. The patron of the competition is the German President Horst Köhler.
Faceted Wikipedia/DBpedia Search allows users to ask complex queries, like “Which Rivers flow into the Rhine and are longer than 50 kilometers?” or “Which Skyscrapers in China have more than 50 floors and have been constructed before the year 2000?” against Wikipedia. The answers to these queries are not generated using key word matching as the answers of search engines like Google or Yahoo, but are generated based on structured information that has been extracted from many different Wikipedia articles. Faceted Wikipedia/DBpedia Search allows users to query Wikipedia like a structured database and thus enables them to truly exploit Wikipedia’s collective intelligence.
Faceted Wikipedia/Dbpedia Search can be tested online at http://dbpedia.neofonie.de/browse/
Please click on the example queries below to see Faceted Wikipedia Search in action:
- Rivers that flow into the Rhine and are longer than 50 kilometers
- Albums from the Beach Boys that were released between 1980 and 1990
- French scientists who were born in the 19th century
- Skyscrapers in China that have been constructed before 2000 and have more than 50 floors
- Actors of the American TV-series Lost that were born in England
- Endangered Primates
Faceted Wikipedia/DBpedia Search has been jointly developed by neofonie GmbH, Berlin and the Web-based Systems Group at Freie Universität Berlin. Technically, Faceted Wikipedia/DBpedia Search is based on the DBpedia data extraction framework and neofonie search technology.
The DBpedia data extraction framework extracts structured data from Wikipedia, such as the content of infoboxes which summarize relevant facts as a table on the top right-hand side of Wikipedia articles. The extracted data is represented using the Resource Description Framework, a data model for web-based systems. Currently, the framework extracts around 190 million facts from the English editon of Wikipedia and 289 million facts from Wikipedia editions in 90 further languages. The DBpedia data extraction framework is developed by the Web-based Systems group at Freie Universität Berlin and the Agile Knowledge Engineering and Semantic Web group at Universität Leizpig.
The neofonie search engine, neofonie search, is employed to execute complex queries over the extracted data. neofonie search aggregates RDF data from DBpedia with full-text data from Wikipedia. The aggregated data is then divided into hierarchical facets, composed of 200 types with 2.9 million values. In addition to providing the search technology and processing power, neofonie is also responsible for the hosting of the Faceted Wikipedia/DBpedia Search on the Amazon Elastic Compute Cloud (Amazon EC2).
As DBpedia covers a wide range of domains and has a high degree of conceptual overlap with various other open-license datasets, an increasing number of data publishers have started to set data-level links from their data sources to DBpedia, making DBpedia one of the cristalization points of the emerging Web of Linked Data. In the future, the links between databases will allow applications like Faceted Wikipedia Search to answer queries based not only on Wikipedia knowledge but based on knowledge from a world wide web of databases.
Faceted Wikipedia Search will be presented as part of the Land der Ideen series on April 12th, 2010 at neofonie, Berlin.
Additional information about the Land der Ideen competition, DBpedia, neofonie and the Web of Data is found at:
- Deutschland – Land der Ideen
- Deutschland – Land der Ideen - Winners 2010
- Deutschland - Land der Ideen - Background information on the 2010 competition
- Press release about Faceted Wikipedia Search (in German)
- DBpedia website
- Neofonie website
- DBpedia – A Crystallization Point for the Web of Data (article)
- Linked Data - The Story so far (article)