– Project Description: Although the DBPedia Extraction Framework was adapted to support RML mappings thanks to a project of last year GSoC, the user interface to create mappings is still done by a MediaWiki installation, not supporting RML mappings and needing expertise on Semantic Web. The goal of the project is to create a front-end application that provides a user-friendly interface so the DBPedia community can easily view, create and administrate DBPedia mapping rules using RML. Moreover, it should also facilitate data transformations and overall DBPedia dataset generation. Mentors: Anastasia Dimou, Dimitris Kontokostas, Wouter Maroy
Ram Ganesan Athreya – Project Description:The requirement of the project is to build a conversational Chatbot for DBpedia which would be deployed in at least two social networks.There are three main challenges in this task. First is understanding the query presented by the user, second is fetching relevant information based on the query through DBpedia and finally tailoring the responses based on the standards of each platform and developing subsequent user interactions with the Chatbot.Based on my understanding, the process of understanding the query would be undertaken by one of the mentioned QA Systems (HAWK, QANARY, openQA). Based on the response from these systems we need to query the DBpedia dataset using SPARQL and present the data back to the user in a meaningful way. Ideally, both the presentation and interaction flow needs to be tailored for the individual social network.I would like to stress that although the primary medium of interaction is text, platforms such as Facebook insist that a proper mix between chat and interactive elements such as images, buttons etc would lead to better user engagement. So I would like to incorporate these elements as part of my proposal.
Mentor: Ricardo Usbeck
Nausheen Fatma – Project discription: Knowledge base embeddings has been an active area of research. In recent years a lot of research work such as TransE, TransR, RESCAL, SSP, etc. has been done to get knowledge base embeddings. However none of these approaches have used DBpedia to validate their approach. In this project, I want to achieve the following tasks: i) Run the existing techniques for KB embeddings for standard datasets. ii) Create an equivalent standard dataset from DBpedia for evaluations. iii) Evaluate across domains. iv) Compare and Analyse the performance and consistency of various approaches for DBpedia dataset along with other standard datasets. v)Report any challenges that may come across implementing the approaches for DBpedia. Along the way, I would also try my best to come up with any new research approach for the problem.
Mentors: Sandro Athaide Coelho, Tommaso Soru
Akshay Jagatap – Project Description: The project aims at defining embeddings to represent classes, instances and properties. Such a model tries to quantify semantic similarity as a measure of distance in the vector space of the embeddings. I believe this can be done by implementing Random Vector Accumulators with additional features in order to better encode the semantic information held by the Wikipedia corpus and DBpedia graphs.
Mentors: Pablo Mendes, Sandro Athaide Coelho, Tommaso Soru
Luca Virgili – Project Description: In Wikipedia a lot of data are hidden in tables. What we want to do is to read correctly all tables in a page. First of all, we need a tool that can allow us to capture the tables represented in a Wikipedia page. After that, we have to understand what we read previously. Both these operations seem easy to make, but there are many problems that could arise. The main issue that we have to solve is due to how people build table. Everyone has a particular style for representing information, so in some table we can read something that doesn’t appear in another structure. In this paper I propose to improve the last year’s project and to create a general way for reading data from Wikipedia tables. I want to review the parser for Wikipedia pages for trying to understand more types of tables possible. Furthermore, I’d like to build an algorithm that can compare the column’s elements (that have been read previously by the parser) to an ontology so it could realize how the user wrote the information. In this way we can define only few mapping rules, and we can make a more generalized software.
Mentors: Emanuele Storti, Domenico Potena
Shashank Motepalli – Project Description: DBpedia tries to extract structured information from Wikipedia and make information available on the Web. In this way, the DBpedia project develops a gigantic source of knowledge. However, the current system for building DBpedia Ontology relies on Infobox extraction. Infoboxes, being human curated, limit the coverage of DBpedia. This occurs either due to lack of Infoboxes in some pages or over-specific or very general taxonomies. These factors have motivated the need for DBTax.DBTax follows an unsupervised approach to learning taxonomy from the Wikipedia category system. It applies several inter-disciplinary NLP techniques to assign types to DBpedia entities. The primary goal of the project is to streamline and improve the approach which was proposed. As a result, making it easy to run on a new DBpedia release. In addition to this, also to work on learning taxonomy of DBTax to other Wikipedia languages.
Mentors: Marco Fossati, Dimitris Kontokostas
Krishanu Konar – Project Description: Wikipedia, being the world’s largest encyclopedia, has humongous amount of information present in form of text. While key facts and figures are encapsulated in the resource’s infobox, and some detailed statistics are present in the form of tables, but there’s also a lot of data present in form of lists which are quite unstructured and hence its difficult to form into a semantic relationship. The project focuses on the extraction of relevant but hidden data which lies inside lists in Wikipedia pages. The main objective of the project would be to create a tool that can extract information from wikipedia lists, form appropriate RDF triplets that can be inserted in the DBpedia dataset.
Mentor: Marco Fossati