Category Archives: Events

More than 130 knowledge graph enthusiasts joined the KGiA event.

Opening the KG in Action event

The SEMANTiCS Onsite Conference 2020 had to be postponed till September 2021. To bridge the gap until 2021, we took this opportunity to organize the Knowledge Graphs in Action (KGiA) online track as a SEMANTiCS satellite event on October 6, 2020. This new online conference is a combination of two existing events: the DBpedia Community Meeting and the annual Spatial Linked Data conference organised by EuroSDR and the Platform Linked Data Netherlands. We combined the best of both and as a bonus we added a track about Geo-information Integration organized by EuroSDR. As special joint sessions we presented four keynote speakers. 

First and foremost, we would like to thank the SEMANTiCS, EuroSDR and Platform Linked Data Netherlands for organizing the KGiA online event and many thanks to all chairs who supported the conference.

Following, we will give you a brief retrospective about the keynote presentations and talks.

Opening & Keynote #1

The Knowledge Graphs in Action conference was opened with a keynote presentation ‘Data Infrastructure for Energy System Models’ by Carsten Hoyer-Klick (German Aerospace Center). He presented LOD GEOSS, a project for the development of a distributed data infrastructure for the analysis of energy systems. The project is about the development of networked database concepts based on the ideas of linked open data and the semantic web for input and output data of energy system models in energy systems analysis. Afterwards the conference chairs offered three parallel sessions in the morning. 

Morning Sessions 

Session 1: Spatial Linked Data Country Update

In this session 7 speakers presented the uptake and latest progress of Spatial Linked Data adoption in European countries, either within national mapping agencies or beyond.

Session 2: VGI country presentations

There is an increasing use of crowdsourced geo-information (CGI) in spatial data applications by National Mapping and Cadastral Agencies (NMCAs). Applications range from using CGI for supporting the actualisation of spatial data to adding extra content, such as land use, building entrances, road barriers, sensors placed in the public space and many more. This session hosted five presentations from NMCAs showing the status of their CGI integration in mapping applications and processes.

Session 3: DBpedia Member presentations

Members of the DBpedia Association presented their latest tools, applications and technical developments in this session. Filipe Mesquita (Diffbot) opened the member session with his talk ‘Beyond Human Curation: How Diffbot Is Building A Knowledge Graph of the Web’. Also ImageSnippets, timbr.ai and GNOSS gave interesting and delightful talks about their technical developments. Vassil Momtchev from Ontotext closed the session by giving insights into the GraphDB 9.4.   

For further details of the presentations follow the links to the slides on the event page.

Afternoon Sessions 

Keynote #2

The afternoon sessions started with an interesting keynote by Peter Mooney (Maynooth University). He talked about the opportunities for a more integrated approach to Geo-information integration. 

Dutch National Graph as a Digital Twin

After the second keynote Sebastian Hellmann, the CEO of the DBpedia Association, presented the development and methodology of the National Knowledge Graph for the Netherlands. In cooperation with Dutch partners, DBpedia invested two months to develop this new knowledge graph. His insightful presentation was followed by Benedicte Bucher (University Gustave Eiffel) talking about ‘Knowledge Graph on spatial digital assets in European’. She also presented the EuroSDR LDG initiative in many details.      

Afternoon Parallel Sessions

Session 4: Transforming Linked Data into a networked data economy – DBpedia Chapter Session

In the DBpedia Chapter Session, members of different European DBpedia chapters gave an overview about the data landscape in their countries. They presented identified business opportunities and important challenges, such as automated clearance of licenses in their countries. Enno Meijers (National Library of the Netherlands) summarized the data landscape in the Netherlands. There were also presentations about the data landscape in Brazil, Spain, Austria and Poland.   

Session 5: EuroSDR VGI data wrangling

This session intends to uncover new combinations and integration of CGI data with data from NMCAs which demonstrate the added value for map creation and map usage. Data wrangling (the process of creating small reproducible data processing workflows) is deployed for this work by using and combining existing geospatial software (desktop, web and mobile). In this session the results of the data wrangling process were presented. 

Session 6: Spatial Session

In this session, two speakers presented how they built knowledge graphs, and in the second part three presenters gave insights into tooling and presented the state of the art on working with Linked Data.

For further details of the presentations follow the links to the slides on the event page.

Keynote #3 and #4

Keynote #3 ‘Spatial Knowledge in Action – Deep semantics, geospatial thinking, and new cartographies’ was given by Marinos Kavouras (National Technical University of Athens). Marinos stated that the power of maps and modern cartographic language proves to have a new role for society at large, as an indispensable communication and cognitive tool. The KG in Action conference ended with the keynote presentation ‘Know, Know Where, KnowWhereGraph’ by Krzysztof Janowicz (University of California). During his live talk from California, Krzysztof provided an overview of ideas and hopes for creating geo-specific knowledge graphs and geo-enrichment services on top of this graph to address some of the aforementioned challenges.

In case you missed the event, all slides and presentations are also available on the DBpeda website. We will upload all recordings on the DBpedia youtube channel. Further insights, feedback and photos about the event are available on Twitter (#KGiA hashtag).

We are now looking forward to 2021. We plan to have meetings at the Knowledge Graph Conference and the SEMANTiCS conference in Amsterdam. Stay safe and check Twitter, LinkedIn and our Website or subscribe to our Newsletter for the latest news and information.

Yours,

DBpedia Association

GSoC 2020 recap

With 45 project proposals, this GSoC edition marked a new record for DBpedia.

GSoc and DBpedia Sticker

Oh, what a year! For the 9th year in a row, we were part of this incredible journey of young ambitious developers who joined us as an open source organization to work on a GSoC coding project all summer. 

Each year has brought us new project ideas, many amazing students and mostly great project results that shaped the future of DBpedia. 

Even though Covid-19 changed a lot in the world, it couldn’t shake GSoC much. The program, designed to mentor youngsters from afar is almost too perfect for the current world situation. One of the advantages of Google Summer of Code is, especially in times like these, the chance to work on projects remotely, but still obtain a first deep dive into Open Source projects like us – DBpedia. 

Meet the students and their projects

This year, we had notably more applications than in the previous ones. With 45 project proposals, this GSoC edition marked a new record for DBpedia. Throughout the summer program, our seven finalists worked intensely on their challenging DBpedia projects with great outcomes to show to the public. Projects ranged from extending our DBpedia extraction framework to a DBpedia Database project as well as to an online tool to generate RDF from DBpedia abstracts. If you want to have deeper insights into our GSoC student’s work you can find their blogs and repos in the following list. Check them out! 

Thanks to all our mentors around the world for joining us in this endeavour, for mentoring with kindness and technical expertise. A huge shout out to those who have been by our side for so many years in a row. Many thanks to Tommaso Soru, Beyza Yaman, Diego Moussalem, Edgard Marx, Mariano Rico, Thiago Castro Ferreira, Luca Virgili as well as Sebastian Hellmann, Stuart Chan, Amandeep Srivastava, Julio Hernandez and Jan Forberg. 

Mentor Summit

During the previous years you might have noticed that we always organized a little lottery to decide which mentor or organization admin can join the annual GSoC mentor summit. As this year’s event will be held online, space is not limited to 300 something mentors but is open to all organization admins and mentors alike. The GSoC Virtual Mentor Summit takes place October 15- 16, 2020 and this year we hope all our mentors will find the time to join and exchange with fellow mentors from around dozens of open source projects. 

After GSoC is before the next GSoC

We can not wait for the 2021 edition. Likewise, if you are an ambitious student who is interested in open source development and working with DBpedia you are more than welcome to either contribute your own project idea or apply for project ideas we offer starting in early 2021.

In case you like to mentor a project do not hesitate to also get in touch with us via dbpedia@infai.org

Stay tuned, frequently check Twitter, LinkedIn or the DBpedia Forum to stay in touch and don’t miss your chance of becoming a crucial force in this endeavour as well as a vital member of the DBpedia community.

See you soon,

yours

DBpedia Association

Call for Participants: DBpedia Autumn Hackathon

Dear DBpedians, Linked Data savvies and Ontologists,

We would like to invite you to join the DBpedia Autumn Hackathon 2020 as a new format to contribute to DBpedia, gain fame, win small prizes and experience the latest technology provided by DBpedia Association members. 
The hackathon is part of the Knowledge Graphs in Action conference on October 6, 2020. 

Timeline 

  • Registration of participants – main communication channel will be the #hackathon channel in DBpedia Slack (sign up, then add yourself to the channel). If you wish to receive a reminder email on Sep 21st, 2020 you can leave your email address in this form.
  • Until September 14th – preparation phase, participating organizations prepare details and track formation. Additional tracks can be proposed, please contact dbpedia-events@infai.org.
  • Announcement of details for each track, including prizes, participating data, demos as well as tools and tasks. Please check updates on the Hackathon website. – September 21st, 2020
  • Hacking period, coordinated via DBpedia slack September 21st to October 1st, 2020
  • Submission of hacking result (3 min video and 2-3 paragraph summary with links, if not stated otherwise in the track) – October 1st, 2020 at 23:59 Hawaii Time
  • Final Event, Each track chair will present a short recap of the track and announces prizes or summarizes the result of hacking. – October 5th, 2020 at 16:00 CEST
  • Knowledge Graphs in Action Event (see program) – October 6th, 2020 at 9:50 – 15:30 CEST
  • Results and videos are documented on the DBpedia Website and the DBpedia Youtube channel.

Member Tracks 

The member tracks are hosted by DBpedia Association members, who are technology leaders in the area of Knowledge Engineering. Additional tracks can be proposed until Sep 14th, please contact dbpedia-events@infai.org.

  • timbr SQL Knowledge Graph: Learn how to model, map and query ontologies in timbr and then model an ontology of GDELT, map it to the GDELT database, and answer a number of questions that currently are quite impossible to get from the BigQuery GDELT database. Cash prizes planned. 
  • GNOSS Knowledge Graph Builder: Give meaning to your organisation’s documents and data with a Knowledge Graph. 
  • ImageSnippets: Labeling images with semantic descriptions. Use DBpedia spotlight and an entity matching lookup to select DBpedia terms to describe images. Then explore the resulting dataset through searches over inference graphs and explore the ImageSnippets dataset through our SPARQL endpoint. Prizes planned. 
  • Diffbot: Build Your Own Knowledge Graph! Use the Natural Language API to extract triples from natural language text and expand these triples with data from the Diffbot Knowledge Graph (10+ billion entities, 1+ trillion facts). Check out the demo. All participants will receive access to the Diffbot KG and tools for (non-commercial) research for one year ($10,000 value).

Dutch National Knowledge Graph Track

Following the DBpedia FlexiFusion approach, we are currently flexi-fusing a huge, dbpedia-style knowledge graph that will connect many Linked Data sources and data silos relevant to the country of the Netherlands. We hope that this will eventually crystallize a well-connected sub-community linked open data (LOD) cloud in the same manner as DBpedia crystallized the original LOD cloud with some improvements (you could call it LOD Mark II). Data and hackathon details will be announced on 21st of September.

Organising committee:

Improve DBpedia Track

A community track, where everybody can participate and contribute in improving existing DBpedia components, in particular the extraction framework, the mappings, the ontology, data quality test cases, new extractors, links and other extensions. Best individual contributions will be acknowledged on the DBpedia website by anointing the WebID/Foaf profile.

(chaired by Milan Dojchinovski and Marvin Hofer from the DBpedia Association & InfAI and the DBpedia Hacking Committee, please message @m1ci to volunteer to the hacking committee)

DBpedia Open Innovation Track 

(not part of the hackathon, pre-announcement)

For the DBpedia Spring Event 2021, we are planning an Open Innovation Track, where DBpedians can showcase their applications. This endeavour will not be part of the hackathon as we are looking for significant showcases with development effort of months & years built on the core infrastructure of DBpedia such as the SPARQL endpoint, the data, lookup, spotlight, DBpedia Live, etc. Details will be announced during the Hackathon Final Event on October 5.  

(chaired by Heiko Paulheim et al.)

Stay tuned and check Twitter, Facebook and our Website or subscribe to our Newsletter for latest news and information.

The DBpedia Organizing Team


‘Knowledge Graphs in Action’ online event on Oct 6, 2020

Due to current circumstances, the SEMANTiCS Onsite Conference 2020 had, unfortunately, to be postponed till September 2021. To bridge the gap until 2021, DBpedia, PLDN and EuroSDR will organize a SEMANTiCS satellite event online, on October 6, 2020. We set up an exciting themed program around ‘Knowledge Graphs in Action: DBpedia, Linked Geodata and Geo-information Integration’.

This new event is a combination of two already existing ones: the DBpedia Community Meeting, which is regularly held as part of the SEMANTiCS, and the annual Spatial Linked Data conference organised by EuroSDR and the Platform Linked Data Nederland. We fused both together and as a bonus, we added a track about Geo-information Integration hosted by EuroSDR. For the joint opening session, we recruited four amazing keynote speakers to kick the event off.    

Highlights of the Knowledge Graph in Action event

– Hackathon (starts 2 weeks earlier)

– Keynote by Carsten Hoyer-Click, German Aerospace Center

– Keynote by Marinos Kavouras, National Technical University of Athens

– Keynote by Peter Mooney, Maynooth University

– Spatial Linked Data Country Session

– DBpedia Chapter Session

– Self Service GIS Session

– DBpedia Showcase Session

Quick Facts

– Web URL: https://wiki.dbpedia.org/meetings/KnowledgeGraphsInAction

– When: October 6, 2020

– Where: The conference will take place fully online.

Schedule

– Please check the schedule for the upcoming Knowledge Graphs in Action event here: https://wiki.dbpedia.org/meetings/KnowledgeGraphsInAction  

Registration 

– Attending the conference is free. Registration is required though. Please get in touch with us if you have any problems during the registration stage. Register here to be part of the meeting: https://wiki.dbpedia.org/meetings/KnowledgeGraphsInAction 

Organisation

– Benedicte Bucher, University Gustave Eiffel, IGN, EuroSDR

– Erwin Folmer, Kadaster, University of Twente, Platform Linked Data Netherlands

– Rob Lemmens, University of Twente

– Sebastian Hellmann, AKSW/KILT, DBpedia Association

– Julia Holze, DBpedia Association

Don’t think twice and register now! Join the Knowledge Graph in Action event on October 6, 2020 to catch up with the latest research results and developments in the Semantic Web Community. Register here and meet us and other SEMANTiCS enthusiasts.

For latest news and updates check Twitter, LinkedIn, the DBpedia blog and our Website or subscribe to our newsletter.

We are looking forward to meeting you online!

Julia

on behalf of the DBpedia Association

DBpedia Workshop at LDAC

More than 90 DBpedia enthusiasts joined the DBpedia Workshop colocated with LDAC2020

On June 19, 2020 we organized a DBpedia workshop co-located with the LDAC workshop series to exchange knowledge regarding new technologies and innovations in the fields of Linked Data and Semantic Web. This workshop series provides a focused overview on technical and applied research on the usage of Semantic Web, Linked Data and Web of Data technologies for the architecture and construction domains (design, engineering, construction, operation, etc.). The workshop aims at gathering researchers, industry stakeholders, and standardization bodies of the broader Linked Building Data (LBD) community.

First and foremost, we would like to thank the LDAC committee for hosting our virtual meeting and many thanks to Beyza Yaman, Milan Dojchinovski, Johannes Frey and Kris McGlinn for organizing and chairing the DBpedia workshop. 

Following, we will give you a brief retrospective about the presentations.

Opening & Keynote 

The first virtual DBpedia meeting was opened with a keynote presentation ‘{RDF} Data quality assessment – connecting the pieces’ by Dimitris Kontokostas (diffbot, US). He gave an overview on the latest developments and achievements around Data Quality. His presentation was focused on defining data quality and identification of data quality issues.  

Sebastian Hellmann gave a brief overview of DBpedia’s history. Furthermore, he presented the updated DBpedia Organisational architecture, including the vision of the new DBpedia chapters and benefits of the DBpedia membership.

Shortly after,  Milan Dojchinovski (InfAI/CTU in Prague) gave a presentation on  ‘Querying and Integrating (Architecture and Construction) Data with DBpedia’. ‘The New DBpedia Release Cycle’ was introduced by Marvin Hofer (InfAI). Closing the Showcase Session, Johannes Frey, InfAI, presented the Databus Archivo and demonstrated the downloading process with the DBpedia Databus

For further details of the presentations follow the links to the slides.

  • Keynote: {RDF} Data quality assessment – connecting the pieces, by Dimitris Kontokostas, diffbot, US (slides)
  • Overview of DBpedia Organisational Architecture, by Sebastian Hellmann, Julia Holze, Bettina Klimek, Milan Dojchinovski, INFAI / DBpedia Association (slides)
  • Querying and Integrating (Architecture and Construction) Data with DBpedia by Milan Dojchinovski, INFAI/CTU in Prague (slides)
  • The New DBpedia Release Cycle by Marvin Hofer and Milan Dojchinovski, INFAI (slides)
  • Databus Archivo and Downloading with the Databus by Johannes Frey, Fabian Goetz and Milan Dojchinovski, INFAI (slides)

Geospatial Data & DBpedia Session

After the opening session we had the Geospatial Data & DBpedia Session. Milan Dojchinovski (InfAI/CTU in Prague) chaired this session with three very stimulating talks. Hereafter you will find all presentations given during this session:

  • Linked Geospatial Data & Data Quality by Wouter Beek, Triply Ltd. (slides)
  • Contextualizing OSi’s Geospatial Data with DBpedia by Christophe Debruyne, Vrije Universiteit Brussel and ADAPT at Trinity College Dublin
  • Linked Spatial Data: Beyond The Linked Open Data Cloud by Chaidir A. Adlan, The Deutsche Gesellschaft für Internationale Zusammenarbeit GmbH (slides)

Data Quality & DBpedia Session

The first online DBpedia workshop also covered a special data quality session. Johannes Frey (InfAI) chaired this session with three very stimulating talks. Hereafter you will find all presentations given during this session:

  • SeMantic AnsweR Type prediction with DBpedia – ISWC 2020 Challenge by Nandana Mihindukulasooriya, MIT-IBM Watson AI Lab (slides)
  • RDF Doctor: A Holistic Approach for Syntax Error Detection and Correction of RDF Data by Ahmad Hemid, Fraunhofer IAIS (slides)
  • The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with SANSA by Gezim Sejdiu,  Deutsche Post DHL Group and University of Bonn (slides)
  • Closing words by the workshop organizers

In case you missed the event, all slides and presentations are also available on the DBpeda workshop website. Further insights, feedback and photos about the event are available on Twitter (#DBpediaDay hashtag).

We are now looking forward to our first DBpedia Stack tutorial, which will be held online on July 1st, 2020. Over the last year, the DBpedia core team has consolidated a great amount of technology around DBpedia. The tutorial primarily targets developers (in particular of DBpedia Chapters) that wish to learn how to replicate local infrastructure such as loading and hosting an own SPARQL endpoint. A core focus will also be the new DBpedia Stack, which contains several dockerized applications that are automatically loading data from the Databus. Attending the DBpedia Stack tutorial is free and will be organized online. Please register to be part of the meeting.

Stay tuned and check Twitter, Facebook and our Website or subscribe to our Newsletter for latest news and information.

Julia and Milan 

on behalf of the DBpedia Association

GSoC2020 – Call for Contribution

James: Sherry with the soup, yes… Oh, by the way, the same procedure as last year, Miss Sophie?

Miss Sophie: Same procedure as every year, James.

…and we are proud of it. We are very grateful to be accepted as an open-source organization in this years’  Google Summer of Code (GSoC2020) edition, again. The upcoming GSoC2020 marks the 16th consecutive year of the program and is the 9th year in a row for DBpedia. 

We did it again – We are mentoring organization!

What is GSoC2020? 

Google Summer of Code is a global program focused on bringing student developers into open source software development. Funds will be given to students (BSc, MSc, PhD.) to work for three months on a specific task. For GSoC-Newbies, this short video and the information provided on their website will explain all there is to know about GSoC2020.

This year’s Narrative

Last year we tried to increase female participation in the program and we will continue to do so this year. We want to encourage explicitly female students to apply for our projects. That being said, we already engaged excellent female mentors to also raise the female percentage in our mentor team. 

In the following weeks, we invite all students, female and male alike, who are interested in Semantic Web and Open Source development to apply for our projects. You can also contribute your own ideas to work on during the summer. 

And this is how it works: 4 steps to GSoC2020 stardom

  1. Open source organizations such as DBpedia announce their projects ideas. You can find our project here
  2. Students contact the mentor organizations they want to work with and write up a project proposal. Please get in touch with us via the DBpedia Forum or dbpedia@infai.org as soon as possible.
  3. The official application period at GSoC starts March, 16th. Please note, you have to submit your final application not through our Forum, but the GSoC Website
  4. After a selection phase, students are matched with a specific project and a set of mentors to work on the project during the summer.

To all the smart brains out there, if you are a student who wants to work with us during summer 2020, check our list of project ideas, warm-up tasks or come up with your own idea and get in touch with us.

Application Procedure

Further information on the application procedure is available in our DBpedia Guidelines. There you will find information on how to contact us and how to appropriately apply for GSoC2020. Please also note the official GSoC 2020 timeline for your proposal submission and make sure to submit on time.  Unfortunately, extensions cannot be granted. Final submission deadline is March 31st, 2020, 8 pm, CEST.

Finally, check our website for information on DBpedia, follow us on Twitter or subscribe to our newsletter.

And in case you still have questions, please do not hesitate to contact us via praetor@infai.org.

We are thrilled to meet you and your ideas.

Your DBpedia-GSoC-Team


Better late than never – GSOC 2019 recap & outlook GSoC 2020

  • Pinky: Gee, Brain, what are we gonna do this year?
  • Brain: The same thing we do every year, Pinky. Taking over GSoC.

And, this is exactly what we did. We had been accepted as one of 206 open source organizations to participate in Google Summer of Code (GSoC) again. More than 25 students followed our call for project ideas. In the end, we chose six amazing students and their project proposals to work with during summer 2019. 
In the following post, we will show you some insights into the project ideas and how they turned out. Additionally, we will shed some light onto our amazing team of mentors who devoted a lot of time and expertise in mentoring our students. 

Meet the students and their projects

A Neural QA Model for DBpedia by Anand Panchbhai

With booming amount of information being continuously added to the internet, organising the facts and serving this information to the users becomes a very difficult task. Currently, DBpedia hosts billions of data points and corresponding relations in the RDF format. Accessing data on DBpedia via a SPARQL query is difficult for amateur users, who do not know how to write a query. This project tried to make this humongous linked data available to a larger user base in their natural languages (now restricted to English). The primary objective of the project was to translate natural language questions to a valid SPARQL query. Click here if you want to check his final code.

Multilingual Neural RDF Verbalizer for DBpedia by Dwaraknath Gnaneshwar

Presently, the generation of Natural Language from RDF data has gained substantial attention and has also been proven to support the creation of Natural Language Generation benchmarks. However, most models are aimed at generating coherent sentences in English, while other languages have enjoyed comparatively less attention from researchers. RDF data is usually in the form of triples, <subject, predicate, object>. Subject denotes the resource, the predicate denotes traits or aspects of the resource and expresses the relationship between subject and object. In this project, we aimed to create a multilingual Neural Verbalizer, ie, generating high-quality natural-language text from sets of RDF triples in multiple languages using one stand-alone, end-to-end trainable model. You can follow up on the progress and outcome of the project here. 

Predicate Detection using Word Embeddings for Question Answering over Linked Data by Yajing Bian

Knowledge-based question-answering system (KBQA) has demonstrated an ability to generate answers to natural language from information stored in a large-scale knowledge base. Generally, it completes the analysis challenge via three steps: identifying named entities, detecting predicates and generate SPARQL queries. In these three steps, predicate detection identifies the KB relation(s) a question refers to. To build a predicate detection structure, we identified all possible named entity first, then collected all predicates corresponding to the above entities. What follows is to calculate the similarity between problem and candidate predicates using a multi-granularity neural network model (MGNN). To find the globally optimal entity-predicate assignment, we use a joint model which is based on the result of entity linking and predicate detection process rather than considering the local predictions (i.e. most possible entity or predicate) as the final result. More details on the project are available here

A tool to generate RDF triples from DBpedia abstract by  Jayakrishna Sahit

The main aim of this project was to research and develop a tool in order to generate highly trustable RDF triples from DBpedia abstracts. In order to develop such a tool, we implemented algorithms which would take the output generated from the syntactic analyzer along with DBpedia spotlight’s named entity identifiers. Further information and the project’s results can be found here

A transformer of Attention Mechanism for Long-context QA by Stuart Chan

In this GSoC project, I choose to employ the language model of the transformer with an attention mechanism to automatically discover query templates for the neural question-answering knowledge-based model. The ultimate goal was to train the attention-based NSpM model on DBpedia with its evaluation against the QALD benchmark. Check here for more details on the project.

Workflow for linking External datasets by Jaydeep Chakraborty

The requirement of the project was to create a workflow for entity linking between DBpedia and external data sets. We aimed at an approach for ontology alignment through the use of an unsupervised mixed neural network. We explored reading and parsing the ontology and extracted all necessary information about concepts and instances. Additionally, we generated semantic vectors for each entity with different meta information like entity hierarchy, object property, data property, and restrictions and designed a User Interface based system which showed all necessary information about the workflow. Further info, download details and project results are available here

Meet our Mentors

First of all, a big shout out and thank you to all mentors and co-mentors who helped our students to succeed in their endeavours.

  • Aman Mehta, former GSoC student and current junior mentor, recently interned as a software engineer at Facebook, London.
  • Beyza Yaman, a senior mentor and organizational admin, Post-Doctoral Researcher based in ADAPT, Dublin City University, former Springer Nature-DBpedia intern and former research associate at the InfAI/University of Leipzig. She is responsible for the Turkish DBpedia and her field of interests are information retrieval, data extraction and integration over Linked Data.
  • Tommaso Soru, senior mentor and organizational admin. I’m a Machine Learning & AI enthusiast, Data Scientist at Data Lens Ltd in London and a PhD candidate at the University of Leipzig. 

“DBpedia is my window to the world of semantic data, not only for its intuitive interface but also because its knowledge is organised in a simple and uncomplicated way”

Tommaso Soru, GSoC 2019
  • Amandeep Srivastava, Junior Mentor and analyst at Goldman Sachs. He’s a huge fan of Christopher Nolan and likes to read fiction books in his free time.
  • Diego Moussalem, Senior mentor, Senior Researcher at Paderborn University, an active and vital member of the Portuguese DBpedia Chapter
  • Luca Virgili, currently a Computer Science PhD student at the Polytechnic University of Marche.He was a GSoC student for a year and a GSoC mentor for 2 years in DBpedia. 
  • Bharat Suri, former GSOC student, Junior Mentor, Masters degree in Computer Science at The Ohio State University

“I have thoroughly enjoyed both my years of GSoC with DBpedia and I plan to stay and help out in whichever way I can”

Bharat Suri, GSoC 2019
  • Mariano Rico, senior mentor,  Senior Doctor Researcher at Ontology Engineering Group, Universidad Politécnica de Madrid.
  • Nausheen Fatma, senior mentor, Data Scientist, Natural Language Processing, Machine Learning at Info Edge (naukri.com).
  • Ram G Athreya long-term GSoC mentor, Research Engineer at Viv Labs, Bay Area, San Francisco. 
  • Ricardo Usbeck, team leader ‘Conversational AI and Knowledge Graphs’ at Fraunhofer IAIS.
  • Rricha Jalota, former GSoC students, current senior mentor, developer in the Data Science Group at University of Paderborn, Germany 

“The reason why I love collaborating with DBpedia (apart from the fact that, it’s a powerhouse of knowledge-driven applications) is not only it gave me my first big break to the amazing field of NLP but also to the world of open-source!”

Rricha Jalota, GSoC 2019

In addition, we also like to thank the rest of our mentor team namely, Thiago Castro Ferreira, Aashay Singhal and Krishanu Konar, former GSoC student and current senior mentor, for their great work.  

Mentor Summit Recap 

This GSoC marked the 15th consecutive year of the program and was the 8th season in a row for DBpedia. As usual in each year we had two of our mentors, Rricha Jalota and Aashay Singhal joining the annual GSoC mentor summit. Selected mentors get the chance to meet each other and engage in a vital knowledge and expertise exchange around various GSoC related and non-related topics. Apart from more entertaining activities such as games, a scavenger hunt and a guided trip through Munich mentors also discussed pressing questions such as “why is it important to fail your students” or “how can we have our GSoC students stay and contribute for long”.

After GSoC is before the next GSoC

If you are interested in either mentoring a DBpedia GSoC project or if you want to contribute to a project of your own we are happy to have you on board. There are a few things to get you started.

Likewise, if you are an ambitious student who is interested in open source development and working with DBpedia you are more than welcome to either contribute your own project idea or apply for project ideas we offer starting in early 2020.

Stay tuned, frequently check Twitter or the DBpedia Forum to stay in touch and don’t miss your chance of becoming a crucial force in this endeavour as well as a vital member of the DBpedia community.

See you soon,

yours

DBpedia Association

More than 50 DBpedia enthusiasts joined the Community Meeting in Karlsruhe.

SEMANTiCS is THE leading European conference in the field of semantic technologies and the platform for professionals who make semantic computing work, and understand its benefits and know its limitations.

Since we at DBpedia have a long-standing partnership with Semantics we also joined this year’s event in Karlsruhe. September 12, the last day of the conference was dedicated to the DBpedia community. 

First and foremost, we would like to thank the Institute for Applied Informatics for supporting our community and many thanks to FIZ Karlsruhe for hosting our community meeting.

Following, we will give you a brief retrospective about the presentations.

Opening Session

Katja Hose – “Querying the web of data”

….on the search for the killer App.

The concept of Linked Open Data and the promise of the Web of Data have been around for over a decade now. Yet, the great potential of free access to a broad range of data that these technologies offer has not yet been fully exploited. This talk will, therefore review the current state of the art, highlight the main challenges from a query processing perspective, and sketch potential ways on how to solve them. Slides are available here.

Dan Weitzner – “timbr-DBpedia – Exploration and Query of DBpedia in SQL

The timbr SQL Semantic Knowledge Platform enables the creation of virtual knowledge graphs in SQL. The DBpedia version of timbr supports query of DBpedia in SQL and seamless integration of DBpedia data into data warehouses and data lakes. We already published a detailed blogpost about timbr where you can find all relevant information about this amazing new DBpedia Service.

Showcase Session

Maribel Acosta“A closer look at the changing dynamics of DBpedia mappings”

Her presentation looked at the mappings wiki and how different language chapters use and edit it. Slides are available here.

Mariano Rico“Polishing a diamond: techniques and results to enhance the quality of DBpedia data”

DBpedia is more than a source for creating papers. It is also being used by companies as a remarkable data source. This talk is focused on how we can detect errors and how to improve the data, from the perspective of academic researchers and but also on private companies. We show the case for the Spanish DBpedia (the second DBpedia in size after the English chapter) through a set of techniques, paying attention to results and further work. Slides are available here.

Guillermo Vega-Gorgojo – “Clover Quiz: exploiting DBpedia to create a mobile trivia game”

Clover Quiz is a turn-based multiplayer trivia game for Android devices with more than 200K multiple choice questions (in English and Spanish) about different domains generated out of DBpedia. Questions are created off-line through a data extraction pipeline and a versatile template-based mechanism. A back-end server manages the question set and the associated images, while a mobile app has been developed and released in Google Play. The game is available free of charge and has been downloaded by +10K users, answering more than 1M questions. Therefore, Clover Quiz demonstrates the advantages of semantic technologies for collecting data and automating the generation of multiple-choice questions in a scalable way. Slides are available here.

Fabian Hoppe and Tabea Tiez – “The Return of German DBpedia”

Fabian and Tabea will present the latest news on the German DBpedia chapter as it returns to the language chapter family after an extended offline period. They will talk about the data set, discuss a few challenges along the way and give insights into future perspectives of the German chapter. Slides are available here.

Wlodzimierz Lewoniewski and Krzysztof Węcel  – “References extraction from Wikipedia infoboxes”

In Wikipedia’s infoboxes, some facts have references, which can be useful for checking the reliability of the provided data. We present challenges and methods connected with the metadata extraction of Wikipedia’s sources. We used DBpedia Extraction Framework along with own extensions in Python to provide statistics about citations in 10 language versions. Provided methods can be used to verify and synchronize facts depending on the quality assessment of sources. Slides are available here.

Wlodzimierz Lewoniewski – “References extraction from Wikipedia infoboxes” … He gave insight into the process of extracting references for Wikipedia infoboxes, which we will use in our GFS project.

Afternoon Session

Sebastian Hellmann, Johannes Frey, Marvin Hofer – “The DBpedia Databus – How to build a DBpedia for each of your Use Cases”

The DBpedia Databus is a platform that is intended for data consumers. It will enable users to build an automated DBpedia-style Knowledge Graph for any data they need. The big benefit is that users not only have access to data, but are also encouraged to apply improvements and, therefore, will enhance the data source and benefit other consumers. We want to use this session to officially introduce the Databus, which is currently in beta and demonstrate its power as a central platform that captures decentrally created client-side value by consumers.  

We will give insight on how the new monthly DBpedia releases are built and validated to copy and adapt for your use cases. Slides are available here.

Interactive session, moderator: Sebastian Hellmann – “DBpedia Connect & DBpedia Commerce – Discussing the new Strategy of DBpedia”

In order to keep growing and improving, DBpedia has been undergoing a growth hack for the last couple of months. As part of this process, we developed two new subdivisions of DBpedia: DBpedia Connect and DBpedia Commerce. The former is a low-code platform to interconnect your public or private databus data with the unified, global DBpedia graph and export the interconnected and enriched knowledge graph into your infrastructure. DBpedia Commerce is an access and payment platform to transform Linked Data into a networked data economy. It will allow DBpedia to offer any data, mod, application or service on the market. During this session, we will provide more insight into these as well as an overview of how DBpedia users can best utilize them. Slides are available here.

In case you missed the event, all slides and presentations are also available on our Website. Further insights, feedback and photos about the event are available on Twitter via #DBpediaDay

We are now looking forward to more DBpedia meetings next year. So, stay tuned and check Twitter, Facebook and the Website or subscribe to our Newsletter for the latest news and information.

If you want to organize a DBpedia Community meeting yourself, just get in touch with us via dbpedia@infai.org regarding program and organization.

Yours

DBpedia Association

SEMANTiCS 2019 Interview: Katja Hose

Today’s post features an interview with our DBpedia Day keynote speaker Katja Hose, a Professor of Computer Science at Aalborg University, Denmark. In this Interview, Katja talks about increasing the reliability of Knowledge Graph Access as well as her expectations for SEMANTiCS 2019

Prior to joining Aalborg University, Katja was a postdoc at the Max Planck Institute for Informatics in Saarbrücken. She received her doctoral degree in Computer Science from Ilmenau University of Technology in Germany.

Can you tell us something about your research focus?

The most important focus of my research has been querying the Web of Data, in particular, efficient query processing over distributed knowledge graphs and Linked Data. This includes indexing, source selection, and efficient query execution. Unfortunately, it happens all too often that the services needed to access remote knowledge graphs are temporarily not available, for instance, because a software component crashed. Hence, we are currently developing a decentralized architecture for knowledge sharing that will make access to knowledge graphs a reliable service, which I believe is the key to a wider acceptance and usage of this technology.

How do you personally contribute to the advancement of semantic technologies?

I contribute by doing research, advancing the state of the art, and applying semantic technologies to practical use cases.  The most important achievements so far have been our works on indexing and federated query processing, and we have only recently published our first work on a decentralized architecture for sharing and querying semantic data. I have also been using semantic technologies in other contexts, such as data warehousing, fact-checking, sustainability assessment, and rule mining over knowledge bases.

Overall, I believe the greatest ideas and advancements come when trying to apply semantic technologies to real-world use cases and problems, and that is what I will keep on doing.

Which trends and challenges do you see for linked data and the semantic web?

The goal and the idea behind Linked Data and the Semantic Web is the second-best invention after the Internet. But unlike the Internet, Linked Data and the Semantic Web are only slowly being adopted by a broader community and by industry.

I think part of the reason is that from a company’s point of view, there are not many incentives and added benefit of broadly sharing the achievements. Some companies are simply reluctant to openly share their results and experiences in the hope of retaining an advantage over their competitors. I believe that if these success stories were shared more openly, and this is the trend we are witnessing right now, more companies will see the potential for their own problems and find new exciting use cases.

Another particular challenge, which we will have to overcome, is that it is currently still far too difficult to obtain and maintain an overview of what data is available and formulate a query as a non-expert in SPARQL and the particular domain… and of course, there is the challenge that accessing these datasets is not always reliable.

As artificial intelligence becomes more and more important, what is your vision of AI?

AI and machine learning are indeed becoming more and more important. I do believe that these technologies will bring us a huge step ahead. The process has already begun. But we also need to be aware that we are currently in the middle of a big hype where everybody wants to use AI and machine learning – although many people actually do not truly understand what it is and if it is actually the best solution to their problems. It reminds me a bit of the old saying “if the only tool you have is a hammer, then every problem looks like a nail”. Only time will tell us which problems truly require machine learning, and I am very curious to find out which solutions will prevail.

However, the current state of the art is still very far away from the AI systems that we all know from Science Fiction. Existing systems operate like black boxes on well-defined problems and lack true intelligence and understanding of the meaning of the data. I believe that the key to making these systems trustworthy and truly intelligent will be their ability to explain their decisions and their interpretation of the data in a transparent way.

What are your expectations about Semantics 2019 in Karlsruhe?

First and foremost, I am looking forward to meeting a broad range of people interested in semantic technologies. In particular, I would like to get in touch with industry-based research and to be exposed 

The End

We like to thank Katje Hose for her insights and are happy to have her as one of our keynote speakers.

Visit SEMANTiCS 2019 in Karlsruhe, Sep 9-12 and get your tickets for our community meeting here. We are looking forward to meeting you during DBpedia Day.

Yours DBpedia Association

SEMANTiCS Interview: Dan Weitzner

As the upcoming 14th DBpedia Community Meeting, co-located with SEMANTiCS 2019 in Karlsruhe, Sep 9-12, is drawing nearer, we like to take that opportunity to introduce you to our DBpedia keynote speakers.

Today’s post features an interview with Dan Weitzner from WPSemantix who talks about timbr-DBpedia, which we blogged about recently, as well as future trends and challenges of linked data and the semantic web.

Dan Weitzner is co-founder and Vice President of Research and Development of WPSemantix. He obtained his Bachelor of Science in Computer Science from Florida Atlantic University. In collaboration with DBpedia, he and his colleagues at WPSemantix launched timbr, the first SQL Semantic Knowledge Graph that integrates Wikipedia and Wikidata Knowledge into SQL engines.

Dan Weitzner

1. Can you tell us something about your research focus?

WPSemantix bridges the worlds of standard databases and the Semantic Web by creating ontologies accessible in standard SQL. 

Our platform – timbr is a virtual knowledge graph that maps existing data-sources to abstract concepts, accessible directly in all the popular Business Intelligence (BI) tools and also natively integrated into Apache Spark, R, Python, Java and Scala. 

timbr enables reasoning and inference for complex analytics without the need for costly Extract-Transform-Load (ETL) processes to graph databases.

2. How do you personally contribute to the advancement of semantic technologies?

We believe we have lowered the fundamental barriers to adoption of semantic technologies for large organizations who want to benefit from knowledge graph capabilities without firstly requiring fundamental changes in their database infrastructure and secondly, without requiring expensive organizational changes or significant personnel retraining.  

Additionally, we implemented the W3C Semantic Web principles to enable inference and inheritance between concepts in SQL, and to allow seamless integration of existing ontologies from OWL. Subsequently, users across organizations can do complex analytics using the same tools that they currently use to access and query their databases, and in addition, to facilitate the sophisticated query of big data without requiring highly technical expertise.  
timbr-DBpedia is one example of what can be achieved with our technology. This joint effort with the DBpedia Association allows semantic SQL query of the DBpedia knowledge graph, and the semantic integration of the DBpedia knowledge into data warehouses and data lakes. Finally, timbr-DBpedia allows organizations to benefit from enriching their data with DBpedia knowledge, combining it with machine learning and/or accessing it directly from their favourite BI tools.

3. Which trends and challenges do you see for linked data and the semantic web?

Currently, the use of semantic technologies for data exploration and data integration is a significant trend followed by data-driven communities. It allows companies to leverage the relationship-rich data to find meaningful insights into their data. 

One of the big difficulties for the average developer and business intelligence analyst is the challenge to learn semantic technologies. Another one is to create ontologies that are flexible and easily maintained. We aim to solve both challenges with timbr.

4. Which application areas for semantic technologies do you perceive as most promising?

I think semantic technologies will bloom in applications that require data integration and contextualization for machine learning models.

Ontology-based integration seems very promising by enabling accurate interpretation of data from multiple sources through the explicit definition of terms and relationships – particularly in big data systems,  where ontologies could bring consistency, expressivity and abstraction capabilities to the massive volumes of data.

5. As artificial intelligence becomes more and more important, what is your vision of AI?

I envision knowledge-based business intelligence and contextualized machine learning models. This will be the bedrock of cognitive computing as any analysis will be semantically enriched with human knowledge and statistical models.

This will bring analysts and data scientists to the next level of AI.

6. What are your expectations about Semantics 2019 in Karlsruhe?

I want to share our vision with the semantic community and I would also like to learn about the challenges, vision and expectations of companies and organizations dealing with semantic technologies. I will present “timbr-DBpedia – Exploration and Query of DBpedia in SQL”

The End

Visit SEMANTiCS 2019 in Karlsruhe, Sep 9-12 and find out more about timbr-DBpedia and all the other new developments at DBpedia. Get your tickets for our community meeting here. We are looking forward to meeting you during DBpedia Day.

Yours DBpedia Association