STAY TUNED AND SIGN UP FOR THE DBPEDIA NEWSLETTER

Do you want to stay informed about upcoming DBpedia events, releases and technical developments? Through the DBpedia newsletter you get the possibility to be always up to date and to provide feedback to us.

Four times per year we will inform the DBpedia community about meetings, new collaborations and other topics related to DBpedia. So make sure to subscribe to our NEWSLETTER and do not miss any news.

Your DBpedia Association

DBpedia @ GSoC 2017 – Call for students

DBpedia will participate for a fifth time in the Google Summer of Code program (GSoC) and now we are looking for students who will share their ideas with us. We are regularly growing our community through GSoC and can deliver more and more opportunities to you. We got excited with our new ideas, we hope you will get excited too!

What is GSoC?

Google Summer of Code is a global program focused on bringing more student developers into open source software development. Funds will given to students (BSc, MSc, PhD) to work for three months on a specific task. At first open source organizations announce their student projects and then students should contact the mentor organizations they want to work with and write up a project proposal for the summer. After a selection phase, students are matched with a specific project and a set of mentors to work on the project during the summer.

If you are a GSoC student who wants to apply to our organization, please check our guideline here: http://wiki.dbpedia.org/gsoc2017

Here you can see the Google Summer of Code 2017 timeline:

March 20th, 2017 Student applications open (Students can register and submit their applications to mentor organizations.)
April 3rd, 2017 Student application deadline
May 4th, 2017 Accepted students are announced and paired with a mentor.
May 30th, 2017 Coding officially begins!
August 21st, 2017 Final week: Students submit their final work product and their final mentor evaluation
September 6th, 2017 Final results of Google Summer of Code 2017 announced

Check our website for further updates, follow us on #twitter or subscribe to our newsletter.

We are looking forward to your input.

Your DBpedia Association

DBpedia strategy survey

Dear DBpedians,

Sören Auer and the DBpedia Board members prepared a survey to assess the direction of the DBpedia Association. We would like to know what you think should be our priorities and how you would like the funds of the association to be used.

Your opinion counts so please contribute actively in developing a better DBpedia. If you use DBpedia and want us to keep going forward, we kindly invite you to vote here: https://goo.gl/forms/rDqLcwL823Ok09Uw2

We will publish the results in anonymized, aggregated form on the DBpedia website.

We are looking forward to your input. Check our website for further updates, follow us on #twitter or subscribe to our newsletter.

Your DBpedia Association

DBpedia @ GSoC 2017 – Call for ideas & mentors

Dear DBpedians,

As previous years, we would like your input for DBpedia related project ideas for GSoC 2017.

For those who are unfamiliar with GSoC (Google Summer of Code), Google pays students (BSc, MSc, PhD) to work for 3 months on an open source project. Open source organizations announce their student projects and students apply for projects they like. After a selection phase, students are matched with a specific project and a set of mentors to work on the project during the summer.

Here you can see the Google Summer of Code 2017 timeline: https://developers.google.com/open-source/gsoc/timeline

or please check:  http://wiki.dbpedia.org/gsoc2016

If you have a cool idea for DBpedia or want to co-mentor an existing cool idea go here (All mentors get a free Google T-shirt and get the chance to go Google HQs in November.).

DBpedia applied for the fifth time to participate in the Google Summer of Code program. Here you will find a list of all projects and students from GSoC 2016: http://blog.dbpedia.org/2016/04/26/dbpedia-google-summer-of-code-2016/

Check our website for further updates, follow us on #twitter or subscribe to our newsletter.

Looking forward to your input.

Your DBpedia Association

DBpedia in Dutch: formalizing the chapter by signing the Memorandum of Understanding

The DBpedia community and members from over 20 countries work hard to localize and internationalize DBpedia and support the extraction of non-English Wikipedia editions as well as build a data community around a certain language, region or special interest. The chapters are part of the DBpedia executives and have taken on responsibility to contribute to the infrastructure of DBpedia.

Hereby we proudly announce that DBpedia in Dutch is the first chapter which signed the Memorandum of Understanding (MoU). There are various intentions why they already signed the MoU: First of all they support the goals of the DBpedia Association, secondly they strengthen their own chapter and community of contributors and thirdly they improve the cooperation with the Dutch research infrastructure and the Dutch Digital Heritage. The cooperation was initiated by Koninklijke Bibliotheek (National Library of the Netherlands) and Huygens ING (research institute of History and Culture).

director-of-kb-and-director-of-huygens-ing-signing-the-mou
Dr. E.J.B. Lily Knibbeler (director of KB) and Prof. Dr. Lex Heerma van Voss (director of Huygens ING) signing the MoU on 12th September 2016 in The Hague.

Other partners like imec/Ghent University and Institute of Sound and Vision have signed as well and became an executive partner of the DBpedia Association. The Vrije Universiteit will join soon. It is a cooperation between these Dutch organizations as well as the NL-DBpedia community.

The Dutch Chapter has provided a Sample DBpedia Chapter Memorandum of Understanding (MoU) to use as a template for further chapters. If you use DBpedia and want us to keep going forward, we kindly invite you to donate and help DBpedia to grow. If you would like to become a member of the DBpedia Association, please go directly to the application form or contact us.

Check our website for further updates, stay tuned and follow us on Twitter.

Your DBpedia Association

DBpedia meetup in Poznan

After our successful meeting in Poznan in 2015, we thought it is time to meet the Polish DBpedia community again. The DBpedia meetup will be held on 22th of November 2016 at the Poznań University of Economics and Business. This meetup aims at the presentation of semantic web technologies and their use in applications by entrepreneurs.

*Quick facts*

The schedule for the DBpedia meetup in Poznan is included in the eventbrite page.

Get your ticket here and be part of this event.

Check our website for further updates, stay tuned and follow us on Twitter.

Your DBpedia Association

Retrospective: 2nd US DBpedia Community meeting in California

After the largest DBpedia meeting to date we decided it was time to cross the Atlantic for the second time for another meetup. Two weeks ago the 8th DBpedia Community Meeting was held in Sunnyvale, California on October 27th 2016.

Main Event

Pablo Mendes from Lattice Data Inc. opened the main event with a short introduction setting the tone for the evening. After that Dimitris Kontokostas gave technical and organizational DBpedia updates. The main event attracted attendees with lightning talks from major companies actively using DBpedia or interested in knowledge graphs in general.

Four major institutions described their efforts to organize reusable information in a centralized knowledge representation. Google’s Tatiana Libman presented (on behalf of Denny Vrandečić) the impressive scale of the Google Knowledge graph, with 1B+ entities and over 100 billion facts.

Tatiana Libman from Google
Tatiana Libman from Google

Yahoo’s Nicolas Torzec presented the Yahoo knowledge graph, with focus on their research on extracting data from Web tables to expand their knowledge which includes DBpedia as an important part. Qi He from LinkedIn focused mostly on how to model a knowledge graph of people and skills, which becomes particularly interesting with the possibility of integration with Microsoft’s Satori Graph. Such an integration would allow general domain knowledge and very specific knowledge about professionals complementing one another. Stas Malyshev from Wikidata presented statistics on their growth, points of contact with DBpedia as well as an impressive SPARQL query interface that can be used to query the structured data that they are generating.

Three other speakers focused on the impact of DBpedia in machine learning and natural language processing. Daniel Gruhl from IBM Watson gave the talk “Truth for the impatient” where he showed that a knowledge model built from DBpedia can help costs and time to value for extracting entity mentions with higher accuracy. Pablo Mendes from Lattice Data Inc. presented their approach that leverages DBpedia and other structured information sources for weak supervision to obtain very strong NLP extractors. Sujan Perera from IBM Watson discussed the problem of identifying implicit mentions of entities in tweets and how the knowledge represented in DBpedia can be used to help uncover those references.

Another three speakers focused on applications of DBpedia and knowledge graphs. Margaret Warren from Metadata Authoring Systems, LLC presented ImageSnippets and how background knowledge from DBpedia allows better multimedia search through inference. For instance, by searching for “birds” you may find pictures that haven’t been explicitly tagged as birds but for which the fact can be inferred from DBpedia. Jans Aasman from Franz Inc presented their company’s approach to Data Exploration with Visual SPARQL Queries. They described opportunities for graph analytics in the medical domain, and discussed how DBpedia has been useful in their applications. Finally, Wang-Chiew Tan presented their research at RIT relating to building chatbots, among other projects that relate to using background knowledge stored in computers to enrich real life experiences.

8th-dbpedia-meeting_california
Nicolas Torzec from Yahoo

Overall the talks were very high quality and fostered plenty of discussions afterwards. We finalized the event with a round of introductions where every attendee got to say their name and affiliation to help them connect with one another throughout the final hour of the event.

All slides and presentations are also available on our Website and you will find more feedback and photos about the event on Twitter via #DBpediaCA.

We would like to thank Yahoo for hosting the event, Google Summer of Code 2016 mentor summit as the reason we were in the area and collocated the DBpedia meeting, the Institute for Applied Informatics for supporting the DBpedia Association, ALIGNED – Software and Data Engineering for funding the development of DBpedia as a project use-case and last but not least OpenLink Software for continuous hosting the main DBpedia Endpoint.

Many thanks to Pablo Mendes for writing this blogpost :)

So now, we are looking forward to the next DBpedia community meeting which will be held in Europe again. We will keep you informed via the DBpedia Website and Blog.

Your DBpedia Association

California is calling for the next DBpedia Community Meeting.

Less than 24 hours left to reserve your seat for our 2nd US DBpedia Community meeting. The meeting will be held in Sunnyvale on October 27th 2016, hosted by Yahoo. Over 85 participants registered so far, we will offer 20 more tickets. So come and get your ticket to be part of this event.

The event will feature talks from Yahoo, IBM Watson, LinkedIn, Lattice, Wikimedia, Frank Inc, Knoesis, RIT and ImageSnippets. The topics will include knowledge graphs & machine learning, open data, open source and startups. Please read below on different ways you can participate. We are looking forward to meeting again in person with the US-based DBpedia community.

Quick facts

Schedule

Please check our schedule for the next DBpedia Community Meeting here: http://wiki.dbpedia.org/meetings/California2016

Acknowledgments

If you would like to become a sponsor for the 8th DBpedia Meeting, please contact the DBpedia Association.

Yahoo! For hosting the meeting and the catering
Google Summer of Code 2016 Amazing program and the reason some of our core DBpedia devs are visiting California
ALIGNED – Software and Data Engineering For funding the development of DBpedia as a project use-case and covering part of the travel cost
Institute for Applied Informatics For supporting the DBpedia Association
OpenLink Software For continuous hosting of the main DBpedia Endpoint

Organisation

Registration

Attending the DBpedia Community meeting is free of charge, but seats are limited. Make sure to register to reserve a seat.

Location

The meeting will take place at the Yahoo headquarters in Sunnyvale.

Address: Yahoo! (Building E, 701 First Avenue, Sunnyvale, CA)

Many thanks to Yahoo & Nicolas Torzec for providing a bigger room and hosting the event!

Check our website for further updates and like us on Facebook.

Your DBpedia Association

YEAH! We did it again ;) – New 2016-04 DBpedia release

Hereby we announce the release of DBpedia 2016-04. The new release is based on updated Wikipedia dumps dating from March/April 2016 featuring a significantly expanded base of information as well as richer and (hopefully) cleaner data based on the DBpedia ontology.

You can download the new DBpedia datasets in a variety of RDF-document formats from: http://wiki.dbpedia.org/downloads-2016-04 or directly here: http://downloads.dbpedia.org/2016-04/

Support DBpedia

During the latest DBpedia meeting in Leipzig we discussed about ways to support DBpedia and what benefits this support would bring. For the next two months, we are aiming to raise money to support the hosting of the main services and the next DBpedia release (especially to shorten release intervals). On top of that we need to buy a new server to host DBpedia Spotlight that was so generously hosted so far by third parties. If you use DBpedia and want us to keep going forward, we kindly invite you to donate here or become a member of the DBpedia association.

Statistics

The English version of the DBpedia knowledge base currently describes 6.0M entities of which 4.6M have abstracts, 1.53M have geo coordinates and 1.6M depictions. In total, 5.2M resources are classified in a consistent ontology, consisting of 1.5M persons, 810K places (including 505K populated places), 490K works (including 135K music albums, 106K films and 20K video games), 275K organizations (including 67K companies and 53K educational institutions), 301K species and 5K diseases. The total number of resources in English DBpedia is 16.9M that, besides the 6.0M resources, includes 1.7M skos concepts (categories), 7.3M redirect pages, 260K disambiguation pages and 1.7M intermediate nodes.

Altogether the DBpedia 2016-04 release consists of 9.5 billion (2015-10: 8.8 billion) pieces of information (RDF triples) out of which 1.3 billion (2015-10: 1.1 billion) were extracted from the English edition of Wikipedia, 5.0 billion (2015-04: 4.4 billion) were extracted from other language editions and 3.2 billion (2015-10: 3.2 billion) from  DBpedia Commons and Wikidata. In general, we observed a growth in mapping-based statements of about 2%.

Thorough statistics can be found on the DBpedia website and general information on the DBpedia datasets here.

Community

The DBpedia community added new classes and properties to the DBpedia ontology via the mappings wiki. The DBpedia 2016-04 ontology encompasses:

  • 754 classes (DBpedia 2015-10: 739)
  • 1,103 object properties (DBpedia 2015-10: 1,099)
  • 1,608 datatype properties (DBpedia 2015-10: 1,596)
  • 132 specialized datatype properties (DBpedia 2015-10: 132)
  • 410 owl:equivalentClass and 221 owl:equivalentProperty mappings external vocabularies (DBpedia 2015-04: 407 – 221)

The editor community of the mappings wiki also defined many new mappings from Wikipedia templates to DBpedia classes. For the DBpedia 2016-04 extraction, we used a total of 5800 template mappings (DBpedia 2015-10: 5553 mappings). For the second time the top language, gauged by the number of mappings, is Dutch (646 mappings), followed by the English community (604 mappings).

(Breaking) Changes

  • In addition to normalized datasets to English DBpedia (en-uris) we additionally provide normalized datasets based on the DBpedia Wikidata (DBw) datasets (wkd-uris). These sorted datasets will be the foundation for the upcoming fusion process with wikidata. The DBw-based uris will be the only ones provided from the following releases on.
  • We now filter out triples from the Raw Infobox Extractor that are already mapped. E.g. no more “<x> dbo:birthPlace <z>” and “<x> dbp:birthPlace|dbp:placeOfBirth|… <z>” in the same resource. These triples are now moved to the “infobox-properties-mapped” datasets and not loaded on the main endpoint. See issue 22 for more details.
  • Major improvements in our citation extraction. See here for more details.
  • We incorporated the statistical distribution approach of Heiko Paulheim in creating type statements automatically and providing them as an additional datasets (instance_types_sdtyped_dbo).

In case you missed it, what we changed in the previous release (2015-10):

  • English DBpedia switched to IRIs. This can be a breaking change to some applications that need to change their stored DBpedia resource URIs / links. We provide the “uri-same-as-iri” dataset for English to ease the transition.
  • The instance-types dataset is now split into two files: instance-types (containing only direct types) and instance-types-transitive containing the transitive types of a resource based on the DBpedia ontology
  • The mappingbased-properties file is now split into three (3) files:
    • “geo-coordinates-mappingbased” that contains the coordinated originating from the mappings wiki. the “geo-coordinates” continues to provide the coordinates originating from the GeoExtractor
    • “mappingbased-literals” that contains mapping based fact with literal values
    • “mappingbased-objects” that contains mapping based fact with object values
    • the “mappingbased-objects-disjoint-[domain|range]” are facts that are filtered out from the “mappingbased-objects” datasets as errors but are still provided
  • We added a new extractor for citation data that provides two files:
    • citation links: linking resources to citations
    • citation data: trying to get additional data from citations. This is a quite interesting dataset but we need help to clean it up
  • All datasets are available in .ttl and .tql serialization (nt, nq dataset were neglected for reasons of redundancy and server capacity).

Upcoming Changes

  • Dataset normalization: We are going to normalize datasets based on wikidata uris and no longer on the English language edition, as a prerequisite to finally start the fusion process with wikidata.
  • RML Integration: Wouter Maroy did already provide the necessary groundwork for switching the mappings wiki to a RML based approach on Github. We are not there yet but this is at the top of our list of changes.
  • Starting with the next release we are adding datasets with NIF annotations of the abstracts (as we already provided those for the 2015-04 release). We will eventually extend the NIF annotation dataset to cover the whole Wikipedia article of a resource.

New Datasets

  • SDTypes: We extended the coverage of the automatically created type statements (instance_types_sdtyped_dbo) to English, German and Dutch (see above).
  • Extensions: In the extension folder (2016-04/ext) we provide two new datasets, both are to be considered in an experimental state:
    • DBpedia World Facts: This dataset is authored by the DBpedia association itself. It lists all countries, all currencies in use and (most) languages spoken in the world as well as how these concepts relate to each other (spoken in, primary language etc.) and useful properties like iso codes (ontology diagram). This Dataset extends the very useful LEXVO dataset with facts from DBpedia and the CIA Factbook. Please report any error or suggestions in regard to this dataset to Markus.
    • Lector Facts: This experimental dataset was provided by Matteo Cannaviccio and demonstrates his approach to generating facts by using common sequences of words (i.e. phrases) that are frequently used to describe instances of binary relations in a text. We are looking into using this approach as a regular extraction step. It would be helpful to get some feedback from you.

Credits

Lots of thanks to

  • Markus Freudenberg (University of Leipzig / DBpedia Association) for taking over the whole release process and creating the revamped download & statistics pages.
  • Dimitris Kontokostas (University of Leipzig / DBpedia Association) for conveying his considerable knowledge of the extraction and release process.
  • All editors that contributed to the DBpedia ontology mappings via the Mappings Wiki.
  • The whole DBpedia Internationalization Committee for pushing the DBpedia internationalization forward.
  • Heiko Paulheim (University of Mannheim) for providing the necessary code for his algorithm to generate additional type statements for formerly untyped resources and identify and removed wrong statements. Which is now part of the DIEF.
  • Václav Zeman, Thomas Klieger and the whole LHD team (University of Prague) for their contribution of additional DBpedia types
  • Marco Fossati (FBK) for contributing the DBTax types
  • Alan Meehan (TCD) for performing a big external link cleanup
  • Aldo Gangemi (LIPN University, France & ISTC-CNR, Italy) for providing the links from DOLCE to DBpedia ontology.
  • Kingsley Idehen, Patrick van Kleef, and Mitko Iliev (all OpenLink Software) for loading the new data set into the Virtuoso instance that provides 5-Star Linked Open Data publication and SPARQL Query Services.
  • OpenLink Software (http://www.openlinksw.com/) collectively for providing the SPARQL Query Services and Linked Open Data publishing  infrastructure for DBpedia in addition to their continuous infrastructure support.
  • Ruben Verborgh from Ghent University – iMinds for publishing the dataset as Triple Pattern Fragments, and iMinds for sponsoring DBpedia’s Triple Pattern Fragments server.
  • Ali Ismayilov (University of Bonn) for extending the DBpedia Wikidata dataset.
  • Vladimir Alexiev (Ontotext) for leading a successful mapping and ontology clean up effort.
  • All the GSoC students and mentors which directly or indirectly influenced the DBpedia release
  • Special thanks to members of the DBpedia Association, the AKSW and the department for Business Information Systems of the University of Leipzig.

The work on the DBpedia 2016-04 release was financially supported by the European Commission through the project ALIGNED – quality-centric, software and data engineering  (http://aligned-project.eu/). More information about DBpedia is found at http://dbpedia.org as well as in the new overview article about the project available at http://wiki.dbpedia.org/Publications.

Have fun with the new DBpedia 2016-04 release!

For more information about DBpedia, please visit our website or follow us on facebook!
Your DBpedia Association

Call For Participation: 8th DBpedia Community Meeting in California

Very shortly after the largest DBpedia meeting to date we are crossing Atlantic for the second time. We are happy to announce that the 8th DBpedia Community Meeting will be held in Sunnyvale on October 27th 2016, hosted by Yahoo.

The event will feature talks from Yahoo, IBM Watson, LinkedIn and Lattice amongst others. The topics will include knowledge graphs & machine learning, open data, open source and startups.

Please read below on different ways you can participate. We are looking forward to meeting again in person with the US-based DBpedia community.

Quick facts

Acknowledgments

If you would like to become a sponsor for the 8th DBpedia Meeting, please contact the DBpedia Association.

Yahoo!        For hosting the meeting and the catering
Google Summer of Code 2016 Amazing program and the reason some of our core DBpedia devs are visiting California
ALIGNED – Software and Data Engineering For funding the development of DBpedia as a project use-case and covering part of the travel cost
Institute for Applied Informatics For supporting the DBpedia Association
OpenLink Software For continuous hosting of the main DBpedia Endpoint

Organisation   

Registration

Attending the DBpedia Community meeting is free of charge, but seats are limited. Make sure to register to reserve a seat.

Call for Contribution

Please submit your proposal through our form. Contribution proposals may include (but are not limited to) presentations, demos, lightning talks, panels and session suggestions. We intend to accept as many proposals as possible in the available meeting time.

Location

The meeting will take place at the Yahoo headquarters in Sunnyvale. Address: Yahoo! (Building E, 701 First Avenue, Sunnyvale, CA)

Your DBpedia Association