All posts by Sandra Praetor

Smart Minds Wanted

New Internship Opportunity @

In conjunction with Springer Nature,  DBpedia offers a 3 months internship at Springer Nature in London, UK and at DBpedia in Leipzig, Germany.

Internship Details

Position DBpedia Intern
Main Employer DBpedia Association
Deadline June 30th, 2017
Duration 3 months/full-time, internship will starts in the second half of 2017
Location 50% in London (UK) and 50% in Leipzig (GER)
Type of students desired Undergraduate, Graduate (Junior role)
Compensation You will receive a stipend of 1300€ per month and additional reimbursement of your travel and visa costs (total up to 1000€)

The student intern will be responsible for assisting with mappings for DBpedia at SpringerNature. Your tasks include and are not restricted to improving the quality of the extraction mechanism of DBpedia scholarly references/wikipedia citations to Springer Nature URIs and Text mining of DBpedia entities from Springer Nature publication content.

Did we spark your interest? Check  our website for further information or apply directly via our online application form

We are looking forward to meet all the whiz kids out there.

Your

DBpedia Association

GSoC 2017- may the code be with you

GSoC students have finally been selected.

We are very excited to announce this year’s final students for our projects  at the Google Summer of Code program (GSoC).

Google Summer of Code is a global program focused on bringing more student developers into open source software development. Stipends are awarded to students to work on a specific DBpedia related project together with a set of dedicated mentors during summer 2017 for the duration of three months.

For the past 5 years DBpedia has been a vital part of the GSoC program. Since the very first time many Dbpedia projects have been successfully completed.

In this years GSoC edition, DBpedia received more than 20 submissions for selected DBpedia projects. Our mentors read many promising proposals, evaluated them and now the crême de la crême of students snatched a spot for this summer.  In the end 7 students from around the world were selected and will jointly work together with their assigned mentors on their projects. DBpedia developers and mentors are really excited about this 7 promising student projects.

List of students and projects:

You want to read more about their specific projects? Just click below… or check GSoC pages for details.

 Ismael Rodriguez – Project Description: Although the DBPedia Extraction Framework was adapted to support RML mappings thanks to a project of last year GSoC, the user interface to create mappings is still done by a MediaWiki installation, not supporting RML mappings and needing expertise on Semantic Web. The goal of the project is to create a front-end application that provides a user-friendly interface so the DBPedia community can easily view, create and administrate DBPedia mapping rules using RML. Moreover, it should also facilitate data transformations and overall DBPedia dataset generation. Mentors: Anastasia Dimou, Dimitris Kontokostas, Wouter Maroy 

Ram Ganesan Athreya – Project Description:The requirement of the project is to build a conversational Chatbot for DBpedia which would be deployed in at least two social networks.There are three main challenges in this task. First is understanding the query presented by the user, second is fetching relevant information based on the query through DBpedia and finally tailoring the responses based on the standards of each platform and developing subsequent user interactions with the Chatbot.Based on my understanding, the process of understanding the query would be undertaken by one of the mentioned QA Systems (HAWK, QANARY, openQA). Based on the response from these systems we need to query the DBpedia dataset using SPARQL and present the data back to the user in a meaningful way. Ideally, both the presentation and interaction flow needs to be tailored for the individual social network.I would like to stress that although the primary medium of interaction is text, platforms such as Facebook insist that a proper mix between chat and interactive elements such as images, buttons etc would lead to better user engagement. So I would like to incorporate these elements as part of my proposal.

Mentor: Ricardo Usbeck

 

Nausheen Fatma – Project discription:  Knowledge base embeddings has been an active area of research. In recent years a lot of research work such as TransE, TransR, RESCAL, SSP, etc. has been done to get knowledge base embeddings. However none of these approaches have used DBpedia to validate their approach. In this project, I want to achieve the following tasks: i) Run the existing techniques for KB embeddings for standard datasets. ii) Create an equivalent standard dataset from DBpedia for evaluations. iii) Evaluate across domains. iv) Compare and Analyse the performance and consistency of various approaches for DBpedia dataset along with other standard datasets. v)Report any challenges that may come across implementing the approaches for DBpedia. Along the way, I would also try my best to come up with any new research approach for the problem.

Mentors: Sandro Athaide Coelho, Tommaso Soru

 

Akshay Jagatap – Project Description: The project aims at defining embeddings to represent classes, instances and properties. Such a model tries to quantify semantic similarity as a measure of distance in the vector space of the embeddings. I believe this can be done by implementing Random Vector Accumulators with additional features in order to better encode the semantic information held by the Wikipedia corpus and DBpedia graphs.

Mentors: Pablo Mendes, Sandro Athaide Coelho, Tommaso Soru

 

Luca Virgili –  Project Description: In Wikipedia a lot of data are hidden in tables. What we want to do is to read correctly all tables in a page. First of all, we need a tool that can allow us to capture the tables represented in a Wikipedia page. After that, we have to understand what we read previously. Both these operations seem easy to make, but there are many problems that could arise. The main issue that we have to solve is due to how people build table. Everyone has a particular style for representing information, so in some table we can read something that doesn’t appear in another structure. In this paper I propose to improve the last year’s project and to create a general way for reading data from Wikipedia tables. I want to review the parser for Wikipedia pages for trying to understand more types of tables possible. Furthermore, I’d like to build an algorithm that can compare the column’s elements (that have been read previously by the parser) to an ontology so it could realize how the user wrote the information. In this way we can define only few mapping rules, and we can make a more generalized software.

Mentors: Emanuele Storti, Domenico Potena

 

Shashank Motepalli – Project Description: DBpedia tries to extract structured information from Wikipedia and make information available on the Web. In this way, the DBpedia project develops a gigantic source of knowledge. However, the current system for building DBpedia Ontology relies on Infobox extraction. Infoboxes, being human curated, limit the coverage of DBpedia. This occurs either due to lack of Infoboxes in some pages or over-specific or very general taxonomies. These factors have motivated the need for DBTax.DBTax follows an unsupervised approach to learning taxonomy from the Wikipedia category system. It applies several inter-disciplinary NLP techniques to assign types to DBpedia entities. The primary goal of the project is to streamline and improve the approach which was proposed. As a result, making it easy to run on a new DBpedia release. In addition to this, also to work on learning taxonomy of DBTax to other Wikipedia languages.

Mentors: Marco Fossati, Dimitris Kontokostas

 

Krishanu Konar – Project Description: Wikipedia, being the world’s largest encyclopedia, has humongous amount of information present in form of text. While key facts and figures are encapsulated in the resource’s infobox, and some detailed statistics are present in the form of tables, but there’s also a lot of data present in form of lists which are quite unstructured and hence its difficult to form into a semantic relationship. The project focuses on the extraction of relevant but hidden data which lies inside lists in Wikipedia pages. The main objective of the project would be to create a tool that can extract information from wikipedia lists, form appropriate RDF triplets that can be inserted in the DBpedia dataset.

Mentor: Marco Fossati 

Read more

Congrats to all selected students! We will keep our fingers crossed now and patiently wait until early September, when final project results are published.

An encouraging note to the less successful students.

The competition for GSoC slots is always on a very high level and DBpedia only has a limited amount of slots available for students.  In case you weren’t among the selected, do not give up on DBpedia just yet. There are plenty of opportunities to prove your abilities and be part of the DBpedia experience. You, above all, know DBpedia by heart. Hence, contributing to our support system is not only a great way to be part of the DBpedia community but also an opportunity to be vital to DBpedia’s development. Above all, it is a chance for current DBpedia mentors to get to know you better. It will give your future mentors a chance to  support you and help you to develop your ideas from the very beginning.

Go on you smart brains, dare to become a top DBpedia expert and provide good support for other DBpedia Users. Sign up to our support page  or check out the following ways to contribute:

Get involved:
  • Join our DBpedia-discussion -mailinglist, where we discuss current DBpedia developments. NOTE: all mails announcing tools or call to papers unrelated to DBpedia are not allowed. This is a community discussion list.
  • If you like to join DBpedia developers discussion and technical discussions sign up in Slack
  • Developer Discussion
  • Become a DBpedia Student and sign up for free at the DBpedia Association. We offer special programs that provide training and other opportunities to learn about DBpedia and extend your Semantic Web and programming skills

We are looking forward to working with you!

You don’t have enough of DBpedia yet? Stay tuned and join us on facebook, twitter or subscribe to our newsletter for the latest news!

 

Have a great weekend!

Your

DBpedia Association

We proudly present our new 2015-10 DBpedia release, which is abailable now via:  http://dbpedia.org/sparql. Go an check it out!

This DBpedia release is based on updated Wikipedia dumps dating from October 2015 featuring a significantly expanded base of information as well as richer and cleaner data based on the DBpedia ontology.

So, what did we do?

The DBpedia community added new classes and properties to the DBpedia ontology via the mappings wiki. The DBpedia 2015-10 ontology encompasses

  • 739 classes (DBpedia 2015-04: 735)
  • 1,099 properties with reference values (a/k/a object properties) (DBpedia 2015-04: 1,098)
  • 1,596 properties with typed literal values (a/k/a datatype properties) (DBpedia 2015-04: 1,583)
  • 132 specialized datatype properties (DBpedia 2015-04: 132)
  • 407 owl:equivalentClass and 222 owl:equivalentProperty mappings external vocabularies (DBpedia 2015-04: 408 and 200, respectively)

The editors community of the mappings wiki also defined many new mappings from Wikipedia templates to DBpedia classes. For the DBpedia 2015-10 extraction, we used a total of 5553 template mappings (DBpedia 2015-04: 4317 mappings). For the first time the top language, gauged by number of mappings, is Dutch (606 mappings), surpassing the English community (600 mappings).

And what are the (breaking) changes ?

  • English DBpedia switched to IRIs from URIs. 
  • The instance-types dataset is now split to two files:
    • “instance-types” contains only direct types.
    • “Instance-types-transitive” contains transitive types.
    • The “mappingbased-properties” file is now split into three (3) files:
      • “geo-coordinates-mappingbased”
      • “mappingbased-literals” contains mapping based statements with literal values.
      • “mappingbased-objects”
  • We added a new extractor for citation data.
  • All datasets are available in .ttl and .tql serialization 
  • We are providing DBpedia as a Docker image.
  • From now on, we provide extensive dataset metadata by adding DataIDs for all extracted languages to the respective language directories.
  • In addition, we revamped the dataset table on the download-page. It’s created dynamically based on the DataID of all languages. Likewise, the tables on the statistics- page are now based on files providing information about all mapping languages.
  • From now on, we also include the original Wikipedia dump files(‘pages_articles.xml.bz2’) alongside the extracted datasets.
  • A complete changelog can always be found in the git log.

And what about the numbers?

Altogether the new DBpedia 2015-10 release consists of 8.8 billion (2015-04: 6.9 billion) pieces of information (RDF triples) out of which 1.1 billion (2015-04: 737 million) were extracted from the English edition of Wikipedia, 4.4 billion (2015-04: 3.8 billion) were extracted from other language editions, and 3.2 billion (2015-04: 2.4 billion) came from  DBpedia Commons and Wikidata. In general we observed a significant growth in raw infobox and mapping-based statements of close to 10%.  Thorough statistics are available via the Statistics page.

And what’s up next?

We will be working to move away from the mappings wiki but we will have at least one more mapping sprint. Moreover, we have some cool ideas for GSOC this year. Additional mentors are more than welcome. 🙂

And who is to blame for the new release?

We want to thank all editors that contributed to the DBpedia ontology mappings via the Mappings Wiki, all the GSoC students and mentors working directly or indirectly on the DBpedia release and the whole DBpedia Internationalization Committee for pushing the DBpedia internationalization forward.

Special thanks go to Markus Freudenberg and Dimitris Kontokostas (University of Leipzig), Volha Bryl (University of Mannheim / Springer), Heiko Paulheim (University of Mannheim), Václav Zeman and the whole LHD team (University of Prague), Marco Fossati (FBK), Alan Meehan (TCD), Aldo Gangemi (LIPN University, France & ISTC-CNR, Italy), Kingsley Idehen, Patrick van Kleef, and Mitko Iliev (all OpenLink Software), OpenLink Software (http://www.openlinksw.com/), Ruben Verborgh from Ghent University – iMinds, Ali Ismayilov (University of Bonn), Vladimir Alexiev (Ontotext) and members of the DBpedia Association, the AKSW and the department for Business Information Systems of the University of Leipzig for their committment in putting tremendous time and effort to get this done.

The work on the DBpedia 2015-10 release was financially supported by the European Commission through the project ALIGNED – quality-centric, software and data engineering  (http://aligned-project.eu/).

 

Detailed information about the new release are available here. For more information about DBpedia, please visit our website or follow us on Facebook!

Have fun and all the best!

Yours

DBpedia Association

Have you backlinked your data yet? – A retrospective of the 6th DBpedia community meeting in The Hague

 

We thought it was about time to go orange again, meet the Dutch DBpedia Chapter and to meet and celebrate the growing dutch DBpedia community. Thus, following our successful US-event past November, the National Library of the Netherlands hosted the 6th DBpedia community meeting in The Hague on February 12th. 

First and foremost, we would like to thank TNO for organizing the pre-event and the National Library of the Netherlands, especially Menno Rasch (Director of KB operations), for sponsoring the catering during the DBpedia community meeting.

Pre-event

Before diving into DBpedia topics, we had a welcome reception on February 11th with snacks and drinks at TNO – New Babylon.  Around 40 people from the DBpedia community, members from TNO and its Data Science Department and representatives from the Platform Linked Data Netherlands engaged in vital exchanges about Linked Data topics in the Netherlands.

Sebastian Hellmann gave a short introduction about DBpedia and the recently found DBpedia Association. After Jean-Louis Roso talked about the TNO Data Science Department and current developments and projects, Erwin Folmer presented the platform Linked Data Netherlands (PiLOD).

A poster and demo session right after gave people from TNO the opportunity to present and discuss projects currently carried out at TNO.

Following, you find a short list of poster-presentation during the pre-event:

The following social gathering with snacks and drinks, encouraged talks about current developments in the DBpedia community and about ongoing projects. According to TNO representative Laura Daniel, the pre-event was very successful. She summarized the evening of the welcome reception: “It was very inspiring to see the DBpedia community in action. There were lots of interesting projects that use DBpedia as well as lively discussions on the challenges faced by the community, and of course, the event was a great opportunity for networking!”

 

Main event

2016-02-12 10.38.18 (1)
Gerard Kuys, Ordina, during the opening session @ The Hague

Opening Session

Following the pre-event, the main event attracted 95 participants and featured special session dedicated to the DBpedia showcases, the DBpedia ontology and challenges of DBpedia and Digital Heritage.

During the opening session, Menno Rasch, host of the meeting and Director of KB operations, highlighted the importance to raise awareness of the DBpedia brand in order to build a DBpedia community.

2016-02-12 10.41.17 (1)
Sebastian Hellmann, AKSW/ KILT and DBpedia Association @ the DBpedia community meeting

The newly found DBpedia Association and the related new charter regulating organizational issues in the DBpedia community was one of the focuses during the early morning hours, right before several interesting keynote presentations opened the discussion about DBpedia and its usage in the Netherlands.

2016-02-12 10.57.09
Marco De Niet, DEN Foundation @ the meeting in The Hague

Marco de Niet, representative of Digital Heritage Foundation (DEN Foundation), the Dutch knowledge centre for digital heritage, talked about “the National Strategy for Digital Heritage in the Netherlands”.

Marco Brattinga and Arjen Santema from the Land Registry and Mapping Agency (Kadaster) presented a framework to describe the data and metadata in registration in relation to a concept schema that describes what the registration is about. Apart from the ideas behind the framework, their presentation included a showcase of examples from the cadastral registration as well as the topographic map and the information node addresses and buildings.

The morning session was closes by Paul Groth, from Elsevier giving a presentation about knowledge graph construction and the Role of DBPedia and other Wikipedia based knowledge. He discussed  the importance of structured data as key to coordinate data in order to build better taxonomies. He also pointed towards the importance of having an updated publicly available knowledge graph as a reference for constructing internal knowledge graphs.

2016-02-12 11.28.03 (1)
Paul Groth, Elsevier, discussing knowledge graphs and the role of DBpedia @ the community meeting

After Lunch Track

DBpedia is one of the biggest and most important focal point of the Linked Open Data movement. Thus, the after-lunch-track focused very much on DBpedia Usages during the dedicated showcase session, which started with the new DBpedia & DBpedia+ Data Stack  release (planned for  2016-04).

Afterwards, the session continued with further DBpedia related discussions, in which various practical DBpedia matters such as DBpedia in the EUROPEANA Food and Drink project, the use of DBpedia for improved vaccine information systems or using Elasticsearch + DBpedia to maintain a searchable database of global power plants were tackled.

Afternoon Track

The afternoon track came along with four DBpedia highlight-sessions, namely DBpedia and Ontologies, DBpedia and Heritage, DBpedia hands-on development and DBpedia and NLP. Firstly, the DBpedia ontology group discussed possible ontology usages and presented the results of the latest DBpedia Ontology survey. In the following 75 minutes during the DBpedia and Heritage session, special challenges and opportunities of reference data for digital heritage were addressed by experts from EUROPEANA, iMinds, RCE and KB, the National Library of the Netherlands. Thirdly, members of the DBpedia Association and the AKSW/KILT group from Leipzig led a practical session for developers and DBpedia enthusiasts to talk about technical issues and challenges in DBpedia as well as they held a Tutorial session for DBpedia Newbies.

The end of the event was dedicated to NLP and the application of Linked Data on Language Technologies, especially entity linking, topics which are of vital importance for the research of AKSW/KILT members at the University of Leipzig.

Following, you find a list of all presentations given during the meeting.

All slides and presentations are also available on our Website and you will find more feedback and photos about the event on Twitter via #DBpediaDenHaag.

Summing up, the 6th community meeting brought together more than 95 DBpedia enthusiast from the Netherlands and Europe which engaged in vital conversations about interesting projects and approaches to questions/problems revolving around DBpedia, not only during the dedicated session but also during networking breaks. The recently found DBpedia Association was strongly represented with presentations from Sebastian Hellmann, Dimitris Kontokostas,  Nilesh Chakraborty, as well as Markus Freudenberg.

Finally, we would like to thank the organizers Enno Meijers, Richard Nagelmaker, Gerald Wildenbeest, Gerard Kuys, Monika Solanki and representatives of the DBpedia Association such as Dimitris Kontokostas and Sebastian Hellmann for devoting their time to the organization of the meeting and the programme. We are now looking forward to the 7th DBpedia Community Meeting, which will be held in the city of Leipzig again, during the Semantics conference in September 15th, 2016.

For updates, just follow us on Facebook, Twitter or check the following websites: http://www.semantics.cc/ and http://wiki.dbpedia.org/.

 

GSoC 2015 is gone, long live GSoC 2016

The submission deadline for mentoring organizations to submit their application for the 2016 Google Summer of Code is approaching quickly. As DBpedia is again planning to be a vital part of the Mentoring Summit, we like to take that opportunity to  give you a little recap of the projects mentored by DBpedia members during the past GSoC, in November 2015. 

Dimitris Kontokostas, Marco Fossati, Thiago Galery, Joachim Daiber and Reuben Verborgh, members of the Dbpedia community, mentored 8 great students from around the world. Following are some of the projects they completed.

Fact Extraction from Wikipedia Text by Emilio Dorigatti

DBpedia is pretty much mature when dealing with Wikipedia semi-structured content like infoboxes, links and categories. However, unstructured content (typically text) plays the most crucial role, due to the amount of knowledge it can deliver, and few efforts have been carried out to extract structured data out of it. Marco and Emilio built a fact extractor, which understands the semantics of a sentence thanks to Natural Language Processing (NLP) techniques. If you feel playful, you can download the produced datasetsFor more details, check out this blog postP.S.: the project has been cited by Python Weekly and Python TrendingMentor: Marco Fossati (SpazioDati)

Better context vectors for disambiguation by Philipp Dowling

Better Context Vectors  aimed to improve the representation of context used by DBpedia Spotlight by incorporating novel methods from distributional semantics. We investigated the benefits of replacing a word-count based method for one that uses a model based on word2vec. Our student, Philipp Dowling, implemented the model reader based on a preprocessed version of Wikipedia (leading to a few commits to the awesome library gensim) and the integration with the main DBpedia Spotlight pipeline. Additionally, we integrated a method for estimating weights for the different model components that contribute to disambiguating entities. Mentors: Thiago Galery (Analytyca), Joachim Daiber (Amsterdam Univ.), David Przybilla (Idio)

 

Wikipedia Stats Extractor by Naveen Madhire

Wikipedia Stats Extractor aimed to create a reusable tool to extract raw statistics for Name Entity Linking out of a Wikipedia dump. Naveen built the project on top of Apache Spark and Json-wikipedia which makes the code more maintainable and faster than its previous alternative (pignlproc). Furthermore Wikipedia Stats Extractor provides an interface which makes easier the task of processing Wikipedia dumps for  purposes other than Entity Linking. Extra changes were made in the way surface forms stats are extracted  and lots of noise was removed, both of which should in principle help Entity Linking.
Special regards to Diego Ceccarelli who gave us great insight on how Json-wikipedia worked. Mentors: Thiago Galery (Analytyca), Joachim Daiber (Amsterdam Univ.), David Przybilla (Idio)

 

DBpedia Live extensions by Andre Pereira

DBpedia Live provides near real-time knowledge extraction from Wikipedia. As wikipedia scales we needed to move our caching infrastructure from MySQL to MongoDB. This was the first task of Andre’s project. The second task was the implementation of a UI displaying the current status of DBpedia Live along with some admin utils. Mentors: Dimitris Kontokostas (AKSW/KILT), Magnus Knuth (HPI)

 

Adding live-ness to the Triple Pattern Fragments server by Pablo Estrada

DBpedia currently has a highly available Triple Pattern Fragments interface that offloads part of the query processing from the server into the clients. For this GSoC, Pablo developed a new feature for this server so it automatically keeps itself up to date with new data coming from DBpedia Live. We do this by periodically checking for updates, and adding them to an auxiliary database. Pablo developed smart update, and smart querying algorithms to manage and serve the live data efficiently. We are excited to let the project out in the wild, and see how it performs in real-life use cases. Mentors: Ruben Verborgh (Ghent Univ. – iMinds) and Dimitris Kontokostas (AKSW/KILT)

Registration for mentors @ GSoC 2016 is starting next month and DBpedia would of course try to participate again. If you want to become a mentor or just have a cool idea that seems suitable, don’t hesitate to ping us via the DBpedia discussion or developer mailing lists.

Stay tuned!

Your DBpedia Association

A retrospective of the 5th DBpedia community meeting in California

A belated Happy New Year to all DBpedia enthusiasts !!!

Two weeks of 2016 have already passed and it is about time to reflect on the past three months which were revolving around the 5th DBpedia meeting in the USA.

After 4 successful meetings in Poznan, Dublin, Leipzig and Amsterdam, we thought it is about time to cross the Atlantic and meet the US-part of the DBpedia community. On November 5th 2015, our 5th DBpedia Community meeting was held at the world famous Stanford University, in Palo Alto California.

First and foremost, we would like to thank Michel Dumontier, Associate Professor of Medicine at Stanford University, and hisIMG_20151105_192240  Laboratory for Biomedical Knowledge Discovery for hosting this great event and giving so many US-based DBpedia enthusiasts a platform for exchange and to meet in person. The event was constantly commented on and discussed not just inside University premises but also online, via Twitter #DBpedia  CA. We would also like to thank the rest of the organizers: Pablo Mendes, Marco Fossati, Dimitris Kontokostas and Sebastian Hellmann for devoting a lot of time to plan the meeting and coordinate with the presenters.

We set out to the US with two main goals. Firstly, we wanted DBpedia and Knowledge Graph professionals and enthusiasts to network and discuss ideas about how to improve DBpedia. Secondly, the event also aimed at finding new partners, developers and supporters to help DBpedia grow professionally, in terms of competencies and data, as well as to enlarge the DBpedia community itself to spread the word and to raise awareness of the DBpedia brand.

Therefore, we invited representatives of the best-known actors in the Data community such as:

  • Michel Dumontier, Stanford
  • Anshu Jain, IBM Watson
  • Nicolas Torzec, Yahoo!
  • Yves Raimond, Netflix
  • Karthik Gomadam, Independent
  • Joakim Soderberg, Blippar
  • Alkis Simitsis, HP Labs
  • Yashar Mehdad, Yahoo! Labs

…who addressed interesting topics and together with all the DBpedia enthusiasts engaged in productive discussion and raised controversial questions.

Pre-event

The meeting itself was co-located with an pre-event designed as workshop, giving the attending companies a lot of room and time to raise questions and discuss “hot topics”. Classification schemas and multilingualism have been on top of the list of topics that were most interesting for the companies invited. In this interactive setting, our guest from Evernote, BlippAR, World University and Wikimedia answered questions about the DBpedia ontology and mappings, Wikipedia categories as well as about similarities and differences with Wikidata.

Main Event

Following the pre-event, the main event attracted attendees with lightning talks from major companies interesting to the DBpedia community.

The host of the DBpedia Meeting, Michel Dumontier from Stanford opened the main event with a short introduction of his group’s focus in biomedical data. He and his group currently focus on integrating datasets to extract maximal value from data. Right in the beginning of the DBpedia meeting, Dumontier highlighted the value of already existing yet unexploited data out there.

During the meeting there have been two main thematic foci, onDSC00807e concerning the topics companies were interested in and raised during the session. Experts from Yahoo, Netflix, Diffbot, IBM Watson and Unicode addressed issue such as fact extraction from text via NLP, knowledge base construction techniques and recommender systems leveraging data from a knowledge base and multilingual abbreviation datasets.

The second focus of this event revolved around DBpedia and encyclopedic Knowledge Graphs including augmented reality addressed by BlippAR and by Nuance. We have some of the talks summed up for you here. Also check out the slides provided in addition to the summary of some talks to get a deeper insight into the event.

Nicolas Torzec, Yahoo! – Wikipedia, DBpedia and the Yahoo! Knowledge Graph

He described how DBpedia played a key role in the beginning of the Knowledge Graph effort at Yahoo! They decided on using the Extraction Framework directly, not the provided data dumps, which allowed them to continuously update as Wikipedia changed. Yashar, also from Yahoo! focused on multilingual NE detection and linking. He described how users make financial choices based on availability of products in their local language, which highlights the importance of multilinguality (also a core objective of the DBpedia effort).

Anshu Jain,  IBM Watson  – Watson Knowledge Graph – DBpedia Meetup

The focus of this presentation was the effort by IBM Watson team  their effort as not building a knowledge graph, but building a platform for working with knowledge graphs. For them, graph is just an abstraction, not a data structure. He highlighted that context is very important, and one

 

Yves Raimond, Netflix – Knowledge Graphs @ NetflixYves Raimond from Netflix observed that in their platform, every impression is a recommendation. They rely on lots of machine learning algorithms, and pondered on the role of knowledge graphs in that setting. Will everything (user + metadata) end up in a graph so that algorithms learn from that?Click here for the complete presentation.

Joakim Soderberg, BlippAR –

Joakim Soderberg mentioned that at Blippar it’s all about the experience. They are focusing on augmented reality, which can benefit from information drawn from many sources including DBpedia.

David Martin, Nuance – using DBpedia with Nuance

David Martin from Nuance talked about how DBpedia is used as a source of named entities. He observes that multi role ranking is an important issue, for instance, the difference in the role of Arnold Schwarzenegger as politician or actor. Click here for the complete presentation.

Karthik Gomadam, Accenture Technology Labs – Rethinking the Enterprise Data Stack

Karthik Gomadam discussed data harmonization in the context of linked enterprise data.

Alkis Simitsis, Hewlett Packard – Complex Graph Computations over Enterprise Data

He talked about complex graph computations over enterprise data, while Georgia Koutrika from HP Labs presented their solution for fusing knowledge into recommendations.

Other topics discussed were:

  • Steven Loomis, IBM – Automatically extracted abbreviated data with Dbpedia
  • Scott McLeod, World University and School – MIT Open Courseware with Wikipedia. Classes in virtual worlds.
  • Diffbot’s developers talked about structuring the Web with their product with the help of DBpedia and DBpedia Spotlight.

You find some more presentations here:

 

Feedback from attendees and via our Twitter stream #DBpediaCA was generally very positive and insightful. The choice of invited talks was appreciated unanimously, and so was the idea of having lightning talks. In the spirit of previous DBpedia Meetings, we allocated time for all attendees that were interested in speaking. Some commented that they would have liked to have more time to ask questions and discuss, while others thought the meeting was too late. We will consider the trade-offs and try to improve in the next iteration. There was strong support from attendees for meeting again as soon as possible!

So now, we are looking forward to the next DBpedia community meeting which will be held on February 12, 2016 in the Hague, Netherlands. So, save the date and visit the event page. We will keep you informed via the DBpedia Website and Blog.                   

Finally, we would like to thank Yahoo! for sponsoring the catering during the DBpedia community meeting. We would also like to acknowledge Google Summer of Code as the reason Marco and Dimitris were in California and for covering part of their travel expenses.

The event was initiated by the DBpedia association. The following people received travel grants by the DBpedia association: Marco Fossati; Dimitris Kontokostas; Joachim Daiber