All posts by Sandra Praetor

A year with DBpedia – Retrospective Part 3

This is the final part of our journey around the world with DBpedia. This time we will take you from Austria, to Mountain View, California and to London, UK.

Come on, let’s do this.

Welcome to Vienna, Austria  – Semantics

More than 110 DBpedia enthusiasts joined our Community Meeting in Vienna, on September 10th, 2018. The event was again co-located with SEMANTiCS, a very successful collaboration. Lucky us, we got hold of two brilliant Keynote speakers, to open our meeting. Javier David Fernández García, Vienna University of Economics, opened the meeting with his keynote Linked Open Data cloud – act now before it’s too late. He reflected on challenges towards arriving at a truly machine-readable and decentralized Web of Data. Javier reviewed the current state of affairs, highlighted key technical and non-technical challenges, and outlined potential solution strategies. The second keynote speaker was Mathieu d’Aquin, Professor of Informatics at the Insight Centre for Data Analytics at NUI Galway. Mathieu, who is specialized in data analytics, completed the meeting with his keynote Dealing with Open-Domain Data.

The 12th edition of the DBpedia Community Meeting also covered a special chapter session, chaired by Enno Meijers, from the Dutch DBpedia Language Chapter. The speakers presented the latest technical or organizational developments of their respective chapter. This session has mainly created an exchange platform for the different DBpedia chapters. For the first time, representatives of the European chapters discussed problems and challenges of DBpedia from their point of view. Furthermore, tools, applications, and projects were presented by each chapter’s representative.

In case you missed the event, a more detailed article can be found here. All slides and presentations are also available on our Website. Further insights, feedback, and photos about the event are available on Twitter via #DBpediaDay.

Welcome to Mountain View  – GSoC mentor summit

GSoC was a vital part of DBpedia’s endeavors in 2018. We had three very talented students that with the help of our great mentors made it to the finish line of the program. You can read about their projects and success story in a dedicated post here.

After a successful 3-month mentoring, two of our mentors had the opportunity to attend the annual Google Summer of Code mentor summit. Mariano Rico and Thiago Galery represented DBpedia at the event this year. They engaged in a vital discussion about this years program, about lessons learned, highlights and drawbacks they experienced during the summer. A special focus was put on how to engage potential GSoC students as early as possible to get as much commitment as possible. The ideas the two mentors brought back in their suitcases will help to improve DBpedia’s part of the program for 2019. And apparently, chocolate was a very big thing there ;).

In case you have a project idea for GSoC2019 or want to mentor a DBpedia project next year, just drop us a line via dbpedia@infai.org. Also, as we intend to participate in the upcoming edition, please spread the word amongst students, and especially female students,  that fancy spending their summer coding on a DBpedia project. Thank you.

 

Welcome to London, England – Connected Data London 2018

In early November, we were invited to Connected Data London again. After 2017 this great event seems to become a regular in our DBpedia schedule.

Executive Director of the DBpedia Association, Sebastian Hellmannparticipated as panel candidate in the discussion around “Building Knowledge Graphs in the Real World”. Together with speakers from Thomson Reuters, Zalando, and Textkernel, he discussed definitions of KG, best practices of how to build and use knowledge graphs as well as the recent hype about it.

Visitors of CNDL2018 had the chance to grab a copy of our brand new flyer and exchange with us about the DBpedia Databus. This event gave us the opportunity to already met early adopters of our databus  – a decentralized data publication, integration, and subscription platform. Thank you very much for that opportunity.

A year went by

2018 has gone by so fast and brought so much for DBpedia. The DBpedia Association got the chance to meet more of DBpedia’s language chapters, we developed the DBpedia Databus to an extent that it can finally be launched in spring 2019. DBpedia is a community project relying on people and with the DBpedia Databus, we create a platform that allows publishing and provides a networked data economy around it. So stay tuned for exciting news coming up next year. Until then we like to thank all DBpedia enthusiasts around the world for their research with DBpedia, and support and contributions to DBpedia. Kudos to you.

 

All that remains to say is have yourself a very merry Christmas and a dazzling New Year. May 2019 be peaceful, exciting and prosperous.

 

Yours – being in a cheerful and festive mood –

 

DBpedia Association

 

A year with DBpedia – Retrospective Part Two

Retrospective Part II. Welcome to the second part of our journey around the world with DBpedia. This time we are taking you to Greece, Germany, to Australia and finally France.

Let the travels begin.

Welcome to Thessaloniki, Greece & ESWC

DBpedians from the Portuguese Chapter presented their research results during ESWC 2018 in Thessaloniki, Greece.  the team around Diego Moussalem developed a demo to extend MAG  to support Entity Linking in 40 different languages. A special focus was put on low-resources languages such as Ukrainian, Greek, Hungarian, Croatian, Portuguese, Japanese and Korean. The demo relies on online web services which allow for an easy access to (their) entity linking approaches. Furthermore, it can disambiguate against DBpedia and Wikidata. Currently, MAG is used in diverse projects and has been used largely by the Semantic Web community. Check the demo via http://bit.ly/2RWgQ2M. Further information about the development can be found in a research paper, available here

 

Welcome back to Leipzig Germany

With our new credo “connecting data is about linking people and organizations”, halfway through 2018, we finalized our concept of the DBpedia Databus. This global DBpedia platform aims at sharing the efforts of OKG governance, collaboration, and curation to maximize societal value and develop a linked data economy.

With this new strategy, we wanted to meet some DBpedia enthusiasts of the German DBpedia Community. Fortunately, the LSWT (Leipzig Semantic Web Tag) 2018 hosted in Leipzig, home to the DBpedia Association proofed to be the right opportunity.  It was the perfect platform to exchange with researchers, industry and other organizations about current developments and future application of the DBpedia Databus. Apart from hosting a hands-on DBpedia workshop for newbies we also organized a well-received WebID -Tutorial. Finally,  the event gave us the opportunity to position the new DBpedia Databus as a global open knowledge network that aims at providing unified and global access to knowledge (graphs).

Welcome down under – Melbourne Australia

Further research results that rely on DBpedia were presented during ACL2018, in Melbourne, Australia, July 15th to 20th, 2018. The core of the research was DBpedia data, based on the WebNLG corpus, a challenge where participants automatically converted non-linguistic data from the Semantic Web into a textual format. Later on, the data was used to train a neural network model for generating referring expressions of a given entity. For example, if Jane Doe is a person’s official name, the referring expression of that person would be “Jane”, “Ms Doe”, “J. Doe”, or  “the blonde woman from the USA” etc.

If you want to dig deeper but missed ACL this year, the paper is available here.

 

Welcome to Lyon, France

In July the DBpedia Association travelled to France. With the organizational support of Thomas Riechert (HTWK, InfAI) and Inria, we finally met the French DBpedia Community in person and presented the DBpedia Databus. Additionally, we got to meet the French DBpedia Chapter, researchers and developers around Oscar Rodríguez Rocha and Catherine Faron Zucker.  They presented current research results revolving around an approach to automate the generation of educational quizzes from DBpedia. They wanted to provide a useful tool to be applied in the French educational system, that:

  • helps to test and evaluate the knowledge acquired by learners and…
  • supports lifelong learning on various topics or subjects. 

The French DBpedia team followed a 4-step approach:

  1. Quizzes are first formalized with Semantic Web standards: questions are represented as SPARQL queries and answers as RDF graphs.
  2. Natural language questions, answers and distractors are generated from this formalization.
  3. We defined different strategies to extract multiple choice questions, correct answers and distractors from DBpedia.
  4. We defined a measure of the information content of the elements of an ontology, and of the set of questions contained in a quiz.

Oscar R. Rocha and Catherine F. Zucker also published a paper explaining the detailed approach to automatically generate quizzes from DBpedia according to official French educational standards. 

 

 

Thank you to all DBpedia enthusiasts that we met during our journey. A big thanks to

With this journey from Europe to Australia and back we provided you with insights into research based on DBpedia as well as a glimpse into the French DBpedia Chapter. In our final part of the journey coming up next week, we will take you to Vienna,  San Francisco and London.  In the meantime, stay tuned and visit our Twitter channel or subscribe to our DBpedia Newsletter.

 

Have a great week.

Yours DBpedia Association

A year with DBpedia – A Retrospective Part One

Looking back, 2018 was a very successful year for DBpedia. First and foremost, we refined our strategy and developed our concept of the DBpedia Databus, a central communication system that allows exchanging, curating and accessing data between multiple stakeholders. The Databus simplifies working with data and will be launched in early 2019. 

Moreover, we travelled many miles in 2018 to not only visit our language chapters and exchange about DBpedia but also to meet enthusiast from our community to exchange during workshops and conferences worldwide.

In the upcoming Blog-Series, we like to take you on a retrospective tour around the world, giving you insights into a year with DBpedia. We will start out with Stop-overs in Japan, Poland and Germany and will continue our journey to other continents in the following two weeks.

Sit back and read on.

Big Spring in Japan – Welcome to Myazaki

Welcome to Miyazaki, to LREC, Language Resources and Evaluation Conference 2018 and meet RDF2PT.  No idea what that is and what it has to do with DBpedia? Read on!

The generation of natural language from RDF data has recently gained significant attention due to the continuous growth of Linked Data. Proposing the RDF2PT approach a research team around Diego Moussalem, part of the Portuguese DBpedia Chapter described how RDF data is verbalized to Brazilian Portuguese texts. They highlight the steps taken to generate Portuguese texts and addressed challenges with grammatical gender, classes and resources and properties. The results suggest that RDF2PT generates texts that can be easily understood by humans. It also helps to identify some of the challenges related to the automatic generation of Brazilian Portuguese (especially from RDF).

The full paper is available via https://arxiv.org/pdf/1802.08150.pdf 

 

Welcome to Poznan, Poland

Our community is our asset. In order to grow it and encourage contributions, the DBpedia Association continuously organizes community meetups to tackle the interests of our multi-faceted community. In late May, we travelled to Poland to meet Polish DBpedia enthusiasts in our meetup in Poznań. The idea was, to find out what the Polish DBpedia community uses DBpedia for, what applications and tools they have and what they are currently developing. Members of the chapter presented, amongst others, results of the primary research project “Quality of Data in DBpedia”. Attendees exchanged in vital discussions about uses of DBpedia applications and tools and listened to a presentation of Professor Witold Abramowicz, chair of the Department of Information Systems at Poznan University of Economics and also the head of SmartBrain. He talked about opportunities and challenges of data science.

Further information on the Polish DBpedia Chapter can be found on their website.

Welcome to Leipzig, home to the DBpedia Association

For the first time ever, DBpedia was part of the German culture-hackathon Coding da Vinci, held at Bibliotheca Albertina, University Library of Leipzig University,  in June 2018. In this year’s edition, we not only offered a hands-on workshop but also provided our DBpedia datasets. This data supported more than 30 cultural institutions, enriching their own datasets. In turn, hackathon participants could creatively develop new tools, apps, games quizzes etc. out of the data. 

One of the projects that used DBpedia as a source was Birdory . It is a memory game using bird voices and pictures. The goal is, much like in regular memory games, to match the correct picture to the bird sound that is played. The data used for the game was taken from Museum für Naturkunde Berlin (bird voices) as well as from DBpedia (pictures). So in case you need some me-time during Christmas gatherings, you might want to check it out via: https://birdory.firebaseapp.com/.

 

In our upcoming Blog-Post next week we will take you to Thessaloniki Greece, Australia and again, Leipzig.  In the meantime, stay tuned and visit our Twitter channel or subscribe to our DBpedia Newsletter.  

 

Have a great week,

 

Yours DBpedia Association

 

Who are these DBpedia users ? …(and why ? )

Guest article by Victor de Boer, Vrije Universiteit Amsterdam, NL, member of NL-DBpedia

Who uses DBpedia anyway?…

This question started a research project for Frank Walraven, an Information Sciences Master student at Vrije Universiteit Amsterdam (VUA). The question came up during one of the meetings of the Dutch DBpedia chapter, of which VUA is a member.

If DBpedia users and their usage are better understood, this can lead to better servicing of those Dbpedia users by, for example, prioritizing the enrichment or improvement of specific sections of DBpedia. Characterizing use(r)s of a Linked Open Dataset is an inherently challenging task because in an open web world it is difficult to tell who is accessing your digital resources.

Frank conducted his MSc project research at the Dutch National Library  and used a hybrid approach utilizing both, a data-driven method based on user log analysis and a short survey to get to know the users of the dataset.

 As a scope, Frank selected just the Dutch DBpedia dataset. For the data-driven part of the method, Frank used a complete user log of HTTP requests on the Dutch DBpedia. This log file consisted of over 4.5 Million entries and logged both URI lookups and SPARQL endpoint requests. For this research, he only included a subset of the URI lookups.

Analysis of IP- Addresses od DBpedia Users

As a first analysis step, the requests’ origins IPs were categorized. Five classes can be identified (A-E), with the vast majority of IP addresses being in class “A”: Very large networks and bots. Most of the IP addresses in these lists could be traced back to search engine indexing bots such as those from Yahoo or Google. In classes B-F, Frank manually traced the top 30 most encountered IP-addresses. He concluded that even there 60% of the requests came from bots, 10% definitely not from bots, with 30% remaining unclear.

 

 

 

Step II – Identification of Page Requests

The second analysis step in the data-driven method consisted of identifying what types of pages were most requested. To cluster the thousands of DBpedia URI request, Frank retrieved the ‘categories’ of the pages. These categories are extracted from Wikipedia category links. An example is the “Android_TV” resource, which has two categories: “Google” and “Android_(operating_system)”. Following skos:broader links, a ‘level 2 category’ could also be found to aggregate to an even higher level of abstraction. As not all resources have such categories, this does not give a complete image, but it does provide some ideas on the most popular categories of items requested. After normalizing for categories with large amounts of incoming links, for example, the category “non-endangered animal”, the most popular categories where

  • 1. Domestic & International movies,
  • 2. Music,
  • 3. Sports,
  • 4. Dutch & International municipality information and
  • 5. Books.
 Survey

Additionally, Frank set up a user survey to corroborate this evidence. The survey contained questions about the how and why of the respondents use of the Dutch DBpedia, including the categories they were most interested in.

The survey was distributed using the Dutch DBpedia website and via Twitter. However, the endeavour only attracted 5 respondents. This illustrates the difficulty of the problem that users of the DBpedia resource are not necessarily easily reachable through communication channels. The five respondents were all quite closely related to the chapter but the results were interesting nonetheless. Most of the DBpedia users used the DBpedia SPARQL endpoint. The full results of the survey can be found through Frank’s thesis, but in terms of corroboration, the survey revealed that four out of the five categories found in the data-driven method were also identified in the top five results from the survey. The fifth one identified in the survey was ‘geography’, which could be matched to the fifth from the data-driven method.

Conclusion

Frank’s research shows that it remains a challenging problem, using a combination of data-driven and user-driven method. Yet,  it is indeed possible to get an indication into the most-used categories on DBpedia. Within the Dutch DBpedia Chapter, we are currently considering follow-up research questions based on Frank’s research. For further information about the work of the Dutch DBpedia chapter, please visit their website. 

A big thanks to the Dutch DBpedia Chapter for supervising this research and providing insights via this post.

Yours

DBpedia Association

The Release Circle – A Glimpse behind the Scenes

As you already know, with the new DBpedia strategy our mode of publishing releases changed.  The new DBpedia release process follows a three-step approach starting from the Extraction to ID-Management towards the Fusion, which finalizes the release process.  Our DBpedia releases are currently published on a monthly basis. In this post, we give you insight into the single steps of the release process and into what our developers actually do when preparing a DBpedia release.

Extraction  – Step one of the Release

The good news is, our new release mode is taking shape and noticeable picked up speed. Finally the 2018-08 and, additionally the 2018.09.12 and the 2018.10.16 Releases are now available in our LTS repository.

The 2018-08 Release was generated on the basis of the Wikipedia datasets extracted in early August and currently comprises 136 languages. The extraction release contains the raw extracted data generated by the DBpedia extraction-framework. The post-processing steps, such as data-deduplication or URI-normalization are omitted and moved to later parts of the release process. Thus, we can provide direct, transparent access to the generated data in every step. Until we manage two releases per month, our data is mostly based on the second Wikipedia datasets of the previous month. In line with that, the 2018.09.12 release is based on late August data and the recent 2018.10.16 Release is based on Wikipedia datasets extracted on September 20th. They all comprise 136 languages and contain a stable list of datasets since the 2018-08 release.

Our releases are now ready for parsing and external use. Additionally, there will be a new Wikidata-based release this week.

ID-Management – Step two of the Release

For a complete “new DBpedia” release the DBpedia ID-Management and Fusion of the data have to be added to the process. The Databus ID Management is a process to unify various different IRIs identifying the same entities coined from different data providers. Taking datasets with overlapping domains of interest from multiple data providers, the set of IRIs denoting the entities in the source datasets are determined heuristically (e.g. excluding RDF/OWL types/classes).

Afterwards, these selected IRIs a numeric primary key, the ‘Singleton ID’. The core of the ID Management process happens in the next step: Based on the large set of owl:sameAs assertions in the input data with high confidence, the connected components induced from the corresponding sameAs-graph is computed. In other words: The groups of all entities from the input datasets (transitively) reachable from one to another are determined. We dubbed these groups the sameAs-clusters. For each sameAs-cluster we pick one member as representant, which determines the ‘Cluster ID’ or ‘Global Identifier’ for all cluster members.

Apart from being an essential preparatory step for the Fusion, these Global Identifiers serve purpose in their own right as unified Linked Data identifiers for groups of Linked Data entities that should be viewed as equivalent or ‘the same thing’.

A processing workflow based on Apache Spark to perform the process described on above for large quantities of RDF input data is already in place and has been run successfully for a large set of DBpedia inputs consisting of:

 

Fusion – Step three of the Release

Based on the extraction and the ID-Management, the Data Fusion finalizes the last step of the  DBpedia release cycle. With the goal of improving data quality and data coverage, the process uses the DBpedia global IRI clusters to fuse and enrich the source datasets. The fused data contains all resource of the input datasets. The fusion process is based on a functional property decision to decide the number of selected values ( owl:FunctionalProperty determination ). Further, the value selection for this functional properties is based on a preference dependent on the originated source dataset. For example, preferred values for En-DBpedia over DE-DBpedia.

The enrichment improves entity-properties and -values coverage for resources only contained in the source data. Furthermore, we create provenance data to keep track of the origin of each triple. This provenance data is also used for the http-based http://global.dbpedia.org resource view.

At the moment the fused and enriched data is available for the generic, and mapping-based extractions. More datasets are still in progress.  The DBpedia-fusion data is uploading to http://downloads.dbpedia.org/repo/dev/fusion/

Please note we are still in the midst of the beta testing for our data release tool, so in case you do come across any errors, reporting them to us is much appreciated to fuel the testing process.

Further information regarding the releases progress can be found here: http://dev.dbpedia.org/

Next steps

We will add more releases to the repository on a monthly basis aiming for a bi-weekly release mode as soon as possible. In between the intervals, any mistakes or errors you find and report in this data can be fixed for the upcoming release. Currently, the generated metadata in the DataID-file is not stable. This will fluctuate, still needs to be improved and will change in the near future.  We are now working on the next release and will inform you as soon as it is published.

Yours DBpedia Association

This blog post was written with the help of our DBpedia developers Robert Bielinski, Markus Ackermann and Marvin Hofer who were responsible for the work done with respect to the DBpedia releases. We like to thank them for their great work. 

 

Retrospective: GSoC 2018

With all the beta-testing, the evaluations of the community survey part I and part II and the preparations for the Semantics 2018 we lost almost sight of telling you about the final results of GSoC 2018. Following we present you a short recap of this year’s students and projects that made it to the finishing line of GSoC 2018.

 

Et Voilà

We started out with six students that committed to GSoC projects. However, in the course of the summer, some dropped out or did not pass the midterm evaluation. In the end, we had three finalists that made it through the program.

Meet Bharat Suri

… who worked on “Complex Embeddings for OOV Entities”. The aim of this project was to enhance the DBpedia Knowledge Base by enabling the model to learn from the corpus and generate embeddings for different entities, such as classes, instances and properties.  His code is available in his GitHub repository. Tommaso Soru, Thiago Galery and Peng Xu supported Bharat throughout the summer as his DBpedia mentors.

Meet Victor Fernandez

.. who worked on a “Web application to detect incorrect mappings across DBpedia’s in different languages”. The aim of his project was to create a web application and API to aid in automatically detecting inaccurate DBpedia mappings. The mappings for each language are often not aligned, causing inconsistencies in the quality of the RDF generated. The final code of this project is available in Victor’s repository on GitHub. He was mentored by Mariano Rico and Nandana Mihindukulasooriya.

Meet Aman Mehta

.. whose project aimed at building a model which allows users to query DBpedia directly using natural language without the need to have any previous experience in SPARQL. His task was to train a Sequence-2-Sequence Neural Network model to translate any Natural Language Query (NLQ) into the corresponding sentence encoding SPARQL query. See the results of this project in Aman’s GitHub repositoryTommaso Soru and Ricardo Usbeck were his DBpedia mentors during the summer.

Finally, these projects will contribute to an overall development of DBpedia. We are very satisfied with the contributions and results our students produced.  Furthermore, we like to genuinely thank all students and mentors for their effort. We hope to be in touch and see a few faces again next year.

A special thanks goes out to all mentors and students whose projects did not make it through.

GSoC Mentor Summit

Now it is the mentors’ turn to take part in this year GSoC mentor summit, October 12th till 14th. This year, Mariano Rico and Thiago Galery will represent DBpedia at the event. Their task is to engage in a vital discussion about this years program, about lessons learned, highlights and drawbacks they experienced during the summer. Hopefully, they return with new ideas from the exchange with mentors from other open source projects. In turn, we hope to improve our part of the program for students and mentors.

Sit tight, follow us on Twitter and we will update you about the event soon.

Yours DBpedia Association

DBpedia Chapters – Survey Evaluation – Episode Two

Welcome back to part two of the evaluation of the surveys, we conducted with the DBpedia chapters.

Survey Evaluation – Episode Two

The second survey focused on technical matters. We asked the chapters about the usage of DBpedia services and tools, technical problems and challenges and potential reasons to overcome them.  Have a look below.

Again, only nine out of 21 DBpedia chapters participated in this survey. And again, that means, the results only represent roughly 42% of the DBpedia chapter population

The good news is, all chapters maintain a local DBpedia endpoint. Yay! More than 55 % of the chapters perform their own extraction. The rest of them apply a hybrid approach reusing some datasets from DBpedia releases and additionally, extract some on their own.

Datasets, Services and Applications

In terms of frequency of dataset updates, the situation is as follows:  44,4 % of the chapters update them once a year. The answers of the remaining ones differ in equal shares, depending on various factors. See the graph below. 

 

 

 

 

 

 

 

When it comes to the maintenance of links to local datasets, most of the chapters do not have additional ones. However, some do maintain links to, for example, Greek WordNet, the National Library of Greece Authority record, Geonames.jp and the Japanese WordNet. Furthermore, some of the chapters even host other datasets of local interest, but mostly in a separate endpoint, so they keep separate graphs.

Apart from hosting their own endpoint, most chapters maintain one or the other additional service such as Spotlight, LodLive or LodView.

 

 

 

 

 

 

 

Moreover,  the chapters have additional applications they developed on top of DBpedia data and services.

Besides, they also gave us some reasons why they were not able to deploy DBpedia related services. See their replies below.

 

 

 

 

 

 

 

 

 

DBpedia Chapter set-up

Lastly, we asked the technical heads of the chapters what the hardest task for setting up their chapter had been.  The answers, again, vary as the starting position of each chapter differed. Read a few of their replies below.

The hardest technical task for setting up the chapter was:

  • to keep virtuoso up to date
  • the chapter specific setup of DBpedia plugin in Virtuoso
  • the Extraction Framework
  • configuring Virtuoso for serving data using server’s FQDN and Nginx proxying
  • setting up the Extraction Framework, especially for abstracts
  • correctly setting up the extraction process and the DBpedia facet browser
  • fixing internationalization issues, and updating the endpoint
  • keeping the extraction framework working and up to date
  • updating the server to the specific requirements for further compilation – we work on Debian

 

Final  words

With all the data and results we gathered, we will get together with our chapter coordinator to develop a strategy of how to improve technical as well as organizational issues the surveys revealed. By that, we hope to facilitate a better exchange between the chapters and with us, the DBpedia Association. Moreover, we intend to minimize barriers for setting up and maintaining a DBpedia chapter so that our chapter community may thrive and prosper.

In the meantime, spread your work and share it with the community. Do not forget to follow and tag us on Twitter ( @dbpedia ). You may also want to subscribe to our newsletter.

We will keep you posted about any updates and news.

Yours

DBpedia Association

DBpedia Chapters – Survey Evaluation – Episode One

DBpedia Chapters – Challenge Accepted

The DBpedia community currently comprises more than 20 language chapters, ranging from  Basque, Japanese to Portuguese and Ukrainian. Managing such a variety of chapters is a huge challenge for the DBpedia Association because individual requirements are as diverse as the different languages the chapters represent. There are chapters that started out back in 2012 such as DBpediaNL. Others like the Catalan chapter are brand new and have different haves and needs.

So, in order to optimize chapter development, we aim to formalize an official DBpedia Chapter Consortium. It permits a close dialogue with the chapters in order to address all relevant matters regarding communication, organization as well as technical issues. We want to provide the community with the best basis to set up new chapters and to maintain or develop the existing ones.

Our main targets for this are to: 

  • improve general chapter organization,
  • unite all DBpedia chapters with central DBpedia,
  • promote better communication and understanding and,
  • create synergies for further developments and make easier the access to information about which is done by all DBpedia bodies

As a first step, we needed to collect information about the current state of things.  Hence, we conducted two surveys to collect the necessary information. One was directed at chapter leads and the other one at technical heads. 

In this blog-post, we like to present you the results of the survey conducted with chapter leads.  It addressed matters of communication and organizational relevance. Unfortunately, only nine out of 21 chapters participated, so the respective outcome of the survey speaks only for roughly 42% of all DBpedia chapters.

Chapter-Survey  – Episode One

Most chapters have very little personnel committed to the work done for the chapter, due to different reasons. 66 % of the chapters have only one till four people being involved in the core work. Only one chapter has about ten people working on it.

Overall, the chapters use various marketing channels for promotion, visibility and outreach. The website as well as event participation, Twitter and Facebook are among the most favourite channels they use. 

The following chart shows how chapters currently communicate organizational and communication issues in their respective chapter and to the DBpedia Association.

 

 

The second one explicit that ⅓ of the chapters favour an exchange among chapters and with the DBpedia Association via the discussion mailing list as well as regular chapter calls.

 

The survey results show that 66,6% of the chapters currently do not consider their current mode of communication efficient enough. They think that their communication with the DBpedia Association should improve.

 

As pointed out before, most chapters only have little personnel resources. It is no wonder that most of them need help to improve the work and impact of chapter results. The following chart shows the kind of support chapters require to improve their overall work, organization and communication. Most noteworthy, technical, marketing and organization support are hereby the top three aspects the chapters need help with. 

 

 

The good news is all of the chapters maintain a DBpedia Website. However, the frequency of updates varies among them. See the chart on the right.

 

 

 

Earlier this year, we announced that we like to align all chapter websites with the main DBpedia website. That includes a common structure and a corporate design, similar to the main one.  Above all, this is important for the overall image and recognition factor of DBpedia in the tech community. With respect to that, we inquired whether chapters would like to participate in an alignment of the websites or not.

 

 

 

With respect to marketing support, the chapters require from the Association, more than 50% of the chapters like to be frequently promoted via the main DBpedia twitter channel.

 

 

Good news: just forward us your news or tag us with @dbpedia and we will share ’em.

Almost there.

Finally, we asked about chapters requirements to improve their work and, the impact of their chapters’ results. 

 

Bottom line

All in all, we are very grateful for your contribution. Those data will help us to develop a strategy to work towards the targets mentioned above. We will now use this data to conceptualize a little program to assist chapters in their organization and marketing endeavours. Furthermore, the information given will also help us to tackle the different issues that arose, implement the necessary support and improve chapter development and chapter visibility.

In episode two, we will delve into the results of the technical survey. Sit tight and follow us on Twitter, Facebook, LinkedIn or subscribe to our newsletter.

Finally, one last remark. If you want to promote news of your chapter or otherwise like to increase its visibility, you are always welcome to:

  • forward us the respective information to be promoted via our marketing channels 
  • use your own Twitter channel and tag your post with @dbpedia,  so we can retweet your news. 
  • always use #dbpediachapters

Looking forward to your news.

Yours

DBpedia Association

Beta-Test Updates

While everyone at the DBpedia Association was preparing for the SEMANTiCS Conference in Vienna, we also managed to reach an important milestone regarding the beta-test for our data release tool.

First and foremost, already 3500 files have been published with the plugin. These files will be part of the new DBpedia release and are available on our LTS repository.

Secondly, the documentation of the testing has been brought into good shape. Feel free to drop by and check it out.
Thirdly, we reached our first interoperability goal. The current metadata is sufficient to produce RSS 1.0 feeds. See here for further information. We also defined a loose roadmap on top of the readme, where interoperability to DCAT and DCAT-AP has high priority.

 

Now we have some time to support you and work one on one and also prepare the configurations to help you set up the data releases. Lastly, we already received data from DNB and SUMO, so we will start to look into these more closely.

Thanks to all the beta-testers for your nice work.

We keep you posted.

Yours

DBpedia Association

Beta-tests for the DBpedia Databus commence

Finally, we are proud to announce that the beta-testing of our data release tool for data releases on the DBpedia Databus is about to start.

In the past weeks, our developers at DBpedia  have been developing a new data release tool to release datasets on the DBpedia Databus. In that context, we are still looking for beta-testers who have a dataset they wish to release.  Sign up here and benefit from an increased visibility for your dataset and your work done.

We are now preparing the first internal test with our own dataset to ensure the data release tool is ready for the testers. During the testing process, beta-testers will discuss occurring problems, challenges and ideas for improvement via the DBpedia #releases channel on Slack to profit from each other’s knowledge and skills. Issues are documented via GitHub.

The whole testing process for the data release tool follows a 4-milestones plan:

Milestone One: Every tester needs to have a WebID to release data on the DBpedia Databus. In case you are interested in how to set up a WebID, our tutorial will help you a great deal.

Milestone Two: For their datasets, testers will generate DataIDs, that provide detailed descriptions of the datasets and their different manifestations as well as relations to agents like persons or organizations, in regard to their rights and responsibilities.

Milestone Three: This milestone is considered as achieved if an RSS feed feature can be generated. Additionally, bugs, that arose during the previous phases should have been fixed. We also want to collect the testers particular demands and wishes that would benefit the tool or the process. A second release can be attempted to check how integrated fixes and changes work out.

Milestone Four: This milestone marks the final upload of the dataset to the DBpedia Databus which is hopefully possible in about 3 weeks.

For updates on the beta-test, follow this link.

Looking forward to working with you…

Yours,

DBpedia Association

 

PS: In case you want to get one of the last spots in the beta-testing team, just sign up here and get yourself a WebID and start testing.