We are pleased to announce the release of DBpedia 3.3. This release is based on Wikipedia dumps of May 2009.
The new release includes the following improvements over DBpedia 3.2:
1. more accurate abstract extraction
2. labels and abstracts in 80 languages
3. several infobox extraction bugfixes
4. new links to Dailymed, Diseasome, Drugbank, Sider, TCM
5. updated Open Cyc links
You can find the datasets here, and the rdf files here. The dataset is available to be queried at our Sparql endpoint.
After eight long months without DBpedia release (due to a lack of Wikipedia dumps), today’s release will bring us up to speed again, and we will release DBpedia datasets much more often in the future.
After quite some work into improving the DBpedia information extraction framework, we have released a new version of the DBpedia dataset today.
The renewed DBpedia dataset describes 1,950,000 “things”, including at least 80,000 persons, 70,000 places, 35,000 music albums, 12,000 films. It contains 657,000 links to images, 1,600,000 links to relevant external web pages and 440,000 external links into other RDF datasets. Altogether, the DBpedia dataset now consists of around 103 million RDF triples.
We worked on improving the data quality in order to make the dataset more usable and useful to developers and fixed a lot of bugs submitted by our growing developer-community. We also reworked our framework to enable developers to extend the dataset with their own extractors.
We are grateful for all contributions and are looking forward to support new projects based on DBpedia data.