In-Use – Paper 424

Sustainable Linked Data generation: the case of DBpedia

Wouter Maroy, Anastasia Dimou, Dimitris Kontokostas, Ben De Meester, Jens Lehmann, Erik Mannens and Sebastian Hellmann

In-Use

clock_eventOctober 17, 2017, 16:30.
house Stolz 1
download Download paper (preprint)

Abstract

dbpedia ef, the generation framework behind one of the Linked Open Data cloud’s central interlinking hubs, has limitations with regard to quality, coverage and sustainability of the generated dataset. dbpedia can be further improved both on schema and data level. Errors and inconsistencies can be addressed by amending (i) the dbpedia ef; (ii) the dbpedia mapping rules; or (iii) Wikipedia itself from which it extracts information. However, even though the dbpedia ef and mapping rules are continuously evolving and several changes were applied to both of them, there are no significant improvements on the dbpedia dataset since its limitations were identified. To address these shortcomings, we propose adapting a different semantic-driven approach that decouples, in a declarative manner, the extraction, transformation and mapping rules execution. In this paper, we provide details regarding the new dbpedia ef, its architecture, technical implementation and extraction results. This way, we achieve an enhanced data generation process, which can be broadly adopted, and that improves its quality, coverage and sustainability.