Research – Paper 237

Entity Comparison in RDF Graphs

Alina Petrova, Evgeny Sherkhonov, Bernardo Cuenca Grau and Ian Horrocks

Research

clock_event

October 23, 2017, 11:00.
house

Lehár 4

Download paper (preprint)

Abstract

In many applications, there is an increasing need for the new types of RDF data analysis that are not covered by standard reasoning tasks such as SPARQL query answering. One such important analysis task is entity comparison, i.e., determining what are similarities and differences between two given entities in an RDF graph. For instance, in an RDF graph about drugs, we may want to compare Metamizole and Ibuprofen and automatically find out that they are similar in that they are both analgesics but, in contrast to Metamizole, Ibuprofen also has a considerable anti-inflammatory effect. Entity comparison is a widely used functionality available in many information systems, such as universities or product comparison websites. However, comparison is typically domain-specific and depends on a fixed set of aspects to compare. In this paper, we propose a formal framework for domain-independent entity comparison over RDF graphs. We model similarities and differences between entities as SPARQL queries satisfying certain additional properties, and propose algorithms for computing them.

6

Leave a Reply (Click here to read the code of conduct)

newest oldest most voted

Guest

Héctor

Though I find the approach quite well defined and potentially useful, I worry about its scalability. How well would it work to find interesting commonalities/differences in a pool of millions of entities described using a model containing tens of thousands of properties?

Guest

Alina

Hi Héctor, thanks much for the comment! Indeed, we are currently working on scalable algorithms for both (most specific) similarities and (most general) differences. 1) Despite the complexity of finding a difference query being quite high, it stems from the presence of blank nodes. In real-world scenario we would never hit the worst case. 2) In addition, in a reasonable scenario the size/depth of the query is bounded by some small value (due to readability), in which case similarity and difference computation becomes scalable.

Guest

Artem Revenko

Very interesting to compare to this approach: https://link.springer.com/chapter/10.1007/978-3-319-60438-1_61

Guest

Alina

Thanks much for the reference, Artem!

Guest

Ernesto

I see a potential application in (traditional) instance matching where one of the task is to find equivalent entities.

Guest

Alina

Hi Ernesto, thank you for the suggestion! Indeed, the framework could be used for equivalent and near-equivalent instance matching and discovery.

Research – Paper 237

Entity Comparison in RDF Graphs

Alina Petrova, Evgeny Sherkhonov, Bernardo Cuenca Grau and Ian Horrocks

Abstract

6 Leave a Reply (Click here to read the code of conduct)

6

Leave a Reply (Click here to read the code of conduct)