Poster – Paper 632

Entity Suggestion Ranking via Context Hashing

Rima Türker, Maria Koutraki, Jörg Waitelonis and Harald Sack


clock_event October 23, 2017, Poster and Demo Reception, 18:30-21:20
house Festsaal 1
download Download paper (preprint)


In text-based semantic analysis the task of named entity linking (NEL) establishes the fundamental link between unstructured data elements and knowledge base entities. The increasing number of applications complementing web data via knowledge base entities has led to a rich toolset of NEL frameworks [4,7]. To resolve linguistic ambiguities, NEL relates available context information via statistical analysis, as e.g. term co-occurrences in large text corpora, or graph analysis, as e.g. connected component analysis on the contextually induced knowledge subgraph. The semantic document annotation achieved via NEL algorithms can furthermore be complemented, upgraded or even substituted via manual annotation, as e.g. in [5]. For this manual annotation task, a popular approach suggests a set of potential entity candidates that fit to the text fragment selected by the user, who decides about the correct entity for the annotation. The high degree of natural language ambiguity causes the creation of a huge sets of entity candidates to be scanned and evaluated. To speed up this process and to enhance its usability, we propose a pre-ordering of the entity candidates set for a predefined context. The complex process of NEL context analysis often is too time consuming to be applied in an online environment. Thus, we propose to speed up the context computation via approximation based on the offline generation of context weight vectors. For each entity, a context vector is computed before- hand and is applied like a hash for quickly computing the most likely entity candidates with respect to a given context. In this paper, the process of entity hashing via context weight vectors is introduced. Context evaluation via weight vectors is evaluated on the test case of SciHi 1 , a web blog on the history of science providing blog posts semantically annotated with DBpedia entities.