Demo – Paper 634

Efficient synonym search by semantic linking of multiple data sets

Kenny Knecht, Bérénice Wulbrecht, Filip Pattyn and Hans Constandt

Demo


download Download paper (preprint)

Abstract

We describe a method to automatically pick a highly relevant subset of synonyms to broaden a text search based on keywords. Public datasets in the bio-medical area tend to provide a plethora of synonyms or alternative names. It is not uncommon to encounter a chemical or diseases with more than 50 different names in data sets like UMLS or ChEMBL. This may result in inefficient searches and sometimes even in false positives. Through semantic linking of several datasets we define a heuristic which increases the power of the search meanwhile making it more efficient. We evaluated the method on the 500 most common keyword searches in DISQOVER in the first 6 months of 2017 and present the results. We obtain more than 98% of the hits by submitting only 16% of the synonyms. The method is implemented and available for everybody in the semantic web platform DISQOVER (www.disqover.com). We implemented this method as a visual suggestion, which the user can override manually at any time. Notwithstanding the fact that we focus our examples and concrete implementation on the biomedical databases in the publicly available DISQOVER, we would like to stress that the method is much more generally applicable.