Published on Tue Apr 18 2017

Semantic Similarity from Natural Language and Ontology Analysis

Sébastien Harispe, Sylvie Ranwez, Stefan Janaqi, Jacky Montmain

The aim of this monograph is to convey novices as well as researchers of these domains towards a better understanding of semantic similarity estimation. The two state-of-the-art approaches for estimating and quantifying semantic similarities/relatedness of semantic entities are presented in detail.

0
0
0
Abstract

Artificial Intelligence federates numerous scientific fields in the aim of developing machines able to assist human operators performing complex treatments -- most of which demand high cognitive skills (e.g. learning or decision processes). Central to this quest is to give machines the ability to estimate the likeness or similarity between things in the way human beings estimate the similarity between stimuli. In this context, this book focuses on semantic measures: approaches designed for comparing semantic entities such as units of language, e.g. words, sentences, or concepts and instances defined into knowledge bases. The aim of these measures is to assess the similarity or relatedness of such semantic entities by taking into account their semantics, i.e. their meaning -- intuitively, the words tea and coffee, which both refer to stimulating beverage, will be estimated to be more semantically similar than the words toffee (confection) and coffee, despite that the last pair has a higher syntactic similarity. The two state-of-the-art approaches for estimating and quantifying semantic similarities/relatedness of semantic entities are presented in detail: the first one relies on corpora analysis and is based on Natural Language Processing techniques and semantic models while the second is based on more or less formal, computer-readable and workable forms of knowledge such as semantic networks, thesaurus or ontologies. (...) Beyond a simple inventory and categorization of existing measures, the aim of this monograph is to convey novices as well as researchers of these domains towards a better understanding of semantic similarity estimation and more generally semantic measures.

Fri Oct 04 2013
NLP
Semantic Measures for the Comparison of Units of Language, Concepts or Instances from Text and Knowledge Base Analysis
Semantic measures are widely used today to estimate the strength of the relationship between elements of various types. Semantic measures generalize the well-known notions of semantic similarity, semantic relatedness and semantic distance.
0
0
0
Sun May 15 2016
NLP
A Proposal for Linguistic Similarity Datasets Based on Commonality Lists
Similarity is a core notion that is used in psychology and two branches of linguistics: theoretical and computational. The later makes humans assign low similarity scores to words that have nothing in common. The former makes similarity scores ambiguous, making them difficult to assess.
0
0
0
Sun Apr 19 2020
NLP
Evolution of Semantic Similarity -- A Survey
Estimating the semantic similarity between text data is one of the most open research problems in the field of Natural Language Processing. Various semantic similarity methods have been proposed over the years. This survey article traces the evolution of such methods, categorizing them based on their underlying principles.
0
0
0
Wed Jan 15 2014
NLP
Text Relatedness Based on a Word Thesaurus
A measure of relatedness between text segments must take into account both the lexical and the semantic relatedness. The approach exploits only a word thesaurus in order to devise implicit semantic links between words. The proposed method outperforms every lexicon-based method.
0
0
0
Wed Mar 23 2016
NLP
Evaluating semantic models with word-sentence relatedness
Semantic textual similarity (STS) systems are designed to encode and evaluate the semantic similarity between words, phrases, sentences, and documents. One method for assessing the quality or authenticity of semantic informationencoded in these systems is by comparison with human judgments.
0
0
0
Fri Aug 15 2014
NLP
SimLex-999: Evaluating Semantic Models with (Genuine) Similarity Estimation
SimLex-999 explicitly quantifies similarity rather than association or relatedness. This diversity enables fine-grained analyses of the performance of models on different types.
0
0
0