Published on Sun Jul 15 2018

Ontology-Based Query Expansion with Latently Related Named Entities for Semantic Text Search

Vuong M. Ngo, Tru H. Cao

The content of a document or a query is mainly defined by both keywords and named entities occurring in it. Named entities have hidden features, namely, their aliases, classes, and identifiers, which are hidden from their textual appearance. We propose an ontology-based generalized vector space model to semantic

0
0
0
Abstract

Traditional information retrieval systems represent documents and queries by keyword sets. However, the content of a document or a query is mainly defined by both keywords and named entities occurring in it. Named entities have ontological features, namely, their aliases, classes, and identifiers, which are hidden from their textual appearance. Besides, the meaning of a query may imply latent named entities that are related to the apparent ones in the query. We propose an ontology-based generalized vector space model to semantic text search. It exploits ontological features of named entities and their latently related ones to reveal the semantics of documents and queries. We also propose a framework to combine different ontologies to take their complementary advantages for semantic annotation and searching. Experiments on a benchmark dataset show better search quality of our model to other ones.

Fri Jul 20 2018
NLP
A Generalized Vector Space Model for Ontology-Based Information Retrieval
Named entities (NE) are objects that are referred to by names such as people, organizations and locations. Named entities and keywords are important to the meaning of a document. We propose a generalized vector space model that combines named entity and keywords.
0
0
0
Fri Aug 05 2016
NLP
Bridging the Gap: Incorporating a Semantic Similarity Measure for Effectively Mapping PubMed Queries to Documents
The main approach of traditional information retrieval (IR) is to examine how many words from a query appear in a document. Unlike other similarity measures, the proposed method relies on neural word embeddings to compute the distance between words. This process helps identify related words when no direct matches are found.
0
0
0
Thu Apr 23 2020
NLP
Coupled intrinsic and extrinsic human language resource-based query expansion
Poor information retrieval performance has often been attributed to the query-document vocabulary mismatch problem. To alleviate this problem, query expansion processes are applied in order to integrate additional terms to an initial query. This requires accurate identification of main query concepts.
0
0
0
Thu Nov 16 2017
Artificial Intelligence
Remedies against the Vocabulary Gap in Information Retrieval
Search engines rely heavily on term-based approaches that represent queries and documents as bags of words. A high matching degree at the term level does not necessarily mean high relevance. Documents that match null query terms may still be highlyrelevant.
0
0
0
Sun Jul 29 2018
Artificial Intelligence
Discovering Latent Information By Spreading Activation Algorithm For Document Retrieval
Spreading activation is an algorithm for finding latent information in a query by exploiting relations between nodes in an associative network or semantic network. The classical spreading activation algorithm uses all relations of a node in the network that will add unsuitable information into the query.
0
0
0
Thu Mar 07 2013
NLP
Concept-based indexing in text information retrieval
Traditional information retrieval systems rely on keywords to index documents and queries. We propose a novel approach for semantic indexing based on concepts identified from a linguistic resource. The resulting system is evaluated on the TIME test collection.
0
0
0