Paper The following article is Open access

Selection of scientific articles according to the degree of proximity to the semantic pattern of the title and phrases of the abstract

and

Published under licence by IOP Publishing Ltd
, , Citation D V Mikhaylov and G M Emelyanov 2019 J. Phys.: Conf. Ser. 1352 012034 DOI 10.1088/1742-6596/1352/1/012034

1742-6596/1352/1/012034

Abstract

The article is devoted to the problem of numerical evaluation of the proximity of a thematic text to the most rational (reference) language version of the description of the piece of knowledge it represents. This problem is relevant for the implementation of targeted selection of textual information without losing the useful semantic component. Examples of practical applications here can be the selection of articles for publication in scientific journals, as well as the development of training courses and educational portals. In the proposed solution, the basis for assessing the proximity of a text to a semantic pattern (i.e. sense standard) is the division of the words of each phrase into classes according to the value of the TF-IDF measure relative to the texts of the corpus pre-formed by an expert. The analyzed texts considered in the paper are the abstracts of scientific articles along with their titles. At the same time, the semantic images of the texts closest to the standard determine the words with the highest TF-IDF values, which, being neighbors in a linear series, are most likely related by meaning and form key combinations. The proposed numerical estimate of the proximity to the standard makes it possible to rank articles according to the significance of the described fragments of knowledge with respect to a given subject area, as well as to the non-redundancy of the description itself.

Export citation and abstract BibTeX RIS

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Please wait… references are loading.
10.1088/1742-6596/1352/1/012034