Building ontological knowledge links based on searching and analyzing text links

D.S. Miheev

Abstract


The article discusses the main aspects of controlling the referential connectivity of a set of electronic texts on the example of the field of knowledge of biomedicine. The complexity of finding the occurrences of one term in the text of the definition of another term is analyzed. Considered the problem of finding the occurrences of potential text links in the text of the definition of a particular term. The main problems are indicated: the use of terms in various syntactic and grammatical forms (various endings, cases, prepositions), as well as the search for terms whose name consists of several words (search for phrases). Among the priorities are the following: the definition of the basis of the word in the name of the term, the search for occurrences, taking into account phrases. There are options for solving these problems with the help of modern text processing methods: splitting into n-grams, stemming, the use of regular expressions. The author proposes the idea of creating a software tool that allows for the binding of a finite set of documents by the method of building ontological links based on the analysis of the identified text links.


Full Text:

PDF (Russian)

References


Gene-Ontology-Consortium. Creating the Gene Ontology Resource: Design and Implementation // Genome Res. - 2001. - V. 11. - P. 1425-1433.

Ogekjan I. N., Volchek N. M., Vysockaja E. V. i dr. «Bol'shoj spravochnik: Ves' russkij jazyk. Vsja russkaja literatura» – Mn.: Izd-vo Sovremennyj literator, 2003. – 992 s.

Russian stemming algorythm, URL: http://snowball.tartarus.org/algorithms/russian/stemmer.html (data obrashhenija: 25.08.18).

Proceedings of the 7th Annual Conference ZNALOSTI 2008, Bratislava, Slovakia, pp. 54-65, February 2008. ISBN 978-80-227-2827-0.

Jurafsky, D. and Martin, J.H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. – Pearson Prentice Hall, 2009. – 988 p. – ISBN 9780131873216.


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность MoNeTec 2024

ISSN: 2307-8162