Semantic Classification of Russian Prepositional Phrases with Transformer Embeddings

A. V. Belyi, D. V. Boitsova, E. A. Botvineva, V. V. Vybornaya, A. M. Goncharova, O. A. Mitrofanova, A. A. Rodina


The article describes frequency characteristics of the preposition's ratio and their meanings in the database of Russian prepositions and considers the task of creating an effective semantic classifier of prepositional phrases trained and testes on the dataset. The database of Russian prepositions discussed in the article was created within the framework of the project ‘Quantitative Grammar of Russian Prepositional Constructions’ developed at the Department of Mathematical Linguistics of Saint Petersburg State University. The study was also based on a corpus of 200 syntactically ambiguous sentences described in D.A. Chernova’s doctoral research “The Process of Processing Syntactically Ambiguous Sentences: A Psycholinguistic Study”. In the present work a novel tree-based classifier architecture consisting of a main multiclass classifier and a supportive binary classifier is proposed. This architecture significantly improves performance compared to previous work, both in overall and on previously troublesome highly confused classes. Experiments were conducted with different types of classifiers and various embedding models for the Russian language used for encoding the dataset. The best solution provides F1-score of 0,76 leveraging SVM classifiers and a DeepPavlov/rubert-base-cased model.

Full Text:

PDF (Russian)


ISSN: 2307-8162