Accuracy analysis of machine learning models using vectorization methods for heterogeneous text data classification tasks
Abstract
Full Text:
PDF (Russian)References
G. O. Young, “Synthetic structure of industrial plastics (Book style with paper title and editor),” in Plastics, 2nd ed. vol. 3, J. Peters, Ed. New York: McGraw-Hill, 1964, pp. 15–64.
Eprev A.S. Automatic classification of text documents. Mathematical structures and modeling. 2010. issue. 21, pp. 65-81.
Poletaeva N.G. Classification of machine learning systems Bulletin of the Baltic Federal University. I. Kant. Series: Physical, mathematical and technical sciences. 2020. №1. pp. 5-22.
Fedyushkin N. A., Fedosin S. A. On the choice of methods for vectorization of textual information. Scientific and technical bulletin of the Volga region. 2019. V. 6. pp. 129-134.
Multi-Lingual Lyrics for Genre Classification [Online]. Available: https://www.kaggle.com/datasets/mateibejan/multilingual-lyrics-for-genre-classification. Accessed: 21.02.2022
(10)Dataset Text Document Classification. [Online]. Available: https://www.kaggle.com/datasets/jensenbaxter/10dataset-text-document-classification. Accessed: 21.02.2022
Klimov D.V. Preprocessing of text messages for the metric classifier. Science symbol. 2017. No. 12. pp.25-32
Musaev A. A. et al. Review of modern technologies for extracting knowledge from text messages. Computer research and modeling. 2021 Vol. 13. No. 6. pp. 1291–1315 DOI: 10.20537/2076-7633-2021-13-6-1291-1315
Bolshakova E.I., Vorontsov K.V., Efremova N.E., Klyshinsky E.S., Lukashevich N.V., Sapin A.S. Automatic processing of texts in natural language and data analysis: textbook. allowance. Moscow.: Publishing House of the National Research University Higher School of Economics. 2017. 269 p.
sklearn.feature_extraction.text.HashingVectorizer, scikit-learn 1.0.2 documentation [Online]. Available: https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.HashingVectorizer.html. Accessed: 3.04.2022
Refbacks
- There are currently no refbacks.
Abava Кибербезопасность IT Congress 2024
ISSN: 2307-8162