Enhancing Rasa NLU model for Vietnamese chatbot

Trang Nguyen, Maxim Shcherbakov

Abstract


Nowadays, the use of chatbots in industry and education has increased substantially. Building the chatbot system using traditional methods less effective than the applied of machine learning (ML) methods. Before chatbot based on finite-state, rule-base, knowledgebase, etc, but these methods still exist limitation. Recently, thanks to the advancement in natural language processing (NLP) and neural network (NN), conversational AI systems have made significant progress in many tasks such as intent classification, entity extraction, sentiment analysis, etc. In this paper, we implemented a Vietnamese chatbot for COVID-19 information that is capable of understanding natural language. It can generate responses, take actions to the user and remember the context of the conversation. We proposed Rasa platform for building chatbot and presented a method using custom pipeline for NLU model. In our work, we applied the pre-trained language models FastText and BERT and created custom tokenizer for our own pipelines. The application of pre-trained language models for NLU model has shown better results than the training model from scratch.

Full Text:

PDF

References


T. Nguyen and M. Shcherbakov, “A Neural Network based Vietnamese Chatbot,” 2018 International Conference on System Modeling & Advancement in Research Trends (SMART), 2018.

I. V. Serban, A. Sordoni, Y. Bengio, A. Courville, and J. Pineau, “Building end-to-end dialogue systems using generative hierarchical neural network models,” arXiv [cs.CL], 2015.

D. Al-Ghadhban and N. Al-Twairesh, “Nabiha: An Arabic dialect chatbot,” Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 3, 2020.

T.-H. Wen et al., “A network-based end-to-end trainable task-oriented dialogue system,” in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, 2017.

“Need-to-Know Chatbot Statistics in 2020”. [Online]. Available: https://www.chatbot.com/blog/chatbot-statistics/. [Accessed: 06-Oct.-2020].

D. Braun, A. Hernandez-Mendez, F. Matthes, and M. Langen, “Evaluating natural language understanding services for conversational question answering systems,” in Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, 2017.

T.M.T. Nguyen, M.V. Shcherbakov, “Целевой чат-бот на основе машиного обучения [A goal-oriented chatbot based on machine learning].” Modeling, optimization and information technology, May 2020. [Online] Available: https://moit.vivt.ru/wp-content/uploads/2020/05/NguyenShcherbakov_2_20_1.pdf

R. K. Sharma, “An Analytical Study and Review of open source Chatbot framework, Rasa,” International Journal of Engineering Research and, vol. V9, no. 06, 2020.

A. Jiao, “An Intelligent Chatbot System Based on Entity Extraction Using RASA NLU and Neural Network,” Journal of Physics: Conference Series, vol. 1487, p. 012014, 2020.

T. Bocklisch, J. Faulkner, N. Pawlowski, and A. Nichol, “Rasa: Open Source Language Understanding and Dialogue Management,” arXiv.org, 15-Dec-2017. [Online]. Available: https://arxiv.org/abs/1712.05181. [Accessed: 2-Oct-2020]

P. H. Quang, “Rasa chatbot: Tăng khả năng chatbot với custom component và custom tokenization(tiếng Việt tiếng Nhật),” Viblo, 16-Mar-2020. [Online]. Available: https://viblo.asia/p/rasa-chatbot-tang-kha-nang-chatbot-voi-custom-component-va-custom-tokenizationtieng-viet-tieng-nhat-Qbq5QN4mKD8. [Accessed: 14-Sep-2020]

M. V. Do, “Xây dựng chatbot bán hàng dựa trên mô hình sinh,” M.S. thesis, Graduate Univ. of Sc. and Tech., Hanoi, 2020. Accessed on: 10 Sep, 2020. [Online]. Available: http://gust.edu.vn/media/27/uftai-ve-tai-day27665.pdf

“The Rasa Core dialogue engine,” Build contextual chatbots and AI assistants with our open source conversational AI framework. [Online]. Available: https://legacy-docs.rasa.com/docs/core/. [Accessed: 1-Oct-2020].

H. Agarwala, R. Becker, M. Fatima, L. Riediger, “Development of an artificial conversation entity for continuous learning and adaption to user’s preferences and behavior” [Online]. Available: https://www.di-lab.tum.de/fileadmin/w00byz/www/Horvath_Final_Documentation_WS18.pdf. [Accessed: 25-Sep-2020].

“underthesea,” PyPI. [Online]. Available: https://pypi.org/project/underthesea/. [Accessed: 25-Sep-2020].

“Word vectors for 157 languages fastText,”fastText. [Online]. Available: https://fasttext.cc/docs/en/crawl-vectors.html. [Accessed: 23-Sep-2020].

A. Singh, “Evaluating the new ConveRT pipeline introduced by RASA,” Medium, 03-Dec-2019. [Online]. Available: https://medium.com/@arunsingh_19834/evaluating-the-new-convert-pipeline-introduced-by-rasa-3db377b8961d. [Accessed: 30-Aug-2020].


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность IT Congress 2024

ISSN: 2307-8162