Applying machine learning algorithms to provide quality requirements specification
Abstract
This article is devoted to the problem of ensuring the quality of requirements specifications for complex technical systems. The purpose of this article is to use neural networks, classification and clustering algorithms to check requirements specifications for consistency and atomicity. It is believed that the use of neural networks will provide a vector representation of textual requirements formulations in order to identify inconsistencies in requirements specifications and to check the atomicity of individual requirements. This article demonstrates the use of such natural language processing techniques as fasttext, doc2vec, and BERT. K-means clustering is used to find inconsistencies in requirements specifications based on the assumption that the requirements of one cluster are potentially conflicting. Requirements are checked for atomicity by using gradient boosting over decision trees. The study showed that using the pretrained BERT neural network gives the best vector representations of requirements for solving clustering and classification problems using k-Means and gradient boosting, respectively. In addition, training the doc2vec model on requirements specifications is impractical, because the number of requirements in the specifications is usually limited and not enough for training, and FastText does not consider the semantics of the full requirement statement. In conclusion, a comparison of the results of the natural language processing methods considered in the article is given.
Full Text:
PDF (Russian)References
Requirements [electronic resource]: System Engineering Thinking Wiki. - Access mode: http://sewiki.ru/Категория:Требования (date of visit: 14.03.2021г.)
Hall E. Requirements Engineering. US, Kent, Gray Publishing, 2005
A gentle introduction to Doc2Vec [electronic resource]: Medium. - Access mode: https://medium.com/wisio/a-gentle-introduction-to-doc2vec-db3e8c0cce5e (date of visit: 12.04.2021г.)
fastText: Library for efficient text classification and representation learning [electronic resource] fastText – Access mode: https://fasttext.cc/. (date of visit: 23.04. 2021г.)
Data Preparation: normal flight - what is data normalization and why is it needed [electronic resource]: School of Big Data. - Access mode: https://www.bigdataschool.ru/blog/нормализация-feature-transformation-data-preparation.html (date of visit: 10.05.2021г.)
BERT, ELMO and Co in pictures (how transfer learning came to NLP) [electronic resource]: Habr. - Access mode: https://habr.com/ru/post/487358/ (date of visit: 24.04.2021г.)
MIPT Yandex & E-Learning Development Fund Finding a structure in the data [electronic resource]: Coursera. – Access mode: https://www.coursera.org/learn/unsupervised-learning/home/welcome (date of visit: 25.04.2021г.)
The task of classification [electronic resource] Wikipedia – Access mode: https://ru.wikipedia.org/wikiЗадача_классификации. (date of visit: 01.05.2021г.)
Classification [electronic resource] MachineLearning – Access mode: http://www.machinelearning.ru/wiki/index.php?title=Классификация. (date of visit: 01.05.2021г.)
Boosting [electronic resource] Wikipedia – Access mode: https://ru.wikipedia.org/wiki/Бустинг. (date of visit: 08.05.2021г.)
Refbacks
- There are currently no refbacks.
Abava Кибербезопасность IT Congress 2024
ISSN: 2307-8162