Analysis and creation of network traffic datasets to detect computer attacks

V.V. Charugin, A.N. Chesalin


The paper examines analysis and formation features of network traffic to detect network anomalies. The paper considers the NSL-KDD and UNSW-NB15 network attack datasets and identifies redundant features of network traffic in them. The selection of the most significant features is carried out to identify anomalies. A new set of modern network attacks is being formed to test machine learning algorithms. The analysis of machine learning methods (classifier of k-nearest neighbors, classifier of random forest, classifier of multilayer perceptron, XGBoost) is carried out for the problem of intrusion detection based on the studied and created datasets. The classification quality is evaluated using the following metrics: Accuracy and F1-score. The results obtained in this work can be applied to testing, machine learning methods and the development of intrusion detection systems.

Full Text:

PDF (Russian)


L.Dhanabal, Dr. S.P. Shantharajah “A Study on NSL-KDD Dataset for Intrusion Detection System Based on Classification Algorithms”, IJARCCE, vol. 4, no. 6, 2015.

NSL-KDD dataset. Available: (URL)

N. Moustafa, J. Slay. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). 2015.

The UNSW-NB15 Dataset. Available: (URL)

Adetunmbi Adebayo O., Adeola Oladele Stephen “ Analysis of KDD ’99 Intrusion Detection Dataset for Selection of Relevance Features”, Proceedings of the World Congress on Engineering and Computer Science, 2010.

Kanimozhi V., Jacob P. UNSW-NB15 “Dataset Feature Selection and Network Intrusion Detection using Deep Learning”, International Journal of Recent Technology and Engineering, vol. 7, no. 5, 2019.

Moustafa N., Slay J. “The Significant Features of the UNSW-NB15 and the KDD99 Data Sets for Network Intrusion Detection Systems”, International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, no.4, 2015.

Oreshkov V. Klassifikacija dannyh metodom k-blizhajshih sosedej, 2021. Available: (URL)

Realizacija i razbor algoritma «sluchajnyj les» na Python. Perevody, 2019. Available: (URL)

Tarik Rashid. Sozdaem nejronnuju set'. - SPb.: OOO «Al'fa-kniga», 2017. - 272 s.

Introduction to Boosted Trees. Available: // (URL)

Sheluhin O.I. i dr. Obnaruzhenie vtorzhenij v komp'juternye seti (setevye anomalii). Uchebnoe posobie dlja vuzov – M.: Gorjachaja linija – Telekom, 2018. – 220 s: il.

Chesalin A.N., Grodzenskij S.Ja., Nilov M.Ju., Agafonov A.N. Modifikacija algoritma WaldBoost dlja povyshenija jeffektivnosti reshenija zadach raspoznavanija obrazov v real'nom vremeni // Rossijskij tehnologicheskij zhurnal. 2019. T. 7. # 5. S. 20–29. Available: (URL).

Chesalin A.N. Primenenie kaskadnyh algoritmov klassifikacii dlja sovershenstvovanija sistem obnaruzhenija vtorzhenij // Nelinejnyj mir. 2022. T. 20. # 1. S. 24−41. Available: (URL)

Samoshkin D. Perspektivnye DDoS-ataki: o chjom nuzhno znat' i kak gotovit'sja? 2020. Available: (URL)


  • There are currently no refbacks.

Abava  Кибербезопасность FRUCT 2023

ISSN: 2307-8162