Enhancing the performance of IoT network intrusion detection models using NF-ToN-IoT-V2 and IoTID20 datasets with chi-square feature selection

Authors

  • Nadia Thereza Universitas Indonesia
  • Pardomuan Raja Harahap PT. Rajawali Nusantara Indonesia

DOI:

https://doi.org/10.62420/selco.v1i2.7

Keywords:

Intrusion detection system, Internet of things, NF-ToN-IoT-V2, IoTID20, Machine learning, Feature selection

Abstract

The Internet of Things (IoT) expansion increases the number and size of networks and the volume of sensitive and private data. Consequently, IoT networks have become vulnerable to various threats and attacks. Researchers have recently devised intrusion detection systems (IDSs) to detect threats and attacks on IoT networks. However, in developing IDS for IoT networks, previous studies predominantly used more limited datasets to depict the actual IoT network characteristics. Thus, this research used datasets containing network flow records from real IoT networks, namely the NF-ToN-IoT-V2 and IoTID20. Various machine-learning algorithms, such as random forest, decision tree, naïve Bayes, AdaBoost, and XGBoost, were employed to train and evaluate the datasets for developing intrusion detection models. We investigated the model's performance based on accuracy, precision, recall, F1-score, false positive rate, training, and testing time utilization. We used the chi-square algorithm for feature selection to select the most relevant and valuable features. The findings indicate that implementing feature selection using chi-square improves the performance of the detection system models. By applying the Chi-Square algorithm, the RF model that outperforms in terms of accuracy performance increases its accuracy up to 0.42% on the NF-ToN-IoT-V2 testing and 0.16% on the IoTID20 testing. The DT model with the fastest training and testing time reduces its time utilization by 6.98% on the NF-ToN-IoT-V2 testing and 29.63% on the IoTID20 testing through feature selection.

Downloads

Published

2025-06-05