Botnets are one of the online threats with the biggest presence, causing
billionaire losses to global economies. Nowadays, the increasing number of
devices connected to the Internet makes it necessary to analyze large amounts
of network traffic data. In this work, we focus on increasing the performance
on botnet traffic classification by selecting those features that further
increase the detection rate. For this purpose we use two feature selection
techniques, Information Gain and Gini Importance, which led to three
pre-selected subsets of five, six and seven features. Then, we evaluate the
three feature subsets along with three models, Decision Tree, Random Forest and
k-Nearest Neighbors. To test the performance of the three feature vectors and
the three models we generate two datasets based on the CTU-13 dataset, namely
QB-CTU13 and EQB-CTU13. We measure the performance as the macro averaged F1
score over the computational time required to classify a sample. The results
show that the highest performance is achieved by Decision Trees using a five
feature set which obtained a mean F1 score of 85% classifying each sample in an
average time of 0.78 microseconds.

Go to Source of this post
Author Of this post: <a href="http://arxiv.org/find/cs/1/au:+Velasco_Mata_J/0/1/0/all/0/1">Javier Velasco-Mata</a>, <a href="http://arxiv.org/find/cs/1/au:+Gonzalez_Castro_V/0/1/0/all/0/1">V&#xed;ctor Gonz&#xe1;lez-Castro</a>, <a href="http://arxiv.org/find/cs/1/au:+Fidalgo_E/0/1/0/all/0/1">Eduardo Fidalgo</a>, <a href="http://arxiv.org/find/cs/1/au:+Alegre_E/0/1/0/all/0/1">Enrique Alegre</a>

By admin