Tsinghua Science and Technology  2021, Vol. 26 Issue (4): 387-402    doi: 10.26599/TST.2020.9010021
DGA-Based Botnet Detection Toward Imbalanced Multiclass Learning
Yijing Chen(),Bo Pang(),Guolin Shao*(),Guozhu Wen(),Xingshu Chen()
College of Cybersecurity, Sichuan University, Chengdu 610065, China.
Cybersecurity Research Institute, Sichuan University, Chengdu 610065, China.

Abstract

Botnets based on the Domain Generation Algorithm (DGA) mechanism pose great challenges to the main current detection methods because of their strong concealment and robustness. However, the complexity of the DGA family and the imbalance of samples continue to impede research on DGA detection. In the existing work, the sample size of each DGA family is regarded as the most important determinant of the resampling proportion; thus, differences in the characteristics of various samples are ignored, and the optimal resampling effect is not achieved. In this paper, a Long Short-Term Memory-based Property and Quantity Dependent Optimization (LSTM.PQDO) method is proposed. This method takes advantage of LSTM to automatically mine the comprehensive features of DGA domain names. It iterates the resampling proportion with the optimal solution based on a comprehensive consideration of the original number and characteristics of the samples to heuristically search for a better solution around the initial solution in the right direction; thus, dynamic optimization of the resampling proportion is realized. The experimental results show that the LSTM.PQDO method can achieve better performance compared with existing models to overcome the difficulties of unbalanced datasets; moreover, it can function as a reference for sample resampling tasks in similar scenarios.

Received: 04 October 2019      Published: 12 January 2021
Fund:  National Natural Science Foundation of China(61272447);National Entrepreneurship & Innovation Demonstration Base of China(C700011);Key Research & Development Project of Sichuan Province of China(2018G20100)
Corresponding Authors: Guolin Shao     E-mail: 2016141531037@stu.scu.edu.cn;2016141231157@stu.scu.edu.cn;sgllearn@163.com;2016141531010@stu.scu.edu.cn;chenxsh@scu.edu.cn
About author: Yijing Chen received the bachelor degree from Sichuan University, Chengdu, China in 2020. Her research interests include deep learning of network security and big data analysis. She has won the network security scholarship of Sichuan University in 2017 and 2018, the first prize of the 12th China University Computer Design Competition in 2019, the Honorable Mention of the Mathematical Contest in Modeling in 2019, etc.|Bo Pang received the bachelor degree from Sichuan University, Chengdu, China in 2020. His research interests include Web security, deep learning for Web security, big data analysis, etc. He has received numerous honors and awards, including the fifth place in the Sichuan University AI Challenge in 2018 and the third prize of Sichuan University Student Information Security Technology Competition.|Guolin Shao received the BS and PhD degrees from Sichuan University, Chengdu, China in 2013 and 2018, respectively. His research interests include deep learning for cyber security and big data analysis. He has published more than 16 peer reviewed papers. He has received numerous honors and awards, including the National Cyber Security Scholarship in 2016 and 2018, the National Scholarship in 2017, the Top Ten Academic Star of Sichuan University, and the First Prize Scholarship in 2013, 2015, and 2017, respectively.|Guozhu Wen received the bachelor degree from Sichuan University, Chengdu, China in 2020. His research interests include Web security, deep learning for Web security, big data analysis, etc. He has received numerous honors and awards, including the fifth place in the Sichuan University AI Challenge in 2018 and the third prize of Sichuan University Student Information Security Technology Competition.|Xingshu Chen is a full professor at College of Cybersecurity, Sichuan University, Chengdu, China. She received the master and PhD degrees from Sichuan University in 1999 and 2004, respectively. Her main research interest focuses on cloud computing, big data analysis, and network security.