Skip to main navigation menu Skip to main content Skip to site footer

A DistilBERT-Based Email Phishing Detection System with SDN-Orchestrated Enforcement

Abstract

Email phishing is a sophisticated type of cybercrime where perpetrators use credible parties to defraud their targets by using spam emails to spoof their accounts. This paper proposes NetShield-Phish, a novel multi-modal phishing detection system, which combines DistilBERT-based email content analysis with email content text classification and lexical feature engineering in a single framework of late-fusion, in conjunction with Software-Defined Networking (SDN) implementation of threat response.  The methodology employs a fine-tuned DistilBERT model for email body classification, a DistilBERT-based email body content text analyzer, and a logistic regression model utilizing engineered lexical features, with probabilistic scores combined through late fusion and calibrated using ROC-based threshold optimization targeting a false positive rate. The system was evaluated using stratified 4-fold cross-validation on a dataset of 28,748 emails (61% legitimate, 39% phishing). The proposed system achieved 0.98 accuracy and 0.98 macro-F1 on held-out data; phishing recall reached 0.99 with a small fraction of legitimate emails routed to the monitor tier. These findings indicate that multi-modal fusion with explicit calibration can deliver high recall while containing user impact, and that coupling detection with programmable SDN actions bridges the gap between lab accuracy and dependable, operations-ready defense.

Keywords

Software-Defined Networking, Email phishing, DistilBERT, multi-modal, ROC

PDF

References

  1. Al-Dabbagh, M. and A.K. Ali, Employing light fidelity technology in health monitoring system. Indonesian Journal of Electrical Engineering and Computer Science (IJEECS), 2022. 26(2): p. 989-997.http://doi.org/10.11591/ijeecs.v26.i2.pp989-997.
  2. Llwaah, F., Resource Utilization Performance of Complex Workflows on the Public Cloud: A Simulation-Based Approach. Resource Utilization Performance of Complex Workflows on the Public Cloud: A Simulation-Based Approach, 2024. 16(1): p. 1-11.https://doi.org/10.12785/ijcds/1571111484.
  3. Ahmed, I., A.K. Ali, and M.S. Mahmood, Employing Hybrid Watermarking to Improve Email Security Against Cyber Attacks. Journal of Soft Computing and Data Mining, 2025. 6(1): p. 435-447.: https://doi.org/10.30880/jscdm.2025.06.01.029.
  4. Amanuel, S.V. and I.M. Ahmed. A Review of the Various Machine Learning Algorithms for Cloud Computing. in 2022 4th International Conference on Advanced Science and Engineering (ICOASE). 2022. IEEE.https://doi.org/10.1109/ICOASE56293.2022.10075592.
  5. Paul, M., et al., Phishing email detection using inputs from artificial intelligence. arXiv preprint arXiv:2405.12494, 2024.https://doi.org/10.48550/arXiv.2405.12494.
  6. Tupsamudre, H., S. Jain, and S. Lodha, Phishmatch: A layered approach for effective detection of phishing urls. arXiv preprint arXiv:2112.02226, 2021.https://doi.org/10.48550/arXiv.2112.02226.
  7. Kashapov, A., et al. Email summarization to assist users in phishing identification. in Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security. 2022.https://doi.org/10.1145/3488932.3527292.
  8. Md, A.Q., et al., Efficient dynamic phishing safeguard system using neural boost phishing protection. Electronics, 2022. 11(19): p. 3133.https://doi.org/10.3390/electronics11193133.
  9. Mohammed, S.J., A.K. Ali, and I.M. Ahmed, Anti-Cyber Childhood Exploitation: An Online Game Chat Monitoring System. Mesopotamian Journal of CyberSecurity, 2025. 5(2): p. 822-841.https://doi.org/10.58496/MJCS/2025/047.
  10. Shmalko, M., et al., Profiler: Profile-Based Model to Detect Phishing Emails. arXiv preprint arXiv:2208.08745, 2022.https://doi.org/10.48550/arXiv.2208.08745.
  11. Ali Abdulrazzaq, K., A.K. Ali, and S. Praptodiyono. The impact of elliptic curves name selection to session initiation protocol server. in International Conference on Advances in Cyber Security. 2020. Springer.https://doi.org/10.1007/978-981-33-6835-4_15
  12. Mohammed, S.J. and Z.N. Al-Kateeb, Chao_SIFT based encryption approach to secure audio files in cloud computing. Multimedia Tools and Applications, 2024: p. 1-15.https://doi.org/10.1007/s11042-024-19424-0.
  13. AL-Azzawi, R.M.A. and S.S.M. AL-Dabbagh. Securing data in IoT-RFID-based systems using lightweight cryptography algorithm. in International Conference of Reliable Information and Communication Technology. 2023. Springer.https://doi.org/10.22146/jnteti.v13i3.11824.
  14. Butt, U.A., et al., Cloud-based email phishing attack using machine and deep learning algorithm. Complex & Intelligent Systems, 2023. 9(3): p. 3043-3070.https://doi.org/10.1007/s40747-022-00760-3.
  15. Phu, A.T., et al., Defending SDN against packet injection attacks using deep learning. Computer Networks, 2023. 234: p. 109935.https://doi.org/10.1016/j.comnet.2023.109935.
  16. Mozo, A., et al., A machine-learning-based cyberattack detector for a cloud-based SDN controller. Applied Sciences, 2023. 13(8): p. 4914.https://doi.org/10.3390/app13084914.
  17. RUBY, A.U., Enhancing Phishing URL Detection Accuracy in Software-Defined Networks (SDNs) through Feature Selection and Machine Learning Techniques. 2024.https://doi.org/10.54216/JCIM.170216.
  18. Chinta, P.C.R., et al., Building an Intelligent Phishing Email Detection System Using Machine Learning and Feature Engineering. European Journal of Applied Science, Engineering and Technology, 2025. 3(2): p. 41-54.https://doi.org/10.59324/ejaset.2025.3(2).04.
  19. Verma, R.N. and S. Patil, Cyber security threats prevention, detection and mitigation using machine learning techniques. 2024.https://doi.org/10.11610/isij.4714.
  20. Al-Subaiey, A., et al., Novel interpretable and robust web-based AI platform for phishing email detection. Computers and Electrical Engineering, 2024. 120: p. 109625.https://doi.org/10.1016/j.compeleceng.2024.109625.
  21. Mladenovic, M., V. Osmjanski, and S.V. Stankovic, Cyber-aggression, cyberbullying, and cyber-grooming: A survey and research challenges. ACM Computing Surveys (CSUR), 2021. 54(1): p. 1-42.https://doi.org/10.1145/3424246.
  22. Ahmed, I.M. and M.Y. Kashmoola, CCF based system framework in federated learning against data poisoning attacks. Journal of Applied Science and Engineering, 2022. 26(7): p. 971-979.https://doi.org/10.6180/jase.202307_26(7).0008.

Downloads

Download data is not yet available.

Similar Articles

21-30 of 44

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)