Skip to main navigation menu Skip to main content Skip to site footer

Classification Based on Decision Tree Algorithm for Machine Learning

Abstract

Decision tree classifiers are regarded to be a standout of the most well-known methods to data classification representation of classifiers. Different researchers from various fields and backgrounds have considered the problem of extending a decision tree from available data, such as machine study, pattern recognition, and statistics. In various fields such as medical disease analysis, text classification, user smartphone classification, images, and many more the employment of Decision tree classifiers has been proposed in many ways. This paper provides a detailed approach to the decision trees. Furthermore, paper specifics, such as algorithms/approaches used, datasets, and outcomes achieved, are evaluated and outlined comprehensively. In addition, all of the approaches analyzed were discussed to illustrate the themes of the authors and identify the most accurate classifiers. As a result, the uses of different types of datasets are discussed and their findings are analyzed.

Keywords

Machine Learning, Supervised, Classification, Decision Tree

PDF

References

  1. D. Abdulqader, A. Mohsin Abdulazeez, and D. Zeebaree, “Machine Learning Supervised Algorithms of Gene Selection: A Review,” Apr. 2020.
  2. M. W. Libbrecht and W. S. Noble, “Machine learning applications in genetics and genomics,” Nature Reviews Genetics, vol. 16, no. 6, pp. 321–332, 2015.
  3. J. Wang, P. Neskovic, and L. N. Cooper, “Training Data Selection for Support Vector Machines,” in Advances in Natural Computation, vol. 3610, L. Wang, K. Chen, and Y. S. Ong, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005, pp. 554–564.
  4. D. Maulud and A. M. Abdulazeez, “A Review on Linear Regression Comprehensive in Machine Learning,” Journal of Applied Science and Technology Trends, vol. 1, no. 4, pp. 140–147, 2020.
  5. G. Carleo et al., “Machine learning and the physical sciences,” Reviews of Modern Physics, vol. 91, no. 4, p. 045002, 2019.
  6. T. Hillel, M. Bierlaire, M. Elshafie, and Y. Jin, “A systematic review of machine learning classification methodologies for modelling passenger mode choice,” Journal of Choice Modelling, p. 100221, 2020.
  7. D. Zeebaree, H. Haron, A. Mohsin Abdulazeez, and D. Zebari, Machine learning and Region Growing for Breast Cancer Segmentation. 2019, p. 93.
  8. C. Feng, S. Wu, and N. Liu, “A user-centric machine learning framework for cyber security operations center,” in 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), Beijing, China, Jul. 2017, pp. 173–175, doi: 10.1109/ISI.2017.8004902.
  9. S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, “Supervised machine learning: A review of classification techniques,” Emerging artificial intelligence applications in computer engineering, vol. 160, no. 1, pp. 3–24, 2007.
  10. S. B. Kotsiantis, I. D. Zaharakis, and P. E. Pintelas, “Machine learning: a review of classification and combining techniques,” Artif Intell Rev, vol. 26, no. 3, pp. 159–190, Nov. 2006, doi: 10.1007/s10462-007-9052-3.
  11. C. Surv, M. N. Murty, P. J. Flynn, A. K. Jain, and P. J. Flynn, And. 1999.
  12. D. Sharma and N. Kumar, “A Review on Machine Learning Algorithms, Tasks and Applications,” vol. 6, pp. 2278–1323, Oct. 2017.
  13. K. Pahwa and N. Agarwal, “Stock Market Analysis using Supervised Machine Learning,” in 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India, Feb. 2019, pp. 197–200, doi: 10.1109/COMITCon.2019.8862225.
  14. M. Pérez-Ortiz, S. Jiménez-Fernández, P. A. Gutiérrez, E. Alexandre, C. Hervás-Martínez, and S. Salcedo-Sanz, “A Review of Classification Problems and Algorithms in Renewable Energy Applications,” Energies, vol. 9, no. 8, Art. no. 8, Aug. 2016, doi: 10.3390/en9080607.
  15. Anuradha and G. Gupta, “A self explanatory review of decision tree classifiers,” in International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014), Jaipur, India, May 2014, pp. 1–7, doi: 10.1109/ICRAIE.2014.6909245.
  16. S. Patil and U. Kulkarni, “Accuracy Prediction for Distributed Decision Tree using Machine Learning approach,” in 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), Apr. 2019, pp. 1365–1371, doi: 10.1109/ICOEI.2019.8862580.
  17. N. S. Ahmed and M. H. Sadiq, “Clarify of the random forest algorithm in an educational field,” in 2018 International Conference on Advanced Science and Engineering (ICOASE), 2018, pp. 179–184.
  18. D. Zeebaree, Gene Selection and Classification of Microarray Data Using Convolutional Neural Network. 2018.
  19. O. M. Salih Hassan, A. Mohsin Abdulazeez, and V. M. Tiryaki, “Gait-Based Human Gender Classification Using Lifting 5/3 Wavelet and Principal Component Analysis,” in 2018 International Conference on Advanced Science and Engineering (ICOASE), Duhok, Oct. 2018, pp. 173–178, doi: 10.1109/ICOASE.2018.8548909.
  20. R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, “A Comprehensive Review of Dimensionality Reduction Techniques for Feature Selection and Feature Extraction,” Journal of Applied Science and Technology Trends, vol. 1, no. 2, pp. 56–70, 2020.
  21. D. V. Patil and R. S. Bichkar, “A Hybrid Evolutionary Approach To Construct Optimal Decision Trees With Large Data Sets,” in 2006 IEEE International Conference on Industrial Technology, Dec. 2006, pp. 429–433, doi: 10.1109/ICIT.2006.372250.
  22. O. Ahmed and A. Brifcani, “Gene Expression Classification Based on Deep Learning,” in 2019 4th Scientific International Conference Najaf (SICN), Al-Najef, Iraq, Apr. 2019, pp. 145–149, doi: 10.1109/SICN47020.2019.9019357.
  23. M. A. Sulaiman, “Evaluating Data Mining Classification Methods Performance in Internet of Things Applications,” Journal of Soft Computing and Data Mining, vol. 1, no. 2, pp. 11–25, 2020.
  24. F. Yang, “An Extended Idea about Decision Trees,” in 2019 International Conference on Computational Science and Computational Intelligence (CSCI), Dec. 2019, pp. 349–354, doi: 10.1109/CSCI49370.2019.00068.
  25. J. Liang, Z. Qin, S. Xiao, L. Ou, and X. Lin, “Efficient and secure decision tree classification for cloud-assisted online diagnosis services,” IEEE Transactions on Dependable and Secure Computing, 2019.
  26. A. Mohsin Abdulazeez, A. Brifcani, and Issa, “Intrusion Detection and Attack Classifier Based on Three Techniques: A Comparative Study Intrusion Detection and Attack Classifier Based on Three Techniques: A Comparative Study 387,” Jan. 2021.
  27. A. S. Eesa, Z. Orman, and A. M. A. Brifcani, “A novel feature-selection approach based on the cuttlefish optimization algorithm for intrusion detection systems,” Expert Systems with Applications, vol. 42, no. 5, pp. 2670–2679, Apr. 2015, doi: 10.1016/j.eswa.2014.11.009.
  28. A. Shamim, H. Hussain, and Maqbool Uddin Shaikh, “A framework for generation of rules from decision tree and decision table,” in 2010 International Conference on Information and Emerging Technologies, Jun. 2010, pp. 1–6, doi: 10.1109/ICIET.2010.5625700.
  29. A. Suresh, R. Udendhran, and M. Balamurgan, “Hybridized neural network and decision tree based classifier for prognostic decision making in breast cancers,” Soft Computing, vol. 24, no. 11, pp. 7947–7953, 2020.
  30. Priyanka and D. Kumar, “Decision tree classifier: a detailed survey,” International Journal of Information and Decision Sciences, vol. 12, no. 3, pp. 246–269, 2020.
  31. A. S. Eesa, A. M. Abdulazeez, and Z. Orman, “A DIDS Based on The Combination of Cuttlefish Algorithm and Decision Tree,” Science Journal of University of Zakho, vol. 5, no. 4, pp. 313–318, 2017.
  32. R. Kumar and R. Verma, “Classification algorithms for data mining: A survey,” International Journal of Innovations in Engineering and Technology (IJIET), vol. 1, no. 2, pp. 7–14, 2012.
  33. S. S. Nikam, “A comparative study of classification techniques in data mining algorithms,” Oriental journal of computer science & technology, vol. 8, no. 1, pp. 13–19, 2015.
  34. C. Z. Janikow, “Fuzzy decision trees: issues and methods,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 28, no. 1, pp. 1–14, 1998.
  35. G. Stein, B. Chen, A. S. Wu, and K. A. Hua, “Decision tree classifier for network intrusion detection with GA-based feature selection,” in Proceedings of the 43rd annual Southeast regional conference-Volume 2, 2005, pp. 136–141.
  36. I. S. Damanik, A. P. Windarto, A. Wanto, S. R. Andani, and W. Saputra, “Decision Tree Optimization in C4. 5 Algorithm Using Genetic Algorithm,” in Journal of Physics: Conference Series, 2019, vol. 1255, no. 1, p. 012012.
  37. R. Barros, M. Basgalupp, A. de Carvalho, and A. Freitas, “A Survey of Evolutionary Algorithms for Decision-Tree Induction,” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 42, pp. 291–312, Jan. 2012, doi: 10.1109/TSMCC.2011.2157494.
  38. G. Gupta, “A self explanatory review of decision tree classifiers,” in International conference on recent advances and innovations in engineering (ICRAIE-2014), 2014, pp. 1–7.
  39. S. S. Gavankar and S. D. Sawarkar, “Eager decision tree,” in 2017 2nd International Conference for Convergence in Technology (I2CT), Mumbai, Apr. 2017, pp. 837–840, doi: 10.1109/I2CT.2017.8226246.
  40. P. H. Swain and H. Hauska, “The decision tree classifier: Design and potential,” IEEE Transactions on Geoscience Electronics, vol. 15, no. 3, pp. 142–147, 1977.
  41. A. Dey, “Machine learning algorithms: a review,” International Journal of Computer Science and Information Technologies, vol. 7, no. 3, pp. 1174–1179, 2016.
  42. J. Mrva, Š. Neupauer, L. Hudec, J. Ševcech, and P. Kapec, “Decision Support in Medical Data Using 3D Decision Tree Visualisation,” in 2019 E-Health and Bioengineering Conference (EHB), Nov. 2019, pp. 1–4, doi: 10.1109/EHB47216.2019.8969926.
  43. Y. Bengio, O. Delalleau, and C. Simard, “DECISION TREES DO NOT GENERALIZE TO NEW VARIATIONS,” COMPUTATIONAL INTELLIGENCE, p. 19.
  44. C. E. Brodley and P. E. Utgoff, “Multivariate decision trees,” Machine learning, vol. 19, no. 1, pp. 45–77, 1995.
  45. G. K. F. Tso and K. K. W. Yau, “Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks,” Energy, vol. 32, no. 9, pp. 1761–1768, Sep. 2007, doi: 10.1016/j.energy.2006.11.010.
  46. S. Singh and P. Gupta, “Comparative study ID3, cart and C4. 5 decision tree algorithm: a survey,” International Journal of Advanced Information Science and Technology (IJAIST), vol. 27, no. 27, pp. 97–103, 2014.
  47. L. Rokach and O. Maimon, “Top-Down Induction of Decision Trees Classifiers—A Survey,” Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, vol. 35, pp. 476–487, Dec. 2005, doi: 10.1109/TSMCC.2004.843247.
  48. T.-S. Lim, W.-Y. Loh, and Y.-S. Shih, “A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms,” Machine learning, vol. 40, no. 3, pp. 203–228, 2000.
  49. W.-Y. Loh, “Fifty Years of Classification and Regression Trees,” International Statistical Review, vol. 82, Jun. 2014, doi: 10.1111/insr.12016.
  50. S. R. Jiao, J. Song, and B. Liu, “A Review of Decision Tree Classification Algorithms for Continuous Variables,” in Journal of Physics: Conference Series, 2020, vol. 1651, no. 1, p. 012083.
  51. Y.-Y. Song and Y. Lu, “Decision tree methods: applications for classification and prediction,” Shanghai archives of psychiatry, vol. 27, pp. 130–5, Apr. 2015, doi: 10.11919/j.issn.1002-0829.215044.
  52. RekhaMolala, “Entropy, Information gain and Gini Index; the crux of a Decision Tree,” Medium, Mar. 23, 2020. https://blog.clairvoyantsoft.com/entropy-information-gain-and-gini-index-the-crux-of-a-decision-tree-99d0cdc699f4 (accessed Dec. 28, 2020).
  53. V. Cheushev, D. A. Simovici, V. Shmerko, and S. Yanushkevich, “Functional entropy and decision trees,” in Proceedings. 1998 28th IEEE International Symposium on Multiple-Valued Logic (Cat. No. 98CB36138), 1998, pp. 257–262.
  54. X. Chen, Z. Yang, and W. Lou, “Fault Diagnosis of Rolling Bearing Based on the Permutation Entropy of VMD and Decision Tree,” in 2019 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE), Xiamen, China, Oct. 2019, pp. 1911–1915, doi: 10.1109/EITCE47263.2019.9095187.
  55. C. Shang, M. Li, S. Feng, Q. Jiang, and J. Fan, “Feature selection via maximizing global information gain for text classification,” Knowledge-Based Systems, vol. 54, pp. 298–309, Dec. 2013, doi: 10.1016/j.knosys.2013.09.019.
  56. T. Maszczyk and W. Duch, “Comparison of Shannon, Renyi and Tsallis entropy used in decision trees,” in International Conference on Artificial Intelligence and Soft Computing, 2008, pp. 643–651.
  57. L. E. Raileanu and K. Stoffel, “Theoretical Comparison between the Gini Index and Information Gain Criteria,” Annals of Mathematics and Artificial Intelligence, vol. 41, no. 1, pp. 77–93, May 2004, doi: 10.1023/B:AMAI.0000018580.96245.c6.
  58. Y. Liu, L. Hu, F. Yan, and B. Zhang, “Information Gain with Weight Based Decision Tree for the Employment Forecasting of Undergraduates,” in 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing, Beijing, China, Aug. 2013, pp. 2210–2213, doi: 10.1109/GreenCom-iThings-CPSCom.2013.417.
  59. R. L. De Mántaras, “A distance-based attribute selection measure for decision tree induction,” Machine learning, vol. 6, no. 1, pp. 81–92, 1991.
  60. S. Taneja, C. Gupta, K. Goyal, and D. Gureja, “An enhanced k-nearest neighbor algorithm using information gain and clustering,” in 2014 Fourth International Conference on Advanced Computing & Communication Technologies, 2014, pp. 325–329.
  61. Y. Zhao and Y. Zhang, “Comparison of decision tree methods for finding active objects,” Advances in Space Research, vol. 41, no. 12, pp. 1955–1959, 2008.
  62. K. Mittal, D. Khanduja, and P. C. Tewari, “An insight into ‘Decision Tree Analysis’”,” World Wide Journal of Multidisciplinary Research and Development, vol. 3, no. 12, pp. 111–115, 2017.
  63. Priyanka and D. Kumar, “Decision tree classifier: a detailed survey,” International Journal of Information and Decision Sciences, vol. 12, no. 3, pp. 246–269, 2020.
  64. Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting diabetes mellitus with machine learning techniques,” Frontiers in genetics, vol. 9, p. 515, 2018.
  65. T. A. Assegie and P. S. Nair, “Handwritten digits recognition with decision tree classification: a machine learning approach,” International Journal of Electrical and Computer Engineering, vol. 9, no. 5, p. 4446, 2019.
  66. F. De Felice et al., “Decision tree algorithm in locally advanced rectal cancer: an example of over-interpretation and misuse of a machine learning approach,” Journal of Cancer Research and Clinical Oncology, vol. 146, no. 3, pp. 761–765, 2020.
  67. I. H. Sarker, A. Colman, J. Han, A. I. Khan, Y. B. Abushark, and K. Salah, “Behavdt: a behavioral decision tree learning to build user-centric context-aware predictive model,” Mobile Networks and Applications, vol. 25, no. 3, pp. 1151–1161, 2020.
  68. X. Hu, C. Rudin, and M. Seltzer, “Optimal sparse decision trees,” in Advances in Neural Information Processing Systems, 2019, pp. 7267–7275.
  69. S. Patil and U. Kulkarni, “Accuracy Prediction for Distributed Decision Tree using Machine Learning approach,” in 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), Apr. 2019, pp. 1365–1371, doi: 10.1109/ICOEI.2019.8862580.
  70. D. Hussain, M. A. Al-Antari, M. A. Al-Masni, S.-M. Han, and T.-S. Kim, “Femur segmentation in DXA imaging using a machine learning decision tree,” Journal of X-ray Science and Technology, vol. 26, no. 5, pp. 727–746, 2018.
  71. N. Linty, A. Farasin, A. Favenza, and F. Dovis, “Detection of GNSS Ionospheric Scintillations Based on Machine Learning Decision Tree,” IEEE Transactions on Aerospace and Electronic Systems, vol. 55, no. 1, pp. 303–317, Feb. 2019, doi: 10.1109/TAES.2018.2850385.
  72. W. Kuang, Y. Chan, S. Tsang, and W. Siu, “Machine Learning-Based Fast Intra Mode Decision for HEVC Screen Content Coding via Decision Trees,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 5, pp. 1481–1496, May 2020, doi: 10.1109/TCSVT.2019.2903547.
  73. I. Ramadhan, P. Sukarno, and M. A. Nugroho, “Comparative Analysis of K-Nearest Neighbor and Decision Tree in Detecting Distributed Denial of Service,” in 2020 8th International Conference on Information and Communication Technology (ICoICT), Yogyakarta, Indonesia, Jun. 2020, pp. 1–4, doi: 10.1109/ICoICT49345.2020.9166380.
  74. V. M. E. Batitis, M. J. G. Caballes, A. A. Ciudad, M. D. Diaz, R. D. Flores, and E. R. E. Tolentin, “Image Classification of Abnormal Red Blood Cells Using Decision Tree Algorithm,” in 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC), Mar. 2020, pp. 498–504, doi: 10.1109/ICCMC48092.2020.ICCMC-00093.
  75. Y. Zhang, J. Liu, Z. Zhang, and J. Huang, “Prediction of Daily Smoking Behavior Based on Decision Tree Machine Learning Algorithm,” in 2019 IEEE 9th International Conference on Electronics Information and Emergency Communication (ICEIEC), Jul. 2019, pp. 330–333, doi: 10.1109/ICEIEC.2019.8784698.
  76. S. Nandhini and J. M. K.S, “Performance Evaluation of Machine Learning Algorithms for Email Spam Detection,” in 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), Feb. 2020, pp. 1–4, doi: 10.1109/ic-ETITE47903.2020.312.
  77. A. I. Taloba and S. S. I. Ismail, “An Intelligent Hybrid Technique of Decision Tree and Genetic Algorithm for E-Mail Spam Detection,” in 2019 Ninth International Conference on Intelligent Computing and Information Systems (ICICIS), Dec. 2019, pp. 99–104, doi: 10.1109/ICICIS46948.2019.9014756.
  78. M. O. Arowolo, M. Adebiyi, A. Adebiyi, and O. Okesola, “PCA Model For RNA-Seq Malaria Vector Data Classification Using KNN And Decision Tree Algorithm,” in 2020 International Conference in Mathematics, Computer Engineering and Computer Science (ICMCECS), Mar. 2020, pp. 1–8, doi: 10.1109/ICMCECS47690.2020.240881.
  79. S. Pathan, P. Kumar, R. Pai, and S. V. Bhandary, “Automated detection of optic disc contours in fundus images using decision tree classifier,” Biocybernetics and Biomedical Engineering, vol. 40, no. 1, pp. 52–64, 2020.
  80. A. A. Nagra et al., “Hybrid self-inertia weight adaptive particle swarm optimisation with local search using C4. 5 decision tree classifier for feature selection problems,” Connection Science, vol. 32, no. 1, pp. 16–36, 2020.
  81. A. Ahmim, L. Maglaras, M. A. Ferrag, M. Derdour, and H. Janicke, “A novel hierarchical intrusion detection system based on decision tree and rules-based models,” in 2019 15th International Conference on Distributed Computing in Sensor Systems (DCOSS), 2019, pp. 228–233.
  82. M. Li, H. Xu, and Y. Deng, “Evidential decision tree based on belief entropy,” Entropy, vol. 21, no. 9, p. 897, 2019.
  83. P. Sathiyanarayanan, S. Pavithra, M. S. SARANYA, and M. Makeswari, “Identification of Breast Cancer Using The Decision Tree Algorithm,” in 2019 IEEE International Conference on System, Computation, Automation and Networking (ICSCAN), 2019, pp. 1–6.

Metrics

Metrics Loading ...