Skip to main navigation menu Skip to main content Skip to site footer

A Comprehensive Review of Dimensionality Reduction Techniques for Feature Selection and Feature Extraction

Journal of Applied Science and Technology Trends

Abstract

Due to sharp increases in data dimensions, working on every data mining or machine learning (ML) task requires more efficient techniques to get the desired results. Therefore, in recent years, researchers have proposed and developed many methods and techniques to reduce the high dimensions of data and to attain the required accuracy. To ameliorate the accuracy of learning features as well as to decrease the training time dimensionality reduction is used as a pre-processing step, which can eliminate irrelevant data, noise, and redundant features. Dimensionality reduction (DR) has been performed based on two main methods, which are feature selection (FS) and feature extraction (FE). FS is considered an important method because data is generated continuously at an ever-increasing rate; some serious dimensionality problems can be reduced with this method, such as decreasing redundancy effectively, eliminating irrelevant data, and ameliorating result comprehensibility. Moreover, FE transacts with the problem of finding the most distinctive, informative, and decreased set of features to ameliorate the efficiency of both the processing and storage of data. This paper offers a comprehensive approach to FS and FE in the scope of DR. Moreover, the details of each paper, such as used algorithms/approaches, datasets, classifiers, and achieved results are comprehensively analyzed and summarized. Besides, a systematic discussion of all of the reviewed methods to highlight authors' trends, determining the method(s) has been done, which significantly reduced computational time, and selecting the most accurate classifiers. As a result, the different types of both methods have been discussed and analyzed the findings.

 

Keywords

dimension reduction, dimension reduction techniques, feature selection, feature extraction

PDF

References

  1. N. Sharma and K. Saroha, "Study of dimension reduction methodologies in data mining," in International Conference on Computing, Communication & Automation, 2015, pp. 133-137: IEEE.
  2. S. Ayesha, M. K. Hanif, and R. Talib, "Overview and comparative study of dimensionality reduction techniques for high dimensional data," Information Fusion, vol. 59, pp. 44-58, 2020.
  3. D. Q. Zeebaree, H. Haron, A. M. Abdulazeez, and S. R. Zeebaree, "Combination of K-means clustering with Genetic Algorithm: A review," International Journal of Applied Engineering Research, vol. 12, no. 24, pp. 14238-14245, 2017.
  4. Z. Cheng and Z. Lu, "A novel efficient feature dimensionality reduction method and its application in engineering," Complexity, vol. 2018, 2018.
  5. D. A. Zebari, H. Haron, D. Q. Zeebaree, and A. M. Zain, "A Simultaneous Approach for Compression and Encryption Techniques Using Deoxyribonucleic Acid," in 2019 13th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), 2019, pp. 1-6: IEEE.
  6. M. Li, H. Wang, L. Yang, Y. Liang, Z. Shang, and H. Wan, "Fast hybrid dimensionality reduction method for classification based on feature selection and grouped feature extraction," Expert Systems with Applications, vol. 150, p. 113277, 2020.
  7. A. P. Pandian, R. Palanisamy, and K. Ntalianis, Proceeding of the International Conference on Computer Networks, Big Data and IoT (ICCBI-2019). Springer Nature, 2020.
  8. M. A. Mohammed, B. Al-Khateeb, A. N. Rashid, D. A. Ibrahim, M. K. A. Ghani, and S. A. Mostafa, "Neural network and multi-fractal dimension features for breast cancer classification from ultrasound images," Computers & Electrical Engineering, vol. 70, pp. 871-882, 2018.
  9. O. Saini and S. Sharma, "A review on dimension reduction techniques in data mining," Computer engineering and intelligent systems, vol. 9, pp. 7-14, 2018.
  10. N. Abd-Alsabour, "On the Role of Dimensionality Reduction," JCP, vol. 13, no. 5, pp. 571-579, 2018.
  11. S. Velliangiri and S. Alagumuthukrishnan, "A Review of Dimensionality Reduction Techniques for Efficient Computation," Procedia Computer Science, vol. 165, pp. 104-111, 2019.
  12. W. Wang, W.-g. Shen, Y.-x. Sun, B. Chen, and R. Zhu, "Dimensionality reduction via adjusting data distribution density," in 2018 5th International Conference on Systems and Informatics (ICSAI), 2018, pp. 1052-1055: IEEE.
  13. J. Stuckman, J. Walden, and R. Scandariato, "The effect of dimensionality reduction on software vulnerability prediction models," IEEE Transactions on Reliability, vol. 66, no. 1, pp. 17-37, 2016.
  14. M. Verleysen and D. François, "The curse of dimensionality in data mining and time series prediction," in International Work-Conference on Artificial Neural Networks, 2005, pp. 758-770: Springer.
  15. L. Liu and M. T. Özsu, Encyclopedia of database systems. Springer New York, NY, USA:, 2009.
  16. A. Juvonen, T. Sipola, and T. Hämäläinen, "Online anomaly detection using dimensionality reduction techniques for HTTP log analysis," Computer Networks, vol. 91, pp. 46-56, 2015.
  17. X. Huang, L. Wu, and Y. Ye, "A Review on Dimensionality Reduction Techniques," International Journal of Pattern Recognition and Artificial Intelligence, vol. 33, no. 10, p. 1950017, 2019.
  18. D. L. Padmaja and B. Vishnuvardhan, "Comparative study of feature subset selection methods for dimensionality reduction on scientific data," in 2016 IEEE 6th International Conference on Advanced Computing (IACC), 2016, pp. 31-34: IEEE.
  19. M. B. Abdulrazzaq and J. N. Saeed, "A Comparison of Three Classification Algorithms for Handwritten Digit Recognition," in 2019 International Conference on Advanced Science and Engineering (ICOASE), 2019, pp. 58-63: IEEE.
  20. A. S. Eesa, Z. Orman, and A. M. A. Brifcani, "A novel feature-selection approach based on the cuttlefish optimization algorithm for intrusion detection systems," Expert Systems with Applications, vol. 42, no. 5, pp. 2670-2679, 2015.
  21. A. S. Eesa, A. M. A. Brifcani, and Z. Orman, "Cuttlefish algorithm-a novel bio-inspired optimization algorithm," International Journal of Scientific & Engineering Research, vol. 4, no. 9, pp. 1978-1986, 2013.
  22. P. Jindal and D. Kumar, "A review on dimensionality reduction techniques," International journal of computer applications, vol. 173, no. 2, pp. 42-46, 2017.
  23. A. S. Eesa, Z. Orman, and A. M. A. Brifcani, "A new feature selection model based on ID3 and bees algorithm for intrusion detection system," Turkish Journal of Electrical Engineering & Computer Sciences, vol. 23, no. 2, pp. 615-622, 2015.
  24. U. M. Khaire and R. Dhanalakshmi, "Stability of feature selection algorithm: A review," Journal of King Saud University-Computer and Information Sciences, 2019.
  25. S. Visalakshi and V. Radha, "A literature review of feature selection techniques and applications: Review of feature selection in data mining," in 2014 IEEE International Conference on Computational Intelligence and Computing Research, 2014, pp. 1-6: IEEE.
  26. C. M. Teng, "Combining noise correction with feature selection," in International Conference on Data Warehousing and Knowledge Discovery, 2003, pp. 340-349: Springer.
  27. H. Zhao, F. Min, and W. Zhu, "Cost-sensitive feature selection of numeric data with measurement errors," Journal of Applied Mathematics, vol. 2013, 2013.
  28. J. N. Saeed, "A SURVEY OF ULTRASONOGRAPHY BREAST CANCER IMAGE SEGMENTATION TECHNIQUES," Academic Journal of Nawroz University, vol. 9, no. 1, pp. 1-14, 2020.
  29. Y. Leung and Y. Hung, "A multiple-filter-multiple-wrapper approach to gene selection and microarray data classification," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. 1, pp. 108-117, 2008.
  30. C. Lazar et al., "A survey on filter techniques for feature selection in gene expression microarray analysis," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 4, pp. 1106-1119, 2012.
  31. M. R. Mahmood and A. M. Abdulazeez, "A Comparative Study of a New Hand Recognition Model Based on Line of Features and Other Techniques," in International Conference of Reliable Information and Communication Technology, 2017, pp. 420-432: Springer.
  32. M. Dash and H. Liu, "Feature selection for classification," Intelligent data analysis, vol. 1, no. 3, pp. 131-156, 1997.
  33. D. Jain and V. Singh, "Feature selection and classification systems for chronic disease prediction: A review," Egyptian Informatics Journal, vol. 19, no. 3, pp. 179-189, 2018.
  34. D. Q. Zeebaree, H. Haron, A. M. Abdulazeez, and D. A. Zebari, "Machine learning and Region Growing for Breast Cancer Segmentation," in 2019 International Conference on Advanced Science and Engineering (ICOASE), 2019, pp. 88-93: IEEE.
  35. D. Zebari, H. Haron, and S. Zeebaree, "Security Issues in DNA Based on Data Hiding: A Review," International Journal of Applied Engineering Research,vol. 12,no. 24, ISSN, pp. 0973-4562, 2017.
  36. M. M. Kabir, M. M. Islam, and K. Murase, "A new wrapper feature selection approach using neural network," Neurocomputing, vol. 73, no. 16-18, pp. 3273-3283, 2010.
  37. Y. Peng, Z. Wu, and J. Jiang, "A novel feature selection approach for biomedical data classification," Journal of Biomedical Informatics, vol. 43, no. 1, pp. 15-23, 2010.
  38. Q. Shen, R. Diao, and P. Su, "Feature Selection Ensemble," Turing-100, vol. 10, pp. 289-306, 2012.
  39. M. K. Elhadad, K. M. Badran, and G. I. Salama, "A novel approach for ontology-based dimensionality reduction for web text document classification," International Journal of Software Innovation (IJSI), vol. 5, no. 4, pp. 44-58, 2017.
  40. D. A. Zebari, H. Haron, S. R. Zeebaree, and D. Q. Zeebaree, "Enhance the Mammogram Images for Both Segmentation and Feature Extraction Using Wavelet Transform," in 2019 International Conference on Advanced Science and Engineering (ICOASE), 2019, pp. 100-105: IEEE.
  41. S. H. A. Moghaddam, M. Mokhtarzade, and B. A. Beirami, "A feature extraction method based on spectral segmentation and integration of hyperspectral images," International Journal of Applied Earth Observation and Geoinformation, vol. 89, p. 102097, 2020.
  42. D. M. Sulaiman, A. M. Abdulazeez, H. Haron, and S. S. Sadiq, "Unsupervised Learning Approach-Based New Optimization K-Means Clustering for Finger Vein Image Localization," in 2019 International Conference on Advanced Science and Engineering (ICOASE), 2019, pp. 82-87: IEEE.
  43. R. Aziz, C. Verma, and N. Srivastava, "Dimension reduction methods for microarray data: a review," AIMS. Bioengineering, vol. 4, no. 1, pp. 179-197, 2017.
  44. A. S. Eesa, A. M. Abdulazeez, and Z. Orman, "A DIDS Based on The Combination of Cuttlefish Algorithm and Decision Tree," Science Journal of University of Zakho, vol. 5, no. 4, pp. 313-318, 2017.
  45. Z. M. Hira and D. F. Gillies, "A review of feature selection and feature extraction methods applied on microarray data," Advances in bioinformatics, vol.170. 2015, 2015.
  46. A. M. Abdulazeez and A. S. Issa, "Intrusion detection system based on neural networks using bipolar input with bipolar sigmoid activation function," AL-Rafidain Journal of Computer Sciences and Mathematics, vol. 8, no. 2, pp. 79-86, 2011.
  47. D. A. Zebari, H. Haron, S. R. Zeebaree, and D. Q. Zeebaree, "Multi-Level of DNA Encryption Technique Based on DNA Arithmetic and Biological Operations," in 2018 International Conference on Advanced Science and Engineering (ICOASE), 2018, pp. 312-317: IEEE.
  48. O. M. S. Hassan, A. M. Abdulazeez, and V. M. TIRYAKI, "Gait-based human gender classification using lifting 5/3 wavelet and principal component analysis," in 2018 International Conference on Advanced Science and Engineering (ICOASE), 2018, pp. 173-178: IEEE.
  49. F. P. Shah and V. Patel, "A review on feature selection and feature extraction for text classification," in 2016 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), 2016, pp. 2264-2268: IEEE.
  50. H. Sadeeq, A. Abdulazeez, N. Kako, and A. Abrahim, "A Novel Hybrid Bird Mating Optimizer with Differential Evolution for Engineering Design Optimization Problems," in International Conference of Reliable Information and Communication Technology, 2017, pp. 522-534: Springer.
  51. S. Chormunge and S. Jena, "Correlation based feature selection with clustering for high dimensional data," Journal of Electrical Systems and Information Technology, vol. 5, no. 3, pp. 542-549, 2018.
  52. P. Tan, X. Wang, and Y. Wang, "Dimensionality reduction in evolutionary algorithms-based feature selection for motor imagery brain-computer interface," Swarm and Evolutionary Computation, vol. 52, p. 100597, 2020.
  53. F. Hafiz, A. Swain, C. Naik, and N. Patel, "Efficient feature selection of power quality events using two dimensional (2D) particle swarms," Applied Soft Computing, vol. 81, p. 105498, 2019.
  54. X. Han, P. Liu, L. Wang, and D. Li, "Unsupervised feature selection via graph matrix learning and the low-dimensional space learning for classification," Engineering Applications of Artificial Intelligence, vol. 87, p. 103283, 2020.
  55. T. Niu, J. Wang, H. Lu, W. Yang, and P. Du, "Developing a deep learning framework with two-stage feature selection for multivariate financial time series forecasting," Expert Systems with Applications, vol. 148, p. 113237, 2020.
  56. D. Jain and V. Singh, "An efficient hybrid feature selection model for dimensionality reduction," Procedia Computer Science, vol. 132, pp. 333-341, 2018.
  57. E. S. Hosseini and M. H. Moattar, "Evolutionary feature subsets selection based on interaction information for high dimensional imbalanced data classification," Applied Soft Computing, vol. 82, p. 105581, 2019.
  58. Z. Manbari, F. AkhlaghianTab, and C. Salavati, "Hybrid fast unsupervised feature selection for high-dimensional data," Expert Systems with Applications, vol. 124, pp. 97-118, 2019.
  59. K. Qu, F. Gao, F. Guo, and Q. Zou, "Taxonomy dimension reduction for colorectal cancer prediction," Computational biology and chemistry, vol. 83, p. 107160, 2019.
  60. S. Umbarkar and S. Shukla, "Analysis of heuristic based feature reduction method in intrusion detection system," in 2018 5th International Conference on Signal Processing and Integrated Networks (SPIN), 2018, pp. 717-720: IEEE.
  61. F. Farokhmanesh and M. T. Sadeghi, "Deep Feature Selection using an Enhanced Sparse Group Lasso Algorithm," in 2019 27th Iranian Conference on Electrical Engineering (ICEE), 2019, pp. 1549-1552: IEEE.
  62. H.-T. Duong and V. T. Hoang, "Dimensionality Reduction Based on Feature Selection for Rice Varieties Recognition," in 2019 4th International Conference on Information Technology (InCIT), 2019, pp. 199-202: IEEE.
  63. A. F. Alharan, H. K. Fatlawi, and N. S. Ali, "A cluster-based feature selection method for image texture classification," Indonesian Journal of Electrical Engineering and Computer Science, vol. 14, no. 3, pp. 1433-1442, 2019.
  64. M. Z. Osman, M. A. Maarof, M. F. Rohani, K. Moorthy, and S. Awang, "Multi-Scale Skin Sample Approach for Dynamic Skin Color Detection: An Analysis," Advanced Science Letters, vol. 24, no. 10, pp. 7662-7667, 2018.
  65. Y. Arshak and A. Eesa, "A New Dimensional Reduction Based on Cuttlefish Algorithm for Human Cancer Gene Expression," in 2018 International Conference on Advanced Science and Engineering (ICOASE), 2018, pp. 48-53: IEEE.
  66. D. Q. Zeebaree, H. Haron, and A. M. Abdulazeez, "Gene selection and classification of microarray data using convolutional neural network," in 2018 International Conference on Advanced Science and Engineering (ICOASE), 2018, pp. 145-150: IEEE.
  67. O. Ahmed and A. Brifcani, "Gene Expression Classification Based on Deep Learning," in 2019 4th Scientific International Conference Najaf (SICN), 2019, pp. 145-149: IEEE.
  68. V. Balasaraswathi, "Enhanced Cuttle Fish Algorithm Using Membrane Computing for feature selection of intrusion detection.",vol.10, special issue,2018.
  69. J. Kaur and S. Singh, "Feature selection using mutual information and adaptive particle swarm optimization for image steganalysis," in 2018 7th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO), 2018, pp. 538-544: IEEE.
  70. A. Fatima, R. Maurya, M. K. Dutta, R. Burget, and J. Masek, "Android Malware Detection Using Genetic Algorithm based Optimized Feature Selection and Machine Learning," in 2019 42nd International Conference on Telecommunications and Signal Processing (TSP), 2019, pp. 220-223: IEEE.
  71. E. Widiyanti and S. N. Endah, "Feature Selection for Music Emotion Recognition," in 2018 2nd International Conference on Informatics and Computational Sciences (ICICoS), 2018, pp. 1-5: IEEE.
  72. G. Zeller et al., "Potential of fecal microbiota for early stage detection of colorectal cancer," Molecular systems biology, vol. 10, no. 11, 2014.
  73. J. Zackular, M. Rogers, and M. Ruffin, "4th, Schloss PD," The human gut microbiome as a screening tool for colorectal cancer. Cancer Prev Res (Phila), vol. 7, no. 11, pp. 1112-21, 2014.
  74. M. A. Berbar, "Hybrid methods for feature extraction for breast masses classification," Egyptian informatics journal, vol. 19, no. 1, pp. 63-73, 2018.
  75. M. A. Rahman, M. F. Hossain, M. Hossain, and R. Ahmmed, "Employing PCA and t-statistical approach for feature extraction and classification of emotion from multichannel EEG signal," Egyptian Informatics Journal, vol. 21, no. 1, pp. 23-35, 2020.
  76. C. Chu, Z. Zuo-xi, K. Xin-rong, and G. Yun-zhi, "The Research of Machinery Fault Feature Extraction Methods Based On Vibration Signal," IFAC-PapersOnLine, vol. 51, no. 17, pp. 346-352, 2018.
  77. Y. Li, Y. Chai, H. Zhou, and H. Yin, "A novel feature extraction method based on discriminative graph regularized autoencoder for fault diagnosis," IFAC-PapersOnLine, vol. 52, no. 24, pp. 272-277, 2019.
  78. V. Nagarajan, E. C. Britto, and S. M. Veeraputhiran, "Feature extraction based on empirical mode decomposition for automatic mass classification of mammogram images," Medicine in Novel Technology and Devices, vol. 1, p. 100004, 2019.
  79. N. Rabin, M. Kahlon, S. Malayev, and A. Ratnovsky, "Classification of human hand movements based on EMG signals using nonlinear dimensionality reduction and data fusion techniques," Expert Systems with Applications, vol. 149, p. 113281, 2020.
  80. M. Kuncan, K. Kaplan, M. R. Minaz, Y. Kaya, and H. M. Ertunç, "A novel feature extraction method for bearing fault classification with one dimensional ternary patterns,",vol.100,p.346-357.ISA transactions, 2020.
  81. Z. Liu, J. Wang, G. Liu, and L. Zhang, "Discriminative low-rank preserving projection for dimensionality reduction," Applied Soft Computing, vol. 85, p. 105768, 2019.
  82. J. Ma and Y. Yuan, "Dimension reduction of image deep feature using PCA," Journal of Visual Communication and Image Representation, vol. 63, p. 102578, 2019.
  83. A. Sellami and M. Farah, "Comparative study of dimensionality reduction methods for remote sensing images interpretation," in 2018 4th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), 2018, pp. 1-6: IEEE.
  84. X. Chen, J. Li, Y. Zhang, Y. Lu, and S. Liu, "Automatic feature extraction in X-ray image based on deep learning approach for determination of bone age," Future Generation Computer Systems, 2019 Oct 31.
  85. Z. Jin, G. Feng, Y. Ren, and X. Zhang, "Feature Extraction Optimization of JPEG Steganalysis Based on Residual Images," Signal Processing,Vol.170, p. 107455, 2020.
  86. W. Lin, J. Huang, C. Y. Suen, and L. Yang, "A feature extraction model based on discriminative graph signals," Expert Systems with Applications, vol. 139, p. 112861, 2020.
  87. S. M. Kasongo and Y. Sun, "A deep learning method with wrapper based feature extraction for wireless intrusion detection system," Computers & Security, vol. 92, p. 101752, 2020.
  88. Y. Liu and A. Sui, "Research on Feature Dimensionality Reduction in Content Based Public Cultural Video Retrieval," in 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), 2018, pp. 718-722: IEEE.
  89. O. Dehzangi and V. Sahu, "IMU-Based Robust Human Activity Recognition using Feature Analysis, Extraction, and Reduction," in 2018 24th International Conference on Pattern Recognition (ICPR), 2018, pp. 1402-1407: IEEE.
  90. X. Zhang et al., "Spatial-Spectral Graph-Based Nonlinear Embedding Dimensionality Reduction for Hyperspectral Image Classificaiton," in IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, 2018, pp. 8472-8475: IEEE.
  91. T. Alipourfard, H. Arefi, and S. Mahmoudi, "A novel deep learning framework by combination of subspace-based feature extraction and convolutional neural networks for hyperspectral images classification," in IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, 2018, pp. 4780-4783: IEEE.
  92. D. Q. Zeebaree, H. Haron, A. M. Abdulazeez, and D. A. Zebari, "Trainable Model Based on New Uniform LBP Feature to Identify the Risk of the Breast Cancer," in 2019 International Conference on Advanced Science and Engineering (ICOASE), 2019, pp. 106-111: IEEE.
  93. Z. Liu, Z. Lai, W. Ou, K. Zhang, and R. Zheng, "Structured optimal graph based sparse feature extraction for semi-supervised learning," Signal Processing,vol.170, p. 107456, 2020.
  94. A. M. Martinez, "The AR face database," CVC Technical Report24, 1998.
  95. P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss, "The FERET database and evaluation procedure for face-recognition algorithms," Image and vision computing, vol. 16, no. 5, pp. 295-306, 1998.
  96. Y. Xu, X. Li, J. Yang, Z. Lai, and D. Zhang, "Integrating conventional and inverse representation for face recognition," IEEE transactions on cybernetics, vol. 44, no. 10, pp. 1738-1746, 2013.
  97. F. S. Samaria and A. C. Harter, "Parameterisation of a stochastic model for human face identification," in Proceedings of 1994 IEEE workshop on applications of computer vision, 1994, pp. 138-142: IEEE.

Metrics

Metrics Loading ...

Most read articles by the same author(s)