An Evaluation of Feature Selection Methods on Multi-Class Imbalance and High Dimensionality Shape-Based Leaf Image Features

Mohd Shamrie Sainin, Rayner Alfred, Faudziah Ahmad, Mohamed A.M. Lammasha


Multi-class imbalance shape-based leaf image features requires feature subset that appropriately represent the leaf shape. Multi-class imbalance data is a type of data classification problem in which some data classes is highly underrepresented compared to others. This occurs when at least one data class is represented by just a few numbers of training samples known as the minority class compared to other classes that make up the majority class. To address this issue in shapebased leaf image feature extraction, this paper discusses the evaluation of several methods available in Weka and a wrapperbased genetic algorithm feature selection.


Feature Selection; Multiclass Imbalance; High Dimensionality; Leaf;

Full Text:



H. Pant, R. Srivastava. A Survey On Feature Selection Methods for Imbalanced Datasets, International Journal of Computer Engineering & Applications, IX(II) (2015) 197–204.

S. Wang, X. Yao. Multiclass Imbalance Problems: Analysis and Potential Solutions, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2(4)(2012) 1119–1130.

B. Lerteerawong, M. Athimethphat. An Empirical Study of Multiclass Classification with Class Imbalance Problems, International Conference on Business and Information, (2011).

C. Lemnaru, R. Potolea. Imbalanced Classification Problems: Systematic Study, Issues and Best Practices, Lecture Notes Bussiness Intelligence, Springer-Verlag. 102 (2011) 35-50.

L. Ladha, T. Deepa. Feature selection methods and algorithms, International Journal Computational Science Engineering, 3 (2011) 1787–1797.

M. Bekkar, D. T. A. Alitouche. Imbalanced Data Learning Approaches Review, International Journal of Data Mining & Knowledge Management Process, 3(4)(2013) 15–33.

M. F. M. Mohsin, A. R. Hamdan, A. A. Bakar. An Evaluation of Feature Selection Technique for Dendrite Cell Algorithm, in Proceeding of International Conference on IT Convergence and Security, (2014) 1–5.

M. S. Sainin, R. Alfred. A genetic based wrapper feature selection approach using Nearest Neighbour Distance Matrix, in 3rd Conference on Data Mining and Optimization, (2011) 237–242.

M. Barati, A. Abdullah, R. Mahmod, N. Mustapha, N. I. Udzir. Feature Selection for IDS in Encrypted traffic using Genetic Algorithm, Proceedings of the 4th International Conference on Computing and Informatics, (2013) 279–285.

S. S. Kamaruddin, J. Yahaya, A. Deraman, R. Ahmad. Filter Wrapper based Feature Ranking Technique for Dynamic Software Quality Attributes, Knowledge Management International Conference (KMICe), (2012) 593–597.

H. B. Alwan, K. R. Ku-Mahamud. Mixed variable ant colony optimization technique for feature subset selection and model selection, Procedings of the 4th International Conference on Computing and Informatics, (2013) 25–31.

I. N. M. Shaharanee, J. Jamil. Feature selection and rule removal for frequent association rule based classification, Proceedings of the 4th International Conference on Computing and Informatics, (2013) 377–382.

P. Yang, W. Liu, B. B. Zhou, S. Chawla, A. Y. Zomaya. Ensemble-Based Wrapper Methods for Feature Selection and Class Imbalance Learning, Advanced Knowledge Discovery in Data Mining, 7818 (2013) 544–555.

M. S. Sainin, F. Ahmad, and R. Alfred. Improving the identification and classification of Malaysian medicinal leaf images using ensemble method, Proceeding of the First International Conference on ICT for Transformation, (2016) 1–6.

Y. Herdiyeni, M. M. Santoni. Combination of Morphological, Local Binary Pattern Variance and Color Moments Features for Indonesian Medicinal Plants Identification, International Conference on Advanced Computer Science and Informations, (2012) 255–259.

D. S. Prasvita, Y. Herdiyeni. Medleaf Mobile Application for Medicinal Plant Identification Based on Leaf Image, International Journal on Advanced Science, Engineering and Information Technology, 2(2)(2013) 5–8.

I. Witten, E. Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann. (2000).

B. C. Heymans, J. P. Onema, J. O. Kuti. A neural network for opuntia leaf-form recognition, IEEE International Joint Conference on Neural Networks, 3 (1991) 2116–2121.

F. Hong, C. Zheru, F. Dagan, S. Jiatao. Machine learning techniques for ontology-based leaf classification, 8th Control, Automation, Robotics and Vision Conference, 1 (2004) 681–686.

Z. Wang, Z. Chi, D. Feng. Shape based leaf image retrieval, IEE Proc. Vision, Image Signal Process., 150(1)(2003) 34–43.

M. V Joshi, V. Kumar, and R. C. Agarwal. (2001) “Evaluating boosting algorithms to classify rare cases: comparison and improvements,” First IEEE International Conference on Data Mining, 257–264.

J.-X. Du, X.-F. Wang, G.-J. Zhang. Leaf shape based plant species recognition, Journal Applied Mathematics and Computation, 185(2) (2007) 883–893.

V. R. Patil, R. R. Manza. A Method of Feature Extraction from Leaf Architecture, Int. J. of Advanced Research in Computer Science and Software Engineering, 5(7) (2015) 1025–1029.

S. Singh, M. S. Bhamrah. Leaf Identification Using Feature Extraction and Neural Network, IOSR Journal of Electronics and Communication Engineering, 5(I) (2015) 134–140.

M. S. Sainin, T. K. Ghazali, R. Alfred. Malaysian Medicinal Plant Leaf Shape Identification and Classification, Knowledge Management International Conference and Exhibition, (2014) 578–583.

M. A. Hall. Correlation-based Feature Subset Selection for Machine Learning, Univerisiy of Waikato, (1999).

M. A. Hall. Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning, Proceedings of the Seventeenth

International Conference on Machine Learning, (2000) 359–366.

M. A. Hall, L. A. Smith. Feature Selection for Machine Learning: Comparing a Correlation-based Filter Approach to the Wrapper, Proceedings of the Twelfth International Florida Artificial Intelligence Research Society Conference, (1999).

H. Liu, R. Setiono. A probabilistic approach to feature selection A filter solution, 13th International Conference on Machine Learning, (1996) 319–327.

A. R. Onik, N. F. Haq, L. Alam. An Analytical Comparison on Filter Feature Extraction Method in Data Mining using J48 Classifier, International Journal of Computing Application, 124(13) (2015) 1–8.

G. Cuaya, A. Muñoz-Meléndez, E. F. Morales, A Minority Class Feature Selection Method, Progress in Pattern Recognition, Image

Analysis, Computer Vision, and Applications, 7042(2011) 417–424.

R. Kohavi, G. H. John. Wrappers for feature subset selection, Artificial Intelligence, 97(1-2) (1997) 273–324.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

ISSN: 2180-1843

eISSN: 2289-8131