DecisionTree for Classification and Regression: A State-of-the Art Review
Abstract
Classification and regression are defined under the umbrella of the prediction task of data mining. Discrete values are predicted using classification techniques whereas regression techniques are most suitable for predicting continuous data. Analysts from different research areas like data mining, statistics, machine learning, pattern recognition, and big data analytics preferred decision trees over other classifiers as it is simple, effective, efficient, and its performance is competitive with others. In this paper, we review extensively many popularly used state-of-the-art
decision tree-based techniques for classification and regression. We present a survey of more than forty years of research that has been emphasized on the application of decision trees in both classification and regression. This survey could be the potential source for all the researchers who are keenly interested to apply the decision tree classifier/regressor for their research work.
Full Text:
PDFReferences
XindongWu, Xingquan Zhu, Gong-QingWu, and Wei Ding. Data mining with big data. IEEE transactions on knowledge and data engineering, 26(1):97–107, 2013.
Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. Database mining: A performance perspective. IEEE transactions on knowledge and data engineering, 5(6):914–925, 1993.
Satchidananda Dehuri and Ashish Ghosh. Revisiting evolutionary algorithms in feature selection and nonfuzzy/fuzzy rule-based classification. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3(2):83–108, 2013.
Leszek Rutkowski. Adaptive probabilistic neural networks for pattern classification in time-varying environment. IEEE transactions on neural networks, 15(4):811–827, 2004.
Wouter Verbeke, David Martens, Christophe Mues, and Bart Baesens. Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert systems with applications, 38(3):2354–2364, 2011.
Charu C Aggarwal. Data classification: algorithms and applications. CRC press, 2014.
Salvador García, Alberto Fernández, and Francisco Herrera. Enhancing the effectiveness and interpretability of decision tree and rule induction classifiers with evolutionary training set selection over imbalanced problems. Applied Soft Computing, 9(4):1304–1314, 2009.
Shih-Wei Lin, Kuo-Ching Ying, Chou-Yuan Lee, and Zne-Jung Lee. An intelligent algorithm with feature selection and decision rules applied to anomaly intrusion detection. Applied Soft Computing, 12(10):3285–3290, 2012.
Rodrigo Coelho Barros, Márcio Porto Basgalupp, Andre CPLF De Carvalho, and Alex A Freitas. A survey of evolutionary algorithms for
decision-tree induction. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(3):291–312, 2012.
Lior Rokach and Oded Maimon. Top-down induction of decision trees classifiers-a survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 35(4):476–487, 2005.
Arno De Caigny, Kristof Coussement, and Koen W De Bock. A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. European Journal of Operational Research, 269(2):760–772, 2018.
Usama M Fayyad and Keki B Irani. On the handling of continuous-valued attributes in decision tree generation. Machine learning, 8(1):87–102, 1992.
Dragi Kocev, Celine Vens, Jan Struyf, and Sašo Džeroski. Ensembles of multi-objective decision trees. In European conference on machine learning, pages 624–631. Springer, 2007.
Dua Dheeru and Efi Karra Taniskidou. UCI machine learning repository, 2017.
Jieyue He, Hae-Jin Hu, Robert Harrison, Phang C Tai, and Yi Pan. Transmembrane segments prediction and understanding using support
vector machine and decision tree. Expert Systems with Applications, 30(1):64–72, 2006.
Jiawei Han, Jian Pei, and Micheline Kamber. Data mining: concepts and techniques. Elsevier, 2011.
Shlomo Geva and Joaquin Sitte. Adaptive nearest neighbor pattern classification. IEEE Transactions on Neural Networks, 2(2):318–322,1991.
Se June Hong. R-mini: An iterative approach for generating minimal rules from examples. IEEE Transactions on Knowledge and Data Engineering, 9(5):709–717, 1997.
Eric WT Ngai, Li Xiu, and Dorothy CK Chau. Application of data mining techniques in customer relationship management: A literature
review and classification. Expert systems with applications, 36(2):2592–2602, 2009.
J Ross Quinlan. Generating production rules from decision trees. In ijcai, volume 87, pages 304–307. Citeseer, 1987.
DOI: https://doi.org/10.31449/inf.v44i4.3023
This work is licensed under a Creative Commons Attribution 3.0 License.