A novel approach to cutting decision trees


Uney-Yuksektepe F.

CENTRAL EUROPEAN JOURNAL OF OPERATIONS RESEARCH, cilt.22, sa.3, ss.553-565, 2014 (SCI-Expanded) identifier identifier

Özet

In data mining, binary classification has a wide range of applications. Cutting Decision Tree (CDT) induction is an efficient mathematical programming based method that tries to discretize the data set on hand by using multiple separating hyperplanes. A new improvement to CDT model is proposed in this study by incorporating the second goal of maximizing the distance of the correctly classified instances to the misclassification region. Computational results show that developed model achieves better classification accuracy for Wisconsin Breast Cancer database and Japanese Banks data set when compared to existing piecewise-linear models in literature. Furthermore, remarkable results are obtained for the well-known benchmarking data sets (Buba Liver Disorders, Blood Tranfusion and Pima Indian Diabetes) when compared to the original CDT model.