2010 7th National Conference on Electrical, Electronics and Computer Engineering, ELECO 2010, Bursa, Turkey, 2 - 05 December 2010, pp.544-548
Although there are many studies on computer-aided drug design in recent years, determination of proteins for drug candidates is a remarkable area for research. The first major shortcoming of this kind of problems is the feature selection representing the protein structure best, the former one is the computational complexity. We use three datasets with different sizes such as Cherkasov dataset with 2684 examples including over 160 descriptors, sdf formatted DrugDataBank dataset with 7440 examples including over 300 descriptors and Pharmeks Company's real drug database having over 250.000 samples. A statistical multiple reliefF algorithm is developed in order to measure the quality of the attributes and to reduce the dimenson of the dataset. We applied a new approach working on subspaces of dataset called as incremental decremantal kernel learning model. As a result, we found that our new approach has better accuracy and lower computational complexity than the other traditional supervised algorithms.