Classification of drug molecules considering their IC<sub>50</sub> values using mixed-integer linear programming based hyper-boxes method

Armutlu, Pelin; Ozdemir, Muhittin; YÜKSEKTEPE, FADİME; Kavakli, I.; Turkay, Metin

doi:10.1186/1471-2105-9-411

Classification of drug molecules considering their IC<sub>50</sub> values using mixed-integer linear programming based hyper-boxes method

Armutlu P., Ozdemir M. E., YÜKSEKTEPE F., Kavakli I. H., Turkay M.

BMC BIOINFORMATICS, 2008 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Basım Tarihi: 2008
Doi Numarası: 10.1186/1471-2105-9-411
Dergi Adı: BMC BIOINFORMATICS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
İstanbul Kültür Üniversitesi Adresli: Hayır

Özet

Background: A priori analysis of the activity of drugs on the target protein by computational approaches can be useful in narrowing down drug candidates for further experimental tests. Currently, there are a large number of computational methods that predict the activity of drugs on proteins. In this study, we approach the activity prediction problem as a classification problem and, we aim to improve the classification accuracy by introducing an algorithm that combines partial least squares regression with mixed-integer programming based hyper-boxes classification method, where drug molecules are classified as low active or high active regarding their binding activity (IC50 values) on target proteins. We also aim to determine the most significant molecular descriptors for the drug molecules. Results: We first apply our approach by analyzing the activities of widely known inhibitor datasets including Acetylcholinesterase (ACHE), Benzodiazepine Receptor (BZR), Dihydrofolate Reductase (DHFR), Cyclooxygenase-2 (COX-2) with known IC50 values. The results at this stage proved that our approach consistently gives better classification accuracies compared to 63 other reported classification methods such as SVM, Naive Bayes, where we were able to predict the experimentally determined IC50 values with a worst case accuracy of 96%. To further test applicability of this approach we first created dataset for Cytochrome P450 C17 inhibitors and then predicted their activities with 100% accuracy. Conclusion: Our results indicate that this approach can be utilized to predict the inhibitory effects of inhibitors based on their molecular descriptors. This approach will not only enhance drug discovery process, but also save time and resources committed.