Date of Award

2019-01-01

Degree Name

Master of Science

Department

Computational Science

Advisor(s)

Suman Sirimulla

Abstract

Prediction of interaction between drugs or drug like compounds and targets, is of high importance in drug discovery process as it provides important insights into therapeutic potential and possible adverse effects. As the experimental testing would be highly expensive, laborious and time consuming, screening the molecules computationally before performing experiments would be cost effective, faster and convenient as a method of approach. In this study, I have developed computational models, leveraging machine learning techniques, to predict drug-kinase binding affinities. The predictive model is mainly built using the Random Forest (RF) machine learning method. This study is focused on kinases because of their importance as drug targets for therapeutic use. The dataset encompasses the kinases and ligands binding information collected from Drug Target Commons (DTC) and Pharos. The data was split into a training set (75%) and a test set (25%). The performance of the model was evaluated using several metrics and the best model achieved a correlation coefficient (R) of 0.86, root mean square error (RMSE) of 0.52, concordance index (CI) of 0.81, and Area Under a receiver operating characteristic Curve (AUC) of 0.95 during the internal 10-fold cross validation. An additional blind test was also performed on synapse IDG-DREAM Challenge, which is a Drug-Kinase Binding Prediction Challenge. The RF model achieved AUC of 0.68. I demonstrated that RF model has the potential to predict the binding affinity for the interaction of ligand and kinase based on structural, physicochemical and atom pair based two-dimensional pharmacophore fingerprints. I also compared the running time and performance of the model based on grid search and random search methods. Our results indicate that there is no significant difference in model performance of grid search and random search. However, random search reduces the model building time significantly.

Language

en

Provenance

Received from ProQuest

File Size

56 pages

File Format

application/pdf

Rights Holder

Govinda KC

Share

COinS