Treffer: A feature enrichment and feature selection approach to improve prediction of drug target interaction and ligand-based virtual screening
Chinese
Weitere Informationen
Ph.D. ; Identifying drug targets is one of the major tasks in drug discovery. Because experimental identification of targets is very challenging, development of computational methods is therefore necessary for efficient identification of drug target interaction. Traditional computational methods such as docking is based only on the chemical structure, which is not available for many targets. Moreover, the underlying assumption in chemical structure-based approaches is not universally true. In this study, a feature enrichment method by integrating information of bioassay and chemical structure was developed to predict drug target interaction. A large benchmark indicated that the integrated fingerprint outperformed chemical fingerprint. Influence of false positive hits in bioassays, as well as algorithm-related factors, to the method was also investigated. It is indicated that prediction by integrated fingerprint is robust to false positive hits, as well as the choice of classifiers and different random splits of the datasets. ; In addition, a new machine learning algorithm was proposed for ligand-based virtual screening where the data are imbalanced and features describing the chemical characteristic of ligands are high-dimensional. The unique characteristic of the algorithm is that it quantifies the generalization ability of screening directly by a refined loss function, and thus can assess the risk of over-fitting accurately and efficiently for imbalanced and high-dimensional data in ligand-based virtual screening without the help of resampling methods such as cross validation. The robustness of the algorithm was demonstrated by a simulation study and tests on real datasets, in which it outperformed conventional algorithms in terms of screening accuracy and model interpretation. The suggested algorithm was then used for screening potential activators of HIV-1 integrase multimerization in an independent compound library, and the virtual screening result was experimentally validated. Of the 25 compounds tested, ...