Browsing by Author "Liu, Ruifeng"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
- ItemAssessing deep and shallow learning methods for quantitative prediction of acute chemical toxicity.(0000-00-00) Liu, Ruifeng; Madore, Michael; Glover, Kyle P; Feasel, Michael G; Wallqvist, AndersAnimal based methods for assessing chemical toxicity are struggling to meet testing demands In silico approaches including machine learning methods are promising alternatives Recently deep neural networks DNNs were evaluated and reported to outperform other machine learning methods for quantitative structure activity relationship modeling of molecular properties However most of the reported performance evaluations relied on global performance metrics such as the root mean squared error RMSE between the predicted and experimental values of all samples without considering the impact of sample distribution across the activity spectrum Here we carried out an in depth analysis of DNN performance for quantitative prediction of acute chemical toxicity using several datasets We found that the overall performance of DNN models on datasets of up to 30 000 compounds was similar to that of random forest RF models as measured by the RMSE and correlation coefficients between the predicted and experimental results However our detailed analyses demonstrated that global performance metrics are inappropriate for datasets with a highly uneven sample distribution because they show a strong bias for the most populous compounds along the toxicity spectrum For highly toxic compounds DNN and RF models trained on all samples performed much worse than the global performance metrics indicated Surprisingly our variable nearest neighbor method which utilizes only structurally similar compounds to make predictions performed reasonably well suggesting that information of close near neighbors in the training sets is a key determinant of acute toxicity predictions
- ItemExploiting large-scale drug-protein interaction information for computational drug repurposing.(2014-07-03) Liu, Ruifeng; Singh, Narender; Tawa, Gregory J; Wallqvist, Anders; Reifman, JaquesDespite increased investment in pharmaceutical research and development fewer and fewer new drugs are entering the marketplace This has prompted studies in repurposing existing drugs for use against diseases with unmet medical needs A popular approach is to develop a classification model based on drugs with and without a desired therapeutic effect For this approach to be statistically sound it requires a large number of drugs in both classes However given few or no approved drugs for the diseases of highest medical urgency and interest different strategies need to be investigated
- ItemMerging applicability domains for in silico assessment of chemical mutagenicity.(2014-03-24) Liu, Ruifeng; Wallqvist, AndersUsing a benchmark Ames mutagenicity data set we evaluated the performance of molecular fingerprints as descriptors for developing quantitative structure activity relationship QSAR models and defining applicability domains with two machine learning methods random forest RF and variable nearest neighbor v NN The two methods focus on complementary aspects of chemical mutagenicity and use different characteristics of the molecular fingerprints to achieve high levels of prediction accuracies Thus while RF flags mutagenic compounds using the presence or absence of small molecular fragments akin to structural alerts the v NN method uses molecular structural similarity as measured by fingerprint based Tanimoto distances between molecules We showed that the extended connectivity fingerprints could intuitively be used to define and quantify an applicability domain for either method The importance of using applicability domains in QSAR modeling cannot be understated compounds that are outside the applicability domain do not have any close representative in the training set and therefore we cannot make reliable predictions Using either approach we developed highly robust models that rival the performance of a state of the art proprietary software package Importantly based on the complementary approach used by the methods we showed that by combining the model predictions we raised the applicability domain from roughly 80 to 90 These results indicated that the proposed QSAR protocol constituted a highly robust chemical mutagenicity prediction model
- ItemUsing the Variable-Nearest Neighbor Method To Identify P-Glycoprotein Substrates and Inhibitors.(0000-00-00) Schyman, Patric; Liu, Ruifeng; Wallqvist, AndersPermeability glycoprotein Pgp is an essential membrane bound transporter that efficiently extracts compounds from a cell As such it is a critical determinant of the pharmacokinetic properties of drugs Multidrug resistance in cancer is often associated with overexpression of Pgp which increases the efflux of chemotherapeutic agents from the cell This in turn may prevent an effective treatment by reducing the effective intracellular concentrations of such agents Consequently identifying compounds that can either be transported out of the cell by Pgp substrates or impair Pgp function inhibitors is of great interest Herein using publically available data we developed quantitative structure activity relationship QSAR models of Pgp substrates and inhibitors These models employed a variable nearest neighbor v NN method that calculated the structural similarity between molecules and hence possessed an applicability domain that is they used all nearest neighbors that met a minimum similarity constraint The performance characteristics of these v NN based models were comparable or at times superior to those of other model constructs The best v NN models for identifying either Pgp substrates or inhibitors showed overall accuracies of 80 and values of 0 60 when tested on external data sets with candidate Pgp substrates and inhibitors The v NN prediction model with a well defined applicability domain gave accurate and reliable results The v NN method is computationally efficient and requires no retraining of the prediction model when new assay information becomes available an important feature when keeping QSAR models up to date and maintaining their performance at high levels