Title |
A confidence predictor for logD using conformal regression and a support-vector machine
|
---|---|
Published in |
Journal of Cheminformatics, April 2018
|
DOI | 10.1186/s13321-018-0271-1 |
Pubmed ID | |
Authors |
Maris Lapins, Staffan Arvidsson, Samuel Lampa, Arvid Berg, Wesley Schaal, Jonathan Alvarsson, Ola Spjuth |
Abstract |
Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water-octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of [Formula: see text] and with the best performing nonconformity measure having median prediction interval of [Formula: see text] log units at 80% confidence and [Formula: see text] log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
Sweden | 2 | 13% |
United States | 2 | 13% |
United Kingdom | 2 | 13% |
Germany | 1 | 6% |
Denmark | 1 | 6% |
Japan | 1 | 6% |
Thailand | 1 | 6% |
Unknown | 6 | 38% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Scientists | 8 | 50% |
Members of the public | 7 | 44% |
Science communicators (journalists, bloggers, editors) | 1 | 6% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 93 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 21 | 23% |
Student > Ph. D. Student | 17 | 18% |
Student > Master | 16 | 17% |
Student > Bachelor | 4 | 4% |
Other | 4 | 4% |
Other | 6 | 6% |
Unknown | 25 | 27% |
Readers by discipline | Count | As % |
---|---|---|
Chemistry | 19 | 20% |
Agricultural and Biological Sciences | 7 | 8% |
Pharmacology, Toxicology and Pharmaceutical Science | 6 | 6% |
Biochemistry, Genetics and Molecular Biology | 5 | 5% |
Chemical Engineering | 4 | 4% |
Other | 15 | 16% |
Unknown | 37 | 40% |