The United States Environmental Protection Agency’s Center for Computational Toxicology and Exposure
Poster_OCHEM_AJW1127_2019.pdf (1.21 MB)
Download file

Development of a Water Solubility Dataset to Establish Best Practices for Curating New Datasets for QSAR Modeling

Download (1.21 MB)
posted on 2019-12-02, 15:25 authored by EPA's Center for Computational Toxicology and Exposure
The U.S. Environmental Protection Agency’s CompTox Chemicals Dashboard ( hosts a plethora of environmentally-relevant chemical information, including physical property data suitable for QSAR/QSPR modeling. The development of these physical property datasets has generally involved the curation of publicly-available experimental data. The ease of accessing this data, along with the overall quality of the dataset (i.e. machine-readable formatting, inclusion of experimental conditions, etc) is highly variable. This purpose of this work is to identify the challenges associated with acquiring physical property datasets, with a focus on obtaining water solubility values for organic compounds. Common issues discovered in this data will be presented, along with solutions that can be easily implemented in a high-throughput manner. The end result will be a standard workflow a researcher can follow when curating physical property datasets. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.


Usage metrics

    Center for Computational Toxicology and Exposure