posted on 2019-12-02, 15:25authored byEPA's Center for Computational Toxicology and Exposure
The U.S.
Environmental Protection Agency’s CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard)
hosts a plethora of environmentally-relevant chemical information, including
physical property data suitable for QSAR/QSPR modeling. The development
of these physical property datasets has generally involved the curation of
publicly-available experimental data. The ease of accessing this data,
along with the overall quality of the dataset (i.e. machine-readable
formatting, inclusion of experimental conditions, etc) is highly variable.
This purpose of this work is to identify the challenges associated with
acquiring physical property datasets, with a focus on obtaining water
solubility values for organic compounds. Common issues discovered in this
data will be presented, along with solutions that can be easily implemented in
a high-throughput manner. The end result will be a standard workflow a
researcher can follow when curating physical property datasets. This
abstract does not necessarily represent the views or policies of the U.S. Environmental
Protection Agency.