A Machine Learning Model to Estimate Toxicokinetic Half-Lives of Per- and Polyfluoro-Alkyl Substances (PFAS) in Multiple Species
Daniel E. Dawson,
Christopher Lau,
Prachi Pradeep,
Risa R. Sayre,
Richard S. Judson,
Rogelio Tornero-Velez,
John F. Wambaugh
Affiliations
Daniel E. Dawson
U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA
Christopher Lau
U.S. Environmental Protection Agency, Office of Research and Development, Center for Public Health and Environmental Assessment, 109 T.W. Alexander Drive, Research Triangle Park, NC 277011, USA
Prachi Pradeep
U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA
Risa R. Sayre
U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA
Richard S. Judson
U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA
Rogelio Tornero-Velez
U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA
John F. Wambaugh
U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711, USA
Per- and polyfluoroalkyl substances (PFAS) are a diverse group of man-made chemicals that are commonly found in body tissues. The toxicokinetics of most PFAS are currently uncharacterized, but long half-lives (t½) have been observed in some cases. Knowledge of chemical-specific t½ is necessary for exposure reconstruction and extrapolation from toxicological studies. We used an ensemble machine learning method, random forest, to model the existing in vivo measured t½ across four species (human, monkey, rat, mouse) and eleven PFAS. Mechanistically motivated descriptors were examined, including two types of surrogates for renal transporters: (1) physiological descriptors, including kidney geometry, for renal transporter expression and (2) structural similarity of defluorinated PFAS to endogenous chemicals for transporter affinity. We developed a classification model for t½ (Bin 1: 2 months). The model had an accuracy of 86.1% in contrast to 32.2% for a y-randomized null model. A total of 3890 compounds were within domain of the model, and t½ was predicted using the bin medians: 4.9 h, 2.2 days, 33 days, and 3.3 years. For human t½, 56% of PFAS were classified in Bin 4, 7% were classified in Bin 3, and 37% were classified in Bin 2. This model synthesizes the limited available data to allow tentative extrapolation and prioritization.