Document Type

Article

Publication Date

2025

DOI

10.1021/acs.joc.5c00343

Publication Title

Journal of Organic Chemistry

Volume

90

Issue

17

Pages

6000-6012

Abstract

A supervised machine learning model has been developed that allows for the prediction of site selectivity in late-stage C-H borylations. Model development was accomplished using literature data for the site-selective (≥95%) C-H borylation of 189 unique arene, heteroarene, and aliphatic substrates that feature a total of 971 possible sp² or sp³ C-H borylation sites. The reported experimental data was supplemented with additional chemoinformatic descriptors, computed atomic charges at the C-H borylation sites, and data from parameterization of catalytically active tris-boryl complexes resulting from the combination of seven different Ir-, Ru-, and Rh-based precatalysts with eight different ligands. Of the over 1600 parameters investigated, the computed atomic charges (e.g., Hirshfeld, ChelpG, and Mulliken charges) on the hydrogen and carbon atoms at the site of borylation were identified as the most important features that allow for the successful prediction of whether a particular C-H bond will undergo a site-selective borylation. The overall accuracy of the developed model was 88.9% ± 2.5% with precision, recall, and F1 scores of 92-95% for the nonborylating sites and 65-75% for the sites of borylation. The model was demonstrated to be generalizable to molecules outside of the training/test sets with an additional validation set of 12 electronically and structurally diverse systems.

Rights

© 2025 The Authors.

This publication is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) License.

Data Availability

Article states: "The data underlying this study are available in the published article, in its Supporting Information, and openly available on GitHub at https://github.com/LambertGroupChemistry/CH-Borylation-Model."

Original Publication Citation

Stephens, S. M., & Lambert, K. M. (2025). The importance of atomic charges for predicting site-selective Ir-, Ru-, and Rh-catalyzed C-H borylations. Journal of Organic Chemistry, 90(17), 6000-6012. https://doi.org/10.1021/acs.joc.5c00343

ORCID

0000-0002-3623-3573 (Stephens), 0000-0002-8230-2840 (Lambert)

jo5c00343_si_001.pdf (9176 kB)
Supporting Information

Share

COinS