Abstract

The high-throughput exploration and screening of molecules for organic electronics involves either a 'top-down' curation and mining of existing repositories, or a 'bottom-up' assembly of user-defined fragments based on known synthetic templates. Both are time-consuming approaches requiring significant resources to compute electronic properties accurately. Here, 'top-down' is combined with 'bottom-up' through automatic assembly and statistical models, thus providing a platform for the fragment-based discovery of organic electronic materials. This study generates a top-down set of 117K synthesized molecules containing structures, electronic and topological properties and chemical composition, and uses them as building blocks for bottom-up design. A tool is developed to automate the coupling of these building blocks at their C(sp2/sp)-H bonds, providing a fundamental link between the two dataset construction philosophies. Statistical models are trained on this dataset and a subset of resulting top-down/bottom-up compounds, enabling on-the-fly prediction of ground and excited state properties with high accuracy across organic compound space. With access to ab initio-quality optical properties, this bottom-up pipeline may be applied to any materials design campaign using existing compounds as building blocks. To illustrate this, over a million molecules are screened for singlet fission. tThe leading candidates provide insight into the features promoting this multiexciton-generating process.|'Top-down' and 'bottom-up' methods are combined to facilitate the fragment-based discovery of organic electronic materials. A dataset of 117K synthesized molecules is curated and used as a building block library. Statistical models are trained on this dataset, enabling accurate prediction of excited state properties. This approach allows for efficient screening of over a million molecular candidates for singlet fission.image

Details