We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
COMPAS-2: a dataset of cata-condensed hetero-polycyclic aromatic systems.
- Authors
Mayo Yanes, Eduardo; Chakraborty, Sabyasachi; Gershoni-Poranne, Renana
- Abstract
Polycyclic aromatic systems are highly important to numerous applications, in particular to organic electronics and optoelectronics. High-throughput screening and generative models that can help to identify new molecules to advance these technologies require large amounts of high-quality data, which is expensive to generate. In this report, we present the largest freely available dataset of geometries and properties of cata-condensed poly(hetero)cyclic aromatic molecules calculated to date. Our dataset contains ~500k molecules comprising 11 types of aromatic and antiaromatic building blocks calculated at the GFN1-xTB level and is representative of a highly diverse chemical space. We detail the structure enumeration process and the methods used to provide various electronic properties (including HOMO-LUMO gap, adiabatic ionization potential, and adiabatic electron affinity). Additionally, we benchmark against a ~50k dataset calculated at the CAM-B3LYP-D3BJ/def2-SVP level and develop a fitting scheme to correct the xTB values to higher accuracy. These new datasets represent the second installment in the COMputational database of Polycyclic Aromatic Systems (COMPAS) Project.
- Subjects
ELECTRON affinity; ORGANIC electronics; IONIZATION energy; HIGH throughput screening (Drug development); DATABASES
- Publication
Scientific Data, 2024, Vol 11, Issue 1, p1
- ISSN
2052-4463
- Publication type
Article
- DOI
10.1038/s41597-024-02927-8