Publication date: Dec 20, 2021
Data analyses based on linear methods constitute the simplest, most robust, and transparent approaches to the automatic processing of large amounts of data for building supervised or unsupervised machine learning models. Principal covariates regression (PCovR) is an underappreciated method that interpolates between principal component analysis and linear regression, and can be used to conveniently reveal structure-property relations in terms of simple-to-interpret, low-dimensional maps. Here we introduce a kernelized version of PCovR and a sparsified extension, and demonstrate the performance of this approach in revealing and predicting structure-property relations in chemistry and materials science, showing a variety of examples including elemental carbon, porous silicate frameworks, organic molecules, amino acid conformers, and molecular materials.
File name | Size | Description |
---|---|---|
datasets.tgz
MD5md5:fd7f42bcd62917a994115b7dac03dbf9
|
102.7 MiB | Gzipped TAR archive containing all the datasets used in XYZ format |
arginine-kpcovr-0.55-chemiscope.json.gz
MD5md5:4901d18f01498450fddf70d4f1bd0d9e
Visualize on Chemiscope
|
1.1 MiB | Map created with KPCovR for the Arginine-Dipeptide dataset at alpha=0.55 using the chemiscope.org visualizer JSON format |
azaphenacenes-kpcovr-0.65-chemiscope.json.gz
MD5md5:8a5d0f6f04c6c26a7c3ce9b3c0668d80
Visualize on Chemiscope
|
280.9 KiB | Map created with KPCovR for the Azaphenacenes dataset at alpha=0.65 using the chemiscope.org visualizer JSON format |
C-VII-kpcovr-0.0-chemiscope.json.gz
MD5md5:500809d4a4a62b864c1dd42f2c01732c
Visualize on Chemiscope
|
1.6 MiB | Map created with KPCovR for the AIRSS carbon dataset at alpha=0.0 using the chemiscope.org visualizer JSON format |
C-VII-kpcovr-0.5-chemiscope.json.gz
MD5md5:1af1bd5df08b2cb21aecdd9885829a51
Visualize on Chemiscope
|
1.6 MiB | Map created with KPCovR for the AIRSS carbon dataset at alpha=0.5 using the chemiscope.org visualizer JSON format |
C-VII-kpcovr-1.0-chemiscope.json.gz
MD5md5:427aa3e3a939fee3a976fa475092e4bb
Visualize on Chemiscope
|
1.6 MiB | Map created with KPCovR for the AIRSS carbon dataset at alpha=1.0 using the chemiscope.org visualizer JSON format |
CSD-1000R-kpcovr-0.5-chemiscope.json.gz
MD5md5:709118a8c4ec0460efda82059b0b57a0
Visualize on Chemiscope
|
1.0 MiB | Map created with KPCovR for the NMR Chemical shielding dataset at alpha=0.5 using the chemiscope.org visualizer JSON format |
DEEM-global-kpcovr-0.5-chemiscope.json.gz
MD5md5:433087121bd75a693da1c51bdd91a519
Visualize on Chemiscope
|
3.0 MiB | Map created with KPCovR for global properties of DEEM zeolites at alpha=0.5 using the chemiscope.org visualizer JSON format |
DEEM-local-kpcovr-0.5-chemiscope.json.gz
MD5md5:196238f8be2815f22257fe791eaa2199
Visualize on Chemiscope
|
753.9 KiB | Map created with KPCovR for local properties of DEEM zeolites at alpha=0.5 using the chemiscope.org visualizer JSON format |
qm9-12PC-kpcovr-0.5-chemiscope.json.gz
MD5md5:9000a600226b8bb361eccf89b88e1613
Visualize on Chemiscope
|
3.3 MiB | Map created with KPCovR for the QM9 dataset at alpha=0.5 using the chemiscope.org visualizer JSON format |
qm9-12PC-kpcovr-1.0-chemiscope.json.gz
MD5md5:604a61a5e9993b2ba7b59b04a1f6306f
Visualize on Chemiscope
|
3.3 MiB | Map created with KPCovR for the QM9 dataset at alpha=1.0 using the chemiscope.org visualizer JSON format |
2021.225 (version v2) [This version] | Dec 20, 2021 | DOI10.24435/materialscloud:9e-3j |
2020.80 (version v1) | Jul 16, 2020 | DOI10.24435/materialscloud:ay-eq |