source: arxiv machine learning: synergizing physically constrained mcmc and chemical-informed gaussian processes for reaction network discovery

level: research

extracting governing equations from sparse, noisy chemical time-series data is hard because reaction topology and kinetic parameters are tightly linked. a new workflow called pc-mcmc-cigp combines spike-and-slab topology sampling with hard conservation and thermodynamic screening. it also uses a chemical-informed gaussian process residual model for parameter calibration and experimental design. the key contribution is not a new mcmc or gp variant, but the integration of these pieces into a physically constrained workflow with explicit uncertainty-aware acquisition choices.

on the h2 + br2 benchmark, the constrained sampler distinguished elementary radical pathways from misleading phenomenological fits. on styrene epoxidation, the cigp optimization loop improved final yield by 12.5% over a reported gp-bo baseline. the method handles the coupling between discrete reaction topology and continuous parameters by using spike-and-slab priors to explore possible reaction networks while enforcing physical laws.

the workflow is reproducible and designed for gray-box modeling, where some physical knowledge is available but the full mechanism is unknown. it uses mcmc to sample over network structures and gaussian processes to model residuals, guiding experiments to reduce uncertainty. this approach helps chemists and engineers discover reaction mechanisms and optimize processes with fewer experiments, making it useful for complex chemical systems where data is limited and noise is present.

why it matters: this method helps scientists automatically discover chemical reaction mechanisms from limited data, reducing experimental costs and improving process yields.


source: arxiv machine learning: synergizing physically constrained mcmc and chemical-informed gaussian processes for reaction network discovery