pycc.id brings hypothesis-driven equation discovery with identifiability checks

source: arxiv machine learning: pycc.id: a package for hypothesis-driven equation discovery with structural identifiability

level: research

data-driven equation discovery tries to infer differential equations from time-series data, but it often produces many models that fit the data equally well. this happens because the inverse problem is ill-conditioned. a common fix is to add known hypotheses and constraints during training, which narrows the search space. however, even with constraints, multiple candidate models can remain, and experts must manually filter them using domain knowledge.

a newer method uses structural skeletons based on characteristic curves to guide the search. practitioners define a skeleton that represents a family of ordinary differential equations and then add their hypotheses. this hypothesis-driven approach reduces ambiguity but still needs a way to check if the model parameters can be uniquely identified from the data. structural identifiability analysis answers this question by determining whether the parameters can be recovered in theory, before fitting the model.

the pycc.id package implements this workflow. it lets users define skeletons, incorporate hypotheses, and automatically test structural identifiability of the resulting models. by filtering out non-identifiable models early, it saves time and improves reliability. the package integrates with existing scientific python tools and is designed for researchers who want to discover interpretable equations from data without getting lost in many plausible but unverifiable candidates.

why it matters: it helps data scientists and researchers avoid wasting time on models whose parameters cannot be uniquely determined, leading to more trustworthy equation discovery.

source: arxiv machine learning: pycc.id: a package for hypothesis-driven equation discovery with structural identifiability