-
-
Notifications
You must be signed in to change notification settings - Fork 574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal for serialisation of models #2787
Comments
Sounds good. Anytree also have a json exporter which would be easier to implement but maybe not as generalizable. What about FMU? |
FMU is nice that its an existing standard, I like that. The bit I don't like is the XML description, its machine readable but not human readable. There is a reason we program in languages and not in xml..... Mind you, the "existing standard" bit is very convincing, so happy to be pursuaded! |
can FMU do sparse vectors or linear algebra? I can't find this....? |
I'm still playing around with a human-readable serialisation format similar to the above, but would suggest that for now we just go with @tinosulzer's suggestion of a json exporter, you basically write out every node in the expression tree in a large json-format tree. Not really readable but it will be much easier to write the exporter/importer. I think we should stick a version number in the output and make sure that if the format ever needs to change (e.g. a node in the expression tree gets a new field) we increment the version number, and make sure we support reading in all prior versions) |
an alternative to json is flatbuffers (https://flatbuffers.dev/). This could simplify transferring pybamm models to other languages (e.g. Julia). Saying that, there are lots of libraries for json as well, so perhaps not simplify, just make the actual data transfer a lot faster! UPDATE: Supported languages are: C |
There is also this: https://protobuf.dev/ |
Just wanted to post a quick progress update on this issue: I first looked at Pydantic, a library which can use type annotations to generate a JSON schema for serialising Python objects. To integrate with Pydantic, PyBaMM would have to be type-hinted throughout and inherit from the Pydantic's BaseModel class. Actually, we’d have to make use to this patch which fixes a Pydantic issue related to property getters/setters – a pattern used frequently in PyBaMM. However Pydantic’s serialisation support doesn’t work out of the box for PyBaMM, since most PyBaMM objects are not natively JSON serialisable. I.e., it does not seem that Pydantic can infer this from the base types alone so we'd still have to manually extend the JSONEncoder for each PyBaMM object we wish to serialise. While experimenting with Pydantic, I added type hints to most expression tree files. I’ll include these in a separate pull request even if they end up not being required for serialization. Before continuing with Pydantic, I looked for more automated alternatives. JSONpickle is an obvious candidate: the library reads/writes JSON files for pickleable Python objects. The authors demonstrate cross-language support with a deserialization module that can reconstruct Python objects in JavaScript, but similar code could be written in any language. JSONpickle also supports complex Python objects: “py/id” tags are used to handle multiple references made to the same Python object. I ran a few tests of JSONpickle. First, I serialized an expression tree (inspired by the PyBaMM expression tree example)
This works great! Next, I tried to serialize a model object which is part of a PyBaMM simulation (inspired by this example):
Unfortunately, this code produces errors. I started debugging by writing a script that recurses through the object and tries dumping & loading each property. This reveals errors with multiple properties in the object structure. I’ve attached a stack dump I generated: 2023-07-06_13-06-43.txt I’m going to continue debugging issues here while having a look at Google Protobuf as an alternative cross-platform serialisation method. |
Description
Motivation
One of the original goals of PyBaMM was to facilitate the sharing of physics-based battery models. PyBaMM has been very successful in this, allowing users to share pybamm models created using the Python language. However, to enable a wider "shareability" of PyBaMM models I would propose that we need a text serialisation format that pybamm models can be converted to and created from. This would enable easiler interoperability of pybamm with other solvers or tools. For example, if someone was developing a battery model in Matlab they could write it out in this format for later import into pybamm. Or if someone was developing a battery parameterisation tool (in any language) they could allow the import of pybamm models by writing a reader for our serialisation format.
Possible Implementation
I would propose that we focus on serialisation of pybamm models that are already discretised and ready to be solved, as sharing a continuum model still leaves many questions on how in particular this model should be discretised.
My proposal for a serialisation format is a text based, human readable language based on tensors (inspired mainly by the TACO tensor algebra compiler for reference), and example of which is below. This is based on another project I'm working on and I'm happy to iterate on this, just wanted to put something down to start the conversation!
Additional context
There have been a few proposals for serialisation formats for model parameters (e.g. BPX) but I would argue that the usefulness of these is very much hampered by the lack of a model serialisation format. Having a bunch of parameters means nothing unless you have a description of the model that uses these parameters. E.g.$y = exp(-at)$ with $a=1$ is very different to $y = exp(-at/10)$ with $a=1$ , even if both of those models are very similar (you would describe them both as "exponential decay", just the details are different)
The text was updated successfully, but these errors were encountered: