-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: Bump Pydantic to v2 #49
Conversation
lmmx
commented
Jan 20, 2024
- chore: upgrade Pydantic pin to 2.5.3 and fastapi to 0.109.0
- fix: comment out pre-v2 Pydantic patch calls
Adding the model validators works well but when the field validators are sent as well, an infinite recursion bug appears. In commit 853a93d I commented out the body of the Click to show pytest failure============================= test session starts ==============================
platform linux -- Python 3.11.7, pytest-7.4.4, pluggy-1.3.0
rootdir: /home/louis/dev/fal/pydantic-2-bump/projects/fal/tests
plugins: asyncio-0.23.3, anyio-4.2.0
asyncio: mode=Mode.STRICT
collected 1 item
test_pydantic.py F [100%]
=================================== FAILURES ===================================
_______________________ test_deserialise_pydantic_model ________________________
def test_deserialise_pydantic_model():
"""Test deserialisation of a Pydantic model succeeds.
The deserialisation failure mode reproduction is incompatible with pytest (see
[#29](https://github.com/fal-ai/fal/issues/29#issuecomment-1902241217) for
discussion) so we directly invoke the current Python executable on this file.
"""
subprocess_args = [sys.executable, __file__, "--run-deserialisation-test"]
proc = subprocess.run(subprocess_args, capture_output=True, text=True)
model_fields_ok = "model-field-missing-annotation" not in proc.stderr
assert model_fields_ok, "Deserialisation failed (`model_fields` not deserialised)"
# The return code should be zero or else the deserialisation failed
deserialisation_ok = proc.returncode == 0
> assert deserialisation_ok, f"Pydantic model deserialisation failed:\n{proc.stderr}"
E AssertionError: Pydantic model deserialisation failed:
E /home/louis/miniconda3/envs/fal/lib/python3.11/site-packages/pydantic/main.py:1406: RuntimeWarning: fields may not start with an underscore, ignoring "__annotations__"
E warnings.warn(f'fields may not start with an underscore, ignoring "{f_name}"', RuntimeWarning)
E Traceback (most recent call last):
E File "/home/louis/dev/fal/pydantic-2-bump/projects/fal/tests/test_pydantic.py", line 192, in <module>
E validate_deserialisation(model)
E File "/home/louis/dev/fal/pydantic-2-bump/projects/fal/tests/test_pydantic.py", line 169, in validate_deserialisation
E assert model.epochs == 30, "The `validate_epochs` field validator didn't run"
E ^^^^^^^^^^^^^^^^^^
E AssertionError: The `validate_epochs` field validator didn't run
E
E assert False
test_pydantic.py:187: AssertionError
=========================== short test summary info ============================
FAILED test_pydantic.py::test_deserialise_pydantic_model - AssertionError: Py...
============================== 1 failed in 0.15s =============================== We can reproduce it directly: $ python test_pydantic.py --run-deserialisation-test ===== DESERIALIZING =====
/home/louis/miniconda3/envs/fal/lib/python3.11/site-packages/pydantic/main.py:1406: RuntimeWarning: fields may not start with an underscore, ignoring "__annotations__"
warnings.warn(f'fields may not start with an underscore, ignoring "{f_name}"', RuntimeWarning)
===== INSTANTIATING =====
Traceback (most recent call last):
File "/home/louis/dev/fal/pydantic-2-bump/projects/fal/tests/test_pydantic.py", line 192, in <module>
validate_deserialisation(model)
File "/home/louis/dev/fal/pydantic-2-bump/projects/fal/tests/test_pydantic.py", line 169, in validate_deserialisation
assert model.epochs == 30, "The `validate_epochs` field validator didn't run"
^^^^^^^^^^^^^^^^^^
AssertionError: The `validate_epochs` field validator didn't run
In commit 3c71e6c I got the field validator in question to deserialise, but it still doesn't trigger upon instantiation $ python test_pydantic.py --run-deserialisation-test /home/louis/miniconda3/envs/fal/lib/python3.11/site-packages/pydantic/main.py:1406: RuntimeWarning: fields may not start with an underscore, ignoring "__annotations__"
warnings.warn(f'fields may not start with an underscore, ignoring "{f_name}"', RuntimeWarning)
===== DESERIALIZING =====
===== INSTANTIATING =====
Traceback (most recent call last):
File "/home/louis/dev/fal/pydantic-2-bump/projects/fal/tests/test_pydantic.py", line 219, in <module>
validate_deserialisation(model)
File "/home/louis/dev/fal/pydantic-2-bump/projects/fal/tests/test_pydantic.py", line 196, in validate_deserialisation
assert model.epochs == 30, "The `validate_epochs` field validator didn't run"
^^^^^^^^^^^^^^^^^^
AssertionError: The `validate_epochs` field validator didn't run Inspecting the '__pydantic_decorators__': DecoratorInfos(validators={},
field_validators={},
root_validators={},
field_serializers={},
model_serializers={},
model_validators={'increment': Decorator(cls_ref='__main__.Input:39874048',
cls_var_name='increment',
func=<function Input.increment at 0x7fc9be6c1580>,
shim=None,
info=ModelValidatorDecoratorInfo(mode='after'))}, whereas running the module interactively and inspecting the pprint-out of the '__pydantic_decorators__': DecoratorInfos(validators={},
field_validators={'triple_epochs': Decorator(cls_ref='__main__.Input:44759808',
cls_var_name='triple_epochs',
func=<bound method Input.triple_epochs of <class '__main__.Input'>>,
shim=None,
info=FieldValidatorDecoratorInfo(fields=('epochs',),
mode='after',
check_fields=None))},
root_validators={},
field_serializers={},
model_serializers={},
model_validators={'increment': Decorator(cls_ref='__main__.Input:44759808',
cls_var_name='increment',
func=<function Input.increment at 0x7f805f2722a0>,
shim=None,
info=ModelValidatorDecoratorInfo(mode='after'))},
computed_fields={}), In 3c44f35 I added a print out of the field validators before and after:
My sense is that I shouldn't have used the But without doing that there was a recursion bug as soon as you added a field validator 😕 Doesn't really make sense why you should access functions for model validators differently to field validators though. (There is no existing code I can find online that supplies field validators to |
After discussing the issues with field validators not running (3c71e6c) we decided to prioritise just reproducing behaviour already tested for (and a full solution to the outlined issues can come later). The new definition of done is therefore for This test raises completely different errors to the ones we wrote so far! 🙃 However, I suspect it ultimately indicates the same thing (a failure to deserialise Pydantic models), so the different error does not mean we have another bug to tackle: it's just another symptom of the same underlying problem. Test Apps error introThe test runs fine on v1 (click to show)(fal_pyd1) louis 🌟 ~/dev/fal/fal/projects/fal/tests $ python -m pytest test_apps.py
========================================== test session starts ===========================================
platform linux -- Python 3.11.7, pytest-8.0.0, pluggy-1.4.0
rootdir: /home/louis/dev/fal/fal/projects/fal/tests
plugins: anyio-4.2.0
collected 7 items
test_apps.py ....... [100%]
========================================= 7 passed in 73.91s (0:01:13) ========================================= but on this feature branch (in an environment with its Pydantic v2 dependency) we hit the following error
Which is downstream of
This happens for all 7 tests (click to show)======================================== short test summary info =========================================
ERROR test_apps.py::test_app_client - isolate.connections.common.SerializationError: Error while serializing the given object
ERROR test_apps.py::test_stateful_app_client - isolate.connections.common.SerializationError: Error while serializing the given object
ERROR test_apps.py::test_app_client_async - isolate.connections.common.SerializationError: Error while serializing the given object
ERROR test_apps.py::test_app_openapi_spec_metadata - isolate.connections.common.SerializationError: Error while serializing the given object
ERROR test_apps.py::test_app_no_serve_spec_metadata - isolate.connections.common.SerializationError: Error while serializing the given object
ERROR test_apps.py::test_app_update_app - isolate.connections.common.SerializationError: Error while serializing the given object
ERROR test_apps.py::test_app_set_delete_alias - isolate.connections.common.SerializationError: Error while serializing the given object
=========================================== 7 errors in 4.92s ============================================ Full error record (click to show)
Solving the
|
I'm going to try copying over the same dill-registered Also note that we can't set the |
… funcs can be reused in `test_apps.py`)
…apps_simple.py` (temporary)
Running the tests now does not crash (which is progress!) but also does not succeed. The app test fails with
When I go to that URL I get an auth error which may be misleading (I don't see any reason why the URL should be wrong, but when opening it in the browser the auth headers would not be passed so this may be perfectly normal to see). I want to get this fixed python -m pytest --assert=plain test_apps.py -x But it's simpler to test (and read) a test file with a single script, rather than select it with On Pydantic v1 it works fine (fal_pyd1) louis 🌟 ~/dev/fal/pydantic-2-bump/projects/fal/tests $ python -m pytest -vvv --assert=plain test_apps_simple.py -x -s --log-level=DEBUG
============================= test session starts ==============================
platform linux -- Python 3.11.7, pytest-8.0.0, pluggy-1.4.0 -- /home/louis/miniconda3/envs/fal_pyd1/bin/python3
cachedir: .pytest_cache
rootdir: /home/louis/dev/fal/pydantic-2-bump/projects/fal/tests
plugins: anyio-4.2.0
collecting ... collected 1 item
test_apps_simple.py::test_app_client PASSED
============================== 1 passed in 16.06s ============================== On Pydantic v2 it just hangs indefinitely (until it's "hung up on" by the timeout). No doubt this is too much, but I dumped the entire log
(I re-auth'd just in case there were any private credentials included here) You only get the error logs if the test fails, so I threw a trivial error and re-ran the working test (Pydantic v1) to compare the results more clearly. |
Hey @lmmx, I went into our logs and the requests I see failing with a 503 internally are just
OK, I checked with the team and actually this happens for the gateway submit flow, which is expected (we could not provision a machine). So you would have to check the sdk call logs (if it was |
…eudo-package in a temp dir that is prepended to PATH
Patch applied within a minimal test ( @isidentical noted that the 'hanging' API calls could be diagnosed as actual bugs using (P.S. thanks for the reply @chamini2 I missed it 😅 Yes PYTHONPATH=. fal --debug fn run test_apps_simple.py addition_app
This identified a bug: you can't always rely on Batuhan amended this File "/root/.cache/isolate/virtualenv/741ff6a01b8571a0cd4c10769658d96db5c871c77f13a7cb173766e3286738ab/lib/python3.11/site-packages/dill/_dill.py", line 442, in load
obj = StockUnpickler.load(self)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/tmp6t5zsbtc/pydantic_patch.py", line 78, in build_pydantic_model
model_fields["__annotations__"] = {
^
File "/tmp/tmp6t5zsbtc/pydantic_patch.py", line 79, in <dictcomp>
name: field.annotation for name, field in model_fields.items()
^^^^^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'annotation' There is a shortcut to running these deployed functions: https://fal.run (which Batuhan sent a POST request to via
Thus Pydantic 2 is now up and running. 🚀 The next step is to drop the |
In v1 all a Pydantic model def __eq__(self, other: Any) -> bool:
if isinstance(other, BaseModel):
return self.dict() == other.dict() In v2 it became more complex and effectively based on I suspect it'd be acceptable to relax this test thought it may be wise to use it to guide further patching: fal/projects/fal/tests/stability_test.py Line 428 in 8ac53ff
It suggests we need extra logic to overwrite the type info, which is checked in The full test here can be debugged with: def test_pydantic_serialization(isolated_client):
from fal.toolkit import mainify
from pydantic import BaseModel, Field
@mainify
class MathQuery(BaseModel):
x: int = Field(gt=0, description="The first operand")
y: int = Field(gt=0, description="The second operand")
@mainify
class MathResult(BaseModel):
result: int = Field(description="The result of the operation")
@isolated_client("virtualenv", requirements=["pydantic>=2"])
def add(query: MathQuery) -> MathResult:
return MathResult(result=query.x + query.y)
result = add(MathQuery(x=1, y=2))
assert result.result == 3
breakpoint()
is_identical = result == MathResult(result=3)
assert is_identical
-> self_type = self.__pydantic_generic_metadata__['origin'] or self.__class__
(Pdb) n
> /home/louis/miniconda3/envs/fal39/lib/python3.9/site-packages/pydantic/main.py(869)__eq__()
-> other_type = other.__pydantic_generic_metadata__['origin'] or other.__class__
(Pdb) p self_type
<class 'tests.stability_test.MathResult'>
(Pdb) n
> /home/louis/miniconda3/envs/fal39/lib/python3.9/site-packages/pydantic/main.py(872)__eq__()
-> self_type == other_type
(Pdb) p other_type
<class 'tests.stability_test.test_pydantic_serialization.<locals>.MathResult'>
(Pdb) p self_type == other_type
False The internal equality check involves a type comparison def __eq__(self, other: Any) -> bool:
if isinstance(other, BaseModel):
# When comparing instances of generic types for equality, as long as all field values are equal,
# only require their generic origin types to be equal, rather than exact type equality.
# This prevents headaches like MyGeneric(x=1) != MyGeneric[Any](x=1).
self_type = self.__pydantic_generic_metadata__['origin'] or self.__class__
other_type = other.__pydantic_generic_metadata__['origin'] or other.__class__
return (
self_type == other_type
and self.__dict__ == other.__dict__
and self.__pydantic_private__ == other.__pydantic_private__
and self.__pydantic_extra__ == other.__pydantic_extra__
)
else:
return NotImplemented # delegate to the other item in the comparison |
It looks like def test_fal_storage(isolated_client):
> file = File.from_bytes(b"Hello fal storage from local", repository="fal")
stability_test.py:491:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../src/fal/toolkit/file/file/standard.py:95: in from_bytes
return cls(
../src/fal/toolkit/file/file/standard.py:47: in __init__
self._file_data = data
/home/louis/miniconda3/envs/fal39/lib/python3.9/site-packages/pydantic/main.py:770: in __setattr__
if self.__pydantic_private__ is None or name not in self.__private_attributes__: doesn't work. It seems that the private attribute isn't being set (which should get passed through as part of our post-fixes, by definition Reading the docs it seems like the proper way to do this in v2 is after the call to This test I think this 401 unauthorized is due to env vars not being set outside of the CI environment... so I think this one is actually resolved? There's also an env var test that fails... Unless I missed something to do with how they are set which is done locally. I'm tentatively marking all 4 remaining errors in |
After bumping some more versions, the remaining issues are all to do with
and others which I presume are downstream of those AttributeErrors
Confirmed: we have a serialisation omission
Possibly need to make a standalone test for The expected >>> pprint(list(vars(fal.toolkit.file.File)))
['__module__',
'__annotations__',
'__init__',
'__get_validators__',
'_File__convert_from_str',
'_from_url',
'from_bytes',
'from_path',
'as_bytes',
'save',
'model_config',
'model_post_init',
...
] The rest are dunders:
As a very simple first attempt we can put this Adding the def test_fal_file_methods(isolated_client):
@isolated_client(requirements=["pydantic==2.5.3"])
def fal_file_methods_set():
file_cls_methods = set(vars(File))
return file_cls_methods
file_cls_methods = fal_file_methods_set()
expected = {"_from_url", "from_bytes", "from_path", "as_bytes", "save"}
> assert expected.difference(file_cls_methods) == set()
E AssertionError: assert {'_from_url', 'as_bytes', 'from_path', 'from_bytes', 'save'} == set()
E Extra items in the left set:
E 'from_path'
E 'from_bytes'
E 'save'
E '_from_url'
E 'as_bytes'
E Full diff:
E - set()
E + {'_from_url', 'as_bytes', 'from_path', 'from_bytes', 'save'} |
I went back to an earlier draft and there was a patch that (allegedly) could be run standalone, and I could get a classmethod to run with that Click to show standalone reprofrom __future__ import annotations
import dataclasses
import dill
import dill._dill as dill_serialization
from pydantic import BaseModel, Field, model_validator
def build_pydantic_model(
name, base_cls, model_config, model_fields, validators, class_fields
):
"""Recreate the Pydantic model from the deserialised validator info."""
import pydantic
model_cls = pydantic.create_model(
name,
__base__=base_cls,
__validators__={
validator_name: pydantic.model_validator(mode=validator_info.mode)(
validator_func
)
for validator_name, (validator_func, validator_info) in validators.items()
},
**model_fields,
**class_fields,
)
print(f"{model_fields=}")
print(f"{class_fields=}")
return model_cls
@dill.register(type(BaseModel))
def _dill_hook_for_pydantic_models(
pickler: dill.Pickler,
pydantic_model,
) -> None:
if pydantic_model is BaseModel:
dill_serialization.save_type(pickler, pydantic_model)
return
validators = {}
decorators = pydantic_model.__pydantic_decorators__
for validator_name, decorator in decorators.model_validators.items():
validators[validator_name] = (decorator.func, decorator.info)
class_fields = {
"__annotations__": pydantic_model.__annotations__,
}
for class_field_name, class_field_value in pydantic_model.__dict__.items():
if class_field_name.startswith("_"):
continue
elif class_field_name in ("model_fields", "model_config"):
continue
elif class_field_name in pydantic_model.model_fields:
continue
elif class_field_name in validators:
continue
class_fields[class_field_name] = class_field_value
pickler.save_reduce(
build_pydantic_model,
(
pydantic_model.__name__,
pydantic_model.__bases__[0],
pydantic_model.model_config,
pydantic_model.model_fields,
validators,
class_fields,
),
)
class Input(BaseModel):
prompt: str = Field()
num_steps: int = Field(default=2, ge=1, le=10)
def get_xxx(self):
return self.num_steps * 2
@model_validator(mode="after")
def validate_num_steps(self):
print(1)
if self.num_steps % 2 != 0:
raise ValueError("num_steps must be odd")
@classmethod
def from_name(cls, name: str, num_steps: int):
return cls(prompt=f"DSLR photograph of {name}", num_steps=num_steps)
if __name__ == "__main__":
dill.settings["recurse"] = True
serialized_cls = dill.dumps(Input)
print()
print("====== DESERIALIZING =====")
print()
cls = dill.loads(serialized_cls)
print("======== RUNNING =====")
inp = cls.from_name(name="Pikachu", num_steps=6)
print(inp.prompt) Whereas this seems to have been lost. I printed out the values and the classmethod is in the ====== DESERIALIZING =====
/home/louis/miniconda3/envs/fal39/lib/python3.9/site-packages/pydantic/main.py:1406: RuntimeWarning: fields may not start with an underscore, ignoring "__annotations__"
warnings.warn(f'fields may not start with an underscore, ignoring "{f_name}"', RuntimeWarning)
model_fields={'prompt': FieldInfo(annotation=str, required=True), 'num_steps': FieldInfo(annotation=int, required=False, default=2, metadata=[Ge(ge=1), Le(le=10)])}
class_fields={'__annotations__': {'prompt': 'str', 'num_steps': 'int'}, 'get_xxx': <function Input.get_xxx at 0x7f89e9510040>, 'from_name': <classmethod object at 0x7f89e9437a90>}
======== RUNNING =====
1
DSLR photograph of Pikachu |
Summary classmethod serialisation fixed in 12a2ad8 but gives a OK lastly it seems like I already figured out that you can't just put classmethods in the model fields (it doesn't seem to work) but you can assign them (you just can't get them to work as field validators, which isn't used internally anyway so is fine for a MVP...) I already implemented an approach that uses methods = {
field_name: field_value
for field_name, field_value in model.__dict__.items()
if field_name in ["steps_x2", "from_template"]
} I suspect that I can just replace the way I was building up a list of classmethods and... put them in here? Unless this breaks anything else, this would seem to be the final piece of the puzzle. Adding
To working (fal39) louis 🌟 ~/dev/fal/pydantic-2-bump/projects/fal/tests $ vim pydantic_standalone/pydantic_patch.py
(fal39) louis 🌟 ~/dev/fal/pydantic-2-bump/projects/fal/tests $ python pydantic_test.py --run-deserialisation-test
===== DESERIALIZING =====
/home/louis/miniconda3/envs/fal39/lib/python3.9/site-packages/pydantic/main.py:1406: RuntimeWarning: fields may not start with an underscore, ignoring "__annotations__"
warnings.warn(f'fields may not start with an underscore, ignoring "{f_name}"', RuntimeWarning)
===== INSTANTIATING ===== Dunder methods are valid to deserialiseAdditionally, there are dunder methods on E.g. a trivial class >>> pprint([k for k in MyModel.__dict__ if k.startswith("_") if not k.startswith("__pydantic_")])
['__module__',
'__annotations__',
'__get_private_field__',
'__class_vars__',
'__private_attributes__',
'__weakref__',
'__doc__',
'__abstractmethods__',
'_abc_impl',
'__signature__'] Or maybe we can... Maybe we could simply:
It turns out you can distinguish between a Pydantic model with a default autogenerated (dataclass-style) louis 🌟 ~/lab/fal/pyd2/round_2 $ python -i private_attr.py
>>> "__init__" in MyModel.__dict__
False
>>>
louis 🌟 ~/lab/fal/pyd2/round_2 $ python -i private_attr_custom_init.py
>>> "__init__" in MyModel.__dict__
True I'd expect this will mean exactly what it looks like: whether the class's namespace has the method defined in it: so in fact you'd want to enumerate methods down the entire MRO until you reach the base class (which we have access to!) Therefore it's actually relevant to methods in general, and nothing to do with dunders specifically. This means we have a way to filter for legit methods I think! |
Dug into this some more and the dunders can be identified from the model metaclass MRO based on the qualname E.g. the
We can therefore test this based on the qualnames of the two classes being similar, or more robustly we can take the first part of the qualname and compare it to the class name Negative result for
Positive result for
|
The good news is: the serialisation is working! The approach to get methods involves checking the qualname, with classmethods detected based on their The bad news is: the mechanism uses just a single level of the MRO, so calls to super will trigger recursion (I naively hoped that the base classes might just work out nicely)
I suspect that we might be able to interfere with the MRO manually? If it's legit/possible? (Just spoof it, like we spoof everything else)
Note The remaining failing tests can be targeted specifically with python -m pytest integration_test.py -k "path or bytes or save or input or compressed" -s |