Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(serialization): Slice 1-D multibyte data as bytes for pwrite #181

Merged
merged 4 commits into from
Nov 28, 2024

Conversation

Eta0
Copy link
Contributor

@Eta0 Eta0 commented Nov 28, 2024

1-D Multibyte Data Slicing

This change fixes a bug in TensorSerializer's write routine that would incorrectly slice 1-D multibyte data using its original dtype width rather than by individual bytes. The effect of this was that if attempting to write a buffer only wrote part of it, slicing the beginning of the buffer away to then attempt to write the rest of it would create an invalid slice, and then fail the write routine with an exception.

For example, if writing a 32-byte buffer of 4x float64 elements partially succeeded and wrote 12 bytes, it would previously essentially attempt to then continue by writing buffer[12:], but mistakenly skip ahead by the width of 12x float64 elements (i.e. 96 bytes) instead of 12 bytes. It would then find that it had no more data to write, and then check that it wrote everything, and fail. To fix this, it now casts buffers to an unsigned byte type before slicing if it is not one already.

This only affects 1-D data because multidimensional arrays passed to TensorSerializer's write routine were already correctly converted to 1-D arrays of bytes. Only ones that were already 1-D, but the wrong width, were missed.

Code version update to v2.9.1

This change additionally updates the code version from v2.9.0 to v2.9.1 so that this fix can be published as a bugfix release.

@Eta0 Eta0 added the bug Something isn't working label Nov 28, 2024
@Eta0 Eta0 requested a review from wbrown November 28, 2024 00:13
@Eta0 Eta0 self-assigned this Nov 28, 2024
Copy link
Collaborator

@wbrown wbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have questions!

tensorizer/serialization.py Show resolved Hide resolved
tensorizer/serialization.py Show resolved Hide resolved
@wbrown wbrown merged commit 2cee68a into main Nov 28, 2024
5 of 7 checks passed
@Eta0 Eta0 deleted the eta/1d-multibyte-slicing branch November 28, 2024 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants