Floating point imprecision and integer overflow issues still occurring on some architectures #193

farkmarnum · 2025-01-08T17:47:48Z

Summary

This is a follow-up to #188. On some architectures, romanisim is still crashing when given the following input (with env set up properly for WebbPSF, CRDS, etc):

romanisim-make-image r0099301001001001005_0001_WFI14_uncal.asdf --date 2026-01-01T00:00:00 --level 1 --sca 14 --bandpass F062 --radec 185.113124 61.636139 --roll 0.0 --ma_table_number 4 --catalog gaia-hltds.fits --webbpsf --usecrds --drop-extra-dq

This is due to differences in how conversion overflow is handled between different architectures. I created 2 docker images with python & numpy, one for ARM architecture and one for AMD architecture:

# Dockerfile:
FROM python:3.11
RUN pip install numpy
ENTRYPOINT ["/bin/bash"]

# build script:
docker build --platform linux/amd64 --tag testing-python-linux .
docker build --platform linux/arm64 --tag testing-python-mac .

The ARM image (which is the same architecture apple silicon processors use) doesn't have integer overflow, but the AMD image does:

This matches my observation of seeing the error when running in linux containers in AWS (AMD) but not seeing it on my local macbook machine (ARM).

Since integer overflow is Undefined Behavior, different processor types are allowed to implement it differently, but it's certainly surprising!

Solution

We can resolve this by removing the cause of the integer overflow. In this case, it's not sufficient to "clip" the value to a max of 2^32-1, since floating point imprecision makes that value equivalent to 2^31, (which overflows):

int(np.float32(2**31 - 1))
# -> 2147483648
2**31
# -> 2147483648

As floating point numbers get further from 0, their precision decreases, and in fact any number between 2^31 - 64 and 2^31 + 128 will end up equal to 2^31:

2**31
# -> 2147483648
int(np.float32(2**31 + -64))
# -> 2147483648
int(np.float32(2**31 + 128))
# -> 2147483648

int(np.float32(2**31 + 129))
# -> 2147483904
int(np.float32(2**31 - 65))
# -> 2147483520

This is because the distance between adjacent 32bit floats is 128 when you're near 2^31 but less than it, and 256 when you're near 2^31 but greater than it. So 2^31 - 65 rounds down to 2^31 - 128, and 2^31 + 128 rounds up to 2^31 + 256.

You can use Numpy's nextafter function to get the next (or in this case, previous) floating point number at the given precision, and this can be used as the limit in the clip call:

MAX_SAFE_INT = np.nextafter(np.float32(2 ** 31 - 1), 0)
np.clip(999999999999, 0, MAX_SAFE_INT).astype('i4')
# -> np.int32(2147483520)

This solution works on both ARM and AMD architecture.

~~I don't have permission to create branches in this repo, otherwise I'd make a PR. Here's the patch to fix:~~

Edit: just made a PR: #194

The text was updated successfully, but these errors were encountered:

farkmarnum mentioned this issue Jan 8, 2025

Fix floating point issues #194

Merged

schlafly closed this as completed in #194 Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Floating point imprecision and integer overflow issues still occurring on some architectures #193

Floating point imprecision and integer overflow issues still occurring on some architectures #193

farkmarnum commented Jan 8, 2025 •

edited

Loading

Floating point imprecision and integer overflow issues still occurring on some architectures #193

Floating point imprecision and integer overflow issues still occurring on some architectures #193

Comments

farkmarnum commented Jan 8, 2025 • edited Loading

Summary

Solution

Patch

farkmarnum commented Jan 8, 2025 •

edited

Loading