-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggest DECIMAL cql type to be returned as string #1608
Comments
I do not follow: since But even with 2-based (binary) floating-point I am not sure I get the point: yes, number representation as 10-based text may need to be rounded -- but would of JSON String make any difference? How would that help? |
Ok, based on discussions this is concern wrt JSON handling differing between JSON Numbers and JSON Strings -- so if server-side was to use 2-based FPs ( As thing are, Data API uses Java |
My 2 cents: I think returning decimals as numbers is the right way to go. There shouldn't be any loss of precision since JSON serialization isn't doing any arithmetic. And it seems semantically cleaner for numbers to end up as numbers in JSON. However, serializing decimals as strings wouldn't be unprecedented. For example, MongoDB's JavaScript BSON serializer only supports |
Reading through...
Not sure how this exactly relates to the issue of encoding decimal, yes there is a reason for having floats and decimals. I will note that the Decimal and Float sums above are the same, depending on the number of decimal places and this is why we have the decimal In [5]: s = "-9.9362e-10"
In [6]: f = float(s)
In [8]: print(f"10 decimal places: {f:.10f}")
10 decimal places: -0.0000000010
In [12]: print(f"1 decimal places: {f:.1f}")
1 decimal places: -0.0
# using new rounding https://peps.python.org/pep-0682/
In [20]: print(f"1 decimal places: {f:z.1f}")
1 decimal places: 0.0
In [11]: float("-0.0") == float("0")
Out[11]: True
In [14]: float("-0.0") == int("0")
Out[14]: True
# false if we use the full exponent
In [15]: float(s) == int("0")
Out[15]: False
I am not following the example when it says "As a result, there is a lossy math when the original intent of the column type was precisely to avoid that." the values returned by the Data API are exactly the same as those returned by CQL.
This is the mongo docs for this https://github.com/mongodb/specifications/blob/master/source/bson-decimal128/decimal128.md and https://www.mongodb.com/docs/manual/reference/mongodb-extended-json/#mongodb-extended-json--v2- Thoughts...not decisions just things to be considered.... JSON view on numbersAs the JSON - Data Types page says....
My view on this is the JSON protocol says it has a way to encode an number that is not infinite or NaN. And then says it is up to the parsing code to alert the user if there is an issue with the decoding such as if the code cannot represent the number. Mongo DB view on DecimalIf a decimal is declared, they are always using an EJSON value And if I read the spec correctly they are telling clients to not round trip into the native decimal types. Telephone number problemWhat happens to the classic telephone number problem, the string "012345" is a string of digits not a number. Relaxed Vs StrictWe are building relaxed representation first, and the idea is to be as basic a JSON as we can. Any read will return return the schema, if a client wants to go beyond handling a number as JSOn number it can use the schema returned to know what is a decimal. The parsing in Python and Node.js and other libraries allows for special handling of fields and floats to handle these cases. |
Related: filed #1654 wrt "other direction" for |
moving to on hold , still think we should use JSON for the simple encoding and can then do this for the strict encoding format |
I thought a little bit more about this in the context of datastax/astra-db-ts#90 and I think I see the reason for returning DECIMAL, as well as potentially VARINT and LONG, as strings. Built-in JSON parsers, like JavaScript's, are often limited by the language's built-in number limits, so values that cannot safely fit in numbers may lose precision. For example, 9223372036854775807 is max long value in Java. Consider what happens when JavaScript's
This puts clients in a potentially tricky position of needing to implement customized JSON parsing. |
The rounding issues that essentially led to creating the DECIMAL cql type would dictate such columns to be returned as strings, to let callers/users/clients to do the right thing. Otherwise, unexpected results may occur.
Principle
In binary floating-point
3.3 - 2.2 - 1.1 != 0
, because of rounding issues during the calculation. Decimal holds a special representation for numbers with decimals that avoids that.In pure CQL
This table has two columns (decimal, float) populated the same, which demonstrate this difference:
The
DECIMAL
column:The
FLOAT
column:Data API
When reading this table with the Data API (dockerized 1.0.18 to be precise), this is what I see:
As a result, there is a lossy math when the original intent of the column type was precisely to avoid that.
The text was updated successfully, but these errors were encountered: