Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use arguments in Number.toLocaleString #1619

Merged
merged 2 commits into from
Nov 24, 2023

Conversation

LuisMerinoP
Copy link
Contributor

Shortcut for #729.
I guess it would be better if the cultures string array could be cached in the engine.

@lahma
Copy link
Collaborator

lahma commented Aug 14, 2023

Thanks for the proposal. I think there's still some problems to solve if we're going to map directly to .NET runtime locales. Currently there's no evidence how tests pass better (and the minor detail that this can be quite costly performance-wise like you've noted).

@LuisMerinoP
Copy link
Contributor Author

LuisMerinoP commented Aug 15, 2023

Thanks for checking and your comments @lahma. I am not familiar in making languages or interpreters so apologies in advanced If I speak some nonsense or fluff.

I think there's still some problems to solve if we're going to map directly to .NET runtime locales.

Which problems are these? I find reasonable to achieve these Available locales from the host environment (.NET) which will arrange the formatting and store as a string array somewhere in the engine. But not sure on how to do this or how to foresee/evaluate the potential problems.

Currently there's no evidence how tests pass better.

Better compared what to what?

My thought on this is that all the cultural info related features (thinking mainly in number, date etc. formatting) relies on implementing Locale and Parameter Negotiation. Ideally, implementing all the abstract operations.

Either if all the abstract operations are implemented or not departing point is the AvailableLocales which is the very list we are talking about. ("[[AvailableLocales]] is a List that contains structurally valid (6.2.1) and canonicalized (6.2.2) language tags identifying the locales for which the implementation provides the functionality of the constructed objects").

With this list available in the engine, depending on the abstract operations implemented it can be decided to provide functionality bypassing the ECMAScript standard or not, or not providing the functionality at all (which I think is the current status). Would not like to waste anybody's time here so with some guidance I can try move this forward. If for any reason that doesn't make sense, lets let it be.

Thanks :)

@LuisMerinoP LuisMerinoP force-pushed the issue729-toLocaleString-withArgs branch from e444291 to ced17c5 Compare August 15, 2023 08:54
@lahma
Copy link
Collaborator

lahma commented Aug 15, 2023

Hey didn't want to depress you or anything with my feedback. We do appreciate all the help that we can get. It's more about validating the changes and features that they work as expected.

Our main tool for this is the Jint.Tests.Test262 project in solution. It has Test262Harness.settings.json which tells which set of tests should be run and which are currently excluded. When one makes changes to this file you should delete the Generated folder and let build generate the test cases again obeying the configuration (new excludes or removal of excludes).

As a use case if I edit the configuration file and remove intl402 from ExcludedDirectories I will get the "full intl402 experience". Which gives me mere 2236 failing tests. And here are the test cases interesting for this particular function:

image

Many of them seem to fail due to missing Intl. So thing is, we would need to find a way to validate these changes and improvements 🙂

@LuisMerinoP
Copy link
Contributor Author

Cool thanks. No depression at all :). Not sure how to run the tests but I'll give that a shot when I find the time and try fix. Thanks :)

@LuisMerinoP
Copy link
Contributor Author

Hi @lahma, same tests fail on main and on my branch (as far as I checked). So your point is that for the PR to be successful should be fixing some of the already failing tests, right?

@lahma
Copy link
Collaborator

lahma commented Aug 23, 2023

We need some tests to cover such new functionality, but there are problems.

I think a less terse implementation could be:

var cultureInfo = Engine.Options.Culture;
if (arguments.Length > 0 && arguments[0].IsString())
{
    var cultureArg = arguments[0].ToString();
    try
    {
        cultureInfo = CultureInfo.GetCultureInfo(cultureArg);
    }
    catch (CultureNotFoundException)
    {
        ExceptionHelper.ThrowRangeError(_realm, "Incorrect locale information provided");
    }
}

BUT, the problem is that the current solution seems to have wrong output if we compare to V8.

Welcome to Node.js v18.17.1.
Type ".help" for more information.
> (123).toLocaleString("en-US")
'123'

VS Jint's REPL output

Welcome to Jint (1.0.0.0)
Type 'exit' to leave, 'print()' to write on the console, 'load()' to load scripts.

jint> (123).toLocaleString("en-US")
"123.000"

@sebastienros
Copy link
Owner

@lahma 's code looks much simpler and the runtime library will probably already do the cache correctly.
About the current discrepancy I believe it's coming from the type, double vs int, in javascript it must know that 123 is integral, while we store these in a double variable. So dotnet will add some decimals by default.

It seems like the native js implementation removes any spare zeros. There are ways in dotnet to do that, but it's an orthogonal problem to supporting the culture as an argument.

@lahma
Copy link
Collaborator

lahma commented Aug 24, 2023

I think something like the following would produce more accurate results:

private JsValue ToLocaleString(JsValue thisObject, JsValue[] arguments)
{
    if (thisObject is not JsNumber number)
    {
        if (thisObject is not NumberInstance numberInstance)
        {
            ExceptionHelper.ThrowTypeError(_realm, "Number.prototype.toLocaleString requires that 'this' be a Number");
            return default;
        }

        number = numberInstance.NumberData;
    }

    var cultureInfo = Engine.Options.Culture;
    var culture = arguments.At(0);
    if (culture.IsString())
    {
        var cultureArg = culture.ToString();
        try
        {
            cultureInfo = CultureInfo.GetCultureInfo(cultureArg);
        }
        catch (CultureNotFoundException)
        {
            ExceptionHelper.ThrowRangeError(_realm, "Incorrect locale information provided");
        }
    }

    if (!number.IsInteger())
    {
        var d = number.AsNumber();
        if (double.IsNaN(d))
        {
            return "NaN";
        }
        if (double.IsPositiveInfinity(d) || d >= double.MaxValue)
        {
            return "Infinity";
        }

        if (double.IsNegativeInfinity(d) || d <= -double.MaxValue)
        {
            return "-Infinity";
        }
    }

    return number.IsInteger()
        ? number.AsInteger().ToString(cultureInfo)
        : number.AsNumber().ToString(cultureInfo);
}

But we still need those test cases to cover...

@LuisMerinoP
Copy link
Contributor Author

LuisMerinoP commented Aug 24, 2023

As far as I researched JavaScript will use the shortest representation that accurately represents the value. For the closest approximation of JavaScript's output across cultures, we could try combining the "g" format specifier with setting the NumberFormatInfo.NumberDecimalDigits to 0 like so:

....
var numberFormat = (NumberFormatInfo) Engine.Options.Culture.NumberFormat.Clone();
if (arguments.Length > 0 && arguments[0].IsString())
{
    var cultureArg = arguments[0].ToString();
    try
    {
        numberFormat = (NumberFormatInfo) CultureInfo.GetCultureInfo(cultureArg).NumberFormat.Clone();
        numberFormat.NumberDecimalDigits = 0;
    }
    catch (CultureNotFoundException)
    {
        ExceptionHelper.ThrowRangeError(_realm, "Incorrect locale information provided");
    }
}

return m.ToString("g", numberFormat);

We could add the tests that check that the output is the same as the one from node for the most widely used cultures and the most conflictive conversions. These would be in the NumberTests, right?

Checking with @lahma provided snippet or the arrangement I suggest above the tests will reveal if code needs further arrangements or tweaks to match JS behaviour.
Makes sense?

@lahma
Copy link
Collaborator

lahma commented Aug 24, 2023

Sounds good 👍🏻

@LuisMerinoP LuisMerinoP force-pushed the issue729-toLocaleString-withArgs branch from 5358d64 to db84e2d Compare November 19, 2023 16:26
@LuisMerinoP
Copy link
Contributor Author

Hi there, I pushed what I think makes sense for the time being. Can you have a look when you find the time @lahma? Thank you :)

@LuisMerinoP LuisMerinoP force-pushed the issue729-toLocaleString-withArgs branch 5 times, most recently from 5c619ae to d8f3a80 Compare November 19, 2023 18:01
@lahma
Copy link
Collaborator

lahma commented Nov 21, 2023

I think NET 8 has introduced some subtle difference in behavior that needs to be fixed in main first to get reliable builds.

@LuisMerinoP
Copy link
Contributor Author

I think NET 8 has introduced some subtle difference in behavior that needs to be fixed in main first to get reliable builds.

Ok thanks. Will wait for this then. Reverting my attempts to fix the test :).

@LuisMerinoP LuisMerinoP changed the title Use arguments in Number.toLocaleTring Use arguments in Number.toLocaleString Nov 21, 2023
@lahma
Copy link
Collaborator

lahma commented Nov 22, 2023

main should be green now, please rebase your branch.

@LuisMerinoP LuisMerinoP force-pushed the issue729-toLocaleString-withArgs branch from 96ba99f to 30f9839 Compare November 22, 2023 08:01
@LuisMerinoP
Copy link
Contributor Author

Any idea why the test is failing?. If the PR looks good conceptually I'll keep on trying to fix.

@lahma
Copy link
Collaborator

lahma commented Nov 22, 2023

Well I know that French people are an odd bunch, but maybe even they don't use such special characters for separators? /cc @sebastienros 😉

image

@LuisMerinoP
Copy link
Contributor Author

LuisMerinoP commented Nov 22, 2023

Well I know that French people are an odd bunch, but maybe even they don't use such special characters for separators? /cc @sebastienros 😉

I'm Spanish so maybe I agree on that hahaha. I just copy pasted what node spits out from .toLocaleString and tests pass locally on my Windows machine.
Thanks so much for trying to fix @sebastienros !

@sebastienros
Copy link
Owner

I assume windows and linux don't use the same group separator (ICU). A solution could be to patch the NumberGroupSeparator for fr-FR to set it constant, or patch the test with its value, or ignore French.

@LuisMerinoP
Copy link
Contributor Author

Maybe this https://gist.github.com/haacked/1610603 is useful. I tried this to have the logs on the CI console and check whats going on there. fyi

@LuisMerinoP
Copy link
Contributor Author

Another idea. Are we sure the fonts are installed on the windows CI? It's possible that the font used in your CI environment may not fully support or properly render this character?
This is the font: https://learn.microsoft.com/en-us/typography/font-list/consolas.

@sebastienros
Copy link
Owner

Are we sure the fonts are installed on the windows CI

That question makes me question the validity of the PR ;) Fonts have nothing to do with the failure. The char could be represented as heart emojis, the expected and actual values would still be the same. The issue comes from the fact that apparently on Linux the separator is non-breaking space, while it's the standard space on windows. On Linux locales have different settings than on windows. There are options to use the same (in the OS and in dotnet). Google for ICU, Linux and .NET at the same time.

@LuisMerinoP LuisMerinoP force-pushed the issue729-toLocaleString-withArgs branch 3 times, most recently from 08b7fa3 to a9bb596 Compare November 23, 2023 19:47
@LuisMerinoP
Copy link
Contributor Author

Hi guys, I tried a few more things with no success so commenting the failing test for the time being. Most of I tried works locally for me on a windows machine so not many ideas left while I cannot access the CI. I am happy to try things out if there are suggestions.
To move things along I commented the test with a //TODO. I guess that in future work fixing the intl402 tests and implementing all these internationalization functions according to the standard we'll stumble with the solution at some point.
Thoughts?

@lahma lahma force-pushed the issue729-toLocaleString-withArgs branch from a9bb596 to f8b3913 Compare November 24, 2023 19:10
@lahma lahma merged commit 0a58535 into sebastienros:main Nov 24, 2023
3 checks passed
@lahma
Copy link
Collaborator

lahma commented Nov 24, 2023

Thank you for your contribution, I think this will fulfill the obvious need and we can fight the ECMAScript test suite another day when needed. Now as you've contributed to the repository, next PRs will run automatically without us needing to approve the runs 😉

@LuisMerinoP
Copy link
Contributor Author

awesome! thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants