github actions failing under PDL-2.096 #18

moocow-the-bovine · 2025-01-02T11:22:43Z

github actions are failing for PDL-CCS commit a9a4882 under PDL-2.096 only

succeeded (PDL-2.095): https://github.com/moocow-the-bovine/PDL-CCS/actions/runs/12571307267
failed (PDL-2.096): https://github.com/moocow-the-bovine/PDL-CCS/actions/runs/12581459072

~~I could not reproduce this locally using the PDL git 2.096 tag PDLPorters/pdl@87bed59.~~

Problem seems to be changes to the output type & BAD-handling of ~~PDL core's~~ CCS's bandover and borover with respect to PDL core. I think the CCS BAD-handling is right, but output type is defaulting to sbyte rather than longlong.

@mohawk2 any ideas?

for the record, failing tests look like:

test_ufunc(bandover, BAD)
not ok 115 - bandover:missing=BAD:type

#   Failed test 'bandover:missing=BAD:type'
#   at CCS/t/03_ufuncs.t line 103.
#          got: 'sbyte'
#     expected: 'longlong'
# ccs_rc(PDL: SByte D [7])=PDL::CCS::Nd:sbyte(6)
#   pdims:indx(1)=[6]
#   vdims:indx(1)=[0]
#   which:indx(1,6)^T=   
#    [
#     [0 1 2 3 4 5]
#    ]
#   vals:sbyte(7)=[10 1 0 0 0 0 BAD]
#   missing:sbyte(1)=[BAD]
# 
# dense_rc(PDL: LongLong D [6])=[10 1 0 0 0 0]

The text was updated successfully, but these errors were encountered:

moocow-the-bovine · 2025-01-02T11:33:02Z

... might have to do with the CCS functions' GenericTypes, which gets set to this:

[map {$_->ppsym} grep {$_->integer} PDL::Types::types()]

... have there been (non-deterministic) changes to this order? Or to the way GenericTypes is interpreted if type-demotion is in effect? (I think the tests are just passing in double inputs and relying on auto-magic conversion).

moocow-the-bovine · 2025-01-02T11:43:14Z

I could not reproduce this locally using the PDL git 2.096 tag PDLPorters/pdl@87bed59.

can reproduce. found workaround (-> 07bb416). still think there's something weird going on.

mohawk2 · 2025-01-02T14:02:51Z

Without having checked yet, sbyte is the output type of some PDL mask functions. The bitwise and I believe the logical (and2, or2) functions only deal in integer types, which means if they get passed a double, conversion will happen.

One thing that's changed in PDL 2.096 is that where it's an available type, the output Par(s) dictate the GenericType of the xform. Considerable efforts have been made to make that not cause problems and be backward-compatible, but it's conceivable I've missed something. I'll have a look at what's happening with the CCS commits you mention above and see if there's something you can do in CCS to not be broken by this, or if it's something in PDL that needs fixing.

mohawk2 · 2025-01-02T14:14:30Z

To add to the above: longlong is the last integer type, so previously if any inputs to e.g. bandover were e.g. double, that being a higher-numbered type (currently 10, if you care) than any integer type, would "win" as selecting the xform's GenType. But because that isn't available, the actual selected type would be the last-given one (here, longlong).

The problem will be some combination of these factors, and I'd imagine that being more finicky about passed-in types will lead to success.

That "use the last-given type" thing is one reason why complex types are opt-in; they are numbered higher than all real types, so if they got selected as the GenType of e.g. a newly-complex-capable asin having been passed a Perl scalar that's an integer (say, 2), would return a mathematically correct complex number, rather than the nan that the user (Craig, in this case) had been expecting. In this case, the fix was to make double be the last-given type.

moocow-the-bovine · 2025-01-02T14:29:10Z

That "use the last-given type"

Thanks for the clarification! The GenericTypes I'm specifying works out to:

[qw(A B S U L K N P Q)]

... and if I'm reading it right, then Q aka longlong is the last given type. But I seem to be getting sbyte out (aka A) when I (somewhat nonsensically) pass in a double. Have I misunderstood the heuristic, or should I still be getting a longlong out? Or do I need to specify something else?

mohawk2 · 2025-01-02T14:41:13Z

Can you say what if any PDL xforms you're using, the whole signature of your op, and the types of all the inputs you're giving it?

moocow-the-bovine · 2025-01-02T16:47:47Z

signature (ccs_accum_bor)

    indx ixIn(Ndims,NnzIn);
         nzvalsIn(NnzIn);
         missing();
    indx N();
    indx [o]ixOut(Ndims,NnzOut);
         [o]nzvalsOut(NnzOut);
    indx [o]nOut();

minimum broken example

use PDL;
use PDL::CCS::Ufunc;

PDL::_ccs_accum_bor_int(
  $ixIn=PDL->pdl(indx, [[0]]), # indx
  $nzvalsIn=pdl([65535]),      # double
  $missing=0,                  # scalar --> this seems to be the culprit
  $N=0,                        # scalar
  $ixOut=null,                 # null
  $nzvalsOut=null,             # null
  $nOut=null                   # null
);

die("you're too short for this ride") if ($nzvalsOut->type == sbyte);

xforms

no explicit ones, but I am passing in perl scalars for $missing and $N. The "real" missing value is actually BAD in the failing test, but passing in N=0 causes the PP code to ignore it.

As it turns out, if I pass in $missing=pdl(double, 0) or even BAD itself ($missing=pdl(double, 0)->setvaltobad(0)), then I do get a longlong out.

... that suggests a more robust CCS-internal workaround (--> always pass in a $missing of the correct type), but I'm still surprised by the behavior.

moocow-the-bovine · 2025-01-03T08:31:34Z

@mohawk2 I'm closing this as resolved by the CCS-internal workaround, but I still believe that the behavior indicates a bug in PDL core's choice of output type. Feel free to investigate ... or not.

mohawk2 · 2025-01-03T16:51:20Z

@moocow-the-bovine I'm glad you've got a situation that works with both old and new. However, your feeling of surprise at the situation is an important signal, so I'd like to just examine that a bit more, and especially get your input. I'm going to "comment" this now, in case that happens to catch your attention, but I'll carry on adding to it to complete the thoughts.

First, what happened isn't precisely a choice of output type, but of the xform's "generic type" (the value of $GENERIC()), or (more conveniently) its datatype. That matters only in that the internal processing will be done "in" that datatype (based on inputs that might have been converted), then maybe be converted for each output. That might be a good or bad thing, depending on competing needs for numerical precision and performance.

Secondly, the influence on that of a provided Perl scalar is as follows: those get turned into the smallest-numbered datatype for that class of data: Perl IVs, UVs, NVs, and Math::Complex objects each become one of the relevant PDL datatypes (Math::Complex only becomes cdouble). If you'd given a 0.0 instead of 0, that would have become a float, I think - a non-allowed datatype.

Up till 2.096, if any parameter (input or output) was of a non-allowed type, that forced the xform datatype to instead be the last-given type, as discussed above. As of 2.097, as you have found, it only now does that if no allowed type was given; if one was, it will go with that. I'm inclining as a result of your surprise towards switching that logic back so any non-allowed input again does the forcing, and using the test you've given above (I can find a core equivalent) to make that the permanent behaviour.

What do you think?

moocow-the-bovine · 2025-01-03T21:23:42Z

@mohawk2
Thanks for the patience & clear explanation, and apologies for my earlier snarkiness (I do that sometimes).

Up till 2.096 [...] the last-given type
As of 2.097 [...] it only now does that if no allowed type was given

💡 ... so scalar 0 being a (sufficiently small?) IV/UV winds up determining the "native" type of the operation ... that makes a lot of sense (sorry if my terminology is incorrect or my conceptualization is off; I've not delved very deeply into the PDL internals).

If you'd given a 0.0 instead of 0, that would have become [...] a non-allowed datatype.

(symptomatically) confirmed; if I pass in $missing=0.0, I get a longlong out.

inclining as a result of your surprise towards switching that logic back [...] What do you think?

It's complicated; it's pretty clear to me that:

my test was doing something unrealistic & stupid by passing a double to an integer-types-only operation,
the "clever" CCS-internal kludge for $missing=BAD wrongly used a perl scalar instead of the "proper" logical type of the sparse ndarray (here double)

... so I think that:

I was wrong: the "clever" CCS-internal kludge was too clever and/or not clever enough,
this particular change in PDL core behavior is unlikely to bite anyone in the Real World™, and
the CCS-internal workaround (== preserving the correct type for $missing) will now indeed Do The Right Thing™ (i.e. the same thing that the corresponding PDL core functions would do with the equivalent dense ndarray).

All of that is very CCS-centric ... regarding switching that logic back in PDL core, I don't have a strong feeling one way or the other. Some thoughts:

Now that I (believe to) understand it better, I actually really like the "no allowed type given" heuristic ... that sounds to me like it should handle more (and more realistic) cases than this one more flexibly & more gracefully than the "always choose the last given type" heuristic.
really naive PDL users might not be super-careful about specifying the types of their ndarrays (I remember that it took me a while before I learned to love the pdl(TYPE, ...) syntax). Naive users are also the most likely to pass in scalars and expect PDL to DWIM ... which with the changed heuristic I guess could lead to unexpected results.
A lot of the PMCode (and CCS::Nd wrapper code) I just removed from PDL-CCS involved just this sort of logic to determine which type to pre-allocate for an output PDL with zeroes (one such removal led to this issue). The removed code was klunky and ugly and I'm glad it's gone, but the heuristic (which I modeled on my conceptualization of PDL core's behavior) was basically "choose the maximal type among those which were explicitly passed in". In this case, that was the double from $nzvalsIn ... which presumably would lead to invocation of the "unsupported type" clause in PDL core.
Maybe there's a way to prevent (atomic) perl scalars from influencing the type-selection process (not counting as "explicitly passed in" in the above sense)? But I'm guessing that the scalar-to-PDL conversion happens first and that when the type unification is being performed we no longer know which types came from PDLs and which from perl scalars, so that might not be an option. I can also imagine that handing Math::Complex scalars (which should probably count as "explicitly passed in") makes that even trickier.

Going a bit more abstract (and harking back to our earlier discussion of "type variables"), I want to conceptualize the "native type of an operation" in terms of a unification operation (== least-upper-bound over a set of (given) types with respect to a bounded complete partial order). That BCPO can be just a chain (e.g. the sort-order < on PDL::Types), but the question for me here is: "what set of types are we doing the LUB over?" ... null obviously shouldn't count, but should perl scalars? If so, what should they count as?

mohawk2 · 2025-01-03T21:59:13Z

These matters are all discussed somewhat on PDLPorters/pdl#511, I'd encourage you to take a look there too. That includes points about how to deal with Perl scalars, noted in your second and fourth bullet-points.

Your term "native" type of the specific instance of the operation seems to mean the same things as my "generic type" or "datatype". I'd encourage "datatype", but it's up to you.

I think you're likely to be right that it's only CCS that gets affected by the change of behaviour, but since it is a real change of behaviour, that's not notably beneficial, I'm going to change it back. Thank you for your careful (including some bonus highly-abstract!) thoughts.

Regarding your last point, my aim is to just let actual ndarrays passed in determine the type (including "didn't match any, go to last-given"). Then the Perl scalars would get converted appropriately.

I could imagine allowing overrides of type-selection for a given xform (maybe with a TypeSelectCode, like RedoDims), but I'm quite skeptical that the juice is worth the squeeze.

mohawk2 · 2025-01-05T03:40:42Z

Interestingly, a naive implementation of restoring that pre-2.096 behaviour (if given any non-available) broke some tests because of the Perl scalar issue: $x **= 1 forces to double even if $x was complex (it got converted to double, which is obviously wrong). I had to adjust it so it's only if the non-available type was greater than the last-given to not cause such breakage.

moocow-the-bovine · 2025-01-05T11:16:56Z

only if the non-available type was greater

that makes sense to me in terms of a "least upper bound" datatype for the operation.

moocow-the-bovine closed this as completed Jan 3, 2025

mohawk2 mentioned this issue Jan 4, 2025

Type-selection of xform is wrong when has complex Pars, and converted [o] Pars in flowing xforms don't work PDLPorters/pdl#511

Open

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

github actions failing under PDL-2.096 #18

github actions failing under PDL-2.096 #18

moocow-the-bovine commented Jan 2, 2025 •

edited

Loading

moocow-the-bovine commented Jan 2, 2025

moocow-the-bovine commented Jan 2, 2025 •

edited

Loading

mohawk2 commented Jan 2, 2025

mohawk2 commented Jan 2, 2025

moocow-the-bovine commented Jan 2, 2025

mohawk2 commented Jan 2, 2025

moocow-the-bovine commented Jan 2, 2025 •

edited

Loading

moocow-the-bovine commented Jan 3, 2025 •

edited

Loading

mohawk2 commented Jan 3, 2025 •

edited

Loading

moocow-the-bovine commented Jan 3, 2025

mohawk2 commented Jan 3, 2025

mohawk2 commented Jan 5, 2025

moocow-the-bovine commented Jan 5, 2025

github actions failing under PDL-2.096 #18

github actions failing under PDL-2.096 #18

Comments

moocow-the-bovine commented Jan 2, 2025 • edited Loading

moocow-the-bovine commented Jan 2, 2025

moocow-the-bovine commented Jan 2, 2025 • edited Loading

mohawk2 commented Jan 2, 2025

mohawk2 commented Jan 2, 2025

moocow-the-bovine commented Jan 2, 2025

mohawk2 commented Jan 2, 2025

moocow-the-bovine commented Jan 2, 2025 • edited Loading

signature (ccs_accum_bor)

minimum broken example

xforms

moocow-the-bovine commented Jan 3, 2025 • edited Loading

mohawk2 commented Jan 3, 2025 • edited Loading

moocow-the-bovine commented Jan 3, 2025

mohawk2 commented Jan 3, 2025

mohawk2 commented Jan 5, 2025

moocow-the-bovine commented Jan 5, 2025

moocow-the-bovine commented Jan 2, 2025 •

edited

Loading

moocow-the-bovine commented Jan 2, 2025 •

edited

Loading

moocow-the-bovine commented Jan 2, 2025 •

edited

Loading

moocow-the-bovine commented Jan 3, 2025 •

edited

Loading

mohawk2 commented Jan 3, 2025 •

edited

Loading