Discarded dataflow errors make for very hard to debug errors - shall we promote them to Panics? #6989

radeusgd · 2023-06-07T17:17:43Z

radeusgd
Jun 7, 2023
Collaborator

The story

I just spent about an hour debugging a super stupid mistake:

         Test.specify "should fail if the source table is missing some columns and the column in the target has no default value" <|
             dest_name = Name_Generator.random_name "table-notnull"
-            connection.execute_update 'CREATE TEMPORARY TABLE "' + dest_name + '" (Y INTEGER NOT NULL 42, X INTEGER)' . should_succeed
+            connection.execute_update ('CREATE TEMPORARY TABLE "' + dest_name + '" (Y INTEGER NOT NULL 42, X INTEGER)') . should_succeed

(red is the error, green is how I fixed it)

Simple - I forgot the brackets. But to find it - it was super hard. The should_succeed used in our test suite checks if the thing is a dataflow error - so I was sure it was succeeding.

I did not notice that the bracketing was different than I expected it. What the original code actually meant was more or less:

(connection.execute_update 'CREATE TEMPORARY TABLE "') + dest_name + ('" (Y INTEGER NOT NULL 42, X INTEGER)' . should_succeed)

Let's unpack it.

(connection.execute_update 'CREATE TEMPORARY TABLE "') evaluated to dataflow error. Then I call + on it - that just propagates the error further. Then I have ('" (Y INTEGER NOT NULL 42, X INTEGER)' . should_succeed) - it's a should_succeed on a Text constant - so it works fine and just returns it, and again this argument used with a + on the errored value just propagates the error. The whole line just evaluates to a dataflow. Which is later discarded, so I don't know that my code failed.

The whole point for should_succeed is to ensure situations like these don't happen, but this time bracketing bit me. I guess one way to solve this is to use Problems.assume_no_problems <| ...rest-of-the-line... instead of should_succeed, which would be a bit more resilient to bracketing errors.

The suggestion

However, this adventure brings to my attention a thing that I was thinking about quite a while ago - almost always, a discarded dataflow error is a problem. It means some operation that was meant to be side-effectful, has failed but it has failed silently. If it were a Panic, it would have propagated. Normally it should be inspected, but sometimes one forgets to inspect the result.

To fix this, I propose to alter the semantics such that statements that are not an assignment (or the last expression in a function which is its return value) check their result and if it is a dataflow error, they promote it into a Panic, possibly wrapped in a Discarded_Dataflow_Error.Panic wrapper. A discarded error is basically always an issue, so it should be indicated somehow. What do you think about this proposal?

It would not affect the IDE, as there we basically always assign the return values. But it can save us from
many hard to track down bugs.

Alternative suggestion

I did have an even more drastic suggestion back in the day, I don't remember what the outcome was. The suggestion was - Panic wherever a method returns a value different than Nothing that is discarded. i.e. if I do:

fun x = x + 10
myfun x =
    fun 10
    42

Then the fun 10 invocation should fail - as I forgot to inspect its value. Side effectful operations usually return Nothing and so it should be allowed to discard a Nothing. But any other value suggests a programming error.

Of course, the programmer can always indicate that this is intentional:

fun x = x + 10
myfun x =
    _ = fun 10
    42

Other solutions?

What do you think? IMO this is a problem. Currently, silently discarding especially dataflow errors (but other values too, but in a less problematic way) can lead to really hard to track down bugs and programming errors. Making sure the return values are inspected, especially errors, would make it easier to find bugs and write correct programs.

Maybe you have some other suggestions how we could solve this problem?

Akirathan · 2023-06-08T09:48:14Z

Akirathan
Jun 8, 2023
Collaborator

Your suggestion seems to be difficult to implement, at least at first sight. Checking if a return value is a dataflow error is easy on the engine side, but checking whether the return value is assigned means traversing the hierarchy of truffle node parents until you reach an AssignmentNode, or keeping a special flag during Codegen phase (Truffle node creation) indicating that the currently created nodes are inside assignment or not. This feels very difficult and error-prone.

What about a compromise - add a new linter pass that reveals unused return values and turns them into warnings? A similar linter rules are also in the other languages.

1 reply

radeusgd Jun 12, 2023
Collaborator Author

What about a compromise - add a new linter pass that reveals unused return values and turns them into warnings? A similar linter rules are also in the other languages.

If we can have such a compile-time check, we can also in the very same place insert a node which will be verifying it at runtime. We do not have to traverse the truffle node hierarchy to do so - we just need to insert a needed check at compile-time. So IMO it is very well possible to implement.

The problem with the linter pass is that we do not have a powerful enough type system to know if the call foo x will be returning Nothing or something else - so the linter pass that you are suggesting would need to warn on every call whose result is not assigned to anything. This will make us prepend _ = ... to all such calls - effectively defeating the purpose of this pass - because it will now never run and not tell us about the discarded values.

The whole point of this check is to detect places where we forgot to do sth with the return value. The linter pass idea could work, but only if we had powerful enough type inference, which we do not have right now.

wdanilo · 2023-06-12T23:06:05Z

wdanilo
Jun 12, 2023
Maintainer

That's a very good question. So, this is connected to our long-term plan with Enso language. Basically, we should remove panics altogether and keep dataflow errors only ...

However, the question always was if we are able to do it in a performant way in the engine. We added panics only to be able to fast create low-level libraries, but data-flow errors are just way more natural way of expressing what we want in Enso. The following ideas would apply then:

If a dataflow error is not handled, eg. it is assigned to a variable that is not used or is passed to a primitive function (like print), then it should propagate up and be attached to the result of the parent function output.
All dataflow errors attached to the output of "main" should be reported to the user.
There should be an operator ? introduced which should break function execution when a dataflow error occurs, allowing breaking evaluation of functions immediately, just as panics do.

If we are able to do the above steps in a performant way, we should consider improving Enso and removing panics altogether. I was hoping that we would not need to revisit this topic so early, but based on @radeusgd post, I feel this is something we should start thinking about.

How does it look like on the engine side? Would something like that be possible to be implemented in a performant way, or we would need some kind of advanced static analysis for that?

3 replies

farmaazon Jun 13, 2023
Collaborator

Ad 1. As I understand, in this case Dataflow error became sort of a warning. How we should handle it in the IDE? Currently, we treat Dataflow Errors as a sole value, and do not allow showing visualization on such erroneous node.
Ad 3. How ? should behave when function returns value (so it computed something) with Dataflow errors attached?

wdanilo Jun 15, 2023
Maintainer

Ad 1. I don't fully understand the question. The error will be propagated by the engine automatically, but it should still be visible on the node. In such a case, GUI should just display it.
Ad 3. Dataflow error is not a warning attached to data. Dataflow errors are like Rust Result values. So, you can return the error without the result from a function.

farmaazon Jun 15, 2023
Collaborator

If a dataflow error is not handled, eg. it is assigned to a variable that is not used or is passed to a primitive function (like print), then it should propagate up and be attached to the result of the parent function output.

Here you mention "attaching" it. By attaching, I understand that the (parent) function returns an output value (successfully) with attached, non-handled dataflow error. This suggests the function returns more like Result<(T, Vec<Err>), Err> than Result<T, Err>, making my question in point 3 still valid.

Again in point 2 you also mention "output with attached dataflow error", not just "returned dataflow error".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enso Analytics

Discarded dataflow errors make for very hard to debug errors - shall we promote them to Panics? #6989

{{title}}

Replies: 2 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Enso Analytics

Discarded dataflow errors make for very hard to debug errors - shall we promote them to Panics? #6989

radeusgd Jun 7, 2023 Collaborator

The story

The suggestion

Alternative suggestion

Other solutions?

Replies: 2 comments · 4 replies

Akirathan Jun 8, 2023 Collaborator

radeusgd Jun 12, 2023 Collaborator Author

wdanilo Jun 12, 2023 Maintainer

farmaazon Jun 13, 2023 Collaborator

wdanilo Jun 15, 2023 Maintainer

farmaazon Jun 15, 2023 Collaborator

radeusgd
Jun 7, 2023
Collaborator

Replies: 2 comments 4 replies

Akirathan
Jun 8, 2023
Collaborator

radeusgd Jun 12, 2023
Collaborator Author

wdanilo
Jun 12, 2023
Maintainer

farmaazon Jun 13, 2023
Collaborator

wdanilo Jun 15, 2023
Maintainer

farmaazon Jun 15, 2023
Collaborator