Question: parsing operators and newlines #134

nikomatsakis · 2022-02-18T09:15:49Z

How to think about binary operators and newlines? Rust had the same issue to wrestle with and I suspect we want the same general answer. I'm referring to things like this:

fn foo() -> {
    if true { 1 } else { 2 }
    -5 # probably wants to return `-5`, not `-4`
}

fn foo() -> {
    a = if true { 1 } else { 2 }
    -5 # probably wants to return `-5` and set `a` to 1
}

fn foo() -> {
    a = (if true { 1 } else { 2 }
    -5) # probably wants to set `a` to `-4` and return `()`? Not sure.
}

fn foo() -> {
    a = if false { 1 } else { 2
    -5} # probably wants to set `a` to `-3`, but I'm not entirely sure ,especially since the next example...
}

async fn foo() -> {
    a = if false { 1 } else { print(2).await
    -5} # ...probably wants to print the number `2` and then set `a` to `-5`, and not try to subtract `5` from `()`
}

The rule I propose:

Binary operators cannot be preceded by a newline

So that you have to write b - \n 5 and not b \n - 5. That'd be a very simple rule.

Other rules I can imagine:

Statement-like expressions (e.g., if), when followed by a newline, do not accept binary operators.

But I'd rather not have to reason like that, it makes the grammar really complex.

Originally posted by @nikomatsakis in #129 (comment)

The text was updated successfully, but these errors were encountered:

nikomatsakis · 2022-02-18T09:15:58Z

cc @xffxff

nikomatsakis · 2022-02-18T09:18:51Z

Note that the rule i proposed would make this code:

fn foo() -> {
    a = (if true { 1 } else { 2 }
    -5) # probably wants to set `a` to `-4` and return `()`? Not sure.
}

set a to -5, and discard the if result.

Ah, I just remembered that I think I generally permitted newlines inside of vectors and things without a comma (I should write some tests for that...), so this would fit with that. e.g. this is legal dada right now (playground)

fn subtract(a, b) {
    a - b
}

fn main() {
    print(subtract(
        5
        3
    )).await #! OUTPUT 2
}

and hence:

fn subtract(a, b) {
    a - b
}

fn main() {
    print(subtract(
        5
        - 3
    )).await #! OUTPUT 8
}

nikomatsakis · 2022-02-18T09:19:14Z

My thinking was that we can just await the whole "trailing ," question altogether and use newlines. Not sure if that was a good idea. =)

brson · 2022-02-21T06:01:34Z

Given

Binary operators cannot be preceded by a newline

then

fn foo() -> {
    a = (if true { 1 } else { 2 }
    -5) # probably wants to set `a` to `-4` and return `()`? Not sure.
}

doesn't seem like it would parse, unless (Expr Expr) parses - is it going to? That would make blocks and parens, (Expr Expr) and {Expr Expr} .... the same?

Having the grammar be newline-sensitive sure doesn't appeal to me much - I didn't realize Rust did this. (edit: but now that I think about it this is probably the special rule about parsing control structures I always knew rust had but couldn't remember the details of).

This problem seems similar to the disambiguation of tuples and function calls in #117, and could be solved the same way, where an opening paren in a function call can't be split onto a new line.

brson · 2022-02-21T06:19:29Z

fn foo() -> {
    a = (if true { 1 } else { 2 }
    -5) # probably wants to set `a` to `-4` and return `()`? Not sure.
}

Cases like this sure do look confusing.

The rules could be different inside { } vs inside ( ) or the compiler could lint against it inside ( ) in a way that would persuade people never to write such code.

brson · 2022-02-21T06:24:12Z

Another seeming solution to the binops case in particular would be to require binops to always be space-delimited, and unary ops not: 1 - 2 vs -2.

brson · 2022-02-21T07:17:34Z

Another newline-sensitive solution that might handle multiple cases in this issue is that for every sequence of expressions both newlines and commas act as separators, with the separators having precedence over continuing to parse the current expression.

nikomatsakis · 2022-02-26T11:21:51Z

@brson

Rust doesn't make the grammar newline sensitive, but it distinguishes uses of things like if ... { } else { } in "statement position" from elsewhere. It further requires that a "statement-like" if (etc) has () type. That's why this program doesn't type check.

doesn't seem like it would parse, unless (Expr Expr) parses - is it going to? That would make blocks and parens, (Expr Expr) and {Expr Expr} .... the same?

Good point, I think that I meant to have () behave differently with respect to newlines than other things.

nikomatsakis · 2022-02-26T11:24:28Z

I'll have to ponder the other suggestions. I also don't love whitespace or newline-sensitive grammars, but I think it's worth trying to not have ;. It leads to some interesting places. I would like to have the grammar be 'minimally' whitespace sensitive -- I think rules like 'cannot be separated by whitespace" (e.g., - 5 and -5 are not the same) or "cannot have a newline" are ok. I would not want more than that because I love the ability to have a "autoformat on save" just cleanup a bunch of gook I just wrote and having things line up correctly. When using Python a lot, I also found that it was easy for me to lose indentation when copy-pasting or at other times, and that could be quite confusing to debug.

brson · 2022-02-28T05:07:22Z

I think we might as well implement the rule you suggest, at least for now. I'd love to get the reference grammar and production grammar in agreement so they can be kept in sync forever. Parol has some ability to turn on and off newline sensitivity based on context, so I think it should be able to handle the rule.

Just one more thing to point out: it's been a long time since I read Code Complete but one bit that has stuck with me is the suggestion that splitting binops to a new line before the op reads better than splitting after the op. That is this:

let x = foo
    + bar
    - baz
    / qux

is easier to scan than

let x = foo +
    bar -
    baz /
    qux

and the proposed rule makes that formatting not possible.

nikomatsakis · 2022-02-28T10:03:59Z

Big +1 to getting ref / actual grammar in sync.

I did consider that the rule would mean you can't move operators to the start of the line. I thought it wasn't as popular for some reason, checking rustfmt suggestions it at least does move operators to the beginning (example).

One other consideration: requiring that binary operators be separated by whitespace would resolve the foo<T> vs foo < T ambiguity as well, right?

nikomatsakis · 2022-03-04T10:59:09Z

Another thought that I had:

Maybe if true { ... } else { ... } and friends should just always require parentheses if you plan to apply an operator to them? I feel like it's kind of hard to read anyway. Some examples:

fn foo() -> {
    if true { 1 } else { 2 } - 5
    (if true { 1 } else { 2 }) - 5
}

fn foo() -> {
    if true { 1 } else { 2 }.share
    (if true { 1 } else { 2 }).share
}

Not sure, the parens don't look great. Going to leave this comment for posterity's sake at least though. :)

nikomatsakis added the question Further information is requested label Feb 18, 2022

vemoo mentioned this issue Aug 8, 2022

Dada reference grammar #17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: parsing operators and newlines #134

Question: parsing operators and newlines #134

nikomatsakis commented Feb 18, 2022

nikomatsakis commented Feb 18, 2022

nikomatsakis commented Feb 18, 2022 •

edited

Loading

nikomatsakis commented Feb 18, 2022

brson commented Feb 21, 2022 •

edited

Loading

brson commented Feb 21, 2022

brson commented Feb 21, 2022 •

edited

Loading

brson commented Feb 21, 2022 •

edited

Loading

nikomatsakis commented Feb 26, 2022

nikomatsakis commented Feb 26, 2022

brson commented Feb 28, 2022

nikomatsakis commented Feb 28, 2022

nikomatsakis commented Mar 4, 2022

Question: parsing operators and newlines #134

Question: parsing operators and newlines #134

Comments

nikomatsakis commented Feb 18, 2022

nikomatsakis commented Feb 18, 2022

nikomatsakis commented Feb 18, 2022 • edited Loading

nikomatsakis commented Feb 18, 2022

brson commented Feb 21, 2022 • edited Loading

brson commented Feb 21, 2022

brson commented Feb 21, 2022 • edited Loading

brson commented Feb 21, 2022 • edited Loading

nikomatsakis commented Feb 26, 2022

nikomatsakis commented Feb 26, 2022

brson commented Feb 28, 2022

nikomatsakis commented Feb 28, 2022

nikomatsakis commented Mar 4, 2022

nikomatsakis commented Feb 18, 2022 •

edited

Loading

brson commented Feb 21, 2022 •

edited

Loading

brson commented Feb 21, 2022 •

edited

Loading

brson commented Feb 21, 2022 •

edited

Loading