Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve type-checking #19

Merged
merged 20 commits into from
Dec 16, 2024
Merged

Improve type-checking #19

merged 20 commits into from
Dec 16, 2024

Conversation

InnocentZero
Copy link
Contributor

Adds type-checking to the following cases:

  • block outputs (reflect the type of the last statement, none otherwise)
  • parameters in def (any if not specified, else the specified type)
  • if else (reflect the type of the blocks if the blocks are equal, if they are unequal then forbidden, and the if branch type otherwise)

Signed-off-by: innocentzero <[email protected]>
Signed-off-by: innocentzero <[email protected]>
Signed-off-by: innocentzero <[email protected]>
Signed-off-by: innocentzero <[email protected]>
Both of them are redundant now given that blocks
return the type of the last statement

Signed-off-by: innocentzero <[email protected]>
@InnocentZero
Copy link
Contributor Author

Updates:

  • Params now has a None type, since they are not expected to have types anyways.
  • Block and BLOCK_TYPE are now redundant, since blocks return the type of the last statement

Signed-off-by: innocentzero <[email protected]>
@kubouch
Copy link
Collaborator

kubouch commented Nov 27, 2024

About the if-else, if they have different types, it should be allowed and the type should be oneof<a, b>. For example,

if $cond { 4 } else { 'a' }

should have type oneof<int, string>. We can start just by casting it to any, but it definitely shouldn't throw an error.

Params having None type sounds good. Block having the last type also seems good.

@InnocentZero
Copy link
Contributor Author

Yeah, I realised that much later while going through docs and messing around in the shell, but that'd require some deeper changes. Anyways, I'll revert those changes in the next commit and hopefully add some more checks.

Signed-off-by: innocentzero <[email protected]>
Implements OneOf for if-else and def nodes

Signed-off-by: innocentzero <[email protected]>
Signed-off-by: innocentzero <[email protected]>
Signed-off-by: innocentzero <[email protected]>
Signed-off-by: innocentzero <[email protected]>
Signed-off-by: innocentzero <[email protected]>
@InnocentZero
Copy link
Contributor Author

Updates:

  • Removed the if-else checks since they were basically wrong. Now it only checks for the condition returning a boolean
  • Added a OneOf(...) type that stores an index. We use that index in Typechecker::oneof_types to get the corresponding BTreeSet of types.
  • if-else return type now supports OneOf if there are more that one types being returned. Else it will return the singleton type
  • for loop and while loop typechecking has been added
  • Added an alias test because I wanted to see the behaviour of the call node
  • Typechecking for match node is now supported. Tests, however, suggest that it's not being parsed properly atm.

@InnocentZero
Copy link
Contributor Author

InnocentZero commented Nov 29, 2024

@kubouch should I attempt to fix match parsing or would you like to check and merge it rn? The PR looks decently sized and implements a couple of major changes, so I'd want to get a review if possible.

@kubouch
Copy link
Collaborator

kubouch commented Dec 8, 2024

Let's not add too much stuff to it, the match parsing can be left for another PR.

The summary seems good, I'll try to take a look whenever possible. Any specific reason to use BTreeSet over a plain Vec?

@InnocentZero
Copy link
Contributor Author

BTreeSet makes it easier to keep unique elements. I did not want a oneof<int, int> kind of situation.

@kubouch
Copy link
Collaborator

kubouch commented Dec 8, 2024

I see, yeah, a set makes sense in this context. Any reason for the BTreeSet over a regular HashSet?

Signed-off-by: innocentzero <[email protected]>
@InnocentZero
Copy link
Contributor Author

My bad, I was working on something else that required BTreeSet and forgot that HashSet exists when I switched back to working on this.

@InnocentZero InnocentZero mentioned this pull request Dec 8, 2024
Signed-off-by: innocentzero <[email protected]>
@InnocentZero
Copy link
Contributor Author

InnocentZero commented Dec 8, 2024

I'm not sure what to make of the CI tests :/

EDIT: Nvm, figured out. I'd sort the output before printing it.

Signed-off-by: innocentzero <[email protected]>
tests/alias.nu Outdated Show resolved Hide resolved
16: unknown
17: unknown
18: unknown
14: oneof<(), unknown>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type of missing if branch is nothing. So, if $spam { 1 } should have a return type of oneof<int, nothing>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That () is from the else branch. The if branch does not have a type because continue and break are not checked/parsed properly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I meant the else branch. It should be nothing, not () coming from the missing else branch. Try if false { 1 } | describe in Nushell, it'll print nothing, so we should match the behavior.

This is used to indicate the type status in case of an error to avoid
ambiguities around the usage of other types

Signed-off-by: innocentzero <[email protected]>
10: unknown
11: unknown
12: ()
9: list<unknown>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Param should also be forbidden since it won't be directly evaluated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we can do some better typechecking by checking each param and if it is type compatible to the argument of the function once we have the facilities for that.

} => {
self.typecheck_node(range);

// We don't need to typecheck variable after this
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? (The comment is not very descriptive.)

Copy link
Contributor Author

@InnocentZero InnocentZero Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. Basically the variable's type is resolved when we typecheck the list. I think the comment is positioned wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to mention, the variable's type is being set by us after checking the type of the range. We're setting the type in the for node itself, not before.

Error (NodeId 26): incomplete expression
Error (NodeId 27): incomplete expression
Error (NodeId 28): expected: right paren ')'
Error (NodeId 30): incomplete expression
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it's cropped?

Copy link
Contributor Author

@InnocentZero InnocentZero Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed and ran cargo insta, seems like the output is the same according to git.

self.error("Blocks in looping constructs cannot return values", block);
}

self.set_node_type_id(node_id, NOTHING_TYPE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To match the current behavior of Nushell (i.e., for i in 0..8 { print $i } | describe throwing an error because for is a statement), we should probably make it into a none type.

However, in the future, we could perhaps allow for to be an expression, and you'd be able to break a value from it, e.g., let x = for i in 1..8 { if $ > 3 { break ($i * $i) } }. In that case, nothing type is appropriate for a loop without value breaks. This applies to while and loop as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the second part, would it not be a problem if the for loop never evaluates with a list at runtime? Or do you plan to keep it oneof<nothing, type>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I've gone with none

src/typechecker.rs Outdated Show resolved Hide resolved
@kubouch
Copy link
Collaborator

kubouch commented Dec 15, 2024

OK, some minor comments, but otherwise looks good.

Since params are not supposed to be evaluated,
their type is changed to Forbidden

Signed-off-by: innocentzero <[email protected]>
Match not returning anything should be a nothing
type as it is an expression. The type for a for
loop is now None, which matches the current
nushell implementation.

Signed-off-by: innocentzero <[email protected]>
@kubouch
Copy link
Collaborator

kubouch commented Dec 16, 2024

OK, let's try it, thanks!

I think the match parsing might be borked and you did some fixes on it in your other branch, so let's roll with it.

@kubouch kubouch merged commit eba4f27 into nushell:main Dec 16, 2024
4 checks passed
@InnocentZero InnocentZero deleted the typechecking branch December 17, 2024 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants