Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler] replace Begin with Let #14068

Merged
merged 13 commits into from
Feb 8, 2024
Merged

Conversation

patrick-schultz
Copy link
Collaborator

@patrick-schultz patrick-schultz commented Dec 4, 2023

Remove the Begin node, as its behavior can now be represented by the Let node. Besides removing redundant nodes, this will also make the new ssa-style text representation simpler.

The Begin node emmitter performed method splitting, emitting groups of 16 children in seperate methods. This preserves that behavior by doing a similar optimization in the Let emitter. This is a significant change in how we split generated code into methods, so we should watch out for how this affects things.

# Conflicts:
#	hail/src/main/scala/is/hail/expr/ir/Children.scala
#	hail/src/main/scala/is/hail/expr/ir/Copy.scala
#	hail/src/main/scala/is/hail/expr/ir/Emit.scala
#	hail/src/main/scala/is/hail/expr/ir/TypeCheck.scala
@patrick-schultz patrick-schultz force-pushed the end-of-begin branch 2 times, most recently from 8d8699b to e4acd13 Compare January 26, 2024 19:48
@patrick-schultz patrick-schultz marked this pull request as ready for review February 2, 2024 16:10
ehigham
ehigham previously requested changes Feb 5, 2024
Copy link
Member

@ehigham ehigham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! One less IR node :)

My comments are mostly around emitLetBindings - I can't dont really understand what it's doing and feels too low-level. Keeping track of three ints may be too much for me.

You could consider using the mark flag on BaseIR to skip emitting certain nodes - that might make the logic a bit easier to read.

FastSeq[ParamType](classInfo[Region]),
UnitInfo,
)
var newEnv = env
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This env needs to be restricted to fields and formal parameters only.
We may not bind any locals at present, but I can imagine if we did and they were used across a split, we'd get a nasty runtime error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, make this a param of emitChunkWrapped - this ref copy just looks ugly

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This env needs to be restricted to fields and formal parameters only. We may not bind any locals at present, but I can imagine if we did and they were used across a split, we'd get a nasty runtime error

The env must have this property already, as we currently emit parts of the ir into separate methods, and simply pass in the current env. See for instance Emit.emitInSeparateMethod.

Also, make this a param of emitChunkWrapped - this ref copy just looks ugly

I'm not sure what you mean. The newEnv var is so that we can smuggle out the update env from emitChunk out of voidWithBuilder.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see.

Comment on lines +3660 to +3661
chunkStart: Int,
pos: Int,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's pos for? There seems to be three ints we're tracking at once when wouldn't two be enough to define the window of the bindings to emit in one group?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added explanatory comments.

val (curName, curIR) = let.bindings(pos)

// skip over unused streams
if (curIR.typ.isInstanceOf[TStream] && !uses.contains(curName)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've testing for some uses in more than two places, can these be folded into one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only by using an array builder to hold the current chunk, instead of just slicing into the bindings seq. uses is just a set, so testing for membership is fast.

ehigham
ehigham previously requested changes Feb 6, 2024
import is.hail.types.virtual.TVoid

object IsPure {
def apply(x: IR): Boolean = x match {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Die?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually treat Die as side effecting. For instance, in

GetField(
  MakeStruct(foo = I32(5), bar = Die())
  "foo")

we will prune "bar", even though it changes the program from dying to returning a value.

Formally, our semantics is slightly non-deterministic: there is at most one correct value result, and any number of correct exception results, and running the program is allowed to produce any of those behaviors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a bug to me

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if it violates the semantics of the given IR. We intentionally use a semantics for which this is not a bug. Preserving all potential errors would be a huge burden on the optimizer. For instance, one important optimization is pushing computation to happen after a filter. But if the computation could possibly error (eg divide by zero), then that's an invalid optimization. Or if a big struct is computed, then only some of its fields are used, we'd be forced to compute all the unused fields in case they error.

@danking danking merged commit 59fdaac into hail-is:main Feb 8, 2024
8 checks passed
@patrick-schultz patrick-schultz deleted the end-of-begin branch April 11, 2024 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants