Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSA is upon us... #33

Open
pfalcon opened this issue Nov 16, 2018 · 6 comments
Open

SSA is upon us... #33

pfalcon opened this issue Nov 16, 2018 · 6 comments

Comments

@pfalcon
Copy link
Owner

pfalcon commented Nov 16, 2018

So, I pushed initial support for conversion to SSA, e.g. tests: 72a047e . To remind, I wanted to postpone introduction of SSA, to not fall into common trap when people treat as some magic device, without good understanding of the frontier where it really becomes needed, how to deal with it, what drawbacks it has, etc.

All this time I (in the background) pumped up my understanding and well, now I hoped I grasped the essence of its construction, that the one true SSA is the maximal SSA, its construction is simple and devoid of any arcane knowledge, and any other form is nothing but implementation-level optimizations, and they're thus strictly optional.

Going beyond construction is definitely on next TODOs, but first syntax for phi functions need to be set. How it's done currently (examples above) is in a textbook way, where phi just takes list of values, in the same order as predecessors (and that order is of course stable in ScratchABlock, partly exactly because of anticipation of phi() introduction).

That should work pretty well for intermediate output, but of course barely suitable as input, where predecessors aren't so explicit, and their order shouldn't have to be clear/stable either. And the input case is of course important (interoperability with LLVM is definitely on the table).

So, there're two facets of it:

  1. External syntax, i.e. how it's rendered in PseudoC source program.
    This can be as simple as normal C fuction syntax, e.g. phi(label1, $r0_1, label2, $r0_2), or something fancier, e.g. phi(label1: $r0_1, label2: $r0_2). Of course, something fancier is definitely not compatible with C syntax, and staying within its bounds is the aim of PseudoC. But then phi() is not an executable function, so having not a C syntax is somehow not completely crazy idea.
  2. Internal syntax, i.e. should there be bblock addresses retained as arguments of phi's EXPR, or not? Common sense says it's superfluous, because in internal representation, there's no ambiguity. But perhaps as a debugging measure still may be useful?

For reference, LLVM's syntax:

%indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]

That order of bblock addr vs value I don't like. The most important thing in phi is not value, but the fact that it depends on predecessor block, so it would rather go first.

$indvar = phi(LoopHeader, 0, Loop, $nextindvar)
$indvar = phi(LoopHeader: 0, Loop: $nextindvar)

Well, to make it more C-like, and still preserve pair-ness, could do:

$indvar = phi((LoopHeader, 0), (Loop, $nextindvar))

Well, that's still not really a valid C syntax. Then could do:

$indvar = phi(_(LoopHeader, 0), _(Loop, $nextindvar))

But that's really going down the rabbit hole...

@pfalcon
Copy link
Owner Author

pfalcon commented Nov 16, 2018

@maximumspatium, Comments are welcome.

@maximumspatium
Copy link
Contributor

maximumspatium commented Nov 17, 2018

@maximumspatium, Comments are welcome.

Paul, there is not much I can add to your elaboration. As you said, the LLVM syntax is incompatible with C syntax and rather cumbersome to read. The other suggestions look better, although each of them has its own pros and cons:

$indvar = phi(LoopHeader, 0, Loop, $nextindvar) doesn't show clearly that even and odd parameters belong together

$indvar = phi(LoopHeader: 0, Loop: $nextindvar) is better but that key:val syntax is typical for script languages, not C

$indvar = phi((LoopHeader, 0), (Loop, $nextindvar)) that tuple notation looks rather Python-like than C-like

$indvar = phi(_(LoopHeader, 0), _(Loop, $nextindvar)) is the best option IMHO, besides the attempt to force tuples into C-syntax by using "_".

What's about this:

$indvar = phi(LoopHeader[0], Loop[$nextindvar])

Anyway, it fully conforms to the C-syntax and doesn't introduce anything superfluous. The notion of location:variable is represented as arrays with subscripts. They can be interpreted as dict[val], too.

@pfalcon
Copy link
Owner Author

pfalcon commented Nov 17, 2018

Thanks for comments!

$indvar = phi(LoopHeader[0], Loop[$nextindvar])

Well, that's creative idea, but I'm not sure how that fares on "ease of understanding" scale ;-).

I guess, so far I lean towards just having a normal C argument syntax (comma-separated).

@pfalcon
Copy link
Owner Author

pfalcon commented Nov 18, 2018

$indvar = phi(LoopHeader[0], Loop[$nextindvar])
arrays with subscripts.

Thinking about it more, "subscript" is a keyword. But what is usually used as a subscript? Well, address is. Indeed, why scheme for naming SSA vars is subscripting them with address of instructions which defined them, e.g. var4f0 is var defined at 0x4f0.

But then, the order of subscription in the example above should be the opposite:

$indvar = phi(0[LoopHeader], $nextindvar[Loop])

Now, that's more understandable. For me. May be still a quiz for outsiders. But I like it, it's now really creative with a logic behind it. Still more time to think about it of course.

@pfalcon
Copy link
Owner Author

pfalcon commented Nov 18, 2018

Another matter for own reference: the whole idea of introducing SSA was of course to track defs/uses precisely. Well, it now needs to be done. The "rename_ssa_vars" pass should be augmented to create sparse use-def, def-use chains, and, the whole point - other xform passes should be augmented to update that info as needed.

The latter may be not as easy as it seems. Or rather, previously passes didn't try to update this info because heck, in non-SSA form it's not that easy (maybe). Now it should be easier with SSA, but the question if existing passes are changed to update that info only for SSA case, is it sane? Or maybe it will be possible to update them for the general case after all? Who will write tests for all that?

There's of course Gordian cut "solution" to this issue - the whole reason why SSA is usually introduced is to have more efficient passes. So, just rewrite (i.e. duplicate) all existing passes in their SSA variant, problem solved, lol.

@pfalcon
Copy link
Owner Author

pfalcon commented Nov 18, 2018

Still didn't get it, lol. SSA construction is broken, need to restart from the beginning...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants