-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V3 file format #26
Comments
I really like the scoped option! However it seems to be limited to just classes. Would it be possible to also extend this to methods? I also don't see a way to attach a group to a field. That would be a nice addition to either support formatting or a case where a constant is inlined to a field (ie |
Ability to mark all constants in a class (e.g. WorldEvents) as unpickable would be nice. |
Hmm, I'll think about it. The problem with this is it breaks the ability for the unpick implementation to be run without reference to the class the constants are in. Which we may decide to not be a big deal tbh but it's a nice to have. An alternative would be to generate some of the unpick mappings in yarn, like you already kind of do for name suggestions. I was already considering this for |
Switch cases, field initialisers, and perhaps field assignments in general should also be possible to unpick.
While these are in-scope for Unpick, the current implementation of uninlining is bytecode level, so it cannot perform transformations that can only be done at the source level like this. Having features in the Unpick format that Unpick cannot uninline would definitely be confusing, so we'll need to figure out some resolution to this.
v2 supports this since #11 was merged, v3 should as well
Why? This is different from v2, making it harder to port v2 mappings to v3. It also makes manually writing unpick definitions harder. |
True, but this is not the responsibility of the file format.
Maybe, but the resolution could simply be that some (or even all) types of uninlining are optional. I am anticipating there to be multiple uninliner implementations, not just unpick itself, but we are planning one in vineflower too. Btw, it's also impossible to uninline switch cases and annotation values at the bytecode level.
Ah, I wasn't aware. Scopes and concatenation support will definitely make it more useful though.
This is what I get for submitting this issue before I had fully fleshed it out. Tomorrow I am going to add much more detail about how unpick mappings are applied, but basically a group definition specifies how to transform a Java literal, and only a Java literal, into another expression. There are only
Good point, I mistakenly assumed this was already how it worked in unpick and that I wasn't changing anything here, but it is just worse than the simpler way and these complexities should be left to the unpick bytecode implementation |
They're kinds of uninlining that's useful, so ideally the file format should have some way to describe them. One option would be a "dumb" uninlining that replaces every use of a literal within the target method. I don't think this is a good way, but it clarifies that I'm not necessarily proposing specific support for switches in the file format.
We're on the same page then. There are at least 3 potential projects that might implement uninliners; Unpick itself, Vineflower, and NeoForge; they all have different needs, and I'd like to meet them if we can. Benefits Unpick too, as it likely gets Unpick more contributors. |
Switch statements and expressions can be uninlined as part of propagation, see the updated description for an example of how I would expect an unpick file to be applied |
Updated spec looks good to me! 👍 Only other thing I can think of is that when documenting flag constants, it's worth explicitly mentioning that they can have multiple bits set (e.g. |
This was implicitly covered by the description of how flags are applied (with the "minimal set" ORed together), but I added in a note explicitly specifying that they can have multiple bits set. |
I have added a syntax for non-static constants:
|
Improvements over V2:
Mth.PI
).ClientboundCustomPayloadPacket.MAX_PAYLOAD_SIZE
).Mth.PI / 3
,LENGTH - 1
).Example V3 file:
Specification
General notes
#
character and all subsequent characters until the next new line or the end of the file are ignored.Tokens
Whitespace is not part of the token stream unless it is part of the
<Indent>
token or<NewLine>
token. Its only other purpose is to separate tokens that would otherwise be joined together as the same token. Otherwise it is ignored.File structure
Semantics
Group Definitions
int
, orlong
, then the group type may beconst
orflag
. If the data type isfloat
,double
, orString
, then the group type may be onlyconst
. The data type cannot bebyte
,short
, orchar
.const
type.String
, integers or doubles if the data type of the group isfloat
ordouble
, and integers if the data type of the group isint
orlong
.Expressions
+
operator handles both addition and string concatenation, as in Java. The cast operator cannot be used to convert between strings and other types.byte
,short
, andchar
types.:instance
suffix can be added to a field expression. The instance of a constant expression is retained in the bytecode and the field access is replaced with a null check. A bytecode implementation of unpick could implement non-static field uninlining by looking for the following pattern in bytecode:invokevirtual java/lang/Object.getClass()Ljava/lang/Class; pop; <load constant>
invokestatic java/util/Objects.requireNonNull(Ljava/lang/Object;)Ljava/lang/Object; pop; <load constant>
getfield
instruction, which should preserve the instance which should have been loaded onto the stack already before the null check.Targets
byte
,short
,int
,long
,float
, anddouble
are all compatible with each other.char
is compatible withbyte
,short
,int
, andlong
.String
is only compatible with itself.Class Names
.
to separate package elements and$
to separate inner class names from outer class names. This format was chosen over the internal name to avoid potential confusion with the/
division operator.Link-time checking
Even though the V3 format is designed to be run without reference to the constant fields on the classpath, it may be useful to validate that unpick files will produce the expected outcome:
final
and is initialized to a compile time constant (JLS §15.28)./
and%
operators, validate that if both sides of the operator are integers, the right hand side does not evaluate to 0.Application
Note that this section is only one example of how constant uninlining could be implemented with these files. Implementations are free to uninline in less or more places than specified here or to use different techniques. This section is meant to give a feel for how the file format is supposed to be interpreted and reasoned about, not to dictate how the implementation is supposed to look.
Following is a description of an algorithm to uninline constants in a method at the source code level. In practice it may be undesirable to only uninline a single method in isolation, due to the existence of inner methods (via anonymous or local classes).
Identify targets
Every expression and sub-expression in the syntax tree of the method body is associated to a group. In addition, every parameter and local variable is associated to a group. All expressions and variables are initially assigned to the default group. Then:
Enclosing method parameters
Search for if the enclosing method or any method it overrides or implements is a target method in the unpick file. For each target parameter, assign the group of that parameter in the method to the group specified in the unpick file.
Enclosing method return
Search for if the enclosing method or any method it overrides or implements is a target method in the unpick file. If the group of the return value of the target method is specified, then assign the group of every expression which is the argument of a return statement to the group specified in the unpick file.
Referenced fields
For every field reference in the method, search for if the referenced field is a target field in the unpick file. If it is, assign the group of the field reference expression to the group specified in the unpick file.
Referenced method parameters
For every method call expression and
new
expression in the method, search for if the referenced method or constructor is a target method in the unpick file, or if the referenced method overrides or implements a target method in the unpick file. For each parameter of that method call, assign the group of the expression passed as that parameter to the group specified in the unpick file, if present.Referenced method returns
For every method call expression in the method, search for if the referenced method, or any method it overrides or implements, is a target method in the unpick file. If it is, and the group of the return value of the target method is specified, then assign the group of the method call expression to the group specified in the unpick file.
Pattern variables
For all variables declared as a pattern variable from destructuring a record, search for if the referenced field or accessor method is a target field or method in the unpick file. If it is, assign the group of the variable to the group specified in the unpick file.
Propagate groups
In this step, the non-default groups of expressions are repeatedly propagated to other expressions until there are no further propagations to apply. If propagation assigns a group to an expression, but that expression is already assigned to a different non-default group, this is a group conflict. How to handle group conflicts is up to the implementation, but throwing an error or warning or assigning an arbitrary group are possible implementations.
Variable propagation
If a variable is assigned a group, then that group is propagated to all references to that variable. If a variable reference is assigned a group, then that group is assigned to all other references to that variable and the variable itself.
Variable declaration propagation
The group of a variable and the expression it is assigned to as part of its declaration are propagated to each other.
Operator propagation
=
,+=
,-=
,*=
,&=
,|=
^=
,+
,-
,*
,&
,|
,^
./=
,%=
,>>=
,>>>=
,<<=
,/
,%
.==
,!=
,>
,<
,>=
,<=
.+
,-
,~
,++
(prefix and postfix),--
(prefix and postfix).Switch propagation
Substitute constants
For each literal expression in the method body:
-
expression, then apply the following steps on the unary expression rather than on the literal itself.const
, or if the group type isflag
and the literal is 0 or -1: if there are any substitutions (matching the scope of the current method) matching the value of the literal, then replace that literal expression with the expression specified by that substitution.flag
and the literal is not 0 or -1:~(expr1 | expr2 | ...)
whereexpr1
,expr2
, etc are the expressions specified by the substitutions in the negative set.expr1 | expr2 | ... | residual
, whereexpr1
,expr2
, etc are the expressions specified by the substitutions in the positive set. Do not include the residual if it is 0, otherwise apply the format of the group to the residual if specified.The text was updated successfully, but these errors were encountered: