Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect existing trailers (including co-author lines) when backporting #46

Merged
merged 4 commits into from
Oct 4, 2022

Conversation

Jackenmen
Copy link
Contributor

@Jackenmen Jackenmen commented Aug 8, 2021

@Jackenmen Jackenmen changed the title Make the message made by cherry_picker --continue same as regular Respect existing trailers (including co-author lines) when backporting Aug 8, 2021
@Jackenmen
Copy link
Contributor Author

A small issue with this is that for a single commit PR, GitHub UI shows the trailers that are part of that single commit and it appends the trailers on its own too which causes them to be duplicated:
image

This doesn't break the actual functionality so that I don't think it's a big issue, and CPython probably relies on miss-islington's auto-merge so it might not even be an issue there anyway.

Either way, figured this quirk with GH UI is worth mentioning.

@Jackenmen Jackenmen force-pushed the issue/439 branch 2 times, most recently from d9d9c81 to 1c7daef Compare December 25, 2021 02:44
@Jackenmen Jackenmen marked this pull request as ready for review April 15, 2022 16:10
cherry_picker/test_cherry_picker.py Outdated Show resolved Hide resolved
cherry_picker/test_cherry_picker.py Outdated Show resolved Hide resolved
"--trailer",
cherry_pick_information,
]
output = subprocess.check_output(cmd, input=updated_commit_message.encode())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(One of the comments from my previous review got lost.)

It would be helpful to add a docstring and a couple more comments to this function to explain what it does, since it's not immediately obvious.
AFAIK and by looking at the tests, it seems it retrieves the original commit message and:

  • adds a prefix at the beginning with the branch the commit is being backported to
  • adds a line in the middle mentioning the original commit id
  • adds the co-authors (and possibly other metadata?) at the end

I also wonder if there's a more straightforward way to do achieve this.
Can you use git interpret-trailers --only-trailers to extract the metadata and then do something like this?

f"""{commit_prefix}{original_commit_message}
(cherry picked from commit {self.commit_sha1})

{metadata}"""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welp, some value sure was lost by me not adding more comments in there in the first place... I added some now to explain the way this is supposed to work.

I also wonder if there's a more straightforward way to do achieve this. Can you use git interpret-trailers --only-trailers to extract the metadata and then do something like this?

f"""{commit_prefix}{original_commit_message}
(cherry picked from commit {self.commit_sha1})

{metadata}"""

I can look into an alternative way of doing this. The important things are:

  • the trailers cannot break after the addition of (cherry picked from commit ...)
  • adding Co-authored-by should not cause duplicates
  • the order of the trailers probably shouldn't be affected more than needed by the changes we make (currently we only insert a new trailer at the beginning of existing trailers and don't touch them further)

One of the problems is that git interpret-trailers does not have a flag that does the opposite of --only-trailers so getting the "original commit message" may not be as easy as it seems. A naive way of doing it would be to cut the end of the full message by the length of the output of git interpret-trailers but I wouldn't be too surprised if this wouldn't work that well in practice (as I said at the beginning, some value sure was lost by me not writing some more comments when I wrote this PR so I'll have to look into that).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding Co-authored-by should not cause duplicates

If we can isolate all the trailers (and --only-trailers seems to do that) you could pass them through a set() to remove duplicates. Even if the order is lost I'm not sure it matters. If you want to preserve it, it's pretty easy to write a function that removes duplicates without losing the order.

getting the "original commit message" may not be as easy as it seems.

Recently I was looking into manually adding co-authors, and apparently you had to leave two empty lines between the message and the trailers, so an msg = full_commit_msg.rsplit('\n\n', 1)[0] might work (assuming this is enforced). Otherwise we can cut the end like you said or remove individual trailer lines from full_commit_msg.splitlines().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can isolate all the trailers (and --only-trailers seems to do that) you could pass them through a set() to remove duplicates. Even if the order is lost I'm not sure it matters. If you want to preserve it, it's pretty easy to write a function that removes duplicates without losing the order.

cherry-picker is Python 3.7+ so this could just rely on the insertion order of dictionaries, possibly with the help of dict.fromkeys().

Recently I was looking into manually adding co-authors, and apparently you had to leave two empty lines between the message and the trailers, so an msg = full_commit_msg.rsplit('\n\n', 1)[0] might work (assuming this is enforced).

The problem with this is that we would need to write a proper detection of trailers or we risk the trailers no longer being trailers. A single split is not enough as we plan on adding a trailer and that means we need to make sure that we add a trailer with other trailers. For example, if this were the original commit message:

Commit message

some text

Then we can't just simply add it to the last part:

Commit message

some text
Co-authored-by: Author Name <[email protected]>

On the other hand, we also have to keep all trailers together so if this were the original commit message:

Commit message

Co-authored-by: Co-Author Name <[email protected]>

Then this would not be acceptable:

Commit message

Co-authored-by: Co-Author Name <[email protected]>

Co-authored-by: Author Name <[email protected]>

For reference, here's the logic that git uses: https://github.com/git/git/blob/afa70145a25e81faa685dc0b465e52b45d2444bd/trailer.c#L818-L915

Otherwise we can cut the end like you said

Based on these two:
https://github.com/git/git/blob/afa70145a25e81faa685dc0b465e52b45d2444bd/trailer.c#L1108-L1123
https://github.com/git/git/blob/afa70145a25e81faa685dc0b465e52b45d2444bd/trailer.c#L982-L990
I think there's no alteration done to the printed trailers which would make this doable but I'll have to read into it just a bit more to be sure. I didn't want to delay my response further since I might not be able to get to it at all today so I have not yet carefully examined these two linked fragments of code.

or remove individual trailer lines from full_commit_msg.splitlines().

I think this falls into same category as rsplit but I'm not sure I fully understand what you mean by this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understand what you mean by this.

Given this:

$ cat msg.txt 
gh-12345: fix some broken stuff

Some stuff was broken and now
it's fixed.

Co-authored-by: Co-Author Name <[email protected]>
Co-authored-by: Author Name <[email protected]>
$ cat msg.txt | git interpret-trailers --only-trailers 
Co-authored-by: Co-Author Name <[email protected]>
Co-authored-by: Author Name <[email protected]>

You could do something like this:

>>> # get these from self.get_commit_message(self.commit_sha1).splitlines()
>>> with open('msg.txt') as f:
...   lines = [line.strip() for line in f]  # remove trailing whitespace/newlines
... 
>>> # get these from `git interpret-trailers --only-trailers`
>>> trailers = ['Co-authored-by: Co-Author Name <[email protected]>',
...             'Co-authored-by: Author Name <[email protected]>']
>>> # remove trailers from the full message
>>> for trailer in trailers:
...   lines.remove(trailer)
... 
>>> msg = '\n'.join(lines).strip()  # this is the msg without trailers
>>> trailers.append('Co-authored-by: Another Author')
>>> metadata = '\n'.join(trailers)  # these are the updated trailers
>>> print(f"""[3.10] {msg}
... (cherry-picked from somewhere)
... 
... {metadata}""")  # this is the final message
[3.10] gh-12345: fix some broken stuff

Some stuff was broken and now
it's fixed.
(cherry-picked from somewhere)

Co-authored-by: Co-Author Name <[email protected]>
Co-authored-by: Author Name <[email protected]>
Co-authored-by: Another Author

That is, if you take the full commit and remove the trailer lines you obtained with git interpret-trailers --only-trailers you'll be left with the message only. Once you have the message and the trailers you can update the trailers and recombine everything in a new msg. Do you think this will work?

cherry_picker/cherry_picker.py Outdated Show resolved Hide resolved
cherry_pick_information = f"(cherry picked from commit {self.commit_sha1})\n:"
# Here, we're inserting new Co-authored-by trailer and we *somewhat*
# abuse interpret-trailers by also adding cherry_pick_information which
# is not an actual trailer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of a tangent, but maybe we should make it a trailer? Cherry-Picked-From: {self.commit_sha1}?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message cherry picked from commit ... is the default message generated by git when we use git cherry-pick -X. So I think it should be kept that way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or this can perhaps be addressed in a separate PR :)

@Mariatta Mariatta merged commit 766f76c into python:main Oct 4, 2022
@Jackenmen Jackenmen deleted the issue/439 branch October 4, 2022 06:44
@Mariatta Mariatta added the hacktoberfest-accepted Accepted for hacktoberfest label Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA signed hacktoberfest-accepted Accepted for hacktoberfest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Respect existing co-author lines when backporting
5 participants