Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect existing trailers (including co-author lines) when backporting #46

Merged
merged 4 commits into from
Oct 4, 2022
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 24 additions & 4 deletions cherry_picker/cherry_picker.py
Original file line number Diff line number Diff line change
Expand Up @@ -296,11 +296,31 @@ def get_updated_commit_message(self, cherry_pick_branch):
commit_prefix = ""
if self.prefix_commit:
commit_prefix = f"[{get_base_branch(cherry_pick_branch)}] "
return f"""{commit_prefix}{self.get_commit_message(self.commit_sha1)}
(cherry picked from commit {self.commit_sha1})
updated_commit_message = f"{commit_prefix}{self.get_commit_message(self.commit_sha1)}"
cherry_pick_information = f"(cherry picked from commit {self.commit_sha1})\n:"
cmd = [
"git",
"interpret-trailers",
"--where",
"start",
"--trailer",
f"Co-authored-by: {get_author_info_from_short_sha(self.commit_sha1)}",
"--trailer",
cherry_pick_information,
]
output = subprocess.check_output(cmd, input=updated_commit_message.encode())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(One of the comments from my previous review got lost.)

It would be helpful to add a docstring and a couple more comments to this function to explain what it does, since it's not immediately obvious.
AFAIK and by looking at the tests, it seems it retrieves the original commit message and:

  • adds a prefix at the beginning with the branch the commit is being backported to
  • adds a line in the middle mentioning the original commit id
  • adds the co-authors (and possibly other metadata?) at the end

I also wonder if there's a more straightforward way to do achieve this.
Can you use git interpret-trailers --only-trailers to extract the metadata and then do something like this?

f"""{commit_prefix}{original_commit_message}
(cherry picked from commit {self.commit_sha1})

{metadata}"""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welp, some value sure was lost by me not adding more comments in there in the first place... I added some now to explain the way this is supposed to work.

I also wonder if there's a more straightforward way to do achieve this. Can you use git interpret-trailers --only-trailers to extract the metadata and then do something like this?

f"""{commit_prefix}{original_commit_message}
(cherry picked from commit {self.commit_sha1})

{metadata}"""

I can look into an alternative way of doing this. The important things are:

  • the trailers cannot break after the addition of (cherry picked from commit ...)
  • adding Co-authored-by should not cause duplicates
  • the order of the trailers probably shouldn't be affected more than needed by the changes we make (currently we only insert a new trailer at the beginning of existing trailers and don't touch them further)

One of the problems is that git interpret-trailers does not have a flag that does the opposite of --only-trailers so getting the "original commit message" may not be as easy as it seems. A naive way of doing it would be to cut the end of the full message by the length of the output of git interpret-trailers but I wouldn't be too surprised if this wouldn't work that well in practice (as I said at the beginning, some value sure was lost by me not writing some more comments when I wrote this PR so I'll have to look into that).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding Co-authored-by should not cause duplicates

If we can isolate all the trailers (and --only-trailers seems to do that) you could pass them through a set() to remove duplicates. Even if the order is lost I'm not sure it matters. If you want to preserve it, it's pretty easy to write a function that removes duplicates without losing the order.

getting the "original commit message" may not be as easy as it seems.

Recently I was looking into manually adding co-authors, and apparently you had to leave two empty lines between the message and the trailers, so an msg = full_commit_msg.rsplit('\n\n', 1)[0] might work (assuming this is enforced). Otherwise we can cut the end like you said or remove individual trailer lines from full_commit_msg.splitlines().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can isolate all the trailers (and --only-trailers seems to do that) you could pass them through a set() to remove duplicates. Even if the order is lost I'm not sure it matters. If you want to preserve it, it's pretty easy to write a function that removes duplicates without losing the order.

cherry-picker is Python 3.7+ so this could just rely on the insertion order of dictionaries, possibly with the help of dict.fromkeys().

Recently I was looking into manually adding co-authors, and apparently you had to leave two empty lines between the message and the trailers, so an msg = full_commit_msg.rsplit('\n\n', 1)[0] might work (assuming this is enforced).

The problem with this is that we would need to write a proper detection of trailers or we risk the trailers no longer being trailers. A single split is not enough as we plan on adding a trailer and that means we need to make sure that we add a trailer with other trailers. For example, if this were the original commit message:

Commit message

some text

Then we can't just simply add it to the last part:

Commit message

some text
Co-authored-by: Author Name <[email protected]>

On the other hand, we also have to keep all trailers together so if this were the original commit message:

Commit message

Co-authored-by: Co-Author Name <[email protected]>

Then this would not be acceptable:

Commit message

Co-authored-by: Co-Author Name <[email protected]>

Co-authored-by: Author Name <[email protected]>

For reference, here's the logic that git uses: https://github.com/git/git/blob/afa70145a25e81faa685dc0b465e52b45d2444bd/trailer.c#L818-L915

Otherwise we can cut the end like you said

Based on these two:
https://github.com/git/git/blob/afa70145a25e81faa685dc0b465e52b45d2444bd/trailer.c#L1108-L1123
https://github.com/git/git/blob/afa70145a25e81faa685dc0b465e52b45d2444bd/trailer.c#L982-L990
I think there's no alteration done to the printed trailers which would make this doable but I'll have to read into it just a bit more to be sure. I didn't want to delay my response further since I might not be able to get to it at all today so I have not yet carefully examined these two linked fragments of code.

or remove individual trailer lines from full_commit_msg.splitlines().

I think this falls into same category as rsplit but I'm not sure I fully understand what you mean by this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understand what you mean by this.

Given this:

$ cat msg.txt 
gh-12345: fix some broken stuff

Some stuff was broken and now
it's fixed.

Co-authored-by: Co-Author Name <[email protected]>
Co-authored-by: Author Name <[email protected]>
$ cat msg.txt | git interpret-trailers --only-trailers 
Co-authored-by: Co-Author Name <[email protected]>
Co-authored-by: Author Name <[email protected]>

You could do something like this:

>>> # get these from self.get_commit_message(self.commit_sha1).splitlines()
>>> with open('msg.txt') as f:
...   lines = [line.strip() for line in f]  # remove trailing whitespace/newlines
... 
>>> # get these from `git interpret-trailers --only-trailers`
>>> trailers = ['Co-authored-by: Co-Author Name <[email protected]>',
...             'Co-authored-by: Author Name <[email protected]>']
>>> # remove trailers from the full message
>>> for trailer in trailers:
...   lines.remove(trailer)
... 
>>> msg = '\n'.join(lines).strip()  # this is the msg without trailers
>>> trailers.append('Co-authored-by: Another Author')
>>> metadata = '\n'.join(trailers)  # these are the updated trailers
>>> print(f"""[3.10] {msg}
... (cherry-picked from somewhere)
... 
... {metadata}""")  # this is the final message
[3.10] gh-12345: fix some broken stuff

Some stuff was broken and now
it's fixed.
(cherry-picked from somewhere)

Co-authored-by: Co-Author Name <[email protected]>
Co-authored-by: Author Name <[email protected]>
Co-authored-by: Another Author

That is, if you take the full commit and remove the trailer lines you obtained with git interpret-trailers --only-trailers you'll be left with the message only. Once you have the message and the trailers you can update the trailers and recombine everything in a new msg. Do you think this will work?

# Replace the right most-occurence of the "cherry picked from commit" string.
#
# This needs to be done because `git interpret-trailers` adds `:`
ezio-melotti marked this conversation as resolved.
Show resolved Hide resolved
# to the end of the trailer.
before, after = output.strip().decode().rsplit(f"\n{cherry_pick_information}", 1)
if not before.endswith("\n"):
# ensure that we still have a newline between cherry pick information
# and commit headline
cherry_pick_information = f"\n{cherry_pick_information}"
updated_commit_message = cherry_pick_information[:-1].join((before, after))


Co-authored-by: {get_author_info_from_short_sha(self.commit_sha1)}"""
return updated_commit_message

def amend_commit_message(self, cherry_pick_branch):
""" prefix the commit message with (X.Y) """
Expand Down
90 changes: 90 additions & 0 deletions cherry_picker/test_cherry_picker.py
Original file line number Diff line number Diff line change
Expand Up @@ -468,6 +468,96 @@ def test_normalize_short_commit_message():
)


@pytest.mark.parametrize(
"commit_message,expected_commit_message",
(
# ensure existing co-author is retained
(
"""Fix broken `Show Source` links on documentation pages (GH-3113)

Co-authored-by: PR Co-Author <[email protected]>""",
"""[3.6] Fix broken `Show Source` links on documentation pages (GH-3113)
(cherry picked from commit b9ff498793611d1c6a9b99df464812931a1e2d69)

Co-authored-by: PR Author <[email protected]>
Co-authored-by: PR Co-Author <[email protected]>""",
),
# ensure co-author trailer is not duplicated
(
"""Fix broken `Show Source` links on documentation pages (GH-3113)

Co-authored-by: PR Author <[email protected]>""",
"""[3.6] Fix broken `Show Source` links on documentation pages (GH-3113)
(cherry picked from commit b9ff498793611d1c6a9b99df464812931a1e2d69)

Co-authored-by: PR Author <[email protected]>""",
),
# ensure message is formatted properly when original commit is short
(
"Fix broken `Show Source` links on documentation pages (GH-3113)",
"""[3.6] Fix broken `Show Source` links on documentation pages (GH-3113)
(cherry picked from commit b9ff498793611d1c6a9b99df464812931a1e2d69)

Co-authored-by: PR Author <[email protected]>""",
),
# ensure message is formatted properly when original commit is long
(
"""Fix broken `Show Source` links on documentation pages (GH-3113)

The `Show Source` was broken because of a change made in sphinx 1.5.1
In Sphinx 1.4.9, the sourcename was "index.txt".
In Sphinx 1.5.1+, it is now "index.rst.txt".""",
"""[3.6] Fix broken `Show Source` links on documentation pages (GH-3113)

The `Show Source` was broken because of a change made in sphinx 1.5.1
In Sphinx 1.4.9, the sourcename was "index.txt".
In Sphinx 1.5.1+, it is now "index.rst.txt".
(cherry picked from commit b9ff498793611d1c6a9b99df464812931a1e2d69)

Co-authored-by: PR Author <[email protected]>""",
),
# ensure message is formatted properly when original commit is long
# and it has a co-author
(
"""Fix broken `Show Source` links on documentation pages (GH-3113)

The `Show Source` was broken because of a change made in sphinx 1.5.1
In Sphinx 1.4.9, the sourcename was "index.txt".
In Sphinx 1.5.1+, it is now "index.rst.txt".

Co-authored-by: PR Co-Author <[email protected]>""",
"""[3.6] Fix broken `Show Source` links on documentation pages (GH-3113)

The `Show Source` was broken because of a change made in sphinx 1.5.1
In Sphinx 1.4.9, the sourcename was "index.txt".
In Sphinx 1.5.1+, it is now "index.rst.txt".
(cherry picked from commit b9ff498793611d1c6a9b99df464812931a1e2d69)

Co-authored-by: PR Author <[email protected]>
Co-authored-by: PR Co-Author <[email protected]>""",
),
),
)
def test_get_updated_commit_message_with_trailers(commit_message, expected_commit_message):
cherry_pick_branch = "backport-22a594a-3.6"
commit = "b9ff498793611d1c6a9b99df464812931a1e2d69"

with mock.patch("cherry_picker.cherry_picker.validate_sha", return_value=True):
cherry_picker = CherryPicker("origin", commit, [])

with mock.patch(
"cherry_picker.cherry_picker.validate_sha", return_value=True
), mock.patch.object(
cherry_picker, "get_commit_message", return_value=commit_message
), mock.patch(
"cherry_picker.cherry_picker.get_author_info_from_short_sha",
return_value="PR Author <[email protected]>",
):
updated_commit_message = cherry_picker.get_updated_commit_message(cherry_pick_branch)

assert updated_commit_message == expected_commit_message


@pytest.mark.parametrize(
"input_path", ("/some/path/without/revision", "HEAD:some/non-existent/path")
)
Expand Down