Open discussion about gem5's PR Merge Policy: Should we use Merge
Squash
or Rebase
?
#261
Replies: 7 comments 12 replies
-
This is a good summary of the possibilities, thanks for bringing this discussion up, Bobby! What do you mean by ¨On Gerrit we had too many people lose code in rebase failures.¨? I like the rebase strategy exactly because of what you list as cons:
I am not a fan of merge commits, as, for me, they just clutter the log (specially when each PR is composed of few commits) generally bringing little information to the table (see above), and decentralizes the implementation (the fixes to conflicts applicable to each specific commit are squashed under the - unrelated - merge commit). Squash merges complicate reverting to fix bugs (i.e., an all-or-nothing approach) and hides knowledge, which IMHO makes it awful. Your proposed solution does not have the drawbacks I mention of squash merge... because it behaves the same as a rebase merge. Overall, I prefer rebase merges for the reasons above, but your proposal is better than the current scenario. |
Beta Was this translation helpful? Give feedback.
-
I have a slight preference for rebase merge simply because this is the most similar to what gerrit was doing, but not really any strong preferences either way. What I have seen so far:
That said, I agree with the proposal to enable another submit option for single commit PRs which will help with the log clutter. Squash/rebase merge seem to be identical in this situation to me. |
Beta Was this translation helpful? Give feedback.
-
One small thing from my point of view... when doing a "rebase always" policy with more than 1 commit in the relation chain (PR in github nomenclature) is practically impossible to revert. Often, if you bisect and find a particular commit to be the problem one commit cannot be reverted without breaking other things. Alternative, reverting the entire relation chain is not straightforward, especially if there are commits made afterward. However, reverting squashed changes is quite easy. I don't know how easy/difficult it is to revert a merged PR, but my bet is that it's easier than a rebased relation chain. Note: I agree that for a single commit rebase makes more sense than merge. Finally, IMO, "clean history" should not be a main goal. It's nice, but in practice not used nearly as often as pulling updates and merging new commits. We should optimize for the common case (i.e., our users), not the uncommon case (e.g., a minority of developers). |
Beta Was this translation helpful? Give feedback.
-
Many thanks Bobby for the discussion, I believe it is a very detailed overview of the pros and cons of every approach. I honestly have the feeling that distinguishing between a multi-commit and a single-commit PR is a great idea (and I presume GitHub allows you to treat the differently when accepting the PR). When you say
Does it mean that it is possible to configure Squash Merge to do a Squash and Rebase? So my questions are:
|
Beta Was this translation helpful? Give feedback.
-
I want to note here something I didn't outline in our original post, though @giactra touched upon it. In issues where the PR is a single commit, reviewers may add "suggestions" to his change (you may have seen this in reviews). The contributor can then click "add suggestion" which will create a new commit in the PR adding those changes. This is a great use-case for |
Beta Was this translation helpful? Give feedback.
-
I'd enabled |
Beta Was this translation helpful? Give feedback.
-
I believe this conversation has had it's day. Our merge system is still in-perfect, but I do not believe there is a merge system available which can satisfy all the criteria we have. |
Beta Was this translation helpful? Give feedback.
-
TL;DR: open discussion about which merge policy to use. I advocate enabling
Squash merge
for certain cases, particular 1-commit PRs (in this case a squash merge is essentially a rebase), and keepingMerge
for cases of multiple commits. I argue thatRebase Merge
doesn't offer us anything more.I intend this to be an open discussion. We've been using the GitHub PR model for just under 3 months now and Ive found the most contentious part is how we merge changes to the main gem5 development branch. I'll log here my observations on the pros and cons of each approach and propose my improvement, though this is just me dumping my thoughts. I'm open to altnerative ideas and feedback.
To remind everyone, we can only have a combination of these three merge policies:
Merge
: A merge commit is created, thus merging the PR commits into the develop branch.Squash Merge
: The commits in the PR are squashed in a single commit then rebased on top of the develop branch. There is no merge commit.Rebase Merge
: The PR commits are rebased on top of develop. You can think of it as cherry-picking the commits to the top of develop. There is no merge commit.Merge (CURRENT)
Right now we are using
Merge
. Here's are problems I've noticed with using justMerge
:Merge
commit message setup as the PR title and description. If you think it through this results in the merge commit message being almost identical to that of the single commit it is merging (note: the merge commit does append GitHub's#<pr id>
to the end PR title). This results in agit log
which looks like a lot of duplicates. For example his commit c0db065c and 3520c836 look identical at first glance, but one is the merge commit and the other the actual commit.Merge
have their authors set as the person who did the merge. Our policy is that Maintainers are responsible for merging and are the only ones who have permission to do so create this merge. This is our way of policing changes. This authorship may be seen as unfair credit in some cases. This is particularly annoying with the "duplicate commit message" issue mentioned in 1. For example, looking at this commit: 8d47cda it appears that I am author. However, this is just the merge commit for the change. If you click through to the PR arch-x86: Fix wrong x86 assembly #251 you can see the actual commit and the real author of the change.There's ultimately nothing we can do about (2.) if we are to use
merge
. If we create a merge commit someone needs to be the author and there's no option to cede this to anyone else. (1.) and (3.) regard the merge commit message default. We do have options here. The default merge commit messages can either be set to:arch-x86: Fix wrong x86 assembly (251)
The "Default Message" would it clear we're dealing with a merge commit and not feel we have so many "duplicates" in the repo (ergo addressing 1.). To me anything that stops the "duplicate commit message" issue from happening is an advantage. "Default message" is also likely to be too long, thereby addressing some of (3.). However, it does mean the contributor's description in the PR is not logged in git. This seems desirable at least for cases where a PR consists of multiple commits and we want the merge commit to explain how these patches work together. The "Default to PR title" doesn't have many advantages and the "Default to PR title and descriptions "is what we're using now and has already been discussed.
While these are just default messages, and Maintainers are free to edit them as they wish prior to merging, I think we have to expect most of us to go with the default that is presented. Ultimately, we can only chose one of these three defaults with
Merge
. That's it really.Squash Merge
We tried
squash merge
for a while and there were some complaints:Squash Merges
setup with "PR title and squash commit description" as our default merge commit message. Squash commits descriptions are just messy. The constituent messages concatenated together and it's just ugly to see and can make finding things in the log difficult.co-authors
in the commit body on the squashed commit. This is non-standard and won't mean much to, say,git blame
but GitHub does recognize this as authorship. The sole author goes to whoever the first commit in the PR is authored to.(1.) and (3.) may be resolvable. We can use "Use PR title and description" for
Squash Merge
. This would mean we don't get the squashes of the commit messages like we did before. How authors would feel about all their commits headers and descriptions being erased in favor of the PR description is unknown.Squash Merge
does have a few advantages though. I found it nice that:Squash Merge
is just a rebase to the develop branch.Rebase Merge
Rebase merge
has been proposed before. It is the approach most similar to the Gerrit system, but I've been hesitant to use it for a a couple of reasons:Rebase Merge
so we can't group changes and provide overarching context. At the level of ourgit log
, they are just a series of commits with nothing linking them or explaining why they were all added together.The advantage is we get a clean git history. The reason it was advocated for in the past is it avoids the creation of commits with identical changeIDs which can happen due to the aforementioned default PR description policy and default merge commit message policy causing a "duplicate commit message" when a PR just has one commit. However, I think we can find ways around that which I explain below.
** My preference: Merge for many, Squash Merge for one. **
I thing the following would be best and solves a lot of the problems we're having:
Merge
are in the case where the PR contains a single commit. In this case we just want to rebase that change on top of develop. Rebasing is only bad in the case we have multiple commits. Creating a separate merge commit a single commit is just unreasonable. As previously mentioned, doing aSquash Merge
on a single commit, while enforcing a policy where we copy the PR title and PR description to the commit message is effectively a rebase. It also has a small bonus in that it adds the PR ID to the commit header. Problems withSquash Merge
's only occur when the PR has multiple commits.Merge
makes sense. The merge commit containing the PR description gives more context to what a merge of changes actually means. This is good for explaining big features that span many commits.Given those two cases I can't think of what enabling
Rebase Merge
gives us. If it's a single commit, it's superseded by "Squash Merge". In the case of having multiple commits we miss out on having merge commits which can provide greater context as well as a link back to the PR.Rebase Merge
also means we may have to deal with messy rebase conflict. So I believeMerge
is better in this case.The issues with Gerrit ID are also lessened here. Those were only an issue in the case where a
Merge
was used for a PR with a single commit, thus resulting in two patches with the same ChangeID, or whenSquash Merge
was used on a PR with multiple commits, ergo creating a commit with multiple ChangeIDs.This doesn't solve all our problems but I think it is a step in a better direction. I would hope if this were setup maintainers would understand which merge is better to use in what circumstance.
Beta Was this translation helpful? Give feedback.
All reactions