Merging and rebasing accomplish similar goals, but go about them in different ways. Both help manage working with branches and collaborating with multiple people, but they’re not interchangeable, and rebasing can be harmful if not done properly.
What’s The Deal With Rebasing?
Rebasing is very complicated, so it’s important to understand how git works under the hood if we’re going to make any sense of it.
Git stores your code as a series of edits. You can think of this like a chain, working backwards. Each edit, or “commit,” references the ID of the previous commit and includes what changed from the previous commit. This chain is only stored on your computer; your Git client doesn’t talk to any other Git clients unless it’s performing a fetch or push (pull is really just fetch + merge with your local branch), and even then, it’s only talking to a shared remote repository.
Branches are more complicated. Git only stores one thing when dealing with branches: the ID of the commit at the end of the branch. In this way, you can think of them like the playhead on a record player; you put the branch head at a specific position, and it works its way back through the chain, “playing” your code and arriving at a final version. Whenever you commit, your local git client will automatically move the playhead forward to the new commit.
If you wanted to merge feature into master, you would run:
This creates a new merge commit, and if there are any conflicts, you’ll have to solve them manually. The git merge command moves the master playhead to the new merge commit, and deletes the feature playhead, as it’s no longer necessary.
This method of merging code presents three problems:
There may be changes on the master branch that the feature branch would like to include, particularly if feature is taking a while to develop. Being forced to go through the merge process every time you want to work with branches is annoying. The commit history is messy, though this is largely an aesthetic problem.
Rebasing tries to solve these issues, to varying degrees of success. Rebasing changes where you started your branch. The whole branch is lifted up, and transported to the end of the current master branch, where it connects to the end. The master branch is left untouched, and is free to continue receiving commits.
The commits aren’t actually moved, however, since commits are immutable. Rather, they’re copied, which results in new commit IDs. The previous commits are left stranded, hiding in your Git files but never to be seen again, since the playhead has moved somewhere else.
To execute this process from the command line, you would run:
This opens the branch, pulls the current changes to master, and then rebases the feature branch onto the master branch.
At this point, the code in the feature branch is now more up to date, which is the only real feature of git rebase. Rebasing does not merge branches, since it does not create any merge commits or move master’s playhead.
If you want to merge after rebasing, you’d run:
Which would look like this, with master’s playhead replacing the feature playhead:
So rebasing doesn’t end up solving the problem of dealing with merges, since you’ll need to merge at the end anyway to update the master branch. The actual merge command at the end should go off without a hitch though, since the process of rebasing requires you to “merge” in changes, which still can cause conflicts. And if you’d still like to continue working on your branch, you still need to “merge” in changes.
Do Not Rebase Shared Branches
Remember how rebasing copies commits and leaves a stranded playhead? That’s actually a major issue if you’re working with shared code. Let’s say you created the feature branch, and push it to your repo so your coworkers can test it out. That’s entirely fine, but if one of them wanted to branch off of your branch, when you eventually rebase, you end up with this:
Your coworker’s feature2 branch is now referencing an old chain of commits. Your Git client has no way of knowing this, since the feature2 branch is stored on your coworker’s computer. They also have no way of knowing that you rebased until you push your changes.
When you rebased, it didn’t copy the feature2 branch when it copied all the commits. Even if it could, it wouldn’t effect your coworker’s local Git repo, making everything out of sync. The solution here would be to rebase feature2 onto feature at the spot that it would be at, but that’s messy, even by Git standards, and this is just a very simple example.
Bottom line is, do not rebase if you’re not working locally.
When Is Rebasing Better Than Merging?
If your branch is going to take a while to develop, rebasing solves the issue of “branch syndrome,” where your code is way too out of date with the working master, and you need to update it to continue working. Speaking generally, you should try to avoid this problem as much as possible, but rebasing can fix it when it arises.
If you’re just making small, incremental, daily changes, you should instead work on a local master branch and use Pull Requests when you’re ready to push your changes. These use the model of topical branches, created specifically to store your code before it’s approved for a merge.
But, if you’re working on a weekly timespan, and are going to end up making multiple pull requests and merging multiple times, you can work on your code for a bit longer, rebase locally for updates, and perform one pull request at the end to cut down on the amount of testing and talking to supervisors. Rebasing is primarily a local thing, so you can do it on your staging branches without waiting for approval.
If nobody else depends on your branch, you can rebase before a branch merge to make the commit history clean and one dimensional. Though, one could argue that traditional merging, while certainly uglier, is easier to follow and debug, since merge commits are entirely nondestructive. Either way, the rebasing needs to be done right before your changes are merged, or you may run into the issue of Master being updated before your changes are approved.