Removing a whole bunch of commits from git

Imagine you have a repository for some time and then decide to publish it to someone (e.g. make it open source). The first n commits are garbage or contain content, that you do not want to share with others. So how do we get rid of these commits? Here we go.

Backup your repo via git clone. Find out the first commit you do like after the commits you do not like and remember the hash of its direct ancestor (in the following bbb). Then find the first commit you do like before the commits you do not like (in the following aaa):

$ git replace bbb aaa

Take a look at your repository (with git log) and check whether everything is as it should be after your intended change. But the replace command just adds some meta information into .git/refs/replace, which git commands interpret correspondingly. In order to make these changes permanent, you may use git filter-branch:

$ git filter-branch -- --all

Unfortunately, this command changes all the descendant commits of bbb and their hash, so that you cannot rebase or pull anymore from the remote branch easily (all your changes up to now are local). Therefore use a fresh clone of your repository (e.g. copy the backup of your repository to a new folder) and delete all remote branches (that participate in at least one unwanted commit) and the tags (that reference one of the unwanted commits):

# delete remote branches
$ git push origin :<branch-name>
...
# delete the local branches (if still existent; use 'git branch' to check)
$ git branch -d <branch-name>
...
# delete remote tags
git push origin :refs/tags/<tag-name>
...
# delete local tags
$ git tag -d <tag-name>
...
# remove all the old data from the repository
$ rm -rf .git/refs/original/
$ git reflog expire --expire=now --all
$ git gc --prune=now
$ git gc --aggressive --prune=now
$ git push origin --all

If this is not possible, because no branch would survive (and you cannot delete the branch you are currently in), create a new branch from aaa and check it out in advance:

$ git checkout -b newmaster aaa

After that the remote repository has no (unwanted) branches any more and all old data is deleted. Change back to your repository where you executed the replace and filter-branches commands and remove the remote tracking to all remote branches (because they are removed from the remote repository):

$ git branch -d -r origin/<branch-name>
...

Then push all your branches back to the origin (remote repository):

$ git push origin <branch-name>

Now local and remote repository are in sync with the new rewritten history. If you had to create the dummy branch newmaster before, you may now delete it.

All other users, that checked out the repo before have to remove the remote tracking of all branches with unwanted commits:

$ git branch -d -r origin/<branch-name>
...
# delete the local branches (if still existent; use 'git branch' to check)
$ git branch -d <branch-name>
...
# delete local tags
$ git tag -d <tag-name>
...
$ git pull
# checkout (and track) some remote branches
$ git checkout <branch-name>

Then they are done. Alternatively, delete the repository locally and create a fresh clone.

FUN! If you know a better solution, that does not change the hash of so much commits and does not require to remove the remote branches temporarily: PLEASE TELL ME.

Leave a Reply

Your email address will not be published. Required fields are marked *