Migrating Modules/Directories Between GIT Repos

Disclaimer: I am not very good at using Git. The vast majority of time I just need to use a tiny subset of its features, and most likely my usage of Git resembles how I was using SVN in the past. Hence this post is actually a collection of resources I found online, which I document here for my own future benefit.

Working in large projects many times it means you have to move code around in different repos. For example, in many cases developers start building an application, maintaining all the code in one repo, but as this application is getting mature, or as parts of the app have been extracted in ‘framework’ like code there is the need to move those parts in a new repository. It is always beneficial to maintain the history of those modules/directories/file.

I came across such a situation recently and below are the steps I followed to achieve that.

Setup

For clarity, let’s assume our setup is as below:

  • SourceRepo is the name of the repository that our code that we want to migrate lives at
  • Dir_I, Dir_II, Dir_III are the directories (and the files, other directories inside those) from the SourceRepo that we need to migrate
  • TargetRepo is the new repository, which exists and we need to migrate the above 3 directories, including their history

Steps

  • Clone the SourceRepo
git clone git@github.com:username/SourceRepo.git
cd SourceRepo
  • Remove the remote origin

This is optional and mainly done as a precaution, so that nothing will be pushed to remote origin by accident

git remote rm origin
  • Filter the commits in all branches that related to the diretories we want to keep

This step will blow away every other directory in the SourceRepo, only keeping the above 3 directories, along with their history.

git filter-branch --index-filter "git rm --cached --quiet -r --ignore-unmatch -- . && git reset $GIT_COMMIT -- Dir_I Dir_II Dir_III" --tag-name-filter cat -- -- all

This step will result in the final form of SourceRepo.

  • Clone the TargetRepo
git clone git@github.com:username/TargetRepo.git
cd TargetRepo
  • Add the SourceRepo as a remote to the TargetRepo
git remote add localSourceRepo ../SourceRepo
  • Fetch the index of the newly added remote localSourceRepo
git fetch localSourceRepo
  • Create a branch out of localSourceRepo

This branch will effectively have all the directories along with their old Git history, that was kept from SourceRepo

git branch temp remotes/localSourceRepo/master
  • Create a branch out of TargetRepo master

This is an optional step, but will help us create an intermediary branch where we will merge changes from SourceRepo and TargetRepo. Once the merging and any conflict resolution is completed (if any), someone can raise a PR to merge this to master

git checkout -b fromMaster
  • Merge SourceRepo branch to TargetRepo branch

In some cases (i.e. you had an identical named dir in TargetRepo) conflicts will arise. Those conflicts can be resolved in this step.

git merge temp
  • Once happy with the merging, a PR can be raised to merge fromMaster branch to master
  • Cleanup

This is an optional step. Most likely someone will be deleting all those temporary clones and dirs that were created. But for the benefit of the reader, we no longer need the localSourceRepo remote that we have added and also the temp and fromMaster (considering it was merged already) brances.

git remote rm localSourceRepo
git branch -D temp
git branch -D fromMaster

Resources