This blog post is intended to be a reference not only for you but for me, because I always forget how it works. “It”, in this case, is Git, or more specifically, GitHub. GitHub is a web site that hosts public Git repositories. It allows anyone in the world to collaborate on any GitHub project without having to worry about repository commit privileges or patch files. If you want to work on a GitHub project, you simply fork the project’s GitHub repository, creating a new GitHub repository in your name. After you make changes to your GitHub repository, you can send pull requests to other GitHub users to notify them of the changes, and they can pull your changes into their own repositories.
While GitHub does make social coding convenient, there is still a significant learning curve. I won’t discuss the process of installing Git or signing up for GitHub account; you can find documentation for those elsewhere. The most confusing aspect of working with GitHub, in my opinion, is repository management, and that’s what I’ll explain here. My explanation will give you the steps of the GitHub workflow using an example project. I’ve chosen ClickToFlash as my example, because people are familiar with it, and I’ve contributed code to the project.
Mystery surrounds the origin of ClickToFlash. The project first appeared on Google Code, posted by an anonymous donor. Not long thereafter, the project disappeared without a trace. We still don’t know the identity of ClickToFlash’s author. (I suspect Holtzman, Holtzmann, or Holzmann.) Fortunately, several developers including ‘Wolf’ Rentzsch preserved the source code, and Rentzsch’s GitHub repository has become ‘official’. Other GitHub repositories such as my own and Simone Manganelli’s are forked from Rentzsch’s. That’s the place to start.
If you haven’t already forked ClickToFlash, you’ll see a “Fork” button on http://github.com/rentzsch/clicktoflash. When I clicked that button, it created a fork of the project at http://github.com/lapcat/clicktoflash. The fork is my public repository, which GitHub users can pull from. The catch is that I can’t work directly on my public repository, because I don’t have shell access to GitHub’s servers where the repository resides. Besides, I wouldn’t be able to run Xcode anyway. Thus, I have to clone the repository on my own local Mac. The URL of my public repository is listed on http://github.com/lapcat/clicktoflash. Actually, there are multiple URLs, but you’ll want to make sure the clone the SSH version, which is read-write. If you clone the read-only version, then you won’t be able to push changes back to the public repository.
$ git clone firstname.lastname@example.org:lapcat/clicktoflash.git
Your private, local clone automatically has a
master branch that matches the master branch of your public, remote repository.
$ cd clicktoflash $ git branch * master $ git status # On branch master nothing to commit (working directory clean)
Your remote cloned repository is given the special name
origin by your local clone repository.
$ git remote origin
master branch also tracks the remote repository, so that
git pull, and
git push automatically apply to
origin when run with
master checked out.
$ git remote show origin * remote origin Fetch URL: email@example.com:lapcat/clicktoflash.git Push URL: firstname.lastname@example.org:lapcat/clicktoflash.git HEAD branch: master Remote branches: cutting-edge tracked master tracked Local branch configured for 'git pull': master merges with remote master Local ref configured for 'git push': master pushes to master (up to date)
Despite the fact that the local repository is a clone of
origin is a fork of
rentzsch, the local repository knows nothing of
rentzsch. [Expletives censored.] If you want to pull changes from
rentzsch, you need to add it as a remote repository. In this case, you can use the read-only URL, because you can’t push changes to his repository.
$ git remote add rentzsch git://github.com/rentzsch/clicktoflash.git $ git remote origin rentzsch
Some people suggest
upstream for the name of the remote repository, but I find this needlessly confusing. The name
rentzsch tells me exactly where the changes are coming from. Unlike
upstream, it’s not abstract or subject to misinterpretation with
Note that unless you use the
-f option, the
rentzsch remote is not automatically fetched, so you’ll need to fetch it manually. I also find this needlessly confusing and wish the default behavior were to fetch rather than not fetch. You might find yourself perplexed, for example, if you try to create a new branch from the remote.
$ git branch rentzsch-master rentzsch/master fatal: Not a valid object name: 'rentzsch/master'. $ git branch -r origin/HEAD -> origin/master origin/cutting-edge origin/master $ git fetch rentzsch $ git branch -r origin/HEAD -> origin/master origin/cutting-edge origin/master rentzsch/1.4.2-64bit rentzsch/cutting-edge rentzsch/gh-pages rentzsch/master $ git branch rentzsch-master rentzsch/master Branch rentzsch-master set up to track remote branch master from rentzsch.
I recommend that you create a branch specifically to track the forked repository, as I do in the last instruction above. Then no matter what changes you make, you can still look at the ‘official’ version of the project by checking out the
rentzsch-master branch. If you use the remote branch
rentzsch/master as the starting point for the local branch
rentzsch-master, the local branch automatically tracks the remote repository
rentzsch, just as the local
master automatically tracks the remote
$ git remote show rentzsch * remote rentzsch Fetch URL: git://github.com/rentzsch/clicktoflash.git Push URL: git://github.com/rentzsch/clicktoflash.git HEAD branch: master Remote branches: 1.4.2-64bit tracked cutting-edge tracked gh-pages tracked master tracked Local branch configured for 'git pull': rentzsch-master merges with remote master Local ref configured for 'git push': master pushes to master (fast-forwardable)
When changes occur in the
master branch of the remote
rentzsch repository, here is the procedure for merging them:
$ git checkout rentzsch-master $ git fetch $ git merge rentzsch/master $ git checkout master $ git merge rentzsch-master $ git push
You could use the one step
git pull instead of the two steps
git fetch and
git merge rentzsch/master. However, I’ve heard it suggested that
git pull sometimes causes problems, though that issue is beyond the scope of this blog post. Anyway, what you’re doing with these steps is first merging the remote
rentzsch repository changes into the local
rentzsch-master branch, then merging the local
rentzsch-master branch into the local
master branch, and finally pushes the local changes to the remote
origin repository. The somewhat convoluted procedure is necessary because you cannot directly pull the remote
rentzsch changes into
origin, they have to go through the local repository.
The key to successful repository management, I believe, is to never write code on the local
master branch. I’ve learned this important lesson by trial and error. In particular, if you try to merge changes from a remote repository into
master while you have local changes on
master that haven’t yet been pushed to
origin, everything can blow up. It’s best to keep
master as pure as possible. In fact, it’s best to keep all your tracking branches as pure as possible. With Git, branches are cheap. When you want to make local changes, always create and check out a new branch, and then merge the changes back into the tracking branch when you want to push.
As far as I can tell,
origin by default will contain the same branches that existed in the
rentzsch repository at the time you forked it. Consequently, http://github.com/lapcat/clicktoflash only has 2 branches, whereas Rentzsch’s GitHub repository has 4. In any case, the local clone only has
master by default. New branches created in the local repository with local starting points are not automatically pushed to
origin. This means you can safely hack on local code changes in a branch without exposing your mess to the public. If you want to work on another public ClickToFlash branch, such as
rentzsch/cutting-edge instead of
rentzsch/master, you’ll need to create a new local branch.
$ git branch rentzsch-cutting-edge rentzsch/cutting-edge Branch rentzsch-cutting-edge set up to track remote branch cutting-edge from rentzsch. $ git branch cutting-edge rentzsch-cutting-edge $ git checkout cutting-edge Switched to branch 'cutting-edge'
Again, we have both a branch
rentzsch-cutting-edge that is a duplicate of
rentzsch/cutting-edge and a branch
cutting-edge that includes your changes. This mirrors the arrangement of the branches
origin already contains a
cutting-edge branch, then
git push should be sufficient to push your local changes. (Beware: in another maddening default behavior,
git push will push all branches that exist on
cutting-edge, not just the currently checked out branch.) On the other hand, if
origin does not yet contain a
cutting-edge branch, you’ll need to use
git push origin cutting-edge to create the branch on
I hope this mini tutorial helps you to work with GitHub more efficiently and with fewer headaches (from banging your head against the wall). If you have further questions, feel free to ask … someone else, because I don’t know the answer.