Github: Your Single Point Of Failure
To say that I love Github would be a bit of an understatement. I more than recommend it when describing code review processes. At Mozilla, the web development team uses Github for our code reviews, since line notes and pull requests work perfectly with our code review requirements. Github allows a large distributed team to work independently while still working together.
However, recently Github has experienced some issues with it’s performance. Thankfully, most of these issues have been minor. But the issues highlight a serious potential flaw in using Github for critical development processes:
Github is a single point of failure.
How Github and Git work together
The way Git works allows every developer to work independently and possess a complete copy of the repository at all times. So there’s little to no risk of data loss beyond what’s on a particular programmer’s computer that hasn’t been shared with others. This is different from a centralized version control system like Subversion.
Github doesn’t change this model in any way, other than offering a centralized canonical repository for project commits. Each fork is essentially an independent repository that exists for that particular developer, and their clone of that repository locally is another copy as well. The developer pushes their commits and Github’s pull request process essentially attempts to merge one group of commits into another; this allows developers to share their work upstream with one another.
The potential problem for developers and companies
By design, each developer has a full copy of the repository. This means that there’s little risk of data loss for a company that uses a service like Github. Where the problem arises is in the fact that Github has been worked into the core of a number of companies’ development processes.
For example, companies that have developers work locally and push their code to Github for deploys have no control over the infrastructure on which their code resides. If they wish to do a deploy, but Github is down, they are unable to deploy their code.
Github also appears to share infrastructure between public and private repositories, making their paid clients as susceptible to downtime as their free users. This means Github is essentially charging companies and developers for not publishing their private code to the rest of the world, but not offering any kind of SLA for uptime.
There are enterprise options with Github, to allow companies to host Github on their own servers. These options are not available for every organization, only those with a decent IT budget. Github offers no packages that come with a known SLA.
So what is the solution?
Unfortunately, I don’t have an easy solution to this problem. Github is best-in-breed and is far superior to other tools like Google Code. The Object Oriented PHP Masterclass will rely heavily upon Github’s tools to aid students in correcting their code, and I doubt I’ll see my team change from Github to something else in the near future.
I would love to see the open source community
come up with a reasonable replacement for Github, though the power of Github is in the fact that money changes hands and that is a terribly powerful motivator for creating a beautiful product.
I believe Github will continue to grow, that their stability issues will eventually settle, and that things will improve. But I still feel uncomfortable with my (and everyone else’s) single point of failure.
Update: There is a possible GitHub replacement, known as Gitlab. Gitlab is open source, released under the MIT license, and (ironically) hosted on Github. Notably, Gitlab is written on Ruby on Rails; you should understand the security implications of Ruby on Rails before using Gitlab.