2Git, the concept


I AM BY PROFESSION an engineer, I build and programme control systems for a living (industrial machinery, pharmaceutical manufacturing, even tritium handling systems for nuclear reactors). The software that goes in these systems is (usually) rigorously tested and anything that goes into a regulated environment (pharmaceutical, nuclear, safety systems &c.) has quite stringent version control and traceability requirements.

Version control is important, particularly when the reactor melts down and you want to know who’s to blame1. It’s also important when you’ve completely screwed things up and you need to back to an earlier, less screwed up version.

Now I write industrial software and these applications usually have their own development environments and version tracking mechanisms that are particular to the product being used (Siemens, ABB &c). Other things such as documents and drawings are controlled by the in-house version control system that we operate in the office.

When I started the Practical Series website, I wanted some form of version control—it was just me working on it, so I didn’t need anything too ambitious and in the early days I just kept increasing the revision number and backing up everything each time I did so.

And this was fine, up to a point. Websites are not particularly massive things, a few megabytes; and having multiple copies doesn’t tax a modern hard drive particularly. So what was the problem?

Well the main problem was that I wasn’t documenting the revisions properly; the website has a lot of files (way over a thousand) and while I had a copy of every file at every revision I didn’t necessarily know which files had changed from one revision to another.

It was also a very inefficient mechanism; there were probably some files in there (images for example) that were in at the first revision and were backed up without change in every subsequent revision up to the latest (there were over 150 when I stopped).

It also became complicated when I wanted to work on two different things at once; I might for example be correcting typos in one section that had just been proofread, while developing a new section from scratch. It was difficult to keep track of each.

So while what I had worked, sort of—I could always go back to an earlier version. It was laborious; I would have to guess which revision I wanted, unzip it and then look at the file I wanted to see if it was the right version, if it wasn’t I had to guess another revision and do the same until I found the one I was looking for. This was ok in the early days; but at the point where I switched over to proper version control I had 78,000 files in 35,000 folders; and 150 zipped revisions. The whole thing was taking up 7 GB most of which was identical copies of files that hadn’t changed between revisions.

It was at this point that I decided I needed some proper version control and while I could have used the office system, that didn’t feel right; the website was nothing to do with my work (more an expensive hobby really) and I didn’t want to take advantage. Neither did I want to buy the office version for home use, it was just two expensive and over the top for what I needed.

So I had a look around, there are lots of different version control systems out there; some are free, some are commercial applications. The question is: which one to use?

One thing that gave me a clue was the software I had used in developing the website; I had used various open source software (some CSS files, some Java Script code) that improved or added functionality to the website (things like lightbox images and formulae on web pages). I noticed as I researched these things, that virtually all of the people developing these applications used GitHub as their version control system.

So I decided to have a look at GitHub and I realised that this was the online version of Git, Git being a version control system that runs on a PC or Mac.

My conclusion is that both Git and GitHub are complicated bastards that are difficult to understand, especially Git. Git in its native form uses a command line interface (just like MS-DOS and those text adventures from the eighties). Both applications also use some fairly peculiar and non-intuitive terminology that gives the whole thing a pretty steep learning curve. It all takes a fair bit of hard work to understand properly.

There are three parts to it really:

  1. The first part is understanding the concepts of how Git manages version control (the theory if you will)

  2. The second part is installing it all and making it work (and that’s not as easy as it sounds)

  3. The third part is using something better than Git’s command line to manage it all. I’ve chosen to use Brackets and some specialist extensions that make the whole thing much easier to use (it moves away from the command line environment to something more modern)

This chapter is concerned with the first of these three; I explain how Git works, how it operates as a version control system and what the hell all that peculiar terminology means.

I should say right from the start, that Git and GitHub are designed by geeks for geeks—and by God does it show. They belong to a bit of a club where those in the club don’t really want to explain it to those outside; they say they do (because it’s open source, left-wing and trendy) but they don’t. There’s lots of information but it’s all designed to be a bit intimidating and to make you feel, well, stupid. It reminds me very much of the gramophone sketch by the not the nine o’clock news team. It’s the same attitude.

So if you don’t want a bag on your head, read on. I hope I’ve explained it better here.

†1 I worked for one company that shall be nameless (let’s just call them Consolidated Gyroscope); they operated a blame-free culture intended to prevent accidents by encouraging people to report the smallest incident. The idea being if you stop the small things, you will prevent the larger things. This was fine for small things, no body minds being told to use a coaster for their coffee cup. Bit different when the building burns down.
It did lead to a bit of a sub-culture—whilst they operated a blame-free policy, they did like to know whose fault it was.

End flourish image