2Git, the concept


Moving data between local and remote repositories

Lots more London Underground maps.

Consider the lab-01-website we had at the end of section 2.4 (figure 2.26) it looked like this:

Figure 2.27 - Final arrangement, all merged back to master branch

Figure 2.27   Final arrangement, all merged back to master branch

This is the project currently stored in the local repository on our machine.

Now let’s say we want to copy all this to a remote repository on GitHub (I explain how to set up a GitHub account in section 4.1).

We must first create an empty repository in a GitHub profile that will receive this local repository (again I explain how to do this in section 4.2). For simplicity I will assume that the remote repository will have the same name as the local repository i.e. lab-01-website.

  • There is no requirement for local and remote repositories to have the same name, but I find it less confusing if they do.

Finally, a secure communication link must be established between the local repository and the remote repository. This is referred to as a secure shell link (SSH) and I explain how to create and test this link in section 4.2.

Assuming these three conditions exist: the GitHub profile is set up, an empty remote repository is present and an SSH link is established. Then we can proceed.


Sending (pushing) data to a remote repository

A local repository is copied to a remote repository with the use of the push command. In Git terminology this is referred to a pushing your repository or sometimes pushing upstream (it all sounds a bit rude).

Now we have this:

Figure 2.28 - lab-01-website—push to remote repository

Figure 2.28   lab-01-website—push to remote repository

All the commit points on the master branch of the local repository have been sent to the remote repository.

A push only applies to a single branch. If no branch is specified, Git will assume you are pushing the currently active branch (where the head is).


Getting (pulling) data from a remote repository

Let’s assume that some other user has added another commit to the remote repository. The remote repository now looks like Figure 2.29:

Figure 2.29 - Modified remote repository

Figure 2.29   Modified remote repository

There is a new commit at the top of the list [ad457b6].

To get the new commit from the remote repository into the local repository we use (unsurprisingly) a pull command. It works like this:

Figure 2.30 - lab-01-website—pull from remote repository

Figure 2.30   lab-01-website—pull from remote repository

Again the pull command only applies to a particular branch (or the currently active branch if none is specified).


What happens if there is a conflict with the remote

This happens when your local repository is out of date with the remote repository and is a perfectly normal situation.

Let’s assume that you have pulled a copy of the repository in to your local repository and you are quite happily working away on it.

In the meantime some other thoughtless bastard also pulls a copy of the remote repository onto their local machine and they also start working on something.

Now let’s assume that the other person finishes first†1 and pushes all his changes back to the remote repository.

At some point you will finish your work and try to push your changes back to the remote. You will be disappointed. Your changes will be rejected.

So, what to do?

Well in this situation, you have to pull the latest version from the remote repository and combine its changes into your local repository. This is very much like merging branches (§ 2.4).

Once you’ve pulled the remote repository and resolved any conflicts, you can now push everything back to the remote repository with your changes (unless of course someone else has modified it again while you were doing all this—bastards).

I work through a full example of this scenario in section 8.

†1 This seems to be the normal state of affairs for me; I’m usually the last to finish. I used to think it was because they gave me the hard stuff to do. In the end I realised that this wasn’t the case—turns out I’m just slow.


Creating a local repository from an existing remote

This is a very common situation. There is a remote repository somewhere with a project in it that people have already been working on. You also want to work on it, so you need to copy the remote repository to a local repository on your machine.

It’s easy. Let’s say the lab-01-website repository exists on GitHub and you’ve been given user access to it and you want to copy it to your machine as a local repository.

In Git you would navigate to wherever you want the new repository folder to be located, in my case it would be here:

D:\2500 Git Projects\

Next use the clone command, this creates a new repository folder, initialises it as a local repository and copies all the data from the remote repository into it.

We now have a local repository linked to the remote repository. The new local repository is completely up to date and matches the remote repository exactly (at least it did at the point of creation, regular pull commands must be executed to keep it up to date).

There is a worked example of this, executed through brackets, in section 5.3.


A note on remote connection names

When dealing with a remote repository, that repository has a name (in the previous example it was lab-01-website). The local repository can have a completely different name, you can call it whatever you want (in my case, I always give the local and remote repositories the same name—this is just my preference, I think it avoids confusion).

The thing that joins these two repositories is a communication link, a URL (like a web address that points to the remote repository). In the previous example, the URL would be something like: git@github.com:practicalseries-lab/lab-01-website.git, which I’m sure you’ll agree, would be a pain in the arse if you had to type it in every time you wanted to push data to the remote.

To get round this Git gives the connection a shorter (but unique) name. By default, the first name it assigns is origin. There is nothing special about this name; you could assign any name you want. However, hardly anyone ever changes it (bit like the master branch). Most connections are called origin. I tend to stick with it.


Working with remotes—best practice

This is a short list:

  1. Use the same name for local and remote repositories

  2. Update the local repository frequently

Store local changes (commit changes)
Update the local repository (pull from remote)

  1. Always update and resolve conflicts before pushing changes to the remote repository

Store local changes (commit changes)
Update the local repository (pull from remote)
Update the remote repository (push to remote)

End flourish image