2

2Git, the concept

2.5

Resetting to an older commit point

Recovering data from a previous commit point can be accomplished by using the reset command.

Reset—a word of warning

Reset is the process of switching a project to a previous commit point.

RESET IS A VERY CONFUSING ASPECT OF GIT,
IT’S A RIGHT BASTARD TO UNDERSTAND

There are three types of reset: soft, mixed and hard. The differences are to do with what happens to files in the working directory and staging area.

I will explain all of them, but before I do, I need to set up a project by way of an example that we can work with:

2.5.1

A project to explain the reset

The easiest type of reset to understand is the hard reset; it resets a project back to an earlier commit and replaces all the files in the working directory with the files that were present at the time of the specified commit.

To explain this let’s consider a simple project with just one file (test.txt). I will go through modifying the file and creating commits in a step-by-step way to demonstrate exactly what happens with a reset.

Let’s assume we have a new project and we’ve just created a new file called test.txt. This file is present in the working area only, we haven’t added or committed the file yet (step 1). Next add the file to the staging area (step 2) and then commit it to the local repository (step 3):

Action Result
Step 1 Reset - step 1
In a new repository create the file test.txt and add some text to it. This is now in the working area and is version V01 of the file.
Step 2 Reset - step 2
Add the file to put it in the staging area.
Step 3 Reset - step 3
Now commit the file to the local repository.

This gives our first commit [ac452db].

Now let’s modify the test.txt file (making it V02) and repeat the add and commit process to give our second commit [5493a7c] (step 4).

Action Result
Step 4 Reset - step 4

Modify text.txt. It is now V02.

Now add and commit the file to the local repository.

Now we do it all again for a third commit [22af5b5] and a file at V03 (step 5).

Action Result
Step 5 Reset - step 5

Modify text.txt again. It is now V03.

Now add and commit the file to the local repository to give 3 commits in total.

  • Some of you may be wondering why I show files in the staged area after a commit, when in Figure 2.7 to Figure 2.10 I showed it as empty. In practice, the staged area does still hold the files (I showed it as empty to make the explanation easier to understand).

The staged area always holds the files, but if they are the same as the committed files, Git considers it to be empty—it isn’t, but there is nothing in there that requires action.

Finally, let’s modify the working copy of test.txt (making it V04) and add it to the staging area (step 6). Do not commit these changes; this is work in progress and I will use it to show how the different types of reset work. We have this as the final step:

Action Result
Step 6 Reset - step 6

Modify text.txt again. It is now V04.

Now add and commit the file to the staged area.

I’ve shown the modified and staged file version in red.

Now let’s look at what a reset does.

Reset—a reassuring point

The reset process can be destructive; it can overwrite data (it’s one of the few things that Git does that can lose your data). That said:

A reset will never change or delete committed data.

Committed data is always safe, a commit point will never be deleted and you can always go back to it.

The worst a reset can do is overwrite changes in working or staged files.

I will start with a hard reset; like a hard Brexit, it’s the option that makes most sense.

2.5.2

A hard reset

Ok, we now have a project with three commits in it. The head is on the master branch and is at the latest of those commits [22af5b5]. All well and good, this is just exactly the same as the example we worked through in the previous section.

We’ve also got a modified working copy file that has also been staged but not committed (step 6 above).

A reset (any reset) moves the head to a different commit point (a bit like switching branches). A hard reset overwrites the files in the working and staged areas with the files from the commit point that we are resetting to.

For example, if we hard reset to the commit point [5493a7c] then we would get:

Action Result

Reset (hard) to [5493a7c].

Head moves to a different commit point. This overwrites the staged and working areas with the files from the [5493a7c] commit point.

Hard reset to an earlier commit point

At this point we have lost the V04 changes to test.txt that were in the working and staged areas (step 5).

  • This hard reset is a bit like changing branches; however, Git won’t let you change branches if you have modified files in the working or staged area (because they would be lost, § 2.3.2). Git has no such qualms about a reset, you will lose any uncommitted changes that are in the working or staged areas when you do a hard reset.

As with branches, best thing to do is commit any modifications prior to a reset.

If we now do another reset (without modifying anything) back to the most recent commit point [22af4b5] we just end up back where we were at step 5:

Action Result

A second Reset (hard) to [22af4b5].

Moves the Head back to the most recent commit point.

This restores the project to the step 5 state above.

Hard reset to latest commit point

2.5.3

Making changes after a reset

This is a bit like going back in time and accidentally killing your own father before you were born—it leaves things hanging.

Currently we are at step 5 and our commit history is:

[22af4b5] Version V03
[5493a7c] Version V02
[ac452db] Version V01

Now hard reset back to commit point [5493a7c] just like we did in the previous section—we get exactly the same result:

Action Result

Reset (hard) to [5493a7c].

Head moves to a different commit point. This overwrites the staged and working areas with the files from the [5493a7c] commit point.

Hard reset to V02 commit point

The history shows:

[5493a7c] Version V02
[ac452db] Version V01

It’s not showing the [22af4b5] commit point.

This makes sense, the history only includes things up to the current head point.

We can still move back to the [22af4b5] commit point because we did in the previous section. The problem is what if we make changes at this earlier commit point?

Do it, modify test.txt this time we’ll call it V05—like the shampoo (I used V04 before remember)—commit the changes, this is commit [8f46c7b].

Action Result

Modify text.txt again, it is now V05.

Now add and commit the file to the local repository.

Modification after a reset to an earlier commit point

And now if we look at the history:

[8f46c7b] Version V05
[5493a7c] Version V02
[ac452db] Version V01

Commit point [22af4b5] isn’t there. A bit like Stalin, we’ve rewritten history.

The missing commit point is still there†1, it no longer fits into the chain of commits—it’s just floating around in the Git repository.

It’s perfectly possible to switch to the missing commit point [22af4b5]. If we were to reset to it [22af4b5] we would have:

Action Result

Reset (hard) to [22af4b5].

Modification after reset

The commit history being:

[22af4b5] Version V03
[5493a7c] Version V02
[ac452db] Version V01

Confusing isn’t it?

†1 You need to be careful with these floating commit points, they are deleted when the database is cleaned up (pruned in Git terminology—it’s gardening, like branches).

2.5.4

Using resets—best practice

I don’t like the idea of resetting to a previous commit point and then progressing on from there as if nothing had happened (effectively bypassing the later commits).

Let’s assume that we’ve progressed from step 1 to step 5 in the previous example and we haven’t done any resets (i.e. commit [8f46c7b] never happened). We have this:

Action Result
Step 5 Reset best practice - step 5
text.txt is at V03; there are no changes pending.

Let’s also assume that there is some problem with test.txt and we realise that the version we need is actually version 01 with some slight modification. The best procedure to recover V01, modify it and commit the modifications is as follows:

Commit or discard any pending changes (in our case there aren’t any) and reset the project back to the V01 commit point with a hard reset [ac452db].

Action Result
Step 6 Reset best practice - step 6
Reset (hard) to [ac452db].

This places the V01 file in the working area.

Now either open the test.txt file and copy the contents or just copy the entire file to the clipboard.

We haven’t changed any files at this point (just copied data).

Now hard reset back to the latest commit in the chain [22af4b5].

Action Result
Step 7 Reset best practice - step 7
Reset (hard) to [22af4b5].

Next either open up the latest file, the V03 file, that is in the working area, delete everything and paste in the V01 copied contents from the clipboard, or just overwrite the file with the copied version.

Either way, the test.txt file in the working area now contains the V01 version that we copied from the step 6 hard reset.

Make any other modifications that are required and save the file (we will call this V1a for simplicity).

Action Result
Step 8 Reset best practice - step 8
Modify test.txt, it is now version V1a.

Now add and commit the changes with the commit message test.txt v1a — based on v01 with modifications. In this case it is commit [9c43e67].

Action Result
Step 9 Reset best practice - step 9
Commit the changes [9c43e67]

The commit history for this is:

[9c43e67] test.txt v1a — based on v01 with modifications
[22af4b5] Version V03
[5493a7c] Version V02
[ac452db] Version V01

This I think keeps things clearer.

From an engineering point of view, it is not acceptable to just move the project back a couple of notches and then carry on as if nothing had happened—it breaks the traceability—and that’s when they send you to gaol—it leaves the question: what happens to the commit points that got skipped? They’re still there, someone could use them, it adds to the confusion—just record what you did to put it right and move on, even if that means copying files and data from an earlier commit point.

Just explain what you did and why.

It’s the engineer’s code—use it wisely young Skywalker.

2.5.5

A mixed reset

Let me say now, a hard reset is the only one to use.

Boy am I goanna get some emails—they’ll all be from Linux people, they usually are†2.

My rules for resetting:

  1. Always commit or delete modified files before resetting

  2. Always use a hard reset

  3. Resetting a project should only be done to view the files as they were at that point in time, never to modify them

  4. Never modify files at any commit point other than the latest (most recent) commit (on a given branch)

That said I will explain what the other types of reset do—I just haven’t figured out what they are for.

The first one is a mixed reset. This is the default type of reset applied by Git.

Set up the project as it was in § 2.5.1 step 6. It looks like this:

Action Result
Step 6 Mixed reset - step 6

The most up to date commit has text.txt at V03.

he working and staged areas hold a modified version of text.txt at V04.

The mixed reset is a non-destructive reset. It does not overwrite the working copy but it does change the staging area and (obviously) the head. If we reset (mixed) to the V02 commit point [5493a7c] we would have:

Action Result
Step 7 Mixed reset - step 7

Reset (mixed) to [5493a7c]

It leaves the working file test.txt exactly as it was at version V04. It means we could stage and commit the V04 modification to a new commit based on the V02 commit point [5493a7c] we’ve just reset to if we wanted to. But we wouldn’t do that would we?

Sound confusing?

Well it is. I appreciate it is non-destructive; it hasn’t overwritten the V04 file in the working area. It’s also pretty useless because I can’t see any of the V02 files. None of the V02 files have been put in the working area and the working files are the only files that are visible or editable.

The test.txt file will be flagged as modified but not staged.

Again, I don’t see a use for it. I’ll just wait for the Linux people to stop shouting and explain it to me.

†2 In my experience and judging by my inbox, Linux people are a bit like Jeremy Corbyn’s supporters: entirely convinced of their own moral superiority and completely dismissive of any other argument. They have a stifling certitude, an implacable self-righteousness and they are always willing to be offended.

2.5.6

A soft reset

I refer you to the comments I made about the mixed reset in the previous section.

A soft rest does even less; all it does is move the head.

Set up the project as it was in § 2.5.1 step 6. It looks like this:

Action Result
Step 6 Soft reset - step 6

The most up to date commit has text.txt at V03.

he working and staged areas hold a modified version of text.txt at V04.

The soft reset is again a non-destructive reset. It does not overwrite the working copy or staged area; all it does is move the head. If we reset (soft) to the V02 commit point [5493a7c] we would have:

Action Result
Step 7 Soft reset - step 7

Reset (soft) to [5493a7c]

It leaves the working file test.txt exactly as it was at version V04 in both the working and staged area. It means we could commit the V04 modification to a new commit based on the V02 commit [5493a7c] we’ve just reset to if we wanted to.

The test.txt file will be flagged as modified and staged.

Again, I don’t see a use for it.

2.5.7

Reset, a summary

My rules and notes for resetting are:

  1. A reset will never change or delete any committed data or commit points

  2. Always make a note of the latest commit number (hash)†3

  3. Always commit modified files before resetting

  4. Always use a hard reset

  5. Resetting a project should only be done to view the files as they were at that point in time, never to modify them

  6. It’s ok to copy files from an earlier commit (following a hard reset) and paste them into the latest commit point (ensure you explain it in the commit message)

  7. Never modify files at a commit other than the latest (most recent) commit (on a given branch)

†3 I say this because if you reset to an earlier commit, the later ones don’t show up in the history and this makes it harder to get back to a later commit if you don’t know its commit number. There are ways around this, see § 7.2.1).



End flourish image