Tech

Git Bisect and the Hunt for Bad Commits

A method to find problematic commits in large codebases.

Gojek

May 9, 2019 • 5 min read

By Vishwesh Jainkuniya

Git is the most widely used distributed version control system. With the help of git, one can easily collaborate with others. In case you are unfamiliar with the term, I have elaborated upon it in another post

So let’s start with the problem:

Let’s say you are working on a large codebase with many people working on the same project, and lots of commits per day. Imagine you went for a week long vacation. Upon coming back, you find that the product (which was working fine when you left) has some issues now.

As mentioned before, it’s a huge codebase. You aren’t aware of which module is responsible for the unwanted behaviour.

What will you do?

In this post, I am going to explain a potential solution to this problem. One of my favourite git commands — git bisect.

Sorting between good and bad

Coming back to the earlier conundrum, you now have one of three options:

Option #1

Call a meeting of all the contributors and ask how this happened.

There can be cases where this behaviour is caused by two changes which are part of multiple commits from different people (whose individual changes worked fine).

Also, meetings are time-consuming. 🤦‍♂️

Option #2

You’re already managing commit history, why don’t you check each and every commit and find the root cause.

But what if there are hundreds of commit in between, checking each and every commit will be really hard, boring and again time-consuming.

Option #3

Run binary search on commits

(Yup! That same searching algorithm which you studied at high school 😅)

This is where git bisect comes into the picture. It can help you to find the specific commit which broke things by iterating commits in a binary search manner. So now you only have to check Log(N) commits instead of N commits (as described in option #2).

Sounds good, how do I use it?

Keep your commit hash (SHA) ready, on which:

if something is fine, let’s call it good
if it is not behaving correctly, let’s call it bad

(in the worst case they can be start and end commits of the whole project, respectively)

Just remember three things:

git bisect start BAD GOOD, where BAD and GOOD, are the references to bad and good commit respectively.

Check behaviour, if it is good,

Enter git bisect good

else,

Enter git bisect bad

and repeat the above command till you find SHA of the first bad commit in the terminal like this

< … sha … > is the first bad commit

Use git bisect reset to reset and get back to the original state from where you started (via git bisect start).

In the meantime, you can also get the logs of the binary search you are performing with the help of git bisect log.

Now, let’s take an example:

Here we have a module to calculate area and circumference of different polygons. (github repo)

Following is the output of the git log — oneline (basically commit history).

Now if I rerun tests on current HEAD, they fail.

But wait, I know that these were passing at commit:

So as described above, there are three options to find the culprit commit which made the tests fails.

Let’s go with option #3, the binary search, i.e using git bisect.

Here bad SHA is 3f5dedf and good SHA is 861b6e0 (let’s start bisecting)

The above command will automatically check out to 5fcc576

Now, let’s re-run the tests.

They failed again, this means 5fcc576 is also a bad commit. So let’s go with git bisect bad.

The result of the above command automatically checks to 88ca0bd.

Re-run tests.

Still failing, so this commit is bad as well (So we have to go with git bisect bad)

Check again now on ab36264.

Success! They passed 😍

In the meantime, we can also check Bisect log with the help of git bisect log.

ab36264 is good commit, so let’s go with git bisect good.

And here, git tells us that 88ca0bd866aeeb91b6b3f8c644b754068779488e is the culprit commit.

If we had gone with option #2, then we would have ended up by running our tests ~8 times, but with the help of git bisect we only ran it 3 times. This helps when you have many tests and executing them takes a long time.

And yay! With minimal effort, we found the commit which made tests fail. It can also be used to find the cause of the bug.

Read more about git bisect at https://git-scm.com/docs/git-bisect

I love fixing bugs and found git bisect really helpful, especially to find the commit which introduced the bug.

Here’s hoping you will like it too! 👍

Keep Building & Debugging

(Note: purpose of the code was just to explain git bisect, nothing else)

There’s a lot going on at GOJEK at any given time. Besides transport, we also help customers look after themselves, their possessions, and their homes through our lifestyle services. All this is supported by one of the largest JRuby, Java and Go clusters in Southeast Asia. Want to come work with us? Check out gojek.jobs and let’s build great things, together. 🤝

Sorting between good and bad

Sounds good, how do I use it?

Sign up for more like this.