Finding Needles in Haystacks

How Gojek’s Magneto team is building products to help automate discovery of top candidates at scale.

Finding Needles in Haystacks

By Atif Haider

In just a few years, Gojek has grown from a small startup aggregating ojeks to a Super App that does over 200 million completed orders a month.

Most importantly, we’ve managed to scale our systems with a relatively lean team of engineers. However, in order to sustain this growth, we need to keep adding quality engineering talent to our ranks. To help with this, we rely on our in-house recruiters, who’ve been doing a phenomenal job. 👌

But the question remains — how do we continue to keep this up at scale?

We’ve been trying to find a solution to this question, and as is usually the case when it comes to Gojek, the answer lies in innovation. Just like every other domain we operate in, we needed to innovate in tech recruitment as well.

When our team joined Gojek, we had one clear objective — apply the learnings from our previous startup to help Gojek find innovative ways to hire top engineering talent.

This is the story of how we’re doing it.

Early Days and Finding Our North star

In the first two weeks after we joined, we consulted with engineering heads, tech recruiters, and the India Inbound Marketing team. Over the course of comparing notes from many meetings, we hoped to find patterns in how Gojek hired, and how we could automate the discovery of more people with similar traits. Undertaking this exercise was at once a rewarding and challenging experience for our team.

We got interesting insights, but also ran into conflicting data. Gojek had also recently switched to a new Application Tracking Service (ATS), which brought its own set of unknowns. Navigating these unknowns and changing processes in an organisation with many fast-moving parts was our first order of business. In between all this, finding the formula for the kind of candidates we were looking for was like looking for a needle in a haystack. Most of the problems that we identified at these meetings turned out to be operational issues — an insight provided by our India MD, Sidu Ponnappa.

Meanwhile, we needed a name for our team which people could use to identify us. We named ourselves Magneto. To achieve our task of finding top developers, we would need the superhuman power of attraction.

Testing Our Powers

Gojek has a rigorous interview process, and we are picky about who we hire, to the point where even people outside of Gojek know how tough it is to get in.

We did get one key takeaway though. All our tests and questions are designed to identify a few common traits:

In a nutshell, we’re biased towards developers who are hands-on and can code; irrespective of their prior experience and pedigree.

Another interesting fact: according to our internal data, a whopping 50% of candidates who go through our interview process fail the first test assignment round. 😮

I was surprised to see candidates from top product companies unable to get through this round.

After looking at this daunting drop-off data, and acknowledging that we were very new to Gojek’s hiring process, we decided to take baby steps. The goal: hire 1 engineer for GoFood. We also set up some key results to track back towards our objective.

Crafting A Solution

We knew we needed to find top developers who could make it through our recruitment process, and then reach out to them to check if they would be interested in joining us.

So we made this theme a problem statement, and divided it into two parts:
1. Discover
2. Nurture

As a first step, we pulled out profiles of engineers we hired from Lever (our Application Tracking Service) and found many of them to be active on code-related online digital platforms that reflect their passion for programming.

We finally knew where to start looking for our needle.

Our first stop was the most obvious one — Github. Then we went to StackOverflow and finally, Twitter. StackOverflow itself hosts over 10 million developers.

StackOverflow Stats

There’s a lot of publicly available information on these platforms, so we wrote a bunch of scripts to pull info about developers from Github and StackOverflow. Then we tried to map their skills, contributions, popularity, and activities.

We wrote custom algorithms for Github and StackOverflow to classify developers as qualified and unqualified, based on the criteria we were looking for. We looked at various signals to check their passion for programming, and they passed a certain pre-defined threshold, we would mark them as qualified.

Once a candidate is qualified, we needed to categorise them based on skills, experience, location, and other parameters. This step also involved some manual work to clean up the data.

In order to meet our first objective, we picked around 800 candidates and reached out to them through personalised email campaigns. Since our team was quite comfortable using Python, we used Python on top of PostgreSQL to write the Nurture system to run these email campaigns (despite Python not being part of Gojek’s core tech stack).

The Discover <> Nurture Flow

Smelling success

After rigorous follow-ups, we ended up getting a decent number of interested candidates, whom we then contacted for exploratory conversations.

This was quite the enlightening experience — and we realised Gojek is perceived as a highly tech-oriented company among developers.

Over 62% of the candidates who submitted the test assignment cleared it, and by the time we finished the first cohort, we ended up hiring 3 devs.

We exceeded our target by 300%. 🙌

The smile of success

Automation FTW

As we succeeded in demonstrating the viability of our MVP, we had to look at the problem statement again and translate it into a scalable system. Tackling scale has always been a part of Gojek’s journey to #SuperApp status, so we decided to be prepared.

While scaling the Candidate Discovery Process, we ran into two issues:

  1. Since the discovery level was managed on spreadsheets, it required a massive manual effort to keep the data unique between each service.
  2. Getting candidate info from different sources required a lot of data-sharing on multiple platforms. The result — repetitive processes and data duplication. 🤦‍♂️

Considering these challenges, we divided the process into 3 different layers:

  1. Service layer: Contains individual micro-service which has single behaviour — fetch details from a specific source and score, classify, and merge them based on the data available on different mediums about a candidate.
  2. Messaging layer: At any point in time, any individual service can connect to other services to fetch more information about the candidate. For instance, while parsing StackOverflow users, we get the user’s website and Github URLs. So, the StackOverflow service will publish candidate’s Github social id and website URL to the Github and WebScrapper services respectively. This information about the candidate is then written to the database.
  3. Data layer: It is a centralised MongoDB that stores all the data (even if it’s incomplete) from different sources, which can further be pushed to Nurture once it is verified.

We also dockerized all these micro-services, integrated monitoring (Prometheus), Slack alerts & logging (Barito — thanks to our Barito team for offering a powerful logging system).

With this automated system, we processed over 38,000 tech candidates’ profiles in a very short time, and found around 10,000 qualified candidates who passed our custom criteria.

This was our Aha! Moment.

With automation, we managed to make 13 offers to developers in a very short span of time, 6 of whom have already joined us.

Launching Fount

So far we had been the customers of our own product. But this needed to be bigger. We needed to get our product out to our internal recruiters to use.

However, there was one problem.

The Nurture system was written in Python, and we primarily use Ruby, Go, Java, and Clojure at Gojek. When you rely on tech stack not extensively used by the larger org, you end up building your own tooling systems to test and deploy on the existing infrastructure (which is already stable and working for other teams).

Therefore, we decided to re-write the Nurture system in Clojure. I had prior experience of writing programs in Common Lisp & Clojure. To my surprise, the team was also excited to try out functional programming (I will soon write a separate article on our experience in building API in Clojure. Pinky promise!).

There were a few hiccups in getting things moving around the infra and security systems, but that was a given considering we were fairly new to the existing infrastructure. We built the Nurture API in Clojure and a frontend in Reactjs. In 30 days, we were up and running with our product.

We call it Fount.

Fount lists all the qualified discovered candidates; leads our recruiters can then follow up on to check their interest and move into our hiring funnel. Currently, a few beta users (our internal tech recruiters) are trying out the system. 🖖

We have already redesigned Fount and this is how it looks now. Thanks to Asphalt — our very own design system at Gojek.

AI Powered Candidate Discovery Platform

The Way Forward

This was just the first step in our journey of streamlining recruitment at Gojek. We’re planning to improve our sourcing algorithms, find new sources, improve the candidate verification process, and build new features on Fount.

We’re also going to attack different tech recruiting problems and bring automation wherever possible in order to hire top developers (and offer up a great candidate experience).

That’s all from us for now. We’d love to hear your feedback in the comments.

For Gojek updates delivered straight to your inbox, sign up for our newsletter!