How we solved ‘user selection’ to help merchants win business
By Mayank Khera
GO-FOOD, our food delivery product is one of the largest food delivery products in the world. We operate at a massive scale and do a good portion of our overall GO-JEK monthly orders of a 100+ million via GO-FOOD. With 300K+ merchants, there is not only a huge variety for customers to choose from, but discovery of restaurants also becomes a challenge. There is also a need for merchants to showcase their dishes.
Instead of using traditional advertising methods, merchants can choose to promote themselves inside the GO-JEK app. Our value proposition: we will increase sales, get more customers and allocate those marketing spends wisely to the right customers. In less than 10 months of launching our promotion platform, a little more than 2% of our overall orders through GO-FOOD are getting discounted through vouchers. That’s a phenomenal number when you realise we deliver more than 30,000 tonnes of food every month 🍔🍕🍟.
TL;DR: In this blog, we will talk about how we stitched customer’s food preferences into a promotions platform to target the right users for a merchant. In essence, selecting a user is a critical problem if vouchers are going to be fulfilled.
How we came about the Promotions Platform
In the latter part of 2017, we realised there is a massive opportunity to distribute food vouchers to our customers. Merchants had shown an inclination to fund such promotions and customers are always delighted to receive them. The caveat being: vouchers should be highly relevant to them and customers don’t feel spammed with too many of them. Merchants see promotions not only as a medium to procure more orders and increase existing user’s transacting frequency, but importantly, to acquire new customers and lure back churned-out customers. (This is a story in itself. Watch this space)
In short, promotions are a win-win for all parties involved. Above all, it ties in a consumer and merchant to GO-JEK’s ecosystem💰.
There are three important functions which are fulfilled by the promotions platform: create a voucher, select users, and lastly, notify the user about the voucher.
Selection of the user is the most important cog in the wheel and significantly influences the internal success metric of our campaigns — Redemption Percentage
. That is, the proportion of users who actually redeemed a voucher we sent.
Redemption percentage = ( Vouchers Redeemed / Vouchers Allocated )
As things stand today, our team of product analysts along with developers have automated each of these functions to allow our sales team to run multiple campaigns at scale. Being part of the Product Analyst team, our focus is on tackling the user selection challenge.
Automating user selection at scale
Formulating user selection criteria required us to deeply understand customer behaviour in order to give our users the right promotion. We ran our initial set of experiments by selecting customers who were most likely to redeem the voucher based on some heuristic rulesets.
These initial experiments helped us validate what features made sense and forgo the ones which weren’t predictive of redemptions. For example, we validated early on that customers who ordered more recently with a merchant have a higher tendency to reorder given a voucher.
At GO-JEK, we always look to automate and let machines do the work, which brought us to our next challenge. How can we select the best users for multiple campaigns at scale, with no intervention from analysts? To solve this problem, we built the Customer Tag store (CTS) — which stores customers’ behavioural features like past 4 week orders
, past 4 week merchant visits
in an ElasticSearch index.
What is CTS?
The Customer Tag Store captures user’s order frequency and in-app merchant visits for 2 time frames: past 4 and 24 weeks
and at 3 levels: GO-FOOD overall, merchant and cuisine level
. We also compute and store metrics to capture other customer preferences, for example, a user’s location preference
which is stored as top-5 s2ids (level 14) of a user.
The Evolution of the Promotions Platform
The CTS has come to its current form after multiple iterations and continuous validated learning through experimentation over the past 10 months 🕐.
The first iteration of the tag store had categorical features (low, medium, high), or tags as we call them, instead of the current numerical values. The query would essentially output segments of users that would be used as the target population for a campaign; for example, select users with high ordering-frequency
and medium average order value
.
This cookie-cutter rule based segmentation approach worked pretty well until we hit a certain scale. In the longer term, we wanted the flexibility to compute user-merchant affinity scores and select the best users (and not segments of users), hence we made the shift to a numerical value store. Currently, affinity scores are calculated using a linear regression model which was trained on the historical voucher redemption data.
Not long after, we also built a Merchant Tag Store along the same lines to further augment the campaign scheduling process. It stored features like merchant’s daily and weekly order volumes
to guide the limits to total voucher distributions, peak hour of the day
to determine appropriate push notification timings for a campaign along with some other important features.
As our product matured and merchants started to better understand the promotions weapon, they wanted to specifically target only new users. This meant we needed to fine tune our selection and introduce new features to capture this new detail and subsequently retrain the model on them. Over time, we kept adding more features to meet such market driven use-cases and in parallel, better our redemption numbers too. We ran experimental campaigns to validate our set of hypotheses related to a feature, analysed the results and worked with developers to automate any piece that led to any improvements.
As an organisation, lean startup methodologies are innately embedded in each team’s work culture. The Product Analyst team is no different and we operated on the principles of the Build, Measure and Learn cycle to evolve the tag store.
Infrastructure for Tag Generation
The first cut was literally computed locally and ingested into ElasticSearch. In a month’s time, we automated this whole process through a streaming pipeline (illustrated in the diagram below) to generate and store tags on a continual basis.
Pyspark scripts and DAG configuration files form the foundational building blocks of this infrastructure.
- Pyspark scripts contain the logic to clean, transform and aggregate raw food bookings and app browsing data.
- A DAG’s structure is specified in a python configuration file which is utilised by Airflow. A
DAG
– or a Directed Acyclic Graph – is a collection of executable tasks organised in a way that reflects their relationships and dependencies. For our case, these tasks execute the logic for transformations and aggregations.
The spark jobs run on Google Dataproc and the computed features are streamed to a Kafka topic. Kafka Connect streams this data to an ElasticSearch index to complete the process. We used Kafka to allow these features to be consumed by other target systems in the future, apart from our index.
What’s Next?
We are constantly striving to deliver superior user experience through app personalisation. Presenting them with super relevant vouchers is a step in the same direction of our overarching goal of personalisation. To further our objectives, we recently started personalising GO-FOOD search too by utilising CTS. Essentially, the aim is to show merchants in order of their relevance to the users. This is accomplished by a Learning To Rank plugin installed over ElasticSearch which computes merchant ranks for each user in real time. Stay tuned for more updates around this!
Join us! We are looking for some kick ass product analysts and data scientists who are passionate about data driven problem solving. Check out gojek.jobs for more.
If you’re specifically looking for roles in the GO-FOOD (🖖) team, here are the openings: