How We Mask Phone Numbers To Secure User Identity

Logging into an app through phone numbers makes the overall experience smoother. Here's how we go about protecting this data.

How We Mask Phone Numbers To Secure User Identity

By Avinash Jaiswal

Every second app logs you in only after getting a hold of your phone number. Given these numbers are like a universal identifier for an individual, and hence a PII (Personally Identifiable Information), it’s important how discretely an organization handles this information in their ecosystem.

For an app like Gojek which serves customers in more than 20 ways, the phone number is an essential point of contact. It acts as a medium to ease the experience when you order food or fetch a cab or want to get stuff delivered from one point to another. It is very important for us to preserve our user’s identity and masking their phone numbers is a step in that direction.

Number Masking

Number Masking is the process of hiding the real phone numbers of a user with a virtual number(VN) so that neither the calling party nor the receiving party gets hold of them. It is a way to connect the calls through, from one party to another, yet never call their phone numbers directly.

To achieve this, a lot is going on in the backend which is never exposed to the users. For them, they are simply calling a number that looks like a weird but valid phone number. Following is a brief explanation of how we achieve the same in Gojek.

Steps to mask numbers

There are many use cases in Gojek where we need to mask numbers between two or more parties. Usually, there are multiple actors(customers, drivers, merchants, senders, etc) involved but for the ease of understanding, I will be taking the simplest approach of number masking between two customers and one driver. The key points to note in such a system would be

  • Two unique phone numbers of two unique customers are to be masked
  • One phone number of one driver is to be masked
  • We adopt a mutual exclusion policy such that every driver-customer pair will have a unique pair of VNs assigned. This simply means that for cases in which there are multiple active orders, every driver will be assigned a new VN, which will be different from the VNs already assigned to him. Likewise for customers.

To explain the above point let me take an example with a customer C and a driver D:

Number masking basics
  • C’s original phone number = AAA, D’s original phone number = KKK
  • C makes an order and driver D is assigned to them. The service anonymizes these numbers next.
  • C’s anonymised number = XXX, D’s anonymised number = YYY
  • So for a pair of original numbers AAA–KKK we assign XXX–YYY.
  • A new customer, N’s original number = BBB, makes an order.
  • Suppose, for N we assign the same driver D, which is currently having an active order. N’s anonymized number = WWW.
  • Given D now has more than one active order, a new VN i.e ZZZ will be assigned as D’s anonymized number.
  • So the new pair will be BBB-KKK and we assign WWW-ZZZ.

If that was a bit difficult to understand at first, try going through it a second time and making a mental model of the same. You will get the hang of it quickly.

The process flow

Our service always keeps a list of the healthy VNs which is fetched at regular intervals from our providers. This ensures that we always assign the active and healthy VNs to our users and ascertain the best experience when making the call.

The salient points involved in number masking are as follows:

The data flow

Worker:

  • When an order is booked, we consume the events related to various stages of the order via an async event streaming service log.
  • Our event consumer(worker) consumes the message and validates it. We move onto anonymization when we get the message with a predefined status(like driver found or order created) which acts as a trigger for the next step.
  • We assign a unique pair of VN, from the list of healthy VNs, for the customer-driver pair and stage this data in our DB.

Server:

  • When the user has been assigned a driver, the frontend calls us to fetch the virtual numbers for that actor. This happens for both the driver and the customer.
  • The app then displays the other party’s anonymized number, which will be used to make the call.
  • It is interesting to note that, given the presence of the order in our DB is governed by an async event(read worker), we have also implemented a fallback that fetches details via the order ID directly.

Provider:

  • To explain this better, suppose the user’s phone number = A
  • User’s anonymized number = B
  • Driver’s phone number = C
  • Driver’s anonymized number = D
  • In the event when the user makes the call, the provider makes an API call to our service with a number pair(A-D) and we validate this against the pairs we have staged in our DB.
  • If the pair is valid, we return the number pair (B-C), for the provider to connect the call through.
  • In essence, the call which was earlier getting connected between A-C now gets routed and re-routed via A-D in the first leg and B-C in the second leg.

It’s interesting to note how one can uphold customers’ privacy through simple tweaks in the basic flow. The call which was earlier getting connected from point A to point C now gets via new points B and D, without adding many overhead delays. The customer never knows the identity of the driver (unless explicitly mentioned) and vice-versa.

To read more stories, click here.

To be a part of the story, check out open job positions below: