Weaver: Sharding With Simplicity

Weaver: Sharding With Simplicity

By Rajeev Bharshetty

The story of GOJEK may have started in Indonesia, but it is a tale destined to chart many geographies across Southeast Asia. Last year, GOJEK expanded beyond the boundaries of its home nation to new markets — Vietnam, Thailand, and Singapore. This move came with its own challenges, one of them being routing requests from multiple geographies to the relevant applications, without the whole system getting affected.

This blog introduces Weaver — GOJEK’s HTTP reverse proxy with dynamic sharding strategies. Weaver was was one of critical tools which helped us expand to newer markets (like Singapore) with little hassle.

Weaver is used internally by teams to route requests to multiple application shards based on a sharding key. It enables teams to have multiple deployments of the same application, and shard the data and requests based on some pre-decided key.

Application layer sharding can help with failure isolation, scaling data intensive applications, and with availability of the overall service.

Before understanding how GOJEK uses this at scale, let’s revisit some basics of Weaver so that we are better equipped to understand its usage.

What is Weaver ?

Weaver is an advanced HTTP reverse proxy built in-house at GOJEK, which supports various sharding strategies.

Features

  • Sharding requests based on headers/path/body fields.
  • Metrics on requests per route per backend.
  • Dynamic configuration of different routes (No restarts! 👍).
  • Enhanced speed.
  • Supports multiple algorithms for sharding requests (consistent hashing, modulo, S2 etc.).
  • Packaged as a single self-contained binary.
  • Extensive logging for various failures.
  • Cloud native.

Weaver Terminologies/Concepts

Let’s go through some basic Weaver terminologies to help understand the rest of the post.

Access Control List (ACL):
ACL in Weaver defines a unit configuration for deciding what needs to be done before routing a request to a defined downstream service(s).

Backend/Shard:
Backend in Weaver is the downstream service(s) to which the requests will be forwarded.

Route:
Route is a standard HTTP route with path and HTTP method(s), which is used to match the incoming request and map it to an ACL, For ex: GET v2/drivers

Matcher:
Matcher decides the value to match for an HTTP request, For ex: Weaver supports matching values from path, header and body

Sharding Strategies:
Weaver helps sharding requests across various backends/shards. It supports multiple ways to split this incoming traffic. Also, the strategies easily extensible for specific requirements.

Some of the supported sharding techniques include:

  • Simple Lookup on a Key
  • Prefix Lookup
  • Modulo
  • Consistent hashing
  • S2 based

Architecture

Figure 1: a typical architecture of Weaver

Weaver uses etcd for storing and retrieving configurations (ACLs). It acts as a control plane for Weaver (data plane). Weaver dynamically (without restarts) loads new/updated configurations from etcd without losing any connections.

Weaver uses this configuration (ACL) loaded from etcd to match the incoming request, apply a sharding technique specified and route it to appropriate backends. Lets understand this with the help of an actual example ACL:

ACL

Screenshot-2020-10-12-at-4.57.17-PM

Consider the above ACL: it is for getting nearby locations of a driver. In this case, consider the HTTP request is GET with path `/gojek/drivers/nearby` and has Driver-ID as part of request header. `criterion` in the above ACL, defines the HTTP method and the regex of the path to match an HTTP request against.

shard_config defines multiple things like backends, shard_expr, shard_func and matcher.

shard_expr is used to find the sharding key from an HTTP request.
matcher determines where to find the sharding key.
shard_func is the sharding function to be used on the sharding key to map a request to a backend.

In the example above, shard_expr is Driver-ID and matcher is header. Hence, for all the requests matching the above criterion, the sharding key is extracted from the header with key Driver-ID. The key is then used to map it against a particular backend. shard_func used is a simple modulo function which uses the Driver-ID value to map it to a particular backend.

More about ACLs for different sharding strategies can be found here.

The motivation behind Weaver

There are already numerous reverse proxy solutions available, you might ask. What was the need to build another ?

Weaver started as a solution for scaling GOJEK systems, specifically for clustering and sharding of systems. It then evolved to become a generic solution for proxying across various use-cases.

Let’s dive into one of the real world use-cases where Weaver helped teams move fast and scale:

Sharding Maps Service:

At GOJEK, we use the Google Maps service for a variety of reasons. We use it to calculate the route path of our customers from their pickup location to destination, auto-complete addresses, estimate fares, and provide accurate ETAs. We have a Maps service which is an interface for all of our calls to Google. Understandably, this is one of the critical services within GOJEK.

For our Singapore expansion, we went ahead with a different cluster of the Maps service (the advantages being failure isolation, scaling data growth, and availability) and sharded the incoming requests based on country_code. All we had to do for this was to add an ACL on Weaver with sharding key as country_code and Weaver took care of routing traffic to the appropriate cluster with ease.

The architecture for this deployment looked something like this:

Weaver sharding requests based on Country code to multiple clusters

The same technique of sharding incoming requests is being used for other critical services within GOJEK as well.

Core Principles for building Weaver

One of the core principles for building Weaver was simplicity. We wanted it to be simple enough so that developers within can actively contribute to it and use it for various use-cases.

Another principle was to make it generic and extensible, but still keeping GOJEK-specific use-cases in mind. Adding a new technique for sharding requests is easy to do on Weaver.

Finally, Observability. All requests flowing through Weaver should be transparent to the user. Users should be able to ask questions to the system and understand its behaviour.

In Conclusion

Weaver has solved a very particular use-case within GOJEK (that of application layer sharding) and is now used for various business-critical flows. It helped us scale systems and teams and GOJEK is invested in the growth and future development of this project.

Some of the future work on Weaver could include:

  • Load balancing (Weaver does not support load balancing requests across multiple backends as of now).
  • GRPC support (Proxying GRPC requests).
  • Support for multiple instrumentation techniques.
  • Circuit breaking.

Weaver is now Open source 🙌

Weaver is completely open source and can be found here.

Please let us know what you think, and give feedback.
Contributions of any kind for improving it are also more than welcome.

References:

[1]: Shard
[2]: S2 Geometry
[3] Weaver ACLS
[4] Monitoring and Observability
[5] Weaver Github OSS

GOJEK is Indonesia’s first unicorn, and our growth has been exponential because of a team that is always innovating. Today, we process over a 100 million orders a month, and roughly 350 million internal API calls every second. As you’ve gathered from this post, we also love open source, and encourage our developers to contribute to open source libraries. We’re also hiring, so if this sounds like your kind of jam, visit gojek.jobs and come help us redefine on-demand platforms in Southeast Asia 🙌