How We Pushed a Million Keys to Redis in Seconds

Dealing with a lot of keys? Redis’ Pipe Mode is your friend.

How We Pushed a Million Keys to Redis in Seconds

By Parampreet Singh

Hello there!

In this post, I’ll share my ideas on how we populated Redis (running in a Kubernetes cluster)… in a matter of seconds.

Here’s what can you expect from this post:

1. How to connect to Redis server running in a Kubernetes cluster ?
2. What is Port-forwarding ?
3. How to use Redis mass insertion & push millions of keys in seconds ?
4. How to generate Redis Protocol ?
5. How to read /parse a CSV in Ruby ?
Wait, but why do I need to do this? 🤔

At Gojek, we use Redis in one of the services for caching drivers for faster lookups. Since we have deployed this service to new clusters, we needed to populate Redis with ~81K keys.

What we didn’t do (and should not be done)

Well, this. 👇

$ redis-cli -h "hostname" -p 6379 set "key" "value"

This simple and easy way of storing a key through redis-cli is okay, but not for thousands or millions of keys. You don’t want to end up waiting for hours unless you are Regina Phalange! 😛

via GIPHY

Using a normal Redis client to perform mass insertion is not a good idea. The naive approach of sending one command after the other is slow, because you have to pay for the round trip time for every command.

Let’s do something different!

We will use Redis mass insertion, but before going to that, let’s talk a bit about Redis Protocol.

Redis clients communicate with the Redis server using a protocol called RESP (REdis Serialization Protocol).

With that said, let’s go write some code! I like toying around with Ruby, so this was my language of choice.

gen_redis_proto function will generate the protocol required for mass insertion.

2.6.3 > puts gen_redis_proto("SET","mykey","Hello World!").inspect

Running the above command in Ruby console, will give us the following protocol.

"*3\r\n$3\r\nSET\r\n$5\r\nmykey\r\n$12\r\nHello World!\r\n"

Well, this is how a command is represented and sent to the Redis Server through Redis Protocol.

*<args><cr><lf>
$<len><cr><lf>
<arg0><cr><lf>
<arg1><cr><lf>
...
<argN><cr><lf>
Where <cr> means "\r" (or ASCII character 13) and <lf> means "\n" (or ASCII character 10).

We can now run this script, but here’s a catch. Our Redis server runs in a Kubernetes cluster and we didn’t want to install Ruby and its gems inside a cluster. So now?

Enter port-forwarding! 👍

$ kubectl -n "namespace" port-forward "pod-name" 7000:6379

Connections made to local port 7000 are forwarded to port 6379 of the pod that is running the Redis server. With this connection in place we can use our local workstation to debug the database that is running in the pod.

Finally, we run our script to populate Redis 😬
$ ruby redis_mass_insert.rb | redis-cli -p 7000 --pipe

All data transferred. Waiting for the last reply...
Last reply received from server.
errors: 0, replies: 81003

We ran this script and it completed within a fraction of seconds!

via GIPHY

But, how?

In 2.6 or later versions of Redis the redis-cli utility supports a new mode called pipe mode that was designed in order to perform mass insertion.

Under the hood of pipe mode

According to the official doc:

  • redis-cli — pipe tries to send data as fast as possible to the server.
  • At the same time it reads data when available, trying to parse it.
  • Once there is no more data to read from stdin, it sends a special ECHO command with a random 20 bytes string: we are sure this is the latest command sent, and we are sure we can match the reply checking if we receive the same 20 bytes as a bulk reply.
  • Once this special final command is sent, the code receiving replies starts to match replies with these 20 bytes. When the matching reply is reached it can exit with success.

Naice, what’s next?

Well, I tried populating Redis locally with a million keys.

It worked like a charm, in just ~2 seconds. 😄

via GIPHY

That’s it!

I really hope that this post gave you some new insights.

Thanks for reading! 💚

References

  1. Redis Mass Insertion
  2. Redis Protocol
  3. Port Forwarding in Kubernetes to access applications