How NOT to write an Envoy Lua filter

Here are 2¹⁰ words about how not to write the powerful, yet simple Envoy Lua filters.

How NOT to write an Envoy Lua filter

By Amitosh Swain Mahapatra

First things first… What’s Envoy?

Envoy is a high-performance L4 and L7 proxy with wire-level support for multiple protocols such as HTTP, GRPC, MongoDB, DynamoDB, etc. It possesses advanced load balancing features including automatic retries, circuit breaking, rate limiting, all with the goodness of providing APIs for dynamically managing all its configuration.

At Gojek, we use Envoy at multiple places — as a fronting proxy to micro-services, as an ingress point to client-facing APIs, etc. This is done with our in-house implementation of an Envoy Control Plane/Discovery Service — consul-envoy-xds, which dynamically configures clusters, listeners, routes and filters for our Envoy instances with data from Consul and Vault (for SDS).

Filters lie at the heart of Envoy request processing architecture.

They perform operations on each incoming connection at various points. Some filters execute on receiving data from upstream or downstream, and some others run after a request is processed. The filters operate as a flow chain, which can stop and subsequently continue the iteration to further filters. This allows for more complex scenarios such as rate-limiting.

Envoy Lua filters

Envoy comes with several built-in HTTP filters. However, some operations on HTTP connections cannot be implemented by the built-in filters and require custom logic. Envoy provides a generic Lua filter and a Lua API to program such filters.

Lua is a fast, embeddable language with a minimal footprint and a straightforward C API for embedding. It is widely used by game engines, embedded systems, databases (such as Redis) and many performance-critical applications such as network routers and proxies.

Now that we’ve gotten our basics right, here is…

How not to write an Envoy Lua filter

We maintain several Lua filters for various metadata and operational transformation on incoming HTTP requests. Here’s a set of quirks I encountered while writing my first Lua filter for Envoy.

Arrays start at 1 ☝️

Arrays, in contrast to C family languages, start with 1 in a Lua. Coding in Java for the last decade or so, I have been accustomed to 0 index calculations. 😅 Computing offsets from 1 simply made my head wrap. It was challenging to translate algorithms which heavily use array lookup operations.

And they can have holes!

Yes, arrays can have missing elements. If the size is 10, it doesn’t always mean that the variety is guaranteed to have elements from 1 to 10. Internally, Lua arrays are modelled as tables with integer keys similar to a dictionary. The only foolproof way is to use pairs/ipairs to iterate over them safely.

Tables, tables everywhere

Everything in Lua, except primitives, is a table. which are key-value pairs similar to a hash map, but with steroids. For example, it allows you to define an __index function as a meta-table to dynamically get values from non-existing keys in the table.
It also has __pairs that modifies the way pairs/ipairs operate on the table.

Truthy and Falsy ✅❌

Except for nil and false, everything else is treated as true in an if check. This behaviour is similar to that of Python and JavaScript. It’s extremely handy while checking the return value of a function call.

Everything is global unless explicitly said otherwise

Like JavaScript in loose mode, if you do not prefix a variable declaration with the local keyword, it will be treated as a global, and the consequences of polluting the global scope can be severe.
This also applies to while using require to load modules from Lua code.

⚠️ Never do this:

require 'some-module'
require 'another-module'

Attributes from another-module can override those from some-module. Instead, scope them using a local variable:

local module = require 'some-module'
module.do_something()

Did I hear classes?

Lua is a purely procedural language with no first-class notation of Objects and Classes. Objects can be simulated using tables. Classes inheritance can be implemented to some extent using tables and meta-tables in a similar fashion of prototype-based design as we do in classic JavaScript.

--
-- metaclass
--
local Point = {}
--
-- constructor
--
function Point:new (x, y)
    local o = {x = x, y = y}
    setmetatable(o, self)
    self.__index = self
    return o
end
function Point:equals (p)
    return self.x == p.x and self.y == p.y
end
-- ...
--
-- creating points
--
local p1 = Point:create(10, 20)
local p2 = Point:create(30, 40)
--
-- example of a method invocation
--
p1:equals(p2)

There can be many variations in the way we can define classes in Lua, this is just one of them and not necessarily the best.

Envoy specific Lua quirks

Lua vs LuaJIT

Envoy uses LuaJIT, which is a Lua runtime that uses a JIT compiler to execute Lua code which makes it crazily fast. There are minor differences between the two, and they are apparent only after taking a deep dive into Lua.

rm -rf your globals

Apart from polluting your global scope (which is very bad in its own right), it brings additional memory management issues when used in Envoy. Due to a bug that existed in Envoy, they are never removed by the GC and cause a memory leak. Your server will soon crash after serving a few thousand requests. 😵

Mind your buffers

If you’re planning on reading the entire request body for inspection, you have to configure Envoy to set your buffer size such that the whole request body can be stored in the request buffer. Otherwise, you’ll silently receive truncated data.

HTTP calling HTTP

Envoy allows you to make HTTP calls to other hosts inside the plugin. This reuses the existing HTTP mechanism of Envoy. This unfortunately means that every host or group of hosts you need to connect, need to be registered as a cluster, be it external or internal host.
⚠️ Also, make sure you accidentally do not expose internal APIs.

And make sure to keep your logs clean

While making HTTP requests from a Lua filter, and when Envoy is configured to log in debug mode, Envoy prints the entire request & response headers and body.
⚠️ Please make sure your production code does not include unnecessary logging and does not leak any sensitive information such as keys.

HTTP body is immutable

Your plugin should ideally operate over the header. The body is supposed to be read-only. Envoy provides no standard way for mutating request body.

…unless you use a hack like this

Dispatch an HTTP call to the same cluster with the same name, headers and a modified body. Respond the caller using the status and body from the response after modifications.

Debugging! 🐞

Debugging plain Lua is already hard, as Lua has no standard debugger. The ones out there do not have dynamic breakpoints. Most of the time, print is the only way.
Debugging embedded Lua code is even more challenging as you cannot even use those aftermarket debuggers. And in the case of Envoy, you cannot execute a print. Debugging Envoy filters usually involves a lot of handle:log* calls. The worse I remember is setting the value of a header and inspecting it using ngrep!

Thou shalt not call native code

You cannot call native code as Envoy, by default, does not export the symbols required by Lua modules written in C. If necessary, either use FFI or compile your Envoy with the Lua symbols exported. Be warned that native modules do not offer the memory-safety guarantees provided by Lua.

Testing all the things

Use a test runner.
Busted is a great test runner for Lua. Since Lua is dynamically typed, you can easily recreate all data structures required for simulating an Envoy environment.

Remember to run some kind of integration tests. Fire up a Docker container with your filter loaded. Make some requests and assert on the result. Envoy Lua has some quirks (like dangling pointers after coroutine yield) that cannot be simulated in unit tests.

Do not hold a reference to anything returned by the handle

…such as header, metadata or body objects. They are backed by C++ objects and have a different life-cycle than ordinary Lua objects. A coroutine suspend-resume makes those references invalid. This once caused a bug which was very hard to locate.

Distributing your code

Simplest of the scripts can be directly embedded in the configuration file. But for more extensive plugins with multiple files, it’s impractical. However, you can create your plugin as a Lua module and distribute it using LuaRocks, the Lua module system and use a shim in your configuration file:

- envoy.lua: |
    local filter = require('my-lua-filter')
    function envoy_on_request(handle)
        filter.handle_request(handle)
    end

Apart from keeping your configuration file short and sweet, it allows you to develop and deploy your filter code independent of the configuration.

Fantastic rocks and where to keep them 🖖

Now that you’re creating rocks for your filters, you need to place them at a location where you can install them in your boxes and containers. In contrast to other package managers such as npm, LuaRocks does not need any special server implementation to host the rocks at a central location. All you need is to generate a LuaRocks manifest and upload them along with the individual rock files to an HTTP host.

Ending notes

Lua uses different conventions than “C” family languages that most of us are familiar with. This sometimes creates many stumbling blocks for us while we code.
Lua filters in Envoy are incredibly powerful to implement logic that cannot be expressed otherwise. Lua code is swift, and in real-world uses, it does not add noticeable latency to your requests. Sometimes the API exposed by Envoy may not be enough, and for such cases, you can look into implementing C++ filters.

Click here for more stories about everything that goes into building a #SuperApp! 💚