Interfacing Lua With Templates in C++11

The code for this project is hosted here

This is the first part of a 3 part series: – part 1part 2part 3

Why do I need to know this?

random C++ programmer

Template programming in C++11 is often met with furrowed brows. Techniques associated with template programming are often written off as something for the experts to worry about. While integrating Lua with the game engine I’ve been writing, I came across a nifty and useful way to use some of the new template features in C++11. Other bindings to Lua exist but I opted to write my own because the others didn’t quite do what I wanted them to (mine is still a WIP, hosted on Github here). The goal of this post is to provide a use case for template metaprogramming in C++ and showcase some of the new C++11 syntax for variadic templates (templates with a variable number of arguments). Hopefully, it will motivate you to delve even deeper into metaprogramming if you haven’t already. To read this, I assume you already know what a template is and you are somewhat comfortable with C++ syntax.

The Lua C API

To understand the problem, we need to first understand the Lua C API. C (or C++) and Lua communicate through a stack. Invoking a Lua-function from C involves the following steps:

  1. Push the function name to the stack
  2. Push each function argument to the stack
  3. Invoke lua_pcall, supplying the number of arguments and the number of return values
  4. Pop the returned values from the stack

Because the Lua API is C based, it provides a different function for each type that can be pushed or popped from the stack (for example, lua_tointeger and lua_pushboolean). Performing the above steps can be really cumbersome and result in a lot of code. Suppose we had a simple Lua file that looked like:

This is a lua function that takes two arguments, subtracts them, and returns the difference along with a string. Calling this function with the basic syntax might look like this:

This is a lot of code to call one function! Not only that, but there are plenty of ways it can go wrong. We could have mixed up the types, or popped or pushed values in the wrong order. We may have also forgotten to pop the correct number of values at the end.

Imagining the perfect interface

Here is how I’d want to call a function in Lua. First, I’d want a C++ object to encapsulate the lua_State pointer and automatically free it when it goes out of scope. Call it State for simplicity. The improved code with our hypothetical interface might look like this:

Cleaner, huh? All the magic happens in the line

1
std::tie(difference, greeting) = l.Call<int, std::string>("subtract_and_hello", 1, 3);

This binds a returned tuple of type std::tuple<int, std::string> to the int and std::string on the left. Note that nowhere did we need to specify the number or types of arguments or returned values. All of that was inferred from the function call itself.

Implementing sel::State

So now that we’ve envisioned a cleaner, more usable, and less error-prone interface, we need to set about implementing it. Let’s start with a shell of the class definition.

This gives us an object that when instantiated, will create a Lua context and automatically free it when its destructor is invoked. It also prohibits copying while permitting moving. If you aren’t familiar with move semantics, you don’t need to understand it to continue to follow along, but I highly recommend you learn that topic.

Push and Read

Now we need to be able to interface with the stack. To push to the stack, we opt for the simple approach of functions overloaded by argument.

1
2
3
void Push(const int value);
void Push(const bool value);
...

A sample implementation for one of these might look like:

1
2
3
void State::Push(int value) {
    lua_pushinteger(_l, value);
}

For reading, we can’t use the same approach because C++ does not allow function overloading based on return type (only a few languages do). Thus, we rely on a template method with signature:

1
template<T> T Read(int index) const;

To use this function, we’ll need to provide instantiations for all the basic types supported by the API. I provide two examples below.

1
2
3
4
5
6
7
8
template<>
float State::Read(int index) const {
    return lua_tonumber(_l, index);
}
template<>
bool State::Read(int index) const {
    return lua_toboolean(_l, index);
}

With these functions, we can now invoke Push<int>(4) and Read<int>(1) to push 4 to the stack and read it back.

Handling multiple values at a time

Our next step is to make it convenient to push and pop multiple values at a time. Let’s handle pushing multiple values first. To allow something like Push(1, "hi", true), we add in addition to the other push functions above, the following template function:

1
2
3
4
5
template <typename T, typename... Ts>
void Push(const T value, const Ts... values) {
    Push(value);
    Push(values...);
}

The ellipses (…) if you haven’t seen them before signify a parameter pack, making this a variadic template function. This particular function can take a variable number of arguments but must take at least one. The compiler will pattern match the first argument to this function to const T Value and it recursively calls itself until it encounters one of the specific overloads from the previous section.

Now, we can do something like

1
2
3
4
int main() {
    sel::State state;
    state.Push(1, "hi", true);
}

and the stack will contain an integer, a string, and a boolean. Now, we need to implement multi-value reads. Since usually, we want to pop these values afterwards, we’ll just implement it as a multi-value pop. This is a bit trickier because as before, we can’t rely on argument overloading since the varying types are embedded in the return value. In addition, we’d like the function to return a boxed tuple if multiple values are requested, and an unboxed normal value if a single value is requested.

To do this specialization, we’ll use a sort of intermediate struct to differentiate between the two cases. This is called a type-traits class.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// General type trait struct
template <typename... Ts>
struct _pop {
    typedef std::tuple<Ts...> type;
    static type apply(State &s) {
        // todo
    }
};

// Specialization for singular type
template <typename T>
struct _pop<T> {
    typedef T type;
    static type apply(State &s) {
        // todo
    }
};

template <typename... T>
typename _pop<T...>::type Pop() {
    // Store the elements requested
    return _pop<T...>::apply(*this);
}

Notice that the return type of Pop is embedded in the _pop helper struct which the compiler chooses based on the number of template arguments passed. We use the auto keyword for convenience although _pop<T...>::type would have sufficed as well. The T... in the invocation of apply will expand out to all the template arguments that were passed in originally.

Implementing the singular case of _pop<T> is relatively straightforward:

1
2
3
4
5
6
7
8
9
10
11
12
template <typename T>
struct _pop<T> {
    typedef T type;
    static type apply(State &s) {
        // Read the top element (negative indices count from the top)
        T ret s.Read<T>(-1);

        // Remove it from the stack
        lua_pop(s._l, 1);
        return ret;
    }
};

For the multi-valued _pop, we’ll need to use recursion to build a tuple. A first try at an implementation might look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
void PopBottom<T>(State &s) {
    // Read the bottom element
    T ret = s.Read<T>(1);

    // Remove the bottom element
    lua_remove(s._l, 1);

    return ret;
}

template <typename... Ts>
struct _pop {
    typedef std::tuple<Ts...> type;
    static type apply(State &s) {
        auto ret = std::make_tuple(PopBottom<Ts>(s)...);
        lua_pop(s._l, sizeof...(Ts));
        return ret;
    }
};

However, this doesn’t work. To see why it might work at all, we note that std::make_tuple(PopBottom<Ts>(l)...) will expand to

1
std::make_tuple(PopBottom<T1>(l), PopBottom<T2>(l), ..., PopBottom<Tn>(l));

if there are n template arguments passed. While this would work if the functions were guaranteed to evaluate from left to right, the C++ standard makes no guarantees about such an evaluation order. In fact, g++ v4.8.2 will invoke each argument to make_tuple from right to left, resulting in a tuple that is the reverse of what we want! Thus, we need to use recursion to enforce the order of execution.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
template <typename... Ts>
struct _pop {
    typedef std::tuple<Ts...> type;

    // base case: creates a tuple containing one element
    template <typename T>
    static std::tuple<T> worker(const State &s, const int index) {
        return std::make_tuple(s.Read<T>(index));
    }

    // inductive case
    template <typename T1, typename T2, typename... Rest>
    static std::tuple<T1, T2, Rest...> worker(const State &s,
                                              const int index) {
        std::tuple<T1> head = std::make_tuple(s.Read<T1>(index));
        return std::tuple_cat(head, worker<T2, Rest...>(s, index + 1));
    }

    static type apply(State &s) {
        auto ret = worker<Ts...>(s, 1);
        lua_pop(s._l, sizeof...(Ts));
        return ret;
    }
};

Recursion is a must here since there is no way to iterate through a parameter pack. If you’ve programmed in Lisp, Haskell, Erlang, or something similar, the code above will look pretty familiar. Note that the template type signature <typename T1, typename T2, typename... Rest> is important to signify a function that has at least 2 template arguments since a parameter pack can have zero or more elements. If we used <typename T1, typename... Rest> instead, the code would fail to compile since it would have an ambiguous overload with the previous function with template type signature <typename T>.

With this, we can now pop multiple values from the stack!

1
2
3
4
5
6
7
8
9
10
11
int main() {
    sel::State state;
    state.Push(1, "hi", true);

    int a;
    std::string b;
    bool c;
    std::tie(a, b, c) = state.Pop<int, std::string, bool>();

    assert(a == 1 && b == "hi" && c);
}

Implementing Call

Finally, we can implement the Call member function to do what we initially intended.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
template <typename... Ret, typename... Args>
typename _pop<Ret...>::type Call(const std::string &fun,
                                 const Args&... args) {
    // Push function to stack
    lua_getglobal(_l, fun.c_str());

    const int num_args = sizeof...(Args);
    const int num_ret = sizeof...(Ret);

    // Push all arguments to the stack using our variadic Push
    Push(args...);

    // Call the function
    lua_call(_l, num_args, num_ret);

    // Return all the results and remove the from the stack
    return Pop<Ret...>();
}

This time, we have a mixture of variadic template arguments and function arguments. All the types and values passed after the function name will be bound to the type parameter pack Args... and value parameter pack args.... Again, we use the built-in variadic function sizeof... to determine the number of arguments and return values (sizeof... is easy to implement as an exercise by the way).

This gets us most of the way there, but Call will fail in a critical way if we try to call a Lua function with no arguments. To see why, we recognize that args... will be an empty parameter pack so

1
Push(args...);

will be invoked as

1
Push();

which we never implemented (we implemented Push taking at least one argument). To circumvent this, we can simply implement Push():

1
void Push() {}

Returning zero arguments

Astute readers will note that our implementation will fail if we try to call a function returning no results. This is because our Pop function expects to do some work and will return either an unboxed value or a tuple. Thus, we’ll need to modify the Pop function and the _pop type-trait struct to accomodate this.

There several ways to do this, but this way felt the most natural to me (feel free to comment if you think otherwise). First, we add a type parameter to _pop that denotes the number of values we expect to pop. The base struct signature becomes

1
2
3
4
template <size_t, typename... Ts>
struct _pop {
    ...
};

Our existing partial specialization occurs when this size_t is 1.

1
2
3
4
template <typename T>
struct _pop<1, T> {
    ...
};

Now, we need to add a new partial specialization for when the size argument is 0:

1
2
3
4
5
template <typename... Ts>
struct _pop<0, Ts...> {
    typedef void type;
    static type apply(State &s) {}
};

Now, our Pop function can return nothing if no types are required with a slight implementation tweak:

1
2
3
4
template <typname... T>
typename _pop<sizeof...(T), T...>::type Pop() {
    _pop<sizeof...(T), T...>::apply(*this);
}

Perfect forwarding

There is one slight defect with the implementation that I will discuss briefly but not fully outline. In this implementation, when parameter packs are forwarded from one function to another, the constness is preserved but not the lvalue/rvalue property. This isn’t too critical here since most of the types we are passing are basic types with the exception of std::string. However, later on, we may wish to pass C++ objects that represent tables and preserving lvalue/rvalue properties will be important for performance, lest we create unecessary copies everywhere.

The syntax to do perfectly forward a parameter pack looks like

1
2
3
4
template <typename... T>
void foo(T&&... args) {
    bar(std::forward<T>(args)...);
}

This way, if an lvalue is passed in, it will remain an lvalue when threaded through to bar. Similarly, if an rvalue is passed in, it will be passed as an rvalue as well.

Wrapping up

After all the type-fu above, we finally have a class that provides us the clean type-aware Call method we were looking for. As mentioned in the beginning, the code for this set of Lua bindings is still in progress but can be found on my github here. I don’t expect this overview to be at all comprehensive, but I hope at the very least that it has demonstrated to some degree what the C++11 type system is capable of. As a final exercise, I highly recommend looking at the disassembly of the executable created by the compiler whenever metaprogramming to understand the runtime characterstics of the code.

Comments