Interfacing Lua With Templates in C++11
The code for this project is hosted here
Why do I need to know this?
— random C++ programmer
Template programming in C++11 is often met with furrowed brows. Techniques associated with template programming are often written off as something for the experts to worry about. While integrating Lua with the game engine I’ve been writing, I came across a nifty and useful way to use some of the new template features in C++11. Other bindings to Lua exist but I opted to write my own because the others didn’t quite do what I wanted them to (mine is still a WIP, hosted on Github here). The goal of this post is to provide a use case for template metaprogramming in C++ and showcase some of the new C++11 syntax for variadic templates (templates with a variable number of arguments). Hopefully, it will motivate you to delve even deeper into metaprogramming if you haven’t already. To read this, I assume you already know what a template is and you are somewhat comfortable with C++ syntax.
The Lua C API
To understand the problem, we need to first understand the Lua C API. C (or C++) and Lua communicate through a stack. Invoking a Lua-function from C involves the following steps:
- Push the function name to the stack
- Push each function argument to the stack
lua_pcall, supplying the number of arguments and the number of return values
- Pop the returned values from the stack
Because the Lua API is C based, it provides a different function for
each type that can be pushed or popped from the stack (for example,
lua_pushboolean). Performing the above steps can
be really cumbersome and result in a lot of code. Suppose we had a
simple Lua file that looked like:
1 2 3
This is a lua function that takes two arguments, subtracts them, and returns the difference along with a string. Calling this function with the basic syntax might look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
This is a lot of code to call one function! Not only that, but there are plenty of ways it can go wrong. We could have mixed up the types, or popped or pushed values in the wrong order. We may have also forgotten to pop the correct number of values at the end.
Imagining the perfect interface
Here is how I’d want to call a function in Lua. First, I’d want a C++
object to encapsulate the
lua_State pointer and automatically free
it when it goes out of scope. Call it
State for simplicity. The
improved code with our hypothetical interface might look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Cleaner, huh? All the magic happens in the line
This binds a returned tuple of type
std::tuple<int, std::string> to
std::string on the left. Note that nowhere did we need
to specify the number or types of arguments or returned values. All of
that was inferred from the function call itself.
So now that we’ve envisioned a cleaner, more usable, and less error-prone interface, we need to set about implementing it. Let’s start with a shell of the class definition.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
This gives us an object that when instantiated, will create a Lua context and automatically free it when its destructor is invoked. It also prohibits copying while permitting moving. If you aren’t familiar with move semantics, you don’t need to understand it to continue to follow along, but I highly recommend you learn that topic.
Push and Read
Now we need to be able to interface with the stack. To push to the stack, we opt for the simple approach of functions overloaded by argument.
1 2 3
A sample implementation for one of these might look like:
1 2 3
For reading, we can’t use the same approach because C++ does not allow function overloading based on return type (only a few languages do). Thus, we rely on a template method with signature:
To use this function, we’ll need to provide instantiations for all the basic types supported by the API. I provide two examples below.
1 2 3 4 5 6 7 8
With these functions, we can now invoke
push 4 to the stack and read it back.
Handling multiple values at a time
Our next step is to make it convenient to push and pop multiple values
at a time. Let’s handle pushing multiple values first. To allow
Push(1, "hi", true), we add in addition to the other
push functions above, the following template function:
1 2 3 4 5
The ellipses (…) if you haven’t seen them before signify a parameter
pack, making this a variadic template function. This particular
function can take a variable number of arguments but must take at
least one. The compiler will pattern match the first argument to this
const T Value and it recursively calls itself until it
encounters one of the specific overloads from the previous section.
Now, we can do something like
1 2 3 4
and the stack will contain an integer, a string, and a boolean. Now, we need to implement multi-value reads. Since usually, we want to pop these values afterwards, we’ll just implement it as a multi-value pop. This is a bit trickier because as before, we can’t rely on argument overloading since the varying types are embedded in the return value. In addition, we’d like the function to return a boxed tuple if multiple values are requested, and an unboxed normal value if a single value is requested.
To do this specialization, we’ll use a sort of intermediate struct to differentiate between the two cases. This is called a type-traits class.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Notice that the return type of
Pop is embedded in the
struct which the compiler chooses based on the number of template
arguments passed. We use the
auto keyword for convenience although
_pop<T...>::type would have sufficed as well. The
T... in the
apply will expand out to all the template arguments
that were passed in originally.
Implementing the singular case of
_pop<T> is relatively
1 2 3 4 5 6 7 8 9 10 11 12
For the multi-valued
_pop, we’ll need to use recursion to build a
tuple. A first try at an implementation might look like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
However, this doesn’t work. To see why it might work at all, we note
std::make_tuple(PopBottom<Ts>(l)...) will expand to
if there are n template arguments passed. While this would work if the
functions were guaranteed to evaluate from left to right, the C++
standard makes no guarantees about such an evaluation order. In fact,
g++ v4.8.2 will invoke each argument to
make_tuple from right to
left, resulting in a tuple that is the reverse of what we want! Thus,
we need to use recursion to enforce the order of execution.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Recursion is a must here since there is no way to iterate through a
parameter pack. If you’ve programmed in Lisp, Haskell, Erlang, or
something similar, the code above will look pretty familiar. Note that
the template type signature
<typename T1, typename T2, typename...
Rest> is important to signify a function that has at least 2 template
arguments since a parameter pack can have zero or more elements. If
<typename T1, typename... Rest> instead, the code would fail
to compile since it would have an ambiguous overload with the previous
function with template type signature
With this, we can now pop multiple values from the stack!
1 2 3 4 5 6 7 8 9 10 11
Finally, we can implement the
Call member function to do what we
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
This time, we have a mixture of variadic template arguments and
function arguments. All the types and values passed after the function
name will be bound to the type parameter pack
Args... and value
args.... Again, we use the built-in variadic function
sizeof... to determine the number of arguments and return values
sizeof... is easy to implement as an exercise by the way).
This gets us most of the way there, but
Call will fail in a critical
way if we try to call a Lua function with no arguments. To see why, we
args... will be an empty parameter pack so
will be invoked as
which we never implemented (we implemented
Push taking at least one
argument). To circumvent this, we can simply implement
Returning zero arguments
Astute readers will note that our implementation will fail if we try
to call a function returning no results. This is because our
function expects to do some work and will return either an unboxed
value or a tuple. Thus, we’ll need to modify the
Pop function and
_pop type-trait struct to accomodate this.
There several ways to do this, but this way felt the most natural to
me (feel free to comment if you think otherwise). First, we add a type
_pop that denotes the number of values we expect to
pop. The base struct signature becomes
1 2 3 4
Our existing partial specialization occurs when this
size_t is 1.
1 2 3 4
Now, we need to add a new partial specialization for when the size argument is 0:
1 2 3 4 5
Pop function can return nothing if no types are required
with a slight implementation tweak:
1 2 3 4
There is one slight defect with the implementation that I will discuss
briefly but not fully outline. In this implementation, when parameter
packs are forwarded from one function to another, the constness is
preserved but not the lvalue/rvalue property. This isn’t too critical
here since most of the types we are passing are basic types with the
std::string. However, later on, we may wish to pass C++
objects that represent tables and preserving lvalue/rvalue properties
will be important for performance, lest we create unecessary copies
The syntax to do perfectly forward a parameter pack looks like
1 2 3 4
This way, if an lvalue is passed in, it will remain an lvalue when
threaded through to
bar. Similarly, if an rvalue is passed in, it
will be passed as an rvalue as well.
After all the type-fu above, we finally have a class that provides us
the clean type-aware
Call method we were looking for. As mentioned
in the beginning, the code for this set of Lua bindings is still in
progress but can be found on my github
here. I don’t expect this overview to
be at all comprehensive, but I hope at the very least that it has
demonstrated to some degree what the C++11 type system is capable of.
As a final exercise, I highly recommend looking at the disassembly of
the executable created by the compiler whenever metaprogramming to
understand the runtime characterstics of the code.