Stop Comparing Programming Languages With Benchmarks

If I had one gripe about the programming community, it would be the oft misguided usage of benchmarks to determine what language is THE RIGHT ONE™. Benchmarks are used to justify decisions when they should not be. Benchmarks can lead a person to the wrong decision. So, programming community, this is my humble benchmark-vention to you.

Preliminaries, C pointers

C and C++ is as fast as your going to get (and maybe Fortran) for raw computing. To understand philosophically why this is the case, consider the following:

1
2
3
int foo(int *bar) {
    return *bar;
}

All we are doing in this function is accepting a pointer to an integer, then dereferencing it (reading the value that the pointer points to), and returning it. Safe programming would dictate that before dereferencing this pointer, we should (unless we know it to be safe) compare bar with nullptr. The question is, what happens if the pointer bar was not actually pointing to something? The answer is, well, anything! The foo function may return a random number, or crash entirely (and take your program down with it). It will not generate an exception!

Well, why not? Because the mantra of C and C++ is performance; it will not do any more than the user asked it to. Here, when I implemented the foo function, I didn’t ask for a check that the pointer passed in was valid. So, it didn’t do it! Having implicit error and exception handling goes against the language philosophy, which is one of granting complete control to the user with minimal hand-holding (at runtime).

Why will higher level languages never get faster than C and C++? Well, they can’t by definition. Higher level languages are “higher level” because they provide a safety net around primitives like pointers to protect the programmer or make the code semantically easier to understand. An example of this is garbage collection. There is an intrinsic cost to providing anything beyond performing the operation itself, so C and C++ will always be the “speed of light” so to speak.

Understanding Tradeoffs

There are multiple axes at play when formulating a programming language. What types of expressions and constructs is the user restricted to (what is deemed as illegal and what is deemed as simply undefined)? ? How is memory handled? How are errors handled? Do I want my runtime to run natively or in a virtual machine? Let’s tackle each of these separately.

Permissiveness

Languages that are more permissive are either less performant, or more dangerous. Consider Haskell. Haskell is very non-permissive in that all functions written must be pure (side-effect free). Side effects are encapsulated very carefully in state monads (e.g. the I/O monad). As a result, Haskell is considered fairly performant and safe. It has sacrificed permissiveness and is not forced to make this tradeoff.

On the flip side, consider C++ and Ruby. Both languages place virtually no restriction on what the programmer can do, and they both are forced to make tradeoffs as a result. C++ does nothing to protect the user, and so exchanges performance for safety. Ruby, on the other hand, makes the decision to make everything an object and thus makes the opposite tradeoff.

Another way of thinking about this is that there is a tension between enforcing constraints at compile time versus runtime. By imposing more constraints, the compiler can opt to generate more optimized code. If there are very few constraints, these constraints must either be checked at runtime, or not checked at all (or something in between).

A permissive language is not necessarily better than a restrictive one and vice versa.

Memory management

Reference counting isn’t free. End of story. Any implementation of a reference counted pointer must necessarily maintain the count of number of references made to the object. This must be faithfully incremented each time it enters a new scope, and decremented when it leaves it. Each time the pointer leaves a scope, the runtime must check if the count is zero, at which point it either deletes it immediately or queues the deletion for the garbage collector later on.

A memory managed language is not necessarily better than a language without managed memory and vice versa.

Error handling

Some languages like Erlang choose to be completely fault tolerant. Every line of code will generate an exception and this will be captured somewhere in the process heirarchy (even without the developer making this explicit inline). This is also done in a way that isolates the error from other lightweight processes in the VM. Of course, there is a cost associated with this; there is no free lunch. The correct question to ask is, do the benefits outweight the negatives? As we saw with the foo function above, not having this sort of error handling can lead to insidious bugs in code.

A fault-tolerant language is not necessarily superior to a fault-intolerant one and vice versa.

Native vs virtual execution

There are additional tradeoffs in considering if the code must run natively or if it can run in a virtual machine. The main advantages a VM gives are concurrency primitives (that don’t map directly to O/S threads) and the ability to load code very quickly (without compilation) and potentially even at runtime (the JVM and Erlang VM are capable of this for example). The main disadvantage is that code running in a VM will be slower.

Choosing a language the right way

First, outline your requirements and answer the following questions:


What is the expected load of my system?
What can I afford to pay to support this load?
What are the uptime requirements?
Can I tolerate my application restarting upon crash?
Do I need high-level concurrency?
Is latency more important to me, or throughput?
Do I need hot reloading?
What platforms do I need to support?
How large is my engineering team?
Do I prefer one paradigm over another (functional vs OO vs procedural)?
Am I comfortable by taking on certain compile time restrictions like side effects or immutability)?
What languages do I know already?
What languages does my team know already?
Does the language have the ecosystem of libraries that supports the domain I work in that I need?
What is the timeline of my project?
What kind of hardware will I be using (memory, multicores, GPUs, etc.)?
Do I need to run on embedded devices?
What is my perceived priority and certainty for all the above points?

If at any point in answer these questions, you are tempted to pull up a benchmark of “X language vs Y language,” please stop. That benchmark is unlikely to help you because the benchmark obscures the tradeoffs made by the slower language - tradeoffs that may actually be important to you.

Instead, sort your responses in priority order. If load is in fact the most important thing, rule out all the languages that are not performant enough and continue (don’t just settle for the fastest immediately). Most of the time, you will end up choosing a language that sacrifices some amount of control for something nice, like lightweight threads or some other critical feature. Heck, you might just like working with somebody that happens to love Python and choose that as a result.

The important thing is that there is a language that is the “optimum fit” for the tradeoffs you are willing to make and you will be 100x happier choosing that language in the long run.

Stop the language wars

Just stop. And unless you actually intend on computing fibonnaci sequences in your product, stop showing me those stupid benchmarks. Also, note that no matter what your answers to the questions above are, the right answer will almost never be PHP (forgive me).

Comments