Wednesday, 17 July 2013

PyPy Performance - Wow!

For those who follow my (very) occasional posts you know I've been working on measure performance of functional languages. I've been expanding this recently to include a more imperative comparison.

What I've done is use a simple algorithm to assign activities to resources. I have a data file with 500 activities and 10 resources. I assign 50 jobs to each resource in turn, selecting the closest each time. The result should be the same for all languages, however, the order in which I process the resources matter. Some languages (notably Go) have a random sequence for maps so I need to take this into account.

This is VERY numerically intensive and makes heavy use of collections (lists and maps).

The following table is a rundown of the results:

C++   -> 1.473s
Go           -> 2.966s
PyPy (Maps)       -> 3.180s
Scala (Imper.)    -> 3.283s
Haskell           -> 3.955s
C# (Adjusted)     -> 3.997s
O'Caml            -> 4.892s
F# (Adjusted)     -> 5.442s
Scala (Func/Imm)  -> 6.836s
PyPy              -> 6.987s
C# (Mono 3.0)     -> 10.137s
C# (Mono 2.10)    -> 10.554s
F# (Mono 3.0)     -> 13.802s
F# (Mono 2.10)    -> 16.031s
Ruby 2.0          -> 16.563s
Python 3.3        -> 25.274s
Python 3.3 (Maps) -> 27.642s
Python 2.7 (Maps) -> 31.736s
Cython            -> 32.745s
Ruby 1.9          -> 37.843s
Python 2.7        -> 43.798s
Clojure           -> 2m25.605s

I'm not going to go into the specifics of each, but where possible each is using the same basic algorithm but in an idiomatic style for each language.

From the title of this post you can see where I'm going with this. Python was by far the quickest to get to a working solution, was the simplest code to understand and extremely easy to profile and optimise. Running the standard interpreter (either 2.7 or 3.3) gives reasonable results but running PyPy gives a performance in the league of Go/C++/Scala/Haskell and O'Caml.

This was very interesting!