Thursday, February 11, 2010

Polymorphism and Complex Conditionals

I ran into a situation today on an old application I inherited where I thought I might be able to take advantage of some polymorphism to refactor some complex if/else structures.  This lead me to the thought "what kind of performance impact would using polymorphism have compared to leaving this large if/else block?"

After a quick google search I came across this:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.90.8892&rep=rep1&type=pdf

It's a research paper written by Serge Demeyer at University of Antwerp, written in 2002 where looked into that very same question.  I was honestly a little surprised by his results. 

To summarize the paper, he points out that over the years there has been great debate between programmers regarding the handling of conditional logic.  On the one hand you have programmers wanting to leave the if/else structure because the time spent on refactoring would be too great and you would have a negative performance impact on the application.  On the other hand you have programmers arguing that the code would be easier to read after refactoring, which would save a substantial amount of time on bug fixes and feature addition.  While there is a substantial amount of evidence to show that the refactored versions of programs are substantially easier to maintain, there is no conclusive evidence as to whether the program is actually slower.

For Serge's experiment, he uses  a small c++ program against several different compilers.  His results surprised me.  For the majority of the compilers, the time to process was approximately the same when using polymorphism as switch statements.  However, in all but one case the if/else structure was slower, and sometimes substantially slower. 

Serge concludes that with the advances in modern compiler theory, using virtual methods will not negatively impact performance.

In the future I myself would like to look into this same situation using Java and C# to see what kind of difference refactoring conditionals into polymorphism will have on performance with languages that have just in time compilers and an intermediate language.



tldr; Polymorphism is faster and easier to maintain than conditionals.

4 comments:

  1. I did some benchmarking of this on the JVM a while ago, comparing the cost of virtual methods to a sequence of if(x instanceof Foo) calls (the context was implementation of pattern matching in Scala. Either way a compiler would be generating it, so maintainability is a non-issue). The conclusion seemed to be:

    * For one or two branches, the compiler turned the virtual method call into the sequence of tests anyway so they were identical.
    * For a small number > 2 branches the sequence of conditionals was actually faster.
    * For a large number (I think it kicked in at around 8 but this is going to be machine specific) the virtual method call is faster.

    This isn't a terribly surprising result. The performance of the conditional is linear in the number of branches, whileas the performance of the virtual call is constant but has higher constant factors due to cache and pipeline behaviour (it's an indirect jump).

    ReplyDelete
  2. I second what David R. MacIver says about performance. However, a system should be initially implemented using polymorphism. Choosing how to implement the polymorphism control flow is an optimization. One that the compiler or runtime might be able to make for you :-)

    I wonder if such an optimization pass can be coded into a Monad instance?

    ReplyDelete
  3. While MacIver makes an interesting observations, advances in JITs could lead to even the small branch count case to be faster, especially if the VM is using runtime profiling to optimize common code paths at runtime.

    ReplyDelete
  4. I've never quite wrapped my mind around the idea that taking conditional logic off one page and spraying it around on five or six other pages is easier to maintain.

    ReplyDelete