Using yield statements for statistical calculations.
Reading time: 4 - 6 minutes
Enumerations and the yield statement in C# are a pretty powerful weapon in every developers’ arsenal. In this post I am going to show how we can use enumerations as co-routines to perform sequential infinite calculations. I will also show what’s the performance price to pay for the syntactic sugar involved in the process.
The idea is not to improve upon known patterns, but to show a different scenario where enumerations can provide us with an elegant solution, albeit not always the most efficient one.
For the example we will use a ported code to calculate normally distributed random numbers. The original C++ code can be found here: http://www.agner.org/random/
We started by creating 3 different versions of the code. The first is an straight forward C++ style version that looks like this:
First we initialize the required prerequisites, the uniform random number generator. To be fair in the performance comparison all versions use the same Uniform Random Generator.
The next version returns an enumerator that allows us to iterate 100000 times that looks like this:
The last one allows us to inject though a lambda expression the random generator that intend to use. That version has a very interesting side effect, you are allowed to use whatever random generator that you choose.
As we can see here, the generator lambda is quite simple, it just gets uniformly distributed random values and makes sure they get normally distributed.
The interesting property is that the execution works like a coroutine, returning the value but continuing at the last position after the return yield when it is called the next time. The side effect is that the smaller the procedure the higher the overhead ratio. For example, in this case, we choose to inject the uniform random generator call, that is called waaay too much in an small timeframe causing an small overhead to add up a lot.
This is a comparative study of the execution time of each one of the methods, where the injecting lambda taking as much as 50% more than the plain vanilla version. These performance measures would at first sight negate the idea of using yield statement; however, the good news is that this is an extreme case. This is what it is called a tight loop, where even a method call can do quite a difference.
That is shown best in the difference between the TestNormalWithLambda and the TestNormalEmbeeded, where the Random generator is passed as a parameter. In that case there is only a 20% increase, pretty big for a tightly packed loop, but not too much if you have an standard use.
As a conclusion, use iterators with yield statements where you have iterative methods and are not supposed to be used in tight loops; as the readability they provide is far greater. You have lots of time to obscure your code optimizing for performance after you know your algorithm works as expected. As Knuth said: “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil”.


Recent Comments