In a recent post, I presented how memory layout may influence a matrix summing speed. It’s interesting to see that there are plenty of pitfalls we might fall into when writing sum function and memory layout is not the only one. Please first read the previous post on summing if you haven’t already.

Without thinking why, let’s take a look at those two functions: