Test total numbers (shorter - faster)
Test Platform: Nehalem Core i5, 2.67GHz (Turbo Boost enabled)
BitMagic Library 3.6.2 (not yet released)
Windows 7 32-bit used default 32-bit compile using MSVC8. SSE4 optimization disabled.
Ubuntu 9.10 64-bit GCC 4.4.1 with compile options ( -march=core2 -m64 -msse4 )
BitMagic 64-bit optimizations + SSE4 optimizations:
(this combination will be supported in BitMagic Library 3.6.2).
All measurements in seconds (shorter - faster).
As you can see, 32-bit default compile using stock compiler looses in almost every test to 64-bit SSE4.2 version, tuned for the platform.
Yes, this experiment intentionally staged for Windows7+MSVC to loose (by using unfavorable compilation and optimization settings: stock x86 32-bit software suffers from register starvation, ofter ignores presence of SIMD SSE unit. Having said that such resource under-utilization often happens in real-life and it is often ignored as insignificant. From this data we can see that sometimes insignificant things can add-up to measurable values.