http://linux.softpedia.com/get/Programming/Libraries/BitMagic-library-53513.shtml
Monday, January 11, 2010
Saturday, January 9, 2010
BitMagic library version 3.6.3 released
This release fixes critical bug in 64-bit mode (#define BM64OPT) and improves performance of bit-counting related operations.
Parallel superscalar bit-counting
While working on version 3.6.3 collective Internet wisdom brought yet another interesting bit-counting algorithm, which turns to be efficient for 64-bit systems with superscalar execution (a form of parallelizm, where linear code with data-independent instructions is actually executed in parallel).
This code below implements population count for 4 64-bit integers. (Tested with GCC 4.4.1 64-bit Linux). This becomes part of BM 3.6.3 (when SourceForge recovers from its (I hope temporary) down).
This code below implements population count for 4 64-bit integers. (Tested with GCC 4.4.1 64-bit Linux). This becomes part of BM 3.6.3 (when SourceForge recovers from its (I hope temporary) down).
inline
unsigned bitcount64_4way(bm::id64_t x, bm::id64_t y,
bm::id64_t u, bm::id64_t v)
{
const bm::id64_t m1 = 0x5555555555555555;
const bm::id64_t m2 = 0x3333333333333333;
const bm::id64_t m3 = 0x0F0F0F0F0F0F0F0F;
const bm::id64_t m4 = 0x000000FF000000FF;
x = x - ((x >> 1) & m1);
y = y - ((y >> 1) & m1);
u = u - ((u >> 1) & m1);
v = v - ((v >> 1) & m1);
x = (x & m2) + ((x >> 2) & m2);
y = (y & m2) + ((y >> 2) & m2);
u = (u & m2) + ((u >> 2) & m2);
v = (v & m2) + ((v >> 2) & m2);
x = x + y;
u = u + v;
x = (x & m3) + ((x >> 4) & m3);
u = (u & m3) + ((u >> 4) & m3);
x = x + u;
x = x + (x >> 8);
x = x + (x >> 16);
x = x & m4;
x = x + (x >> 32);
return x & 0x000001FF;
}
Thursday, January 7, 2010
BitMagic library version 3.6.2 -- broken
Bad news, version 3.6.2 confirmed to be broken for 64-bit builds. Urgent patch is coming...
Subscribe to:
Posts (Atom)