10-06-08
Followup on the "Russian Range Coder". So, first of all let me stress once again that this thing is pointless and you should not use it.
It's an "optimization" that gives up some compression and doesn't actually make you faster at all. You should never do approximations or
optimizations unless they actually help, even if they look like they should be faster. Best to stick with full quality everywhere lest it
come back to bite you.
But I was thinking about it and I realized you can improve the renorm loop. The original loop in the case of underflow just truncates the
range down to the bottom part to make the top bits always equal to the top bits of "low". But that can be really bad in some cases, since you might
be straddling a boundary, but most of your range could be above the straddle, not below. It's easy enough to just check whether you have
more range above or below the straddle :
while(rc->range < (1<<16))
{
int above = (rc->low + rc->range) & 0xFFFF;
int below = (~rc->low) & 0xFFFF;
//int above = rc->range - 1 - below;
/*
U32 hi = rc->low + rc->range;
U32 hi_below = rc->low + below;
U32 lo_above = hi & 0xFFFF0000;
*/
if ( above > below )
{
rc->low += rc->range - above;
rc->range = above;
}
else
{
rc->range = below;
}
*rc->ptr++ = (U8) (rc->low >> (RANGE_SIZE - 8));
rc->range <<= 8;
rc->low <<= 8;
}
That is, we either push the top bits of low up to the top bits of hi, or push the top bits of hi down to the
top bits of low, depending on which gives us more code space. Again, this doesn't hurt speed at all because
it's rare. In fact, that's why you should just use a proper virtual-queue range coder that handles underflow
right.
Just to make it super clear what's happening, here is an actual example :
rc->low + rc->range 0x4d0043e4
rc->low 0x4cffa580
above 0x000043e4
below 0x00005a7f
rc->low + below 0x4cffffff
rc->low + rc->range - above 0x4d000000
Some numbers :
m:\ccc\book1 : inLen : 768,771
True Entropy : 435,042.6 : 4.527 bpb
radArith : encLen : 435,116 = 4.528 bpb
bytes exceeding entropy : 73
original RRC : encLen : 435,284 = 4.530 bpb
bytes exceeding entropy : 241
cbloom RRC : encLen : 435,216 = 4.529 bpb
bytes exceeding entropy : 173
radArith cycles: 69.6
original rc cycles: 74.2
cbloom rc cycles: 70.3
Here "radArith" is a normal Michael Schindler range coder with virtual queue. My modified "russian range coder"
has about half the waste of the original (compared to the canonical range coder, not compared to entropy).
Note that the arithmetic coders here are all using a max cumulative frequency of 16384, while the "true entropy"
uses the exact character counts in the file.
BTW these are the links to the authors of the "Russian Range Coder" - they seem to all be dead. If anyone has live links
to these people let me know and I'll hook them up.
/*
Arithmetic coder based on "carryless rangecoder" by D. Subbotin.
* http://www.softcomplete.com/algo/pack/rus-range.txt
(Original C++ code by Subbotin)
* http://hem.spray.se/mikael.lundqvist/
(C impelementation by M. Lundqvist)
* H. Okumura: C magazine, Vol.14, No.7, pp.13-35, Jul. 2002
(Tutorial of data compression with sample codes, in japanese).
*/
"10-06-08 : Followup on the "Russian Range Coder""
1 Comment -
Even those older entries are very interesting!
I looked up the missing links on the Wayback Machine and found them here:
Subbotin's range coder (http://www.softcomplete.com/algo/pack/rus-range.txt)
Mikael Lundqvist's page (http://hem.spray.se/mikael.lundqvist/)
July 31, 2020 at 12:38 AM