Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"09-13-12 - LZNib"

12 Comments -

1 – 12 of 12
Blogger won3d said...

Yeah, Snappy is mostly for RPC. I think most of the pain you're seeing is from the fact that the internal implementation doesn't actually use the C++-y things like iostreams, but whomever open sourced it just jammed it in there. Personally, I don't see why you would use it over LZ4, unless you were already using it everywhere (like we are).

A while back (I might have even asked you), I was going to make a nibble-oriented LZP. Hmm...

September 13, 2012 at 9:33 PM

Blogger NeARAZ said...

Speaking of compressor APIs, did you try lzham? Thoughts on it?

September 13, 2012 at 9:45 PM

Blogger ryg said...

There's also aPLib. Was really good for a pure LZ77 without entropy encoding around 2000 or so, haven't compared it to anything recently.

It's part of the "byte-aligned literals but bitwise control codes" camp though. Good parser (well-tweaked heuristic, not optimal), and IIRC it has unlimited window size (intended for all-in-memory decompression). And an API that doesn't suck (no code for the compressor though, but if you already have an optimal parser it's trivial to write an encoder).

Code is here: http://www.ibsensoftware.com/products_aPLib.html

September 13, 2012 at 10:50 PM

Blogger cbloom said...

@Won - yeah as I was doing this I started thinking about LZP-Nib. (I see from my email history search that you did in fact mail me about it long ago).

I suspect that you don't want to do a "pure" LZP (with only one match string), but rather use a few "ways" in the cache table, or perhaps a few bits of forward hash.

Of course this is mostly just rehashing old ideas :

http://cbloomrants.blogspot.com/2011/05/05-20-11-lzp1-variants.html

Also quicklz is a forward-hash LZP (LZP1c) if I'm understanding it correctly.

September 14, 2012 at 9:55 AM

Blogger cbloom said...

LZHAM needs this -

int DictLog2(int rawLen)
{
int l2 = cb::intlog2ceil( (float)rawLen );
return cb::Clamp(l2,LZHAM_MIN_DICT_SIZE_LOG2,LZHAM_MAX_DICT_SIZE_LOG2_X86);
}

September 15, 2012 at 9:47 AM

Blogger Jyrki said...

would you be willing to try gipfeli?

http://code.google.com/p/gipfeli/

September 17, 2012 at 6:48 AM

Blogger cbloom said...

gipfeli needs at least a download .zip, and really there should be a Win32 .exe . It also appears to be entropy coded using a simple bit-packing scheme.

I believe that if you're doing bit-packing you may as well just do huffman. But perhaps if your speed/ratio is good you would convince me otherwise.

September 17, 2012 at 8:36 AM

Blogger cbloom said...

@ryg - I tested aPLib; compression is very similar to CRUSH

September 17, 2012 at 9:35 AM

Blogger cbloom said...

@gipfeli - I'm crankier than usual this morning. But god damn I'm sick of dealing with Google Code. (sourceforge and github and etc etc are no better). The main page should always just be a link to download a zip of the full source tree.

September 17, 2012 at 10:09 AM

Anonymous Anonymous said...

cbloom, if you're looking for a strong LZ77, take a look at LZMAT. It's not entirely about strength, but on some kinds of data it does extremely well, f.e I think it supports arbitrary match length with logarithmic length encoding.

As to aPlib, last time I checked it didn't come with encoder sources...did I miss something?

September 21, 2012 at 9:52 AM

Blogger cbloom said...

Re : aPlib - Correct, I just ran the EXE to see if the compression it achieved was interesting. It was not.

Re : LZMAT - yeah I tried it; compression ratios are somewhere around LZ4/CRUSH . It does have slightly different characteristics than the others, but it's definitely small window. Total on my set is 14518303

September 21, 2012 at 11:56 AM

Blogger Unknown said...

I know that this is a very old post. I just wanted to comment that aPLib is not a pure LZ77. It has a somewhat strange mechanism for re-using offsets, where instead of including a required offset, it can re-use the offset used the last time when the match was found.

August 24, 2018 at 5:10 PM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.