Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"09-15-12 - Some compression comparison charts"

10 Comments -

1 – 10 of 10
Blogger won3d said...

Re: big window LZ77

I believe Adobe uses a zlib variant called "Flate" (which is of course impossible to search for) which takes a few unused match encodings to act as larger offsets. I think PDF and other formats use it.

September 16, 2012 at 10:11 AM

Blogger Chris Green said...

Are there any good multithreaded decompressors around or is the only option to compress the file as independent chunks and then decompress all the chunks at once? A pretty slow decompressor running on 4-6 cores could outperform an optimized one that only uses a fraction of the cpu cycles available to it. Even my phone has 4 cores.

September 18, 2012 at 11:26 AM

Blogger cbloom said...

@Chris - I'm not aware of any inherently multithreaded compression algorithms.

There are lots that support chunking for parallelism (Oodle does it that way).

(de)compression is very inherently serial because each byte output needs its predecessor to be already done.

(the compression side is much easier to parallelize; generally you keep the code output still serial, but the code creation can use parallel match-finders or parallel block-sorters or whatever)

You could imagine doing something like context mixing decompression in fine-grained parallel; eg. run all the models on many cores and then merge them on one core, but that would be killed by overhead I suspect. (it would be a lot easier in what I call a "kernel-like" threading environment; eg. a console where you are in full control of thread switching, so you could lock threads to cores and know they were all running). (there would even be advantages if the cores had independent L2 caches; each model would get its own L2)

September 18, 2012 at 12:34 PM

Blogger won3d said...

Isn't BWT, or at least suffix sorting parallelizable?

September 18, 2012 at 7:49 PM

Blogger cbloom said...

Yeah, but only on the compress side, not the decompress side.

Though you could do a BWT where the entropy coding was done in chunks (so that decode would be parallelized) but then the un-sort was done as a single serial pass (so that you get decorrelation across chunks).

Most modern BWT decoders spend much more time in the entropy/MTF than the un-sort.

(it's also not perfect parallelization even on the compress side; the general approach is like merge sort, but the merging is rather expensive; there is some new work on better ways to parallelize the suffix sort though)

September 19, 2012 at 9:36 AM

Anonymous Anonymous said...

LZHAM has a parallel matchfinder.

September 21, 2012 at 9:40 AM

Anonymous Anonymous said...

And IIRC there have been topics on parallel reverse BWT on encode.ru.

Also, a question:
oodlelzh is a proprietary codec, right?

September 21, 2012 at 9:41 AM

Blogger cbloom said...

"LZHAM has a parallel matchfinder."

Yeah, lots of people do; that doesn't help decode speed, which is all I'm measuring at the moment. Also it's not really a "parallel compression algorithm", it's just using some parallelism to accelerate a sequential parse.

LZHAM's parallel match finder is somewhat more aggressively parallel than others, and there's source code which is pretty clean, so it's definitely worth a look.

September 21, 2012 at 11:50 AM

Anonymous Anonymous said...

>Yeah, lots of people do
Can you pose a few examples?
The only ones that I'm aware of are LZMA and LZHAM.

September 21, 2012 at 12:51 PM

Blogger cbloom said...

Well I was thinking of LZMA, LZHAM, and me ;) I assume there are more (?) but maybe not, I see FreeArc just uses LZMA's for example.

There are parallel suffix sorters and suffix array is a pretty good way to do LZ match finding.

September 21, 2012 at 2:58 PM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.