Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"09-06-12 - Quick WebP lossless examination"

9 Comments -

1 – 9 of 9
Blogger Unknown said...

Awesome blog. A few random questions on HDR images: What would you suggest for lossless encoding of HDR images (16-32 bit floating point)? E.g. scientific images.

How's the B44 and PZIP compression of OpenEXR? And JP2K / JPIP -- good, bad, ugly?

-- dg

September 7, 2012 at 4:13 PM

Blogger cbloom said...

That's a good question and I don't have a good answer.

I haven't looked into it much, but from what I've seen all the choices are a disaster in one way or another. (I dislike OpenEXR due to it being such a mess and over-complex)

The lossless compression ratio of JPEGXR and J2K for 16 bit is pretty poor.

There's not much better at the moment than just using LZMA with delta:2 or delta:4.

September 7, 2012 at 4:52 PM

Blogger Unknown said...

Heh. Why am I not overly surprised. Thanks for the insight. =)

Okay, if you had to encode 16 or 32bit floats per channel, what would you suggest?

September 8, 2012 at 1:22 AM

Blogger Jan Wassenberg said...

It may be worthwhile to combine conventional pixel predictors with existing scientific computing approaches for decorrelating floating-point representations.
Some slightly dated references on the former (I haven't looked into this since 2010) -
2010 "Floating precision texture compression"
2009 "pFPC: A Parallel Compressor for Floating-Point Data"

September 8, 2012 at 8:25 AM

Blogger cbloom said...

Yeah, the first question with 32 bit floats is - do you really need all the values in those bits? Typically the answer is no.

The standard approach is to convert to some kind of log-int representation and then do normal predictors on those ints.

see:

http://cbloomrants.blogspot.com/2009/05/05-26-09-storing-floating-points.html

and from the comments of that post:

http://www.cc.gatech.edu/~lindstro/

http://www.cs.unc.edu/~isenburg/

in particular :

http://www.cs.unc.edu/~isenburg/lcpfpv/


September 8, 2012 at 8:59 AM

Blogger Unknown said...

Excellent -- thanks for the tips! SZIP is a common scheme used by scientific formats (HDF, etc), but is just lossless rice coding. Perhaps image based block encoding with better pixel representations may prove useful. Thanks a lot!

September 8, 2012 at 3:49 PM

Blogger Jyrki said...

Thank you for the examination of WebP lossless. Here are some comments based on your examination:

0. Coding pixel-by-pixel does not hurt compression, it improves it. Copying a 4-byte ARGB color from RGBA, GBAR, or BARG bytes is not useful for typical image. Limiting the distances (as well as lengths) to pixels reduces the amount of bits transmitted for each LZ77 reference.

1. We experimented with larger and more complicated predictors. The decoding speed and the predictor compromise was based on benchmarking using a set of 1000 images crawled from the web. Spatial predictors use a significant portion of the time for decoding. The Select-predictor is computationally less complex than Paeth -- note that you need four runs of Paeth for an RGBA pixel, and Select produces slightly better compression on the web image test corpus.

2. 14 bit image size is a match to WebP lossy, which is derived from the video format. WebP team is working on WebP-Mux a.k.a RIFF based container to represent image fragments (tiles) as separate chunks. Larger images can be tiled to smaller images and represented with WebP-Mux.

3. The color transforms that we allow are local, and the current encoder applies them after the spatial transform, to reduce the entropy (by reducing correlations) in residuals.

4. We believe the 1 million pixel window is a good choice: fetching from the main memory (instead of the cache) is still likely being faster and cheaper than getting new bytes transmitted through radio.

We believe that LZX with the last offset cache is more useful with text than with images. However, we did not experiment with this in WebP lossless, and it is possible that we could get more gains from this.

The selection of Huffman coding was based on experimentation. The arithmetic coder gave us 1 % increase in compression density at 3x slower decoding speed. We considered the arithmetic coder, but chose Huffman for the released version. We believe that the palette use behaves somewhat similarly to arithmetic coding. There a combination of four literals (ARGB) are encoded with a single entropy coded symbol.

5. and 6. The color cache is the concept used for local palettes. So there is either a global palette or the use of color cache.

7. The format itself supports currently only 8-bits per component. Higher dynamics are planned with enhancement layers (via WebP-Mux), where different bitplanes represent separate image chunks in the RIFF container.

ADDENDUM. We believe that there is still significant room for improvement in WebP encoder. Specifically the LZ77 optimization should be redone after the entropy code clustering has been completed. This might give savings of around 4 % for graphical images. Also, much of the encoder behavior is heuristic instead of exhaustive (like for example pngcrush is). Writing a slower more exhaustive encoder gives some more savings for WebP lossless.

September 17, 2012 at 6:32 AM

Blogger cbloom said...

Hi Jyrki, thanks for the info.

@1 - good point about doing it once instead of 3-4 times

@4 - Yeah I didn't mean that 1M window was a bad thing, just pointing out the consequences. Larger windows always help compression, they just put a constraint on decoder memory use.

Just switching Huffman to arithmetic is not a fair test. The whole advantage of arithmetic coding is not in saving a fraction of a bit, it's that context modeling is much easier, so switching to arithmetic should also add some context bits to the coding.

I guess I don't understand how ARB literals are sent. Early on the doc says

"1. Huffman coded literals, where each channel (green, alpha, red, blue) is entropy-coded independently;"

which seems to indicate each one gets its own huffman, but later on it sort of implies that only G is huffman and the others are "read" somehow in a way that is not specified.

September 17, 2012 at 8:59 AM

Blogger Jyrki said...

All of ARGB have their own Huffman codes.

October 18, 2012 at 8:37 AM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.