Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"05-14-09 - Image Compression Rambling Part 2"

7 Comments -

1 – 7 of 7
Blogger ryg said...

There's a major caveat with H264 beating JPEG2000 at intra coding in terms of PSNR: I've never found a comparision that states which color space it was done in, which is really important. H264 PSNR is usually specified in terms of PSNR on the decoded YUV signal, since the standard doesn't cover getting from YUV to RGB. J2k decoders, however, usually give you the decoded data back in RGB. The correct way to test this would be to use the same YUV source data for both H264 and J2k, and turn off any color space conversions on the output data, but unless it's explicitly mentioned it's safe to assume it wasn't done.

Instead, the easiest way to compare them is to just convert the decoded RGB data to YUV using the same process you used to get the YUV for H264 in the first place. This puts J2k at a disadvantage since its data goes through two (slightly lossy) transforms before the PSNR gets computed.

There's an even bigger problem though - still image coding normally uses YCbCr normalized so that Y is in [0,255] and Cb, Cr are in [-128,127]. Video coding, however, customarily uses the D1 standard ranges, which is Y in [16,235] and Cb, Cr in [16,240]. (D1 directly recorded NTSC signals, so it needed blacker-than-black levels for the blanking intervals and used whiter-than-white levels for sync markers and the like).

In short, if you don't explicitly make sure that J2k and H264 work on exactly the same YUV input data and are compared based on the YUV outputs they produce, H264 is automatically at an advantage because it gets values with a slightly lower dynamic range; and unless you compare in RGB, J2k doesn't benefit from its extra precision in the PSNR calculation.

That said, aside from the other issues with AIC that you mentioned, it also leaves out the deblocking filter and it fixes the quantizer once for the whole image (H264 can adapt it over the course of the image if beneficial).

May 15, 2009 at 8:44 PM

Blogger cbloom said...

"There's a major caveat with H264 beating JPEG2000 at intra coding in terms of PSNR: I've never found a comparision that states which color space it was done in, which is really important."

Yeah you make an excellent point; I swore long ago to never trust the PSNR numbers in papers because they mess things up so often.

You see so much stuff that's just retarded, like comparing JPEG with the perceptual quantizer matrix using an plain RMSE/PSNR metric vs. something like JPEG-2000 without perceptual quantizers.

I was assuming they compared errors in RGB, but even if they did that has its own problems.

May 16, 2009 at 12:03 AM

Blogger cbloom said...

"(H264 can adapt it over the course of the image if beneficial)."

Do you know how that works? I haven't found any papers on it. Does it signal and actually send a new Qp with a macroblock?

May 16, 2009 at 12:05 AM

Blogger ryg said...

Yep. Whenever a macroblock with a nonzero number of residual coefficients is sent, the encoder also writes
"mb_qp_delta" which is the difference between the old and new qp value encoded as signed Exp-Golomb code (same encoding as used for motion vectors).

The primary purpose of this is definitely rate control, but since the qp->quantization error curve is anything but monotonous, this is something you'd want to consider during RD optimization as well.

May 16, 2009 at 4:49 AM

Blogger cbloom said...

In the long long ago I made graphs like this :

http://www.cbloom.com/src/lena_25_err.gif

demonstrating just how extremely nonlinear the RD variation with quantizer can be.

May 17, 2009 at 8:47 PM

Blogger ryg said...

One more note on lapping: while lapping solves blocking artifacts "on paper", it falls a bit short in practice. What lapping boils down to is replacing your DCT basis functions with wider basis functions that smoothly decay towards zero some distance from the block. This is fine for the AC bands, but for DC, this means you still have per-block averages (now with smoother transitions between them). For medium-to-high quantization settings, the difference between DCs of adjacent blocks in smooth regions (blue or cloudy skies for example), can still be significant, so now you have slightly blurred visible block boundaries instead of hard block boundaries. Definitely an improvement, but still very visible.

The background of the house.bmp sample in this comparison is a good example. Also, in general, the PSNR results of HD Photo/JPEG XR are very mediocre in that comparison and others. Certainly not "very close to JPEG2000" as advertised. This SSIM-based evaluation (with the usual caveats) is outright dismal, with HD Photo lagging far behind even ordinary JPEG in quality, and also being consirably worse than JPEG-LS in the lossless benchmarks.

May 22, 2009 at 10:14 AM

Blogger cbloom said...

Yeah, I agree completely.

I did a lot of experiments with lapping. In some cases it's a big win, but usually not. It does create a very obvious sort of "bilinear filtered" look. Of course it should, it's still just taking the LL band and upsampling it in a very simple local way.

Lapping also actually *hurts* compression at most bit rates because it munges together parts of the image that aren't smooth.

The obvious solution is adaptive bases. You would like adaptive bases that :

1. detect high noise areas are reduce their size to localize more

2. detect smooth areas and increase their size so it's a broader smooth up-filter for the LL.

Of course the best known way to do this is just wavelets!

May 22, 2009 at 10:22 AM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.