Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"10-15-10 - Image Comparison Part 6 - cbwave"

6 Comments -

1 – 6 of 6
Blogger ryg said...

How do you do the color-space conversions? Do you convert to 8-bit integer RGB first and go from there to S-CIELAB or do you keep float intermediates? Every discrete rounding step in the pipeline introduces a bit of round-of error, and it keeps adding up. You need to use more than 8 bits per channel for your decoded images or this rounding error biases your comparison. Easiest solution (well, not easiest really, but least likely to develop unforeseen systematic biases) overall is to do the YUV->RGB in float and store floating-point RGB values. If you don't want to use floats, at least use more than 8 bits per color channel.

The same thing goes for other rounding steps throughout the codec. Don't fully descale post-DCT, keep at least 2 bits extra and only remove them at the very end (usually during Y'CbCr->RGB). Similarly, consider keeping your YCbCr/YUV coefficients in more than 8 bits (usually 16 bits signed for SW implementations) and only clamp once at the end. If you use a IJG-style Triangle (aka bilinear) upsampling filter, do the multiplies but hold the divides/shifts (if you're working in 16 bits, you have enough leftover bits to do this). You can fold the divides into your one descaling shift at the end.

When you finally do the shift, rounding is important. At the very least, add a rounding bias of 0.5. Even better, use some unbiased tie-breaking rule like round-to-nearest-even (not really an issue if you use FP internally or have lots of extra bits, but it's significant if you're working in fixed point and only have 2-4 extra bits). Alternatively (and easier during development), just do everything in float and only round to int at the very end. Same stuff goes for the encoder. Scale your RGB->YCbCr matrix up by a small power of 2, use more than 8 bits internally for YCbCr coefs and only remove the extra scale factor during quantization. Doing all this properly can give you an improvement of 1dB PSNR for very little effort indeed, depending on the image (it's not really visible, but RMSE/PSNR is hypersensitive to this).

S-CIELAB first does RGB->XYZ, and CIE Y happens to be very closely aligned with YCbCr Y (and less so with the other transforms). If you treat chroma differently from luma, that means all other bases will smudge some of their chroma noise into CIE Y (and from there into L*) while JPEG YCbCr won't. IOW, given your choice of metric, it's no surprise that YUV comes out looking as good as it does.

October 15, 2010 at 9:30 PM

Blogger cbloom said...

My code is all 100% float so I don't have any of those problems.

I assume that some of the other people suffer from those problems.

The problem with the cbwave downsample is just that it uses box filters.

October 15, 2010 at 9:47 PM

Blogger cbloom said...

"that means all other bases will smudge some of their chroma noise into CIE Y (and from there into L*) while JPEG YCbCr won't. IOW, given your choice of metric, it's no surprise that YUV comes out looking as good as it does."

I'm not sure I buy this argument.

While my metric is in fact LAB, it's *float* LAB, and it's basically just a rotation of the color matrix. Rotation doesn't change L2 so it shouldn't be a large effect on error.

Obviously if you were downsampling chroma, then having your axes aligned should make a big difference, but the YUV basis wins pretty big even without downsampling - in the non-downsampled case the color channels are all treated the same way.

Something a little more complex is going on. I believe it must be something about the preferential directions of the discretization grid.

October 15, 2010 at 9:52 PM

Blogger ryg said...

"While my metric is in fact LAB, it's *float* LAB, and it's basically just a rotation of the color matrix. Rotation doesn't change L2 so it shouldn't be a large effect on error."
Wait a minute. First off, you're not usually coding in linear RGB, but YUV derived from nonlinear RGB with gamma. First step when converting to XYZ (which is always linear) is to convert that to linear RGB first (there goes linearity of the whole transform). The RGB->XYZ matrix is *not* orthogonal and hence doesn't preserve the L2 metric (or related quantities). And CIELAB takes the CIE XYZ coefficients and applies some more nonlinear transforms on top (the remapping function, which does cube roots for the higher values and is linear in the lower parts of the range).

The remapping function partially cancels out with the gamma curve, but still, it's definitely not just a rotation.

October 15, 2010 at 10:09 PM

Blogger cbloom said...

Yeah okay, that was wrong. RGB-YUV is pretty close to a rotation, though its not actually orthornormal either. And RGB-LAB has a degamma, but then it has a sort of regamma, but of course it doesn't preserve L2's , that's the whole *point* of it is to not preserve L2's, it's supposed to make the distances more perceptually uniform.

But it still doesn't make sense to me. It's not like the YUV space data is passed directly to the error metric - it goes back into integer RGB to get written out to a BMP. I figure that step is appropriate because in practice to use it we convert to 24 bit RGB to display on the screen, so that step should be included in the error metric.

Why for example is YUV so much better than KLT-FixedY which uses the same Y but chooses it own chroma axes?

There's two different issues with colorspace choice in a lossy compressor. One is how it decorrelates the data and simply puts it in a more compressible form. The other is how it rotates (and shears and scales) the quantization grid.

I dunno, I have to think about it a bit more.

October 15, 2010 at 10:46 PM

Blogger cbloom said...

I've been thinking about this a bit.

The question is, why is it such a big advantage for cbwave to use YUV, which is more similar to the measurement basis LAB than the other color spaces.

First of all let's be clear about what's NOT going on :

1. cbwave is not downsampling chroma or in any way taking bits away from chroma.

In coders that *do* downsample chroma, then obviously it is a big advantage to have your color axes aligned with the measurement axes. This is because you are killing chroma data, and if your concept of chroma is rotated from the measurement basis, then it will believe that you are incorrectly killing some useful bits.

2. You might think it's always best to work in the measurement basis, but obviously that's not true - see the RGB colorspace results for RMSE for example. The advantage of a good decorrelating colorspace is much more valuable.


My guess is that the largest effect is the orientation of the quantization axis. And in particular, the luma axis since it's the most important one.

Maybe I'll post a picture because it's hard to explain with words.

October 16, 2010 at 2:04 PM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.