Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"07-06-09 - Small Image Compression Notes"

10 Comments -

1 – 10 of 10
Blogger ryg said...

Lapping: Yeah, that mirrors my experience as well. It looks nice on paper, but the gains in PSNR are relatively small and completely vanish once you compare based on SSIM or (subjective) perceptual quality. Good idea to turn "how much does lapping actually help" into an optimization problem.

"I guess the fix to this is some hacky/heuristic way to just force the lagrange optimization not to be too aggressive.": There's the Psy RDO stuff that's been added to x264 some months ago. I couldn't find details (didn't spend much time looking though), but looking at the images, what seems to be happening is that it tries to keep the amount of "noisiness", measured as e.g. sum of absolute difference between all pixels in their block and their immediate neighbors.

Put differently, you treat the low-frequency part as usual, and for the high-frequency part, you try to converge on something that has roughly the same energy but not necessarily at the right frequencies. To give an 1D example, if your DCT coefficients are "18, 4, 3, 2, 1, 1, 1, 1", such an algorithm would give you something like "18, 4, 3, 2, 0, 0, 2, 0".

So if you have an image region that's mainly smooth with some noise on top (due to film grain, fine details, whatever), you reconstruct the smooth part properly, and for the noise you just substitute some arbitrary noise at roughly the right frequency and roughly the right intensity - playing on the fact that the human visual is very good at detecting the presence of noise and very bad at telling one 8x8 block of noise from another.

The main question would be how many high-frequency coefficients you need to make noise stil seem noisy and not let the transform basis functions shine through too much.

July 7, 2009 at 3:07 AM

Anonymous Anonymous said...

The main question would be how many high-frequency coefficients you need to make noise stil seem noisy and not let the transform basis functions shine through too much.

Maybe add some "noise" basis functions that are less correlated than the normal ones! (they'd be redundant, but...)

July 7, 2009 at 3:55 AM

Blogger won3d said...

"Yeah sure naive JPEG looks awful, but even a deblocking filter after decompress can fix that case very easily"

What do people use for JPEG deblocking? All I could find was this.

July 7, 2009 at 7:19 AM

Blogger cbloom said...

I'll write a post about deblocking, I've collected a bunch of papers on it.

"Maybe add some "noise" basis functions that are less correlated than the normal ones! (they'd be redundant, but...)"

Yeah, this is something I've been thinking about too, maybe I'll write a post instead of a long comment...

July 7, 2009 at 8:03 AM

Blogger   said...

I've done something similar before. I detect a noisy region at a particular bit-plane and just generate noise on the decompression side instead of trying to encode the noise exactly. Works really well.

July 7, 2009 at 1:53 PM

Blogger cbloom said...

Actually I just tried something totally brain-dead and it works great.

I just only allow the RD to consider killing the very last non-zero value, and only if it's 1, and if it's not in a very important AC coefficient.

This eliminates all the unsightly variation and surprisingly preserves 90% of the RDO win. It also makes the encoder way faster.

July 7, 2009 at 3:28 PM

Anonymous Anonymous said...

I've probably posted about this here before, but for quantized DCT deblocking, I still think there's something clever to be done by knowing about the quantizer.

I.e. when you dequantize a DCT coefficient, you put it in the middle of its range -- since this will minimize error across all possible images that would have had that quantized coefficient. It would be better to instead choose a dequantized value within the allowed range which prefers decoding to an image that is "more like what you find in real images" instead of decoding to the mathematical average of all mathematically possible images.

Smartly choosing the right dequantized values for multiple coefficients in a given block might make it possible to get rid of ringing; deblocking might require looking at cross-block properties, and I don't have the slightest clue where to even begin to figure out what mathematics would be involved in a decoder with either of those behaviors.

July 7, 2009 at 4:36 PM

Blogger cbloom said...

Yeah, Sean, there are papers on exactly that. I'll mention it when I write about deblocking.

July 7, 2009 at 4:38 PM

Anonymous Anonymous said...

http://ieeexplore.ieee.org/search/wrapper.jsp?arnumber=723513

July 7, 2009 at 4:43 PM

Anonymous Anonymous said...

Ah, good, there you go. I've never read any of these papers because of the pay wall, so I could never be sure what they were even about.

July 8, 2009 at 5:26 AM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.