Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"04-04-13 - Oodle Compression on BC1 and WAV"

11 Comments -

1 – 11 of 11
Blogger won3d said...

You are beating RAR in one important aspect: not having an embedded exploitable bytecode engine:

http://blog.cmpxchg8b.com/2012/09/fun-with-constrained-programming.html

For deltas, one thing that Jeff said that makes lots of sense is to segregate the sign bits since they're all random, and the magnitudes end up having an exponential-ish distribution. Obviously you still have the top/bottom byte issue.

One random thing I'd been wondering about for delta-coded things was if you knew the bounds of the original signal, could you use the previous value as something that predicted the sign? If you know you're at the maximum (minimum) value there's no way to go but down (up). And the middle is all random anyway.

April 4, 2013 at 8:12 PM

Blogger cbloom said...

"For deltas, one thing that Jeff said that makes lots of sense is to segregate the sign bits since they're all random"

The good work on this is still CALIC and JPEG-LS. They do some clever things with post-delta distributions to ensure that the sign bit actually *is* random. Normally it is not, but if you bias correctly you can make it random. The other clever bit is to flip your post-delta values so that you always put the longer tail on the positive side (or vice-versa, doesn't matter as long as it's consistent).

"If you know you're at the maximum (minimum) value there's no way to go but down (up)"

Yeah, for sure. On 16-bit WAV I'm sure this doesn't matter because reaching the edge is so rare, but on 8-bit BMP type data it can be measurable, because pixel values do hang out around 00 and FF a lot.

April 5, 2013 at 7:11 AM

Blogger cbloom said...

"You are beating RAR in one important aspect: not having an embedded exploitable bytecode engine:"

Ah, craziness! Presumably this is why Google now makes it troublesome to send RARs in email. Grumble.

April 5, 2013 at 7:12 AM

Blogger Brian said...

"Presumably this is why Google now makes it troublesome to send RARs in email. Grumble."

Hah, though do you really use RARs? Isn't it an archaic archiving format that "no one" should use? I of course mean in preference of 7z or zip with lzma/etc. Just learning more in depth about archive formats, so please excuse what might be ignorance.

April 5, 2013 at 2:21 PM

Blogger cbloom said...

Sure I use RAR. It just works, and who cares about super compression ratios anymore? (*)

Disks are cheap, the internet is fast. It's not like the old days when you didn't want to spend another $1000 to get another 1 MB of hard disk, or had to send data over a 2400 baud modem, then you really cared about max compression.

(* = of course there are lots of good reasons to care about compression, but none of them apply to just home users. Content providers want to get their bandwidth down; distributors need to fit data on a DVD; mobile providers want downloads to be tiny)

Actually as a home user I rarely use anything but zip.

The only data on my home disk that's annoying me are all the super-huge RAW photos from my digital camera. I'd like to have a RAW-specific compressor, that might actually be worth using because it would put a serious dent in the size of my backup set. (and backups are still ridiculously slow)

April 5, 2013 at 3:23 PM

Anonymous Anonymous said...

I don't think there's any evidence the rar bytecode engine is exploitable. At least there isn't any at that link.

April 5, 2013 at 6:03 PM

Blogger Aaron said...

Compressing RAWs, great idea! There's this: Rawzor though I've never tried it.

April 6, 2013 at 1:24 PM

Blogger Trixter said...

"out[i] = in[i] - 2*in[i-1] + in[i-2]"

I'm confused; on a few tests, this is performing oddly. Did you mean this instead?

out[i] = in[i] + 2*in[i-1] - in[i-2]

I ask because the simplest linear predictor I've been using was just 2*(current sample) - (previous sample) and it seems to work more predictably than what you posted.

I just want to make sure I'm not going crazy...

April 8, 2013 at 2:20 PM

Blogger cbloom said...

I think it's right, but maybe we have a syntax difference.

The current sample is in[i]

A pure neighbor predictor is

pred = in[i-1]

and

out[i] = in[i] - pred

A linear gradient predictor is

pred = in[i-1] + (in[i-1] - in[i-2])

= 2*in[i-1] - in[i-2]

then

out[i] = in[i] - pred

= in[i] - (2*in[i-1] - in[i-2])

= in[i] - 2*in[i-1] + in[i-2]

April 8, 2013 at 2:52 PM

Blogger Trixter said...

Thanks for the longhand; the predictor we use is identical given the differences in our syntax.

However, I still don't understand how the sign is changing from:

= in[i] - (2*in[i-1] - in[i-2])

...to:

= in[i] - 2*in[i-1] + in[i-2]

Again, I might be missing something, but it looks like you're writing A-(B-C) = A-B+C ...

April 9, 2013 at 2:25 PM

Blogger cbloom said...

Indeed!

April 9, 2013 at 2:55 PM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.