Re: optimization: Many years ago, we're about to ship a 2D RTS, thousands of little animated sprites running around on a (pretty expensively calculated, pseudo-3D, kinda-textured) terrain. We do everything on the CPU, the target hardware is somewhere in the lowest triple-digits of Pentiums, and there's a mountain of complicated code that tries to find out a good but still minimal set of rectangles to invalidate on the screen.
Somebody comes up with a really good optimization that should help a lot with that. Gut feeling says something like 2x; instead, the expected result is more like 5%. Oh well, we think, never trust your gut feeling, measure everything, 5% is still a win etc.
We happily ship.
A few weeks later we find that optimization really achieves a better than 2x improvement - while introducing a bug that results in all sprites being blitted twice.
Currently doing exactly this. Got any opinions on Golomb-Rice coding and prefix coders generally?
September 29, 2012 at 1:46 AM
This
Theora update is a good reminder to be careful when optimizing.
Apparently they had a massively broken DCT which was just throwing away quality for no good reason.
This may seem really bone-headed but it's very very common. You may recall in some of my past posts I looked into some basic video stuff
in great detail and found that almost every major video encoder/decoder is broken on the simple stuff :
the DCT (or whatever transform)
quantization/dequantization
color space conversion
upsample/downsample
translating float math into ints
These things are all relatively trivial, but it's just SO common to get them wrong, and you throw away a bottom bit (or half a bottom bit) when
you do so. Any time you are writing a compressor, you need to write reference implementations of all these basic things that you know are right -
and check them! And then a crucial thing is : keep the reference implementation around! Ideally you would be able to switch it on from the
command line, or failing that with a build toggle, so at anytime you can go back and enable the slow mode and make sure everything works as
expected.
(of course a frequent cause of trouble is that people go and grab an optimized integer implementation that they
found somewhere, and it either is bad or they use it incorrectly (eg. maybe it assumes data that's biased in a
certain way, or centered at 0, or scaled up by *8, or etc))
A lot of this basic stuff in video is very easy to do regression tests on (eg. take some random 8x8 data, dct, quantize, dequantize, idct, measure
the error, it should be very low) so there's no excuse to get it wrong. But even very experienced programmers do
get it wrong, because they get lazy. They might even start with a reference implementation they know is right, and
then they start optimizing and translating stuff into ints or SIMD, and they don't maintain the slow path, and somewhere
along the line a mistake slips in and they don't even know it.
I've been thinking about a more difficult problem, which is : how do you deal with bugs in compression algorithms?
I don't mean bugs like "it crashes" or "the decompressor doesn't reproduce the original data" - those are the easy kind
of bugs and you just go fix them. I mean bugs that cause the compressor to not work the way you intended, and thus
not compress as much as it should.
The very hard thing about these bugs is that you can have them and not even know it; I'm sure I have a hundred of them
right now. Frequently they are tiny things like you have a less-than where you should have a less-or-equal.
To avoid them really requires a level of care that most programmers never use. You have to be super vigilant.
Any time something surprises you or is a bit fishy, you can't just go "hmm that's weird, oh well, move on
to the next task". You have to stop and think and look into it. You have to gather data obsessively.
Any time you implement some new idea and it doesn't give you the compression win you expected, you can't just
say "oh well guess that didn't work", you have to treat it like a crash bug, and go set breakpoints and
watch your variables and make sure it really is doing what you think; and if it is, then you have to
gather stats about how often that code is hit and what the values are, and see where your expectations didn't match
reality.
I've really been enjoying working on compression again. It's one of the most fun areas of programming
that exists. What makes it great :
1. Clear objective measure of success. You can measure size and speed (or whatever other criteria) and
see if you are doing well or not. (lossy compression is harder for this).
2. Decompressors are one extreme of fun "hacker" programming; they have to be very lean; great decompressors
are like haikus, they're little pearls that you feel could not get any simpler without ruining them.
3. Compressors, on the other hand, can be big and slow, and you get to pull out all the big guns of algorithms
for optimization and searching and so on.
"09-26-12 - Optimizing Carefully and Compression Bugs"
3 Comments -
Re: optimization:
Many years ago, we're about to ship a 2D RTS, thousands of little animated sprites running around on a (pretty expensively calculated, pseudo-3D, kinda-textured) terrain. We do everything on the CPU, the target hardware is somewhere in the lowest triple-digits of Pentiums, and there's a mountain of complicated code that tries to find out a good but still minimal set of rectangles to invalidate on the screen.
Somebody comes up with a really good optimization that should help a lot with that. Gut feeling says something like 2x; instead, the expected result is more like 5%. Oh well, we think, never trust your gut feeling, measure everything, 5% is still a win etc.
We happily ship.
A few weeks later we find that optimization really achieves a better than 2x improvement - while introducing a bug that results in all sprites being blitted twice.
September 28, 2012 at 11:00 PM
Currently doing exactly this. Got any opinions on Golomb-Rice coding and prefix coders generally?
September 29, 2012 at 1:45 AM
Currently doing exactly this. Got any opinions on Golomb-Rice coding and prefix coders generally?
September 29, 2012 at 1:46 AM