Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"06-16-09 - Inverse Box Sampling"

9 Comments -

1 – 9 of 9
Anonymous Anonymous said...

My attempt at a closed form solution to the iterated bilerp, based on this general approach:

bilerp_dc.txt

I solved for both a full-width (sinc-like) thing which should match the iterated one, and then a simplified bilerp with a tunable parameter and a very local filter support (only need 3x3 DC values to do a given 8x8 block as 4 4x4 bilerps).

June 17, 2009 at 7:07 AM

Anonymous Anonymous said...

Er, correction, the simplified bilerp isn't tunable, I missed that there was another constraint on it.

June 17, 2009 at 7:22 AM

Anonymous Anonymous said...

Note that you can get a tuning factor by just having two solutions and weighting between them.

My writeup offers a "half-bilerp" approach for 3x3 taps on the DC; the best "half-bilerp" approach for 1x1 taps is the standard blocky DC reconstruction. You could weight between those, but obviously you'd have block artifacts.

So the thing might be to solve for the best approach using 5x5 taps on the DC, and then try a tuning factor between the 3x3 tap and the 5x5 tap solutions.

June 17, 2009 at 7:52 AM

Blogger ryg said...

Like a lot of image processing/computer vision problems, this one is underdetermined in an "interesting" way, and different ways of regularizing it correspond to very different algorithms.

Your approach (deconvolution) is at one extreme, with no particular assumption of how a "typical" reconstructed image should look, but very strong assumptions in the model (linearity of the reconstruction process, bandlimitedness of the box-filtered signal so deconvolution can be applied). The other extreme would be just having a huge library of "typical" images and trying to find matching patches subject to some constraints to the transitions work, like in this paper. This is equivalent to learning a statistical model of typical images and using the maximum a posteriori estimate with the downsampled image being your observation, and the main problem is that there's just too many "plausible" images to ever learn a representative sample.

...which is where slightly stronger model assumptions would help. Add some basic color correction, for example; it needs to have a very limited range since otherwise it'd introduce too much ambiguity when trying to match the lowres image. Example-driven texture synthesis algorithms could be killer here, but this is basically the inverse problem - given this repeating structure, what is the most likely pattern to have generated it, and where have I seen it before?

Damn, all very interesting, but I wouldn't know where to start :)

June 17, 2009 at 12:12 PM

Blogger cbloom said...

"The other extreme would be just having a huge library of "typical" images and trying to find matching patches subject to some constraints to the transitions work"

Yes, I've written and worked on that problem as well. It's slightly different because they generally don't require that down-sampling reproduces the original low res. It's usually called "super resolution" and there are now a lot of papers on it (better than that one).

It's definitely something on my "wish list" that I would love to work on if I had a bunch of free time. I was working on it a bit at the end of my unemployment but I got distracted by the job search and whatnot. It's a very hard problem.

BTW I believe that any linear reconstruction approach to these kinds of problems is wrong and doomed to suck.

June 17, 2009 at 12:19 PM

Blogger cbloom said...

BTW Sean's limited range linear reconstruction basis (at the end of bilerp_dc.txt) is equivalent to using linear "mexican hat" basis functions.

http://en.wikipedia.org/wiki/File:Wavelet_-_Mex_Hat.png

Obviously you can do quadratic or cubic or whatever mexican hats, but it's not a very tasty solution because they ring too much.

June 17, 2009 at 12:20 PM

Blogger cbloom said...

This is an old post on super res (slightly different problem) :

http://cbloomrants.blogspot.com/2007/11/11-14-07-1.html

I think I have better approaches & understanding now as well as finding a bunch of papers.

Kwang In Kim is a good place to start :

http://www.kyb.mpg.de/~kimki

June 17, 2009 at 12:26 PM

Anonymous Anonymous said...

"BTW I believe that any linear reconstruction approach to these kinds of problems is wrong and doomed to suck."

Maybe, but everything else seems way too expensive to actually compute, so I focused on linear just because it seems computationally feasible, and better than DC-nearest-neighbor.
Are there actually other options on the table? You appeared to write all your attempts off as too expensive.

Also, I suspect any solution to this problem is either going to be ringy or blocky; my guess it's inherent to the problem. This applies to this solution, to hats, etc. Basically, you're trying to pack a certain amount of energy into the block; either it's well-distributed over the block (blocky), or you try to keep the edges nice and then it's overconcentred in the center (ringy).

Also, the iterative solution seemed pretty reasonable to me, was it actually too ringy? My mexican hat is just a truncation of the iterative solution with the necessary cleanup to make the DC work, so I'm not sure how much worse it would be.

Second, if every 8x8 block is, in the end, independent (the only hard constraint is the average DC), then you can use different solutions for different blocks. So if the deconvolution approach is too slow, but the mexican hat is too ringy, then if you end up in a place where 'oh well, i'll just use nearest neighbor aka DC replication', you could still do mexican hat, have a 'ring detector' (color falls outside of convex hull of nearby elements) and substitute the constant DC blocks in those places.

June 17, 2009 at 1:25 PM

Blogger cbloom said...

I don't mean "linear" as in whether the basis function is piecewise linear or not - I mean linear in the "linear operator" sense - *any* basis function based approach will not be awesome.

The simplest way to get a "Non-linear" approach is to have something like a few different linear solutions, and pick some blend between them that's based on the local neighborhood. For example one obvious thing to do which I tried is to run an edge detector on the DC image and do different things when you see hard edges vs. smooth areas.

June 17, 2009 at 1:37 PM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.