Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"01-17-09 - Float to Int"

4 Comments -

1 – 4 of 4
Anonymous Anonymous said...

I assume this whole thing has been invented multiple times, but it's nice to see that Herf's stereopsis articles references Hecker's article which cites me for the '1.5' in the magic number. Yay me.

(I remember figuring that out and posting it to a mailing list he and I and Terje were on back when I was at Looking Glass, but I can't actually nail down precisely when it was.)

January 18, 2009 at 11:45 PM

Blogger Unknown said...

Has noone ever noticed that it doesn't work.

Try floor 0.9999999999999990008
sub doublemagicroundeps
0.50000001499999902066
add floatutil_xs_doublemagic
6755399441055745
in binary
0100001100111000000000000000000000000000000000000000000000000001
keep lower bits
1 (WRONG)

The correct floor 0.9999999999999990008
sub 0.25
0.7499999999999990008
add floatutil_xs_doublemagic / 2
3377699720527872.5
in binary
0100001100101000000000000000000000000000000000000000000000000001
put lower bits in int
1
shift right 1
0 (correct)

One More Time

Try floor 1.9999999999999988898
sub doublemagicroundeps
1.5000000149999990207
add floatutil_xs_doublemagic
6755399441055746
in binary
0100001100111000000000000000000000000000000000000000000000000010
keep lower bits
2 (WRONG)

The correct floor 1.9999999999999988898
sub 0.25
1.7499999999999988898
add floatutil_xs_doublemagic / 2
3377699720527873.5
in binary
0100001100101000000000000000000000000000000000000000000000000011
put lower bits in int
2
shift right 1
1 (correct)

August 12, 2010 at 2:57 PM

Blogger cbloom said...

Sigh sigh sigh la la la

This only works on single precision *floats* . The closest number to 1.0 that you can have in float is

0.99999988

and it works just fine

August 12, 2010 at 3:57 PM

Blogger Unknown said...

Touché, you are right. Since ftoi_floor takes a double I figured you wanted full range support. I may not have read the post thoroughly enough. Although, if this is running on x86 hardware you should probably use long double, if your compiler supports 80bit floats, otherwise every operation must make a round trip to memory for truncation. Something we see with GCC's optimizer, is that all float math is internally in 80bit floats, so all intermediate math must use the 80bit version of magic or use -mfpmath=sse and the correct magic for each type, fun times.

I mainly came about the problem of doing rounding (every kind) in SSE with full range support for each type, and have learned more than I even wanted to. Magic has to be the reverse sign or bad rounding occurs in the range of [magic/3, magic*4/3], plus overflow in the range of [maxval - magic, maxval]. Sigh, modulus with euclidean division, wrapping, and mirroring where cake after figuring all this out.

My attempt:

// should be a constant or result in a constant and an fscale instruction
long double magic(int precision) { return (1.5 * std::pow((long double)2, std::numeric_limits::digits - 2)) * std::pow((long double)2, -precision); }

// the sse version of this ((test & -0.0) ^ value)
FloatType toggleSign(FloatType value, FloatType test) { return test < 0 ? -value : value); }

FloatType roundToEven(FloatType value, FloatType magic)
{
magic = toggleSign(magic * 2, value);
return (value - magic) + magic;
}

FloatType roundDirect(FloatType value, FloatType magic, bool roundUp, bool roundAtHalf, bool asymetric)
{
FloatType quarter = 0.25;
quarter = asymetric ? quarter : thatType::toggleSign(quarter, value);
magic = asymetric ? magic : thatType::toggleSign(magic, value);
value = (roundUp ? value + quarter : value - quarter) - magic;
return (roundUp ^ roundAtHalf ? value + quarter : value - quarter) + magic;
}

FloatType ceil(FloatType value) { return roundDirect(value, magic(0), true, false, true); }
FloatType floor(FloatType value) { return roundDirect(value, magic(0), false, false, true); }
FloatType trunc(FloatType value) { return roundDirect(value, magic(0), false, false, false); }
FloatType rndUp(FloatType value) { return roundDirect(value, magic(0), true, true, true); }
FloatType rndDn(FloatType value) { return roundDirect(value, magic(0), false, true, true); }
FloatType rndAway(FloatType value) { return roundDirect(value, magic(0), true, true, false); }
FloatType rndTo0(FloatType value) { return roundDirect(value, magic(0), false, true, false); }

Ok that was way more than I should have wrote, but maybe it will save someone else from my pain.

August 13, 2010 at 6:42 AM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.