Thanks for the idea. Quite implementable on other platforms, too.
August 14, 2010 at 9:51 AM
On the SPU I'm now making heavy use of a primitive op I call "memset16" ; by this I don't mean that it must be 16-byte aligned, but
rather that it memsets 16-byte patterns, not individual bytes
void rrSPU_MemSet16(void * ptr,qword pattern,int count)
{
qword * RADRESTRICT p = (qword *) ptr;
char * end = ((char *)ptr + count );
while ( (char *)p end )
{
*p++ = pattern;
}
}
(and yes I know this could be faster, this is the simple version for readability).
The interesting thing has been that taking a 16-byte pattern as input actually makes it way more useful than normal byte memset. I can now
also memset shorts and words, floats, doubles, and vectors! So this is now the way I do any array assignments when a chunk of consecutive
elements are the same. eg. instead of doing :
float fval = param;
float array[16];
for(int i=0;i16;i++) array[i] = fval;
you do :
float array[16];
qword pattern = si_from_float(fval);
rrSPU_MemSet16(array,pattern,sizeof(array));
In fact it's been so handy that I'd like to have it on other platforms, at least up to U32.
"08-10-10 - A small note on memset16"
1 Comment -
Thanks for the idea. Quite implementable on other platforms, too.
August 14, 2010 at 9:51 AM