Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"09-12-10 - The defficiency of Windows' multi-processor scheduler"

12 Comments -

1 – 12 of 12
Blogger Jeff Roberts said...

We just got context switches rendering in Telemetry - it's amazing how many bad decisions it makes...

September 12, 2010 at 9:08 PM

Blogger Carsten Orthbandt said...

There's another thing timeBeginPeriod() does that is far more important in my experience: It affects the resolution of timers and time functions (e.g. GetTickCount() and timeGetTime()).
These are important fallbacks given the absence of other reliable time sources (RDTSC isn't one as you probably know). But the default 15 (desktop) or 10 (server) ms resolution is way to coarse. So you often end up calling timeBeginPeriod(1) just to get better time accuracy.

September 12, 2010 at 11:55 PM

Blogger cbloom said...

Yeah that's one of the reasons why you will always see a clock res of 1 milli, because somebody has done it to get more precision in their timer.

It's fucking retarded that the timer function call resolution is tied to the OS scheduler, BTW.

BTW see also

http://cbloomrants.blogspot.com/2009/04/04-23-09-telling-time.html

http://cbloomrants.blogspot.com/2009/03/03-02-09-sleep-sucks-and-vsync-woes.html

September 13, 2010 at 12:05 AM

Blogger Shelwien said...

Do you have any hints for this:
http://stackoverflow.com/questions/3280197/whats-the-best-way-to-demonstrate-the-effect-of-affinity-setting
?

September 13, 2010 at 2:53 PM

Blogger cbloom said...

I think setting affinity for a single threaded app like that is a *very* bad idea.

It might improve performance a little bit in some cases, but in other cases it will *cripple* overall machine performance.

Imagine if everyone was setting cpu affinity to the same core. Basically you've killed my multiprocessing.

The only time I use affinity is when I am making a bunch of threads myself and I want to control how they distribute onto cores.

For example in a video game you might put your main game loop on core 0, your threaded physics work on core 1, and your background loader on core 2.

Even that is not actually ideal, but you can't really do much better without more feedback from the OS which doesn't exist.

September 13, 2010 at 4:22 PM

Blogger cbloom said...

The other big case is my Worklet work stealing system where I intentionally make a thread per core and lock it to each core. Then by work stealing it automatically redistributes work to the cores that have available time.

September 13, 2010 at 4:23 PM

Blogger Shelwien said...

No, the question there was how to demonstrate that thread rescheduling
to other cores can hurt the performance - with something simpler than a compressor.
Also I still don't know why Windows does that even on a system with a single non-idle process.
And another question was how to lock a thread to the core which was originally allocated to it - without Vista+ APIs.
Anyway, with that compressor I ended up making it a commandline option.

September 13, 2010 at 4:46 PM

Blogger cbloom said...

Yeah I understood what the question was and I'm telling you it's not something you should be doing. It's very bad practice.

September 13, 2010 at 5:50 PM

Blogger Jeff Roberts said...

So, the problem with work stealing style designs for me is how do you deal with the last thread?

Like, Bink breaks up the compression on 8x8 blocks - so, if you start 24 threads, they all start working away at a very fine grain.

Eventually, some other thread on the system runs and blocks, say, thread 17. The rest of the my threads pitch in and finish the compression, but frame 17 is still sitting on compression that *one* little 8x8 block.

I can tell do that block, but that thread is just sitting there waiting to run, and possibly overwriting some memory when it finally starts back up.

So, I have to wait before I continue processing for that last thread just to end.

Super-annoyingly, Windows could just move the thread that interrupted me to another core (since they are all idle), and then everything will finish, but I can't force that to happen.

Grrrr.

September 14, 2010 at 1:48 AM

Blogger cbloom said...

Yeah that's the issue Tim Farrar writes about.

I think that could be addressed though.

Dupicating the work and just doing it on the main thread is one option.

Another is if the main thread blocks on the async work and there are only a few left, kick those threads to high priority to make sure they get a chance to finish up.

Also with work stealers the problem can only happen with the very last task on each worker, so it is really greatly diminished.

September 14, 2010 at 11:18 AM

Anonymous Anonymous said...

Super-annoyingly, Windows could just move the thread that interrupted me to another core (since they are all idle), and then everything will finish, but I can't force that to happen.

Or it could just move the thread that's blocked to another core! That's what it wants to do, but you've hosed it by setting the thread affinity.

What happens if you set each thread's ideal processor to its current affinity setting, and don't set affinty? Or what if you set affinity for pairs of threads to both pairs of cores, or set each thread to be ideal X and affinity to X and X+1?

September 15, 2010 at 3:40 AM

Blogger cbloom said...

For my reference, this is Farrar's post on this issue :

http://farrarfocus.blogspot.com/2010/01/pc-cpu-task-parallelism-limits.html

June 7, 2011 at 2:38 PM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.