Googles appar
Huvudmeny

Post a Comment On: cbloom rants

"04-06-09 - The Work Dispatcher"

7 Comments -

1 – 7 of 7
Blogger won3d said...

So how is the dependency graph represented in the library, and how is it expressed in client code?

April 7, 2009 at 12:23 PM

Blogger cbloom said...

Each Worklet has an Id, and each Worklet can depend on a list of other Ids. Once a Worklet is fired you can't change its dependencies.

This creates a tree structure of dependencies. You know you can always run the leaves.

In practice I don't actually build a tree and there are some little clever bits to make dependency checking very fast.

It does still have overhead because you need an Id <-> Worklet map. Because of that there's also a fast path for Worklets that don't depend on anyone and that noone else can depend on, so they just get fired and don't participate in the map.

April 7, 2009 at 12:43 PM

Blogger John W. Ratcliff said...

Charles,

Great post. Is this something you are planning on open sourcing? I have been waiting to add such features to my job swarm until I need them and, unfortunately, I haven't been writing any code lately that needs it.

The batching option is something that is a really common use case and I will provide it the next time I do a source update.

John

April 7, 2009 at 1:53 PM

Blogger cbloom said...

No, this is probably not going in the open source. This is work for RAD that will be in my product Oodle, and a lot of the black art of making it really nice is tweaking the details of dealing with threads and scheduling on all the different platforms and operating systems.

For example for SPUs the internal function will be very different; it will probably just be forward-dispatched, not work-stealing.

I did open source the lock-free primitives that I did for this, which should make it pretty easy to roll your own.

April 7, 2009 at 2:02 PM

Blogger cbloom said...

BTW work-stealing self balancing is also great if your worklets take variable amounts of time that's hard to predict.

Consider for example an SPU or LRB environment where you have workers that are 100% under your control so you don't have to worry about them losing CPU time to other threads.

Say you want to fire 80 worklets to 8 workers. A forward dispatcher might just push 10 items onto each worker. That's fine if you know that all the work items take the same time, but if they take variable amounts of time it can be arbitrarily bad.

For example if one item takes time 10*T , and the other 79 all take time T, your total time will be 19*T when it only needs to be 11.1*T

Obviously to really do a forward dispatcher right you would have to have a dispatcher thread running all the time and watching the workers and redistributing the work to rebalance things. Basically you wind up writing a whole scheduler like an OS which is a pretty scary piece of code to write. Work stealing basically does a decent job of this for you for free.

April 7, 2009 at 2:07 PM

Blogger Assen said...

I hope you get email about thread comment necromancy...

What do you do once a thread is idle, and it can't find anything to steal? Basically the system has emptied itself of worklets - how do you sleep your worker threads so they wake efficiently when there's work to do?

I can't think of anything clever.
You can put them all to sleep on their own events, and signal each event from the main thread when you add the first worklet onto a worker's thread currently empty queue; but then you have to be careful to spread work, eg to assign it in round-robin order. This feels a bit wrong to me - the beauty of work-stealing should allow me to dump all the work on the first worker, and let it spread automatically.

September 17, 2011 at 4:53 PM

Blogger cbloom said...

By far the simplest and most elegant way is just to use a counting semaphore.

Each time you push a work, "Post" (inc) the semaphore. Each time you pop a work, "Wait" (dec) the semaphore.

Then if there is no work, the dec is a sleep. When you push new work, the right number of guys get woken.

In practice you want to use something like "fastsemaphore" that I posted previously so that only the inc/decs which cause thread wake/sleeps are kernel calls, and other cases are just a lockfree inc/dec.

As long as your workers only ever sleep due to the work queue being empty, this is a great and very clean solution.

In my implementation, the worker threads can sleep for other reasons, so I have a more complicated solution.

Also I believe I posted somewhere about the fact that you may want to spin a bit when your fastsemaphore detects that it is waking or sleeping a thread, because if the action can get handled very soon without a thread transition that is preferable. Obviously this depends on usage pattern, etc.

September 17, 2011 at 5:50 PM

You can use some HTML tags, such as <b>, <i>, <a>

This blog does not allow anonymous comments.

Comment moderation has been enabled. All comments must be approved by the blog author.

You will be asked to sign in after submitting your comment.