Eric's Research Logbook: The Politics of Task and Thread Scheduling and Edge-Triggered Events

I wrote earlier on the subject of tasks vs threads. The basic idea was of threads as a unit of execution, and tasks as a unit of executability. Put another way, threads represent separate time streams, where tasks represent separable control flows. The implication is that priorities and the epoch property matter only in the scheduling of threads; tasks by contrast can be scheduled using the classic, dumb FIFO (or LIFO) methods.

The truth is more complex. I'd like to do event-handling in an edge-triggered, continuation-handled manner. That is, you set a continuation to be executed in response to some particular kind of event (for instance, I/O completion, or availability of data on a socket). There are many reasons for this, which I'm not going to discuss in full detail here. The major reasons are 1) it's simple, natural, and close to what hardware actually gives you (even if OS APIs make it quite hard), 2) it avoids locking which would be necessary in a polling (or should I say level-triggered) style, yet it does not inhibit one's ability to implement such a style, 3) it provides just as fine an interface for long-running compute-bound tasks (particularly FFI calls) and 4) it provides a means to identify control flows originating from system events, which is important, as I'll discuss momentarily.

The lock-free scheduler system I designed has a very carefully-designed API designed to facilitate a kind of voluntary context switching used in conjunction with mailbox communication. Changes in thread state aren't acknowledged until a safepoint, which enables a pattern of the form "set myself to sleep status", "change some part of state", "safepoint". For example, in monitor-style programming, I'd mark myself as sleeping, append myself to the condition variable's listeners, unlock the mutex, and then execute a safepoint. If another thread managed to catch me before the safepoint and wake me back up, then I never go to sleep (obviously, in the real world, I want to spin-wait for the expected time cost of executing a safepoint, sleeping, and waking up again).

This all ties nicely into edge-triggered style events. Tasks are really just continuations, which plays nicely into the idea of registering continuations to be run in response to events. However, you don't want to hijack the I/O thread (or interrupt handler) to do lots of computation. You want the I/O system constantly handling various events. One could just take the continuation and shove it into the set of runnable tasks. However, tasks have no notion of time or priority, and we'd like to keep it that way, or as close to it as possible (since such notions come with a cost).

Ideally, tasks responding to I/O should modify data structures, possibly schedule some I/O, and then check out. The problem is, this could involve potentially time-consuming data structure operations, or operations on lock-free structures which strictly speaking aren't guaranteed to ever finish. Moreover, we'd like to harness parallelism for the completion of I/O tasks too.

The answer may lie in the safepoint mechanism. We can execute a set number of safepoints (or perhaps, for a set amount of time), then kick the task off the I/O timestream and onto the normal execution system. Ideally, this should allow short-running responses to do their jobs, but won't risk hijacking the I/O timestream for long-running responses. This requires us to split the task execution system into a time-sensitive and time-insensitive one, but that may not be a bad thing.

I will write in more detail as I design more of the system.

Eric's Research Logbook

Tuesday, September 27, 2011

The Politics of Task and Thread Scheduling and Edge-Triggered Events

No comments:

Post a Comment

About Me

Followers

Blog Archive