This is an attempt to add the mailbox data to the GC and transactional intrinsics. I will need to further develop some transactional concepts before the transactional intrinsics are where they need to be.
On the other hand, those are for far-future work, not present-day. The idea is to make sure I don't have to redesign all the instructions at a later date.
This is consciously trying to look more like LLVM instructions...
Types:
gcallocator: A structure used to allocate GC objects. These are good for one use, and the gcalloc intrinsic yields a new allocator. The program is expected to use this, and give the latest version to safepoints.
gcstate: A collection of boolean flags representing various things. Doesn't change between safepoints.
gc, undo, buf: Each of these are a kind of log.
mbox: A thread's mailbox. At present equal to {i32 execid, gcstate state, gcallocator alloc, gc gclog }.
thread: A thread id. This is a differentiated type because it is traced by the garbage collector.
schedstate: Scheduling states. Includes at the least run, suspend, and term.
tstate: A thread's state. At the present, equal to {schedstate state, i32 priority }
Calling Conventions:
thread: A calling convention for function that serve as the starting point of threads. Enters with the architecture's thread pointer pointing to an inital frame, which is initialized to contain the arguments to the function. The object containing this frame will likely also contain the thread structure, though it is not required.
In general, the other calling conventions are geared towards using GC-allocated frames, and flattening them to the fullest extent possible. Functions expect to enter with their stack pointer pointing to a frame of the proper size and type. When leaving, they restore the return address and old frame pointer. I call this the "blind callee convention" since the callee has no idea where it got its frame from.
There are roughly three sub-conventions that arise from this. The first is the most basic: allocate a frame from GC, initialize it, set up the arguments, return address, and return frame, and jump to the function.
In the second, we actually use the frame of the caller. We can do this for calls to non-recursive functions.
In the third, we use a classic stack, which is a big aggregate GC object. We can just use the blind callee convention, or we can have the callee be aware that it is on a stack (and thus, can allocate more frames using the stack). This is useable when the frames themselves don't escape a call (curried functions, inner functions that access variables in their parents and escape the parent, and continuations break this convention).
Instructions:
result = spawn execid execid, gcallocator gcallocator, funptrval(initial args)
Spawn a thread. The starting function funptrval must use the thread calling convention. result is of type {thread, gcallocator }.
result = tstat execid execid, thread threadval
Get the state for threadval. The result is of type tstate.
update execid execid [, schedstate (run|suspend|term)] [, i32 pri ], thread threadval
Update state for threadval. There must be at least one state change given. For threads updating themselves, this does not take effect until the next safepoint.
result = safepoint mbox mbox
Possibly context switch or run garbage collection, acknowledge any changes to scheduling state, generate a new mbox for subsequent calls. The mbox given becomes defunct, and using it after this point causes undefined behavior. A safepoint will spill and restore all registers to the current frame if a garbage collection or a context switch takes place.
result = gcalloc [tx, ] mbox mbox, type [, options ]
Allocate an object with type type, initializing only its header. The type of result is {type*, mbox }. This call may also have the effect of executing a safepoint.
result = watch [r ][w ], type* ptrval ... [, set setval ]
Watch for activity on ptrvals. If no specific type of activity is given, rw is assumed. Append ptrvals to setval, or create a new set containing ptrvals.
touch r [w ], type* ptrval ...
result = touch [r ]w, type* ptrval ... [, (gc / undo) logval ]
Report access to ptrvals, which may be read or write. If write access is reported, then it may also be logged. If the touch is logged, then there is a result, which is a log equal to the type of log argument given. Depending on the exact log type, this instruction must be placed differently relative to the memory operation to which it corresponds. If undo logs are used, the instruction must precede any write to the locations, whereas the instruction can be anywhere with gc logs.
result = read [gc gcstate, ] [tx ,] type* ptrval
result = read [gc gcstate, ] tx, type* ptrval, buf logval
Exactly equivalent to load, except it exists for extra transactional or garbage collection semantics. At least one of gc or tx must be present. The second variant checks a set of buffer logs first. This does not inform watch sets of the activity. If this is a read of gc-traced memory, the gcstate from the last mbox must be given.
write [gc gcstate, ] [tx ], type* ptrval, type val
result = write [gc gcstate, ] tx, type* ptrval, type val [, buf logval ]
Exactly equivalent to store, except for extra transactional or garbage collection semantics. At least one of gc or tx must be present. The second variant buffers the write in a write-buffer log. This does not actually write through to memory. This does not inform watch sets of the activity. If this is a read of gc-traced memory, the gcstate from the last mbox must be given.
result = validate [r ][w ], [set setval ... ]
Validate setvals, or all outstanding watched values for this thread if no sets are given. If no access mode is given, rw is assumed. result is a i1, indicating success or failure.
discard [r ][w ], set setval [, set setval ... ]
Discard setvals from the watched locations for this thread.
commit buf logval [, buf logval ... ]
Perform the actions described in logvals.
rollback undo logval [, undo logval ... ]
Abort all actions described in logvals.
listen set setval [, set setval ... ]
Following the completion of this instruction, the current thread's status will be set to run whenever anyone touches an of the locations contained in setvals.
There is still no conflict management. Also, the transactional instructions completely lack the kind of meta-arguments present in the gc and scheduler intrinsics. The next steps will be to develop conflict management and the necessary metastate. There also need to be semantics and operations for managing logs. Adding these will complete the intrinsic set.
Sunday, June 28, 2009
Complete GC and Scheduler Intrinsics
Labels:
concurrency,
runtime systems,
threading,
transactional memory
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment