Friday, June 12, 2009

Verifying Linearizability with Types

A major theme of my research for the past year has been the desire to build type systems which are able to verify that certain term (or more accurately, certain objects) are linearizable. Linearizability means that something (an execution, a term, operations on an object) all take effect in a single atomic operation.

Of course, to build a type system, we'll need to be more specific. A term's behavior is described by a Hoare type of the form {P} T {Q}, meaning that the term assumes P, guarantees Q, and reduces to a value in T. As I discuss in a previous entry, the term will observe some properties and change others. If all this happens in the same step, then we have linearizability.

Of course, life is always more complicated. In optimistic concurrency control mechanisms, we usually have a read operation which gives us a value, then a validate operation, which ensures that the location hasn't changed. In LL/SC, the write and validate operations are a combined atomic operation. "Observation" happens at any point between the read and the validate, though we usually figure out our exact control flow at the validate.

So essentially, what we have are read, write, and validate operations. We can do any number of reads, so long as we validate them and throw away the values if validation fails. All validations and writes must be done in a single atomic step. But it's still more complicated than that, actually. We're allowed to do writes that don't produce observable changes (such as rebalancing a tree), and we're allowed to skip validation on reads under less obvious circumstances.

Colin Gordon and I worked out an approach for dealing with simple languages with LL/SC-style atomic memory instructions using non-dependent (yes, you heard me correctly) types, and a ridiculous amount of caffeine. Of course, we had to combine the control-flow effects of the SC instructions with the write effects. The type system itself works something like this:

Pointers to shared data have a special "guarded" type, which is linear (meaning use-exactly-once), but special. Instead, the entry in the type environment can be "split" into several other entries, which we call "tokens": read, write, and value, each of which are also linear. The tokens are consumed by operations which read from, write to, and use the value of the corresponding tokens. The type system also uses a primitive form of Hoare-types to track whether it has seen a linearization point yet.

I'm glossing over quite alot here, but it's enough to get the general idea. Now, how can this be generalized? The tracking of whether or not we've seen a linearization point yet is also critical, since it determines our ability to make observable changes to state. If you go from a pre-point state to a post-point state, you need to validate all reads you've done and perform any writes such that you perform one of the possible behaviors of the term.

All writes that aren't part of a linearization point need a proposition showing that they don't cause any observable effects (meaning that the before and after states are equivalent). The ability to discard reads that have not been validated yet depends on a slightly more intricate property. Unvalidated reads can be discarded if the result of the term is not affected by changes to the location from which the read took place. Proving this is another matter.

In Colin's and my work, we consumed the value token in a SC or check (basically a straight validation), and if the operation failed, you got kicked to a "fail" control flow path. In a more general solution, you end up with two situations. The first is failed validation, in which you need to prove that you don't use the value, or any value derived from it. I've still not worked out the details of this yet.

The second is by proving that concurrent modifications don't affect the outcome. For instance, in a linked list, you would have a proposition that hd(cons(a, b)) = b, so if you're traversing down a list, removing deleted values from the list won't affect the outcome.

This is clearly not yet worked-out, and it won't be until I build a language and start imlementing some more complex linearizable systems. But this is a lab-book, afterall.

No comments:

Post a Comment