Thursday, June 11, 2009

Proposition Spaces

I was on the road back from Brattleboro, thinking over my work on structural theories of atomicity, synchronization, and the like. I'll do a brief overview, since I'm writing the tech report on all this right now, and trying to turn it into a conference paper.

Concepts like atomicity, lock-freedom, and the like exist only as conceptual descriptions at the moment, but nothing you could use to build a type system or a model checker. To do that, you need to describe these concepts using structural rules. This is what I've been working on recently (one of the things, at least). My approach is to develop a theory of concurrency which operates on executions, or histories of states as a program is executed. A concept like "atomicity" can be stated as a structural property of executions, similar to the way a type judgment is a structural property of a program's syntax. You can also state things like synchronization rules, memory consistency models, and database serializability properties this way.

All this heavily depends on dependent Hoare type theory. Dependent types let you state arbitrary properties of a term, like "an integer less than 4, a hash table containing x but not y", and so on. Hoare types extend this with pre- and post-conditions, written {P} T {Q}, where P is a condition that must hold before executing a term, and Q is the condition that holds after. T, of course, is the term's type. All these can contain arbitrary predicates, so you can have something like "delete : { x ∈ t } t : Table → () { x ∉ t }", which describes the exact behavior of delete.

How does all this relate to concurrency and specifically, atomicity? These types describe behaviors. Some operation has a type {P} T {Q}, meaning that it changes the state of the world from P to Q and evaluates to T, or else {P} T {P}, meaning that it observes P, and evaluates to T (recall all these can contain arbitrary predicates). I make heavy use of a thing called a "definitive type" which is essentially the most restrictive type you can give a term. So the definitive type of a "contains" function on a set might be

contains : {x ∈ s} s : Set → true {x ∈ s} ∨ {x ∉ s} s : Set → false {x ∉ s}

Notice the disjunction operator ∨. If you take all the behaviors combined with this operator, you have a bunch of definitively typed possible behaviors. Each of these is a threshold.

Assuming your type system is sound, then a given term's definitive type represents a monotonically decreasing set as the term gets executed, meaning that its type encompasses a smaller and smaller set of values. (If the type ever grows, then your type systems lies) Here's the key point: once your term becomes a subtype of only one of its possible behaviors, the outcome is now fixed. This is analogous to what Herlihy calls a "univalent state". Prior to this, your term is a supertype of some set of the possible behaviors, which is analogous to a "multivalent state".

Now, finally on to what I was thinking about. When I wrote the logic rules for spotting linearization points, good behavior of locks, and such, I remarked that it was awkward, because you have to operate on pairs of states in order to say exactly what parts of a term got reduced. What would be really nice is a differentiation function (Harrison, here's your precious taking the deriviative of a data structure coming up soon). To do this you need several things: 1) what part of a term got eliminated, 2) what changed in the concrete state, and lastly but not least 3) what changed in abstract state (meaning what propositions' truth changed). 3 subsumes 2, since concrete states are just sets of very uninteresting propositions (address 1 holds true, 2 holds the address of 4, ...) Types are also propositions in a dependent system, and as we've already stated, they are monotonically decreasing, so all we only ever lose propositions. The propositions representing abstract states, of course, can change arbitrarily.

So what we're essentially setting up here is a sort of "propositional space". A point in this space is the truth assignments to all propositions. A vector in this space is our differentiation function. A program represents a vector field in this space (or a system of equations for math people). Lastly, an execution represents a single path.

Of course, all this would be just a mind-bending abstraction if it weren't for what else I'm working on: chaos and computing. You see, when you start to think about things in this way, it suddenly starts to look very much like concepts in nonlinear dynamics. You have your vector fields, bifurcations, limit cycles, stable and unstable fixed points, and so on. Undecidability implies that you probably can't predict the outcome or the position in propositional space past a certain point. Most importantly, you have an infinite number of propositions represented in a single point, but for every program, these are determined by a finite number of propositions. This is analogous to high-dimension systems which are actually determined by a small number of real dimensions.

Just as Euclidean space is defined by some set of vectors, it would seem that a propositional space would be defined by a set changes in propositions. Figure out what these are for a program, and you can probably tell an awful lot about its behavior at any given point.

I also wonder about the exact relationship between a nonlinear system's being chaotic and a program's outcome being undecidable.

No comments:

Post a Comment