Eric's Research Logbook: Garbage Collection and Chaos

Changing topics, and going to more research in progress as opposed to finalizing what I've already done. I'm shifting to my long-term work: chaos and computing.

The original question was "is garbage collection chaotic?" Now that I understand all this better, I can take a more direct approach. For a system to be chaotic, three properties need to be satisfied.

1) Sensitivity to initial conditions

Points which are close to eachother may produce substantially different trajectories. That is, subtle changes may drastically alter the result. This, I think, is the most important.

2) Topological mixing

As near as I can tell, given any two sets of points, it is possible to start with any point in one, run the system long enough, and end up with a point in the other.

3) Dense periodic orbits

Repeating orbits are continuous, or unbroken. You don't don't have "gaps" in a cycle.

Now, how does this relate to garbage collection? That depends on what you're measuring. My first thoughts were to treat a family of isomorphic memory graphs as a single point in a space. However, this will not be a chaotic system. Garbage collection as a system will consist of a number of families of isomorphic graphs with vectors all pointing towards a set of strongly-connected graphs representing the nodes reachable from the root set. Clearly this is not chaotic.

The costs of collections, however, may well be chaotic, or determined by some chaotic system. Costs can be measured in terms of nodes traversed, nodes deleted, nodes preserved, or some combination thereof. An individual move from one graph to another entails some cost.

Now, a program describes some trajectory in this space of graph-families. At some interval, the garbage collection function gets fired, which costs something. Depending on how costs get computed, two neighboring points (graphs) in the trajectory described by the program might result in radically different collection costs. So whatever function decides when to collect can potentially have a major effect on the total cost (this is something analogous to sensitivity to initial conditions). Topological mixing and dense orbits are more troublesome. They suggest the ability for a given decision function to produce arbitrarily bad behavior if the program runs long enough.

I think the more telling question is "how much data is necessary to make optimal, or within some multiple of optimal decicions about when to collect?", or perhaps the other angle, "how close to optimal can I get with a finite data set?". Can a collector actually make a decent decision, or will it produce arbitrarily bad behavior given enough time? I think answering that question would get at the original intent.

Eric's Research Logbook

Wednesday, July 1, 2009

Garbage Collection and Chaos

No comments:

Post a Comment

About Me

Followers

Blog Archive