The data management problems that KnownSpace is intended to help solve are both hard and varied. We don't know how to solve them all, but we think that if enough people work on them, each with their own approach, many of them can be solved reasonably. We also don't expect that those solutions will be anything but partial, so we expect the system to keep growing and changing forever.
KnownSpace, then, needs to be robust, easily modifiable by many hands across the world, and yet still be interoperable across a multitude of versions. Consequently, our main architectural goal is flexibility. An ideal system would let us, at runtime, insert a completely new kind of simpleton, delete any simpleton, or replace any simpleton by a completely different one, all without having to change anything else, recompile the system---or even stop the program.
To make KnownSpace dynamically reconfigureable at runtime, all simpletons should be as simple as possible. For instance, there is a difference between simpletons that see things (Collectors), simpletons that notice what the attributes of those things are (Parsers), simpletons that compare those attributes against the attributes of things the user already likes or dislikes (Clusterers), simpletons that decide whether the user actually likes or dislikes the new things after an evaluation period (Evaluators), and simpletons that figure out how to find new things the user is likely to like (Analysts).
Breaking computations down this far makes recombining simpletons to produce new designs much easier, which makes the architecture more flexible---and more extensible.
To increase flexibility further, simpletons can appear in families. Instead of one entity purger, for example, there can be a family of purgers, all roaming over the same group of entities and marking any entities they come across with attributes of various kinds.
Writing one large, complex purger to do the entire purging job is a bad idea. Here's why:
When there is only one purger, we must decide all its actions and interactions before we can write it. If purging is distributed among many simpletons, however, it's much simpler to write several small, simple purgers, each of whose decisions combine to decide whether an entity should be purged.
It takes too long to program.
When there is only one purger, all the programmers working on it must continually communicate with each other. If purging is distributed among many simpletons, however, several programmers can independently write simple purgers without much communication at all.
When there is only one purger, changing it means overhauling (and debugging) one giant, complex purger. If purging is distributed among many simpletons, however, it's much easier to add new capabilities to the purger family simply by adding new (and simple) purgers.
When there is only one purger, an error in it can mean the death of the entire program. If purging is distributed among many simpletons, however, if one purger dies for some reason, purging does not cease, nor does the program as a whole crash.
This granular approach to programming increases flexibility and robustness enormously. The same lessons apply to all other simpleton families.
The fundamental operation is one of simpletons marking entities by attaching attributes to them. Each simpleton searches for a set of entities marked in some way (its "input"), does some computation and then marks some of those entities, thereby, conceptually, putting them in an "output" group. Each entity appears in a variety of groups of entities, each marked by some simpleton. Thus, the entities flow down the data stream, becoming more and more enriched the further they flow.
Loose simpleton coupling allows dynamic runtime reconfiguration of the whole system. It's possible, for example, to introduce a new simpleton that modifies the work of another simpleton while still leaving the original simpleton in place and unchanged. Suppose it is necessary, say, to add an intermediate simpleton between two existing simpleton families. The new simpleton simply searches for the entities marked by the "upstream" simpletons and marks those entities in such a way that the "downstream" simpletons will find them. There is no need for a complete rewrite---or even recompilation; only the new simpleton must be compiled. Neither of the two old families of simpletons need to be modified.
This property of dynamic patchability is, we believe, a new step in software development, and should lead to extremely flexible and responsive systems for the twenty-first century.