KnownSpace Manifesto

The KnownSpace Vision

KnownSpace is a flexible and adaptive data manager of all of a user's information, whether that information is data or programs, and whether it originates on the web, via mail, news, ftp, an editor, or any other application. It is not a search engine, browser, desktop, or operating system, although it shares elements of all four programs. It is an attempt to take control of the desktop away from the major software manufacturers and put it in the hands of the world's independent software developers.

The desktop has stagnated just when the web is exploding. Search engine queries now often return millions of irrelevant pages. Those pages are not spatially arranged, clustered by topic, or distinguished in any way other than by their titles, so we have no idea of the relevance of any page before reading it. The same is true for mail and news.

After we save webpages, mail messages, news articles, ftp pages, or any pages produced with an editor or any other application, those pages then become lost on our desktops. They are not analyzed in any way, clustered according to our interests, or laid out spatially to show their similarities to other pages already there. Even after we organize the pages on our desktops by hand there is no automation to help us reoganize them, search them, navigate through them, or find more pages like them. Our computers don't help us manage our own data.

We wish to see desktops become as personalized as the individuals using them, with desktops as commodities to be bought and sold, swapped and traded, and quickly and arbitrarily copied and modified. Yet those desktops, as individual as they may be, should speak a common tongue so that anyone can exchange both data and programs with everyone. That common environment could serve to share the fruits of decades of research in artificial intelligence, user interfaces, user modeling, databases, networking, and information retrieval now locked up in research labs, starved for want of a standard way to share them.

We are working toward a future where anyone with a computer can explore the space of human-computer interactions, all within a universal, open, and free environment.

Halfway To Anywhere

There has been a lot of research over the past twenty years on better, or at least, alternate, data management environments. Artificial intelligence, databases, information retrieval, data mining, user interfaces, information visualization, and user modeling have all advanced. The work done in each of those fields, however, has yet to make its way in any serious sense to the personal computer---or even to any other field within computer science.

It would benefit ordinary users to have some of the fruits of all that labor. Further, it should also benefit all those research fields, who for too long have only been motivated either by pure, non-applicable research interests or by direct commercial pressures. Having such a difficult, visible, and everyday application should help to focus and propel all those fields.

If this is such a win, though, why didn't it happen before? First, building such a system is not trivial; it has taken us nearly two years to get to a reasonable prototype. Second, there is no obvious incentive for corporations to do it. Third, KnownSpace couldn't have existed before there was a plausible universal language---we feel that language is Java---nor could it have existed before there was a plausible universal digital communications medium---we feel that medium is the web. Now that those two enablers exist, another enabler like KnownSpace is inevitable simply to deal with the ever-rising tsunami of data.

The science fiction writer Robert Heinlein once observed that low-earth orbit is halfway to anywhere, a saying now widespread in NASA. It takes about as much energy to get from the earth's surface and into orbit as it does to get from orbit to anywhere in the solar system. Similarly, KnownSpace is halfway to anywhere in desktop space. As the twenty-first century's Conestoga Wagon, it will let anyone explore the space of possible human-computer interactions without waiting for a major corporation to provide such a desktop.

We do not, however, wish to destroy commercial innovation and competition---we want to level the field a bit so that poor but quicksilver individuals can compete with rich but ponderous corporations. KnownSpace lets forward-thinking programmers completely alter their personal computing environment at will. We next want to extend the franchise to non-programmers as well by building on top of KnownSpace so that anyone can arbitrarily reconfigure its structure without knowing Java, or any other programming language.

Why KnownSpace?

The argument from information overload: the web is now growing by 3 million pages a day. With a text page averaging about 4 kilobytes, that comes to 12 gigabytes a day in text alone. There is too much available information for the present data management tools to handle well; there is little automation and little parsing, organization, user modeling, data mining, and visual presentation. The user has to do almost all the hard work.

The argument from desktop ossification: today's desktop was designed in the 70s and early 80s for a certain class of machines. Thanks to exponential improvements in hardware, all the base assumptions about the computational power of that class of machines are obsolete, yet we still have the primitive desktop it engendered.

The argument from filesystems, databases, and object-oriented languages: a hierarchical file system with few file attributes is a limiting system for users to work within---it is optimized to simplify data storage but not data organization and retrieval. On the other hand, object-oriented languages present a way of thinking of data as something that can be arbitrarily attributed. Databases too are moving to an attributed-data view with object-oriented and object-relational databases. A file system is not just a storage mechanism but also a database (access plus querying on attributes as well as data). Thanks, however, to the history of computing and rapid changes in computer power, it is presently optimized for the wrong things.

The argument from Java improvements: Java offers many improvements to the traditional desktop environment with built-in components supporting networking, security, multithreading, and advanced interfaces. Further, with no pointers, no multiple inheritance, strong typing, garbage collection, a rooted hierarchy, dynamic linking, component support, and declared exceptions, it is a much cleaner and safer language than C++. Finally, its liberal licensing and hardware independence make it the cheapest portable language in existence.

The argument from research platforms: There is no common world-accessible platform to exchange code resulting from research. Building on each other's work is hard. Artificial intelligence, user interfaces, user modeling, databases, information retrieval, and data mining can all advance more quickly with a common platform to work on.

The argument from component programming: Programming needs to be made less monolithic if we're to solve more complex problems. Program parts need to be smaller, more independent, and plug-compatible to better function as mix-and-match components. Moving to such a style of programming means that many people can work on many different parts simultaneously and separately, and those parts can combine in many more ways than before. Hundreds of geographically separated people can collaborate on large projects using this style of programming with far less necessary interaction than today's monolithic style. KnownSpace is built with, and is intended to both demonstrate and support, this style of distributed programming.

The argument from open-source development: working on software with a large and open community of programmers leads to quicker bug detection and more rapid program improvement. A large pool of independent and widely distributed developers also fits well with the KnownSpace philosophy. The system evolves to fit the needs of the community developing the system itself. Eventually that development community will include non-programmers as well as programmers.

The argument from information visualization: no matter how data is fetched, parsed, and organized, it should be displayed visually to engage the user's visual and proprioceptive systems to make judgements about data much more reflexive and thereby easing the cognitive burden of organizing and accessing large amounts of data. Laying out high-dimensional data in a two- or three-dimensional map can improve many data management tasks: reading mail, reading news, surfing the web, deciding what movies to watch or what music to listen to. Had we such a system handling our mail for instance, then when a new message comes in we would have a rough idea of what it's about at a glance.

The KnownSpace Architecture

The KnownSpace Philosophy

As a future-oriented technology, KnownSpace values flexibility and correctness above efficiency and completeness. First, machine speed, memory, and disk space continue to explode; there's little point burdening the architecture with efficiency decisions that will seem constricting when every two years brings another fourfold improvement in the underlying hardware. Second, the problems we're trying to solve are too hard to solve at one go, so completeness makes little sense either; it's better to keep the system flexible so that others can fill in functionality when they need it.

With this in mind, the architecture is extremely decentralized. Small, independent programs (called simpletons) are loosely coupled with the data (called entities) and with each other. Programmers can dynamically attach arbitrary computations to arbitrary data. Nor is the data limited to traditional documents, it can be any data at all---for example, an entity can represent a person, a program, an organization, a date, a song, a concept, a website, a building, or any other thing, real or imagined.

The KnownSpace Kernel

KnownSpace tries to be as flexible as possible, making arbitrary changes to the system's functionality as easy as possible. To this end, the KnownSpace kernel implements a simple directed graph of entities allowing anyone to add any entity to any other entity as one of its attributes. Since an entity's value can also be the result of an arbitrary piece of code, any attribute may be dynamically generated.

Code is encapsulated into simpletons which (ideally) are small and simple. Each simpleton is like an enzyme attaching itself to a molecule (an entity) and suitably altering it by attaching, detaching, or modifying its attributes. Each simpleton passes data along a chain of simpletons without any of them being aware of any other simpleton. Programmers can introduce arbitrary new simpletons or entities at any time---even while the program is running.

The KnownSpace Backend

Behind the scenes, KnownSpace simpletons fetch data from web servers, mail servers, news servers, ftp servers, or the local disk. KnownSpace also fetches new data autonomously based on its estimate of the relative importance of that data to the user. The user may also ask it to periodically update aging data from various sources, or to do various web searches on the user's behalf.

Simpletons parse this unstructured data, building an object-oriented database on it. Other simpletons continuously remodel this database based on the user's actions. Yet other simpletons autonomously fetch new data apparently relevant to the user's interests. This adaptive model gives the frontend a visualizable model of the user's data.

The KnownSpace Frontend

KnownSpace has many faces. The idea is to support arbitrary interfaces and let evolution decide which interfaces are generated and used. Each interface is a completely separate application, totally divorced from the backend and the kernel.

Anyone can implement an interface and plug it into KnownSpace. That interface may, or may not, use the attributed data the backend has generated, and it may exploit (or not) the attributes denoting spatial structure to present a two- or three-dimensional arrangement of the data with data placement recorded deep in the system itself. Unlike today's desktops, the system, as well as the user, has a model of the space the data appears to live in and so can make inferences about user behavior based on that model. Users report a sense of "space" when using such interfaces---their data is in known locations---it is embedded in a known space.

The State of KnownSpace

KnownSpace presently exists only in highly unstable forms. The project has been ongoing since Fall 1997, each term producing a prototype with improved functionality. The present prototype, released at the end of Spring 1999 and available as a Java jar file from the KnownSpace website, fetches, parses, and organizes data from web, email, and ftp servers, builds a user model to detect simple user patterns, and has six different visual user interfaces (albeit each one is presently only weakly functional). It consists of about 80,000 lines of Java 2 code and depends on Java3D, JavaMail, and Swing availability. It also requires a high-end machine to run reasonably (400MHz and up and 200MBs and up). Other than that, it is platform independent.

At the end of Fall 1999 we will release the first open-source prototype, called KnownSpace Hydrogen, together with extensive documentation on the code and the design. Successive major releases will be named after the elements of the periodic table.

Hydrogen will save its own state, have dynamically switchable user interfaces, and have built-in web and email browsers. It will also be bundled with an optional visual interface builder, called KnownSpace Spectra, to let even non-programmers build their own (albeit primitive for now) desktop interfaces and share them with their friends.

Future KnownSpace releases will support locking and permissions, better backend databases, image analysis, file management on remote filesystems, encryption, multiple users, and a simple scripting language to further expand the pool of people able to make deep changes to the system beyond Java programmers. These releases will also be runnable as proxy servers, and will integrate support for legacy applications. Finally, they will also have more advanced artificial intelligence algorithms and more sophisticated user interfaces. We hope that early in the new millennium KnownSpace will lead to a renaissance on the desktop.