(liberte@ncsa.uiuc.edu)
Alan Braverman, NCSA
(alanb@ncsa.uiuc.edu)
Many browsers now support personal annotations associated with any document, created and readable only by the user of the browser. In addition to personal annotations, a document may have any number of public annotations created and readable by members of the public, and it may have any number of group annotations created and readable by members of many distinct groups. We can distinguish between users who may both create and read annotations, and those who may only create or only read annotations. It is conceivable to allow group annotations of public annotations, and publically readable annotations created by members of a group. Other variables to consider are who can edit or delete annotations, and who moderates or monitors group and public annotations?
A good survey of many design and implement issues for annotations is [GRA94].
To achieve scalability, the changing variable must be factored out so that the solution requires only a constant or manageable amount of the fixed resources. For example, a linear search through a list usually works fine if the list is small taking time proportional to the size of the list. But for a large number of elements, a binary search through a ordered, balanced tree is more efficient because it reduces the search time to the log of the number of elements. Some solutions can be made more scalable by adding more processors (for example, searching branches of a tree concurrently), but then scalability problems often arise in the communication between the concurrent processes.
The most significant scalability challenge for the web is the number of users, which will soon be in the billions. This would not be a problem if people limited their activities to their local machines, but the nature of the World Wide Web is that users request documents and services from servers at remote locations. This is a problem for two reasons. Requests from users to random servers add to the network traffic in proportion to the number of requests, which is proportional to the number of users. (This is mostly a problem for routers along the path taken rather than merely a bandwidth problem on the segments between routers.) The second cause of problems is that the interests of users will never be evenly distributed over all available servers, so some servers are likely to experience more load than others in proportion to the total number of users. Therefore the first cause (random remote requests) loads the network whereas the second cause (non-uniform remote requests) loads some servers more than others.
One obvious solution to the scalability problem of access to documents is caching of documents closer to users. This reduces both network traffic and server load. A similar solution is replication which is essentially caching before the documents are requested. The cost of massive replication can be reduced by using a flooding distribution such as is supported by the Network News Transfer Protocol (NNTP). Furthermore, replication can be applied to services as well as documents. If replicated servers are in clusters near the original server, this reduces the load on any one server, but it does not reduce the network traffic because all requests still go to the cluster. However, if replicated servers are distributed around the web, and if clients can automatically locate the nearest server, this would reduce network traffic as well.
As a general rule, scalability of the web requires that documents and services be moved closer to clients.
Other variables addressed by this paper that affect scalability for annotations include the following:
The major cause of problems for public annotations is that the public is essentially a very large group that includes everyone. With a critical mass of participants, there is more that everyone wants to say. To compensate for the large number of readers and writers, the number of documents that may be annotated by a public annotation server should be severly limited.
For everyone to have access to public annotations, the annotations must be available in a public place. But this public place can certainly not be a single public place for all annotations of all documents; even a small number of public places would be insufficient. A large number of public places would seem to be a possible solution, but that tends to be a waste of space and time since not everyone reads everything. This is discussed further in Other Attempts at Scalable Annotations.
One important concern we do not address here is how users find out about groups and documents with public annotations that they are likely to be interested in. This is related to searching and publication, issues which have their own scalability problems, but the problem is also related to finding out when new annotations are available, which we do address in the section What's New.
Since each server of a document references the public annotation server(s) for that document, this potentially distributes the load for serving annotations to at least as many servers as there are document servers, and possibly many more. A public annotation server may serve annotations for as many or as few document servers as desired, and it may be relocated or replicated independently of the document servers. Furthermore, an annotation server could delegate annotations of particular annotations to yet another server. Each document server could be its own public annotation server as well, but this is not a requirement.
The fact that the bodies of annotations are stored on the annotation server is a scalability problem for very large annotations and for a large number of annotations. More space is required to store them and more time is required to deliver them to clients. One solution to this problem involves posting just a URL of an annotation - the annotation itself is stored wherever the author chooses. Authors assume the cost of maintaining the storage and serving these annotations. But until most users have a server available to them, we shouldn't require it, and instead, annotation servers should restrict the size and number of annotations.
The fact that clients ask public annotation servers for the annotations is also a scalability problem. With an ever increasing number of clients asking for annotations, the same annotation server will experience increased load. Part of the solution could be to replicate annotation servers and to delegate some responsibility to other annotation servers. But the longer term scalable solution takes advantage of the fact that annotations are treated as ordinary documents retrieved via HTTP: Annotations may be cached closer to clients just as documents are, so annotation servers are only requested to deliver an annotation if it is not found in a cache. One debatable point that affects caching is whether the act of adding an annotation to a document is considered a change of the document itself or only a change to its metadata, and that leads into the next issue.
The fact that clients need to ask a document server for the location of the public annotation server is a scalability problem for the same reason that requesting the annotations from the annotation server is a scalability problem. But the reference to the annotation server could be stored in the metadata for the document, and if metadata were cachable just as any other data, clients could look first in caches and only then ask the document server.
Group annotations are distributed among all the group annotation servers. That is, each group annotation server specifies what set of documents it serves annotations for and what set of users are allowed to create and read annotations on that server. A single document could have annotations on several different group annotation servers. A server will typically be near the group of users that is served by it, such as an NCSA or MIT group for those communities of users, but it could be anywhere. To further distribute the load, a single group of users might be associated with several group annotation servers, each serving annotations for a different set of documents. Clients should be able to learn what that set of documents is for each group, as a URL pattern for example, to avoid requesting annotations from servers that will have none. This cooperative behavior will increase the scalability of group annotations.
We expect that over the life of a document, both group and public annotations may need to be expired or archived, but that the time frame for removing annotations can be considerably longer than that of articles on Usenet, for example, which need to be expired relatively quickly to make room for new articles. The annotation server for a document is essentially the archive for the communications on the topic of the document, and since readers on the web will go first to this archive (after checking caches, of course), there will tend to be less redundant and inappropriate postings, thus further reducing the volume of traffic and increasing the signal to noise ratio. Even so, selective cleanup of annotations will be required, and topics that turn out to have a lot of interest will need to be subdivided into a number of narrower topics, each with a smaller interest group.
The protocol is described informally in the following sections. A more detailed, formal presentation of the protocol is available [BRA95]. Figure 1 illustrates how a client might get a document and its public and group annotations from respective annotation servers. The description of the protocol ignores where caching might be used to improve scalability, since caching should be invisible in any case.
Alternatively, clients could include along with the request for a document a header line or "accepts" option to specify that the document server should return annotations if the document server happens to be the annotation server.
If a group annotation server is a nearby the client, the client may bypass HTTP for a more direct connection to the annotation server, or perhaps the client may access the group annotation files directly. These options are outside of our protocol.
The client requests each annotation server to return information about all appropriate annotations of the document. The client may request information about only those annotations created (or modified?) after some date, perhaps the last date that the user visited the document.
The annotation server returns this information using one of several formats as requested by the client, such as HTML with hyperlinks for each annotation, or some metadata notation. Possibly the bodies of all annotations could be returned with one request.
The client constructs a display of the annotations from this information and enables controls that allow the user to add or edit annotations. It is recommended that clients display annotations of annotations in a nested fashion to reflect the hierarchical nature of this relationship.
Alternatively, a client may request that a form be returned for the user to fill in. Our implementation of the annotation server via CGI provides such a form (similar to the HyperNews response form). If the user submits this form, the user is allowed to preview the annotation before committing to it.
Editing is similar to creating but the information that had been entered previously must be sent back to the client. By the way, one might think that public annotation servers have a scalability problem with editing because anyone can create a new annotation, but only the author of an annotation should be allowed to edit it. Therefore, to support editing, a public annotation server must store a password per user, or at least a link to an authentication server for the user. This information could be stored as part of every annotation, or in a separate database, so it grows in proportion to the number of annotations.
Annotations may be deleted by the author or by the owner of the base document. The annotation server first authenticates, and then performs the deletion. Orphaned annotations of a deleted annotation should be moved up to its parent; there they may stay until they are reclaimed or deleted.
A variation on this idea is to post just a URL of the annotation instead of the annotation itself. This scheme would save considerable space and transfer time in the usual case that annotations are not read by everyone, but both schemes suffer from the same scalability problems since they both rely on the NNTP model of replication.
The main scalability problem with the NNTP model is that with more and more annotations by an ever growing number of people, NNTP servers must either reduce the number of subscribed groups or reduce the expiration time of articles, or both. Therefore users would not have equal access to public annotations. Increasing the size of disk would allow more annotations to be stored, but soon the time to transfer all the news for one day would take over a day unless bandwidth is also increased. But moreover, even if we could store a couple days worth of news for all newsgroups, why should we want to since most of it will go unread? More and more resources would be tied up doing what a smaller and smaller percentage of users want.
Nevertheless, although NNTP is not scalable as the sole access mechanism, it still has value as a replication service. To read a particular annotation of a document, clients would first look in the appropriate newsgroup, and if it is not there then ask the annotation server.
NCSA provided a group annotation server [MOS93] that was open to the public and so it was essentially a single public annotation server for all annotations of all documents. The server was quickly overloaded and NCSA was forced to disable public access. But as a group annotation server for sufficiently constrained groups, this design is scalable, and it is basically the same as our current design.
Many annotation-like facilities already exist on the World Wide Web [LAL95]. HyperNews [LAL94], WIT, Hot Wired, and the National Performance Review Electronic Open Meeting are all examples of how to support public annotations for the purpose of discussion. All use a similar model of associating the public annotations with the thing being annotated. The difference between our built-in support for public annotations and systems that use standard HTTP 1.0 is that one of three tricks must be used. 1) A URL could go through a CGI program that fetches the real document and returns that with the annotations appended. 2) A URL could reference the real document containing a directive that causes the server to include the annotations. 3) The annotations could be added to the real document.
As the number of annotations for a document grows large, users will usually not want to see the same old annotations every time they revisit the document, nor will they want to revisit the document at all unless there is something new. Furthermore, you cannot hope to keep up with changes to a moderately large number of documents without help from tools similar to newsreaders and associated personal databases that remember which groups are of interest to you, which annotations you have seen, and what preferences you have for displaying articles. How will just the new annotations be requested, returned, and displayed to readers in a scalable manner?
We discuss two complementary approaches to the problem: polling and notification. [WEI94] discusses the trade-offs of these approaches in terms of "come get it" vs "send it everywhere" relative to broader information infrastructure issues. The Hot Wired system is a good example of how to support polling with server-side filtering, whereas the NPREOM system uses email notification. Our protocol does not yet address the problem at all, except to support filtering by date. We are exploring the notification route and implementing interim solutions with mailing lists.
If you want to periodically find out what is new for a set of documents, you could start up a client-side program that automatically polls the servers of those documents to find out which ones have something new, and generate an HTML document (or several) containing all of this information. This has been suggested by [ANDb93]. But the list of documents that a user is interested in watching should not be too large, or at least the number of new annotations should not be too large, since otherwise the user will not get through it all and the effort to find out what is new will have been wasted.
However, asking all annotation servers of interesting documents for new annotations is not useful unless there is, in fact, something new for most documents most of the time. This may be occasionally appropriate, but for documents that are annotated infrequently and for a large interested public, the server may be frequently asked if there are any new annotations when there are none. Therefore, servers may wish to inhibit excessive polling by clients in order to reduce their load.
To be scalable, the client must be able to poll for new annotations first at local caches. But if something is not in the caches, that doesn't mean there is nothing new; maybe no one else has requested the new annotations yet. Hence the NNTP model of replication before articles have been requested means that users can poll locally for new articles.
To support notification, the server could maintain a mailing list per document. For each change that requires notification, mail would be sent to each member of the list. This is not too much of a burden if the number of list members is not too large. At some point it becomes cost effective to switch over to an NNTP-style flooding distribution of the notifications, but see the discussion of the disadvantages of this approach in the section above on Other Attempts at Scalable Annotations.
A hybrid between mailing lists and NNTP is possible to get the best of both worlds. One route to creating this hybrid is to make mailing lists transparently and dynamically hierarchical with many sublists to handle remote distribution. Another route is to modify NNTP servers to dynamically subscribe and unsubscribe from newsgroups based on local or downstream demand. Ideally we need to distribute the load onto only those nodes in the web that forward the notifications only to interested readers. Each node should have only a small number of neighbors to keep its load relatively constant.