The strength of ASCII protocols for network services such as SMTP [1], NNTP [2], and IMAP [3], is their relative simplicity for debugging, trussing, etc. On the other hand, an undesirable hallmark is their invention of unique syntaxes for specifying requests and replies -- particularly in their conventions for quoting metacharacters, dealing with line continuations, encoding binary data, handling error conditions, etc. XML (Extensible Markup Language 1.0, [4]) is often viewed as encoding domain-specific data payloads over a protocol such as HTTP [6] [7] [8], but not as the protocol substrate itself. This paper presents our experience with MASP (Mediated Attribute Store Protocol), a simple, synchronous, fully XML, client-server protocol.
The important states in MASP are:
From the client side, the protocol document begins:
<?xml version="1.0"?> <!DOCTYPE masp SYSTEM "http://www.research.att.com/~jones/masp-client.dtd"> <client-session>
The server side is similar. A session is closed with the appropriate </client-session> and </server-session> end tags. Although arbitrary markup can represent the requests and responses, we have found the following conventions to be valuable:
The following is an example of a client search request and a successful server response. Note the use of attribute value indexing. The ix attribute references previous name attributes by index.
<search id='1'> <!-- client request --> <typedecl>user$u</typedecl> <filter><![CDATA[(last_name[$u]='Burnes')]]></filter> <select name='face[$u]'/> </search> <search-response id='1'> <!-- server response --> <resultset> <typedecl>user$u</typedecl> <results count='2'> <result> <ids> <id>hermod0000000102</id> </ids> <attrvals> <val ix='0' name='face[$u]'><EDATA encoding='qp'>GIF87a=01=00=01=00=80=00=00=95=76=81=00 =00=00=2c=00=00=00=00=01=00=01=00=00=02=02=44=01=00=3b=00</EDATA></val> </attrvals> </result> <result> <ids> <id>hermod0000000324</id> </ids> <attrvals> <val ix='0'><EDATA encoding='qp'>GIF87a=01=00=01=00=80=00=00=95=76=81=00=00=00=4e=00 =00=00=00=01=00=01=00=00=02=02=25=09=00=3b=00</EDATA></val> </attrvals> </result> </results> </resultset> </search-response>
MASP also supports complex multi-turn protocols such as SASL [5] authentication mechanisms. XML debugging comments can be observed with a tool such as the Unix truss utility without affecting the protocol operations. Syntax errors, semantic errors, resource failures, etc. cause the server to return an appropriate <error-response>, which includes a indication of permanence, an errorcode, and an error message. For example:
<error-response id='1' permanence='permanent' errorcode='5'> <![CDATA[Error parsing: parse error, column 22: '!'Bur...']]> </error-response>
MASP is an entirely XML-based client-server protocol whose extensions and conventions form a very useful protocol substrate. XML offers a standard set of mechanisms for representing structured data, and there are many high-quality XML parsers that are now available. DTD's (or XML schemas) present a clear picture of the client and server protocol syntax, and, especially with a validating parser, can enforce very precise syntactic requirements. Modifying the DTD's, changing a dispatch table in the code, and testing a new feature/command is easier than modifying ad hoc parsing code or a YACC grammar.
Most of the features that we have described for turn-taking, escaping and encoding mechanisms, error handling, attribute indexing, debugging and session management would be generally useful for many protocols. A longer version of this paper can be found at http://www.research.att.com/~jones/www9paper.htm.
Mark Jones is a researcher at AT&T Labs. He works on information modeling, artificial intelligence, natural language processing and machine learning, particularly as these fields apply to messaging systems. Tony Hansen is a developer at AT&T Labs. He works on messaging systems, web server systems and Internet standards.