Vincent.Quint@imag.fr
Cécile Roisin (Grenoble University)
Cecile.Roisin@imag.fr
Irène Vatton (CNRS)
Irene.Vatton@imag.fr
The World-Wide Web (WWW) is a distributed hypermedia system based on the client/server model. It allows any user of the Internet to access a number of electronic documents distributed throughout the world and stored in servers. Interchange between clients and servers is based on a common representation of documents, called HTML (HyperText Markup Language). WWW users may choose different browsers for accessing and reading these documents. For displaying documents, all browsers parse the HTML files transmitted by servers and interpret the HTML tags they contain.
Many HTML documents are created with simple text editors, which do not check the syntax of the files they produce. As a consequence, a lot of documents on the Web do not strictly conform the HTML syntax. In addition, HTML has some features that are not fully exploited, because they are too much complex to be handled with general purpose editors.
There are other means of producing HTML documents; one of them is provided by filters. A filter allows files produced by a given document preparation system to be translated into HTML. Documents generated this way are generally HTML conformant, but when the document is intended for the Web, it is not natural to create it with a tool that ignores the Web.
In this paper, we propose an unified approach to these problems, which is based on SGML, but remain fully compatible with the current Web technology. We first discuss, in section 2, the advantages of the HTML document structure and the importance of a strict conformance with HTML. Then, in section 3, we present the main characteristics of an authoring environment we have specifically developed for the Web. Section 4 shows how this tool can help authors to produce better documents.
HTML has been designed as a markup language that represents the structure of documents, but it is often considered by authors as presentation-oriented. In many WWW documents, markup is clearly used for obtaining the graphical form that their author had in mind, not for structuring information in an abstract way.
As HTML aims at representing a large variety of documents, it specifies very general element types that can be used in various contexts, such as paragraphs, headings, lists, etc. and it does not impose strong constraints on their structure. As a consequence, the structure represented in HTML is very simple; it allows the head and the body of a document to be separated [1], but in these two parts, almost all elements are placed at the same hierarchical level. For instance, H1 and H6 element types, although they are supposed to represent very different levels of details in a document (heading at level 1 and 6, resp.), are placed at the same level in the HTML structure. Practically speaking, hierarchical depth only appears in lists.
This way of structuring documents is in fact an advantage for many documents accessed through the Web. These documents are often short and contain mainly links to other documents (anchors) organized in lists and grouped under a heading. The organization that the HTML structure can represent is well adapted to these documents.
But there are also larger documents, especially technical documents such as specifications, manuals and scientific papers. Currently, these documents are often represented in one of the two following forms:
Both categories of documents are needed on the Web. Small documents are very useful for quickly accessing pertinent information, essentially through their anchors. Technical, structured documents are needed for carrying more detailed and complex information. The problem is that HTML and most tools available on the Web are more appropriate to small documents than to technical documents. But it is often necessary to manipulate both kinds of documents simultaneously.
HTML is not simply a set of tags that authors can intersperse in the text, although some documents give this feeling. There are some rules clearly stated in the reference document [Berners94] and formally expressed by a SGML DTD. But many users produce HTML documents with simple text editors which ignore the DTD. Authors do not refer to HTML DTD for verifying what is allowed and what is not. They just test the appearance of the documents they write with their preferred browser. The result is that very few documents on the Web strictly conform the HTML DTD and many documents are rejected by a SGML parser. This is not a problem for most documents, as the most popular browsers are very liberal and make their best efforts for displaying non conforming documents, but sometimes, markup is ambiguous and the interpretation may change from one browser to another.
As an example, consider the following document. Its markup is obviously wrong, but, unfortunately, such documents exist in the Web:
<HTML> <TITLE>WWW test</TITLE> <BODY> <H1>Test of HTML parsers</H1> <H2>Presentation</H3>Here is a simple paragraph <P> <UL> Simple text before the first item of the unnumbered list <LI><H3>Title within the first item of the unnumbered list</H3> Simple text after the title. <LI>Text of the second item of the unnumbered list</LI> </UL> Text after the end-tag of the unnumbered list. <HR> </BODY> </HTML>
That document is rejected by a strict SGML tool such as HoTMetaL. More liberal browsers, such as Netscape and Mosaic, seem to interpret it in the same manner (they display about the same thing), but it is not sure that other programs will ``understand'' it in the same way.
For example, the content of the unnumbered list (from <UL> to </UL>) may be interpreted as:
Links are first class citizens in HTML. Different types of links are defined, that allow authors to specify the relationship between a source document and a target document:
Structural links are very useful for organizing large documents split into a set of small pieces (pages). They can represent the full structure of a large document and their semantics is clearly defined. Unfortunately, very few HTML documents use them. Instead, authors often use free links for allowing readers to move to the parent, next or previous document; this link is associated with an icon or a label that indicates to the end user the semantics of the link, but a program cannot make use of it.
As stated above (see section 2.1), HTML documents can be written using simple text editors, with the risk of generating incorrect files. This method supposes that the author has a good knowledge of the HTML syntax. It can be used for writing simple documents, but it is not the best way of authoring complex documents.
For large documents, several filters are proposed. They transform documents from the specific format of LaTeX, FrameMaker, Word, etc. to HTML. These filters are very useful for users who usually produce documentation with these tools. It is also an efficient way to generate the HTML representation of an existing documentation (user manuals, technical reports, etc.). These converters can be parameterized, thus allowing each type of document to be efficiently translated into HTML. The HTML structure of different types of documents are then different. But with filters, users cannot exploit all the specific capabilities provided by HTML, such as forms for instance. In addition, connecting a document generated by a filter to other Web documents must be done ``by hand'', directly in HTML.
As HTML is a SGML DTD, it is tempting to use SGML editors for authoring HTML documents. But there are specific conditions in the Web, which make these editors difficult to use:
HTML 3.0 will extend the HTML DTD, adding new, more complex elements, such as equations or tables. This will impose more constraints on the HTML tools, as it will become more complicated to produce these complex objects, both with simple text editors and with filters. SGML editors will then have some advantages, if they are able to edit and display these complex objects, but very few of them have this ability.
Finally, none of the available authoring tools completely fulfill the author needs. For preparing high quality documents for the Web, a specific environment is needed. In the next section, we present the way we have built such an environment, taking advantage of existing tools for structured documents.
Building from scratch an environment that addresses all problems stated in the previous section constitutes a huge task. Our approach is rather to adapt an existing tool that already offers most of the features needed, and that can be fine-tuned for the Web. With Grif [Quint86], a comprehensive environment for handling SGML documents, we have a good basis for building such an environment; but, for avoiding the limitations of pure SGML editors, some extensions are necessary.
Basically, Grif is a structure-driven editor. Structure-driven editing is inspired from syntax-driven editing. A DTD specifies the structure of a type of document and the editor uses that specification for helping (or sometimes obliging) the user to produce a document that is an instance of that type, i.e., with a logical structure consistent with the specification given by the DTD. With that approach, a document is represented in the editor as a logical structure that organizes typed elements with attributes. This logical structure is mainly hierarchical, with additional relationships which represent non-hierarchical links between elements, such as cross-references. With these links, a structured document is also a hypertext [Quint92].
As the logical structure of a Grif document always conforms the model of a DTD, it is possible to generically define the graphical aspect of documents: presentation rules are associated with element types and attributes and the editor applies them to the elements constituting the logical structure, thus producing the graphical aspect of the document. Presentation rules are expressed in a declarative language, called P. They specify such parameters as font, color, spacing, line length, indentation, justification, etc. Presentation rules are gathered in presentation models. A presentation model contains all rules necessary to specify the graphical aspect of all types of elements and attributes defined in a DTD. Several presentation models may be associated with a DTD; the graphical aspect of a type of document can then be changed globally, just by changing the current presentation model.
Presentation models also define different views, which present in different windows the same logical structure with different graphical aspects. For instance, the table of contents of a report is a view that shows only the elements of type SectionHeading and does not use the same fonts and layout parameters as the main view. These views show a formatted representation of documents that is very similar to a printed document: neither the logical structure nor the SGML tags are directly seen by the user, who interacts with the editor through the views displayed on the screen.
Grif is not restricted to textual documents. It can also handle pictures in various formats (bitmaps, pixmaps, gif, jpeg, etc.). It can handle complex objects such as equations, tables, structured graphics, forms, etc. in exactly the same way as the rest of the document. The logical structure of these elements is defined by DTDs, and, due to the power of language P, presentation models can specify the graphical aspect of these elements.
With these basic functions, the user interface presented by Grif is not fundamentally different from the interface of an editor-formatter based on a simpler document model. Nevertheless, its high-level document model allows it to produce highly structured documents that are guaranteed to be consistent with their DTD.
Another interesting aspect of Grif is its GATE function. With GATE, an API allows any program to access the editing functions that are proposed to the human user. Thus an application can drive the editor. GATE also allows developers to extend the editor functionality: an application can specify editing events in which it is interested and the editor gives it control when these events occur. It can then execute some specific treatments that can in turn call editing functions through the API. These two mechanisms are used for developing applications based on the concept of active documents [Quint94].
A naive way of adapting a SGML editor like Grif to the Web consists in simply using the HTML 2.0 DTD and writing a presentation model for that DTD. Obviously, it works, but it has a number of disadvantages. Some of them have been presented above (section 2.3): many existing documents cannot be loaded, links are simply treated as ordinary attributes, etc., but there are also user interface problems. Some specific developments have then been carried out, for solving user interface problems and for managing incorrect structures.
A presentation model has been written with the intent to facilitate user interaction. In this presentation model, several views are defined, which allow the author to have a better understanding of the structure s/he is building. One view displays a formatted document, like in the most popular browsers. Another view displays the ``source'' form, with all tags. [2] Another view displays only anchors, allowing the user to see at a glance all links contained in the document. At any time, the user may open or close any of these views (see Fig. 1).
Figure 1 - Different views of a HTML document
Views in Grif are ``synchronized'', i.e. they display the same part of the document at the same time. When the user clicks on an element in any view, the same element is highlighted in all views where it is supposed to be displayed. In addition any view can be edited and all changes made in one view are immediately reflected in the other views. This behaviour makes structured editing very easy to use, because, with a well designed presentation model, there is always a view that shows the information needed for performing a given editing operation.
In fact, various versions of this presentation model have been written; different choices of graphic design have been made in each model, thus allowing users to choose the most convenient model for each document or each usage. Models with large characters can be used for displaying slides, models with colors for users having color displays, etc.
As the DTD is very ``flat'', the menus presented to the user when s/he wants to create an element are very long (according to the DTD, almost any element can be inserted anywhere). These menus are therefore difficult to use.
For addressing this problem, two complementary approaches have been taken. The first one consists in modifying the user interface of the editor, the second in adapting the HTML DTD to the behaviour of the editor.
The first approach takes advantage of the extensibility of Grif. In order to permit applications to tailor the editor to their own needs, when they use GATE, the user interface (menu bar, pull-down menus, pop-up menus, etc.) can be modified by an application. This possibility has been used for adapting the user interface to the structure defined by the HTML DTD.
Following the second approach, the HTML DTD has been transposed in the internal Grif structure representation. In this internal representation, elements are gathered in a different way. For instance, all sorts of headings, lists and paragraphs, that appear at the same level in the original DTD, have been separated in three groups, named headings, lists and paragraphs. Thus, instead of a unique, long menu, the editor proposes a first menu for choosing a group, and, after this first choice, a second menu for choosing the element to be created.
Other adaptations have been made, that allow users to handle HTML documents more efficiently. Elements such as B (boldface), EM (Emphasis), I (Italics), etc. have been replaced by attributes that can be associated with certain types of elements (the types allowed to appear as contents of these elements in the original DTD). Given the user interface of the editor, attaching attributes to elements and removing them is easier and more natural than surrounding elements with other elements: attributes are manipulated in the same way as fonts or character size in word processors.
HTML 2.0 specifies a unique type of element (INPUT) for representing several types of form components: various types of input areas (text, image, password), various types of buttons (toggle, radio, submit, reset, etc.). The difference between these components is indicated by attributes associated with the INPUT type. We have preferred to specify a different element for each type of component, which allows an author to compose a complex form more easily.
With these adaptations, documents are not handle within the editor with exactly the structure specified by HTML 2.0, but, in their external form, they are strictly HTML conformant.
An authoring environment is not only used for creating new documents; it must also allow users to work on existing documents. For that purpose, it must be able to read any HTML document, even those which do not strictly conform the HTML DTD.
Like any SGML editor, Grif checks the documents it reads and it normally accepts only documents that satisfy the DTD. But the Web is not a typical SGML environment. SGML is often considered as too rigid for many Web documents. Flexibility and liberalism are required. Therefore, a specific parser is used for HTML documents. It does not follow the HTML DTD and does its best efforts for accepting the structure it analyzes. When producing the internal Grif structure that corresponds to the HTML document, it generates missing elements or attributes; inconsistent sequences of tags and symbols are interpreted in such a way to produce a more correct structure. In the worst case, invalid tags and symbols are simply ignored.
The result is that the editor has to handle documents that are not fully consistent with the DTD, but, given the way it manipulates the logical structure of documents, Grif can manage such incorrect structures. Obviously, when creating new elements, these elements can only be inserted at a position that is allowed by the DTD, but existing elements do not disturb the editor if they are misplaced.
This behaviour allows the Grif editor to accept documents as they are, but it guarantees that the structure of the documents it manipulates will not become worst than it was before.
Grif not only accepts incorrect document structures, but it also offers several ways of transforming these structures, thus allowing users to improve documents.
The first possibility is provided by standard editing commands. An element that is not at the right place can be moved with the commands Cut and Paste. A missing element can be added, an invalid attribute can be removed or modified, etc.
Other commands are specifically intended for restructuring documents: Split, Merge, Change, and Surround. Split duplicates an element and moves into the new element some elements contained in the duplicated element. In HTML, this allows for instance to divide a list into two separate lists. The opposite command is Merge, which makes a single element from two successive elements of the same type.
Commands Change and Surround are more interesting for transforming a wrong structure, especially in the case of HTML documents. The first one changes the type of an element. It proposes to the user a list of types into which the selected element can be changed and the user chooses among that list. The list contains all the types defined in the DTD that are isomorphic to the type of the selected element, i.e. all types that have the same structure. With HTML, this allows to transform an unnumbered list into a numbered list, for instance, or a heading of a given level (Hn) into a heading of another level. In fact a number of such transformations are possible in a HTML documents, because the DTD defines almost flat documents.
Command Surround creates a new element that contains existing elements. It can transform for instance a sequence of paragraphs into a list, each paragraph becoming an item of that list.
Command Paste is also a restructuring command in a structured editor. When a user wants to paste some part of a document at a given position in the structure of that (or another) document, the operation may pose problem. The structure of the part to be pasted may violate the structural constraints imposed by the DTD. Therefore that part must be restructured by the Paste command. This restructuring can be done in a simple way, based on conversion tables which specify how each element type is transformed into some other element types. In the case of the HTML DTD, conversion tables can be built ``by hand'', because the DTD is small and the structures it specifies are simple. The problem is more complex when restructuring must be done with any DTD and with complex DTDs. Further research is needed for addressing this general problem [Akpotsui92].
With these structure manipulation commands, Grif can help users to improve the structure of documents, but it does not automatically transform the structure to make it fully consistent with the DTD. This problem is still open, but most applications on the Web do not require a very rigorous structure. Nevertheless, more complex structure conversions are under study and will be proposed in the future.
The previous section presents the basic functionality of the authoring environment built for the Web on the basis of the Grif editor. With its original features, this environment provides Web users with a number of new possibilities: better use of large documents, authoring of more complex documents, easy conversion of existing documents, smooth evolution towards SGML, etc. Some of these possibilities are detailed in this section.
Most filters divide large documents into several small HTML documents, related to each other by links. This form is convenient for browsing, but not for printing and reading on paper. Specific elements (LINK with attributes REL and REV) are defined in HTML 2.0 for describing the structure of these split documents.
When a HTML document is printed, the editor uses these links in order to rebuild the whole original document. In addition, with the possibility offered by Grif to associate different presentation models with the same document, documents can be printed with a graphical aspect that is well adapted to the printed form (smaller characters, page numbers, page headers and footers, multi-column, etc.), while another presentation model is used for displaying the document on the screen, under the form of a scroll, without any page.
Many large documents include tables and equations. HTML 3.0 will allow these documents to be fully coded in HTML. Producing them directly with Grif will be easy, as the editor is already able to manipulate such elements with a direct manipulation style of interface.
Structured editing on the Web not only makes authoring easier, but it also permits a more advanced browsing functionality. The tool can be used for managing the user environment (hot lists, bookmarks, annotations, etc.) with the full power of a structured editor. Hot lists may be hierarchically organized, for helping the user to quickly locate the URL s/he looks for. Each annotation may be structured; it can be represented and edited as a HTML document, allowing the user to better organize his/her comments, with links between annotations or to other documents. As they are handled by a structured editor, hot lists, bookmarks and annotations can be rearranged and modified at any time, with the commands used for manipulating any document.
Homogeneity between documents of same type may be easily achieved by defining specific types of documents. As an example, in an organization such as a department of a university, a specific DTD can be defined for personal pages, another DTD for pages presenting each research group, etc. With these DTDs, each user who writes documents is guided and all documents have the same structure.
Translating these documents into HTML is done by using the Grif export mechanism. This mechanism allows the editor to output the documents it manipulates in different formalisms, which are specified with a specific language called T (for translation). Using this language it is possible to make the editor generate HTML documents, even if their internal structure is different. Then, different types of documents can be generated in HTML form and this form is homogeneous for all documents of same type.
This way of producing HTML documents can also be used for large, structured documents. A DTD can be defined for these documents, and when they are edited with Grif, they can be exported under the form of several small HTML documents, with all necessary links representing the global structure. Readers can browse them efficiently through the net, but they can also print them in a convenient way (see section 4.1).
Grif is a multi-DTD editor that can handle different types of documents simultaneously. Even with the adaptation to HTML described in section 3.2, it is not restricted to HTML documents. Thus, in the same environment, users can handle simultaneously any type of SGML documents and HTML documents.
Documents written with SGML DTDs can be converted into HTML, as indicated in section 4.3. But, one can also imagine that documents written with these DTDs will be exchanged directly, without translation to HTML. This will allow other users to exploit all possibilities offered by these DTDs, for reading these documents or for editing them, when collaborative work will be possible in the Web.
Working with different DTDs is not a problem, because structure transformation are made possible by the editor, as explained in section 3.3. Most of these transformations are generic: they can be made with any DTD, and between documents written with different DTDs.
SGML documents will be used for exchanging information within small communities for which specific DTDs are required. HTML DTD will be used in a larger community, for communicating with any WWW user.
This will allow a soft evolution towards SGML and towards more structured documents, as proposed in [Sperberg94].
The authoring environment presented in this paper allows authors to produce more comfortably better HTML documents. Authors work on a formatted representation of documents and do not care about HTML syntax. This environment maintains a full compatibility with existing tools available on the Web, as it can handle any HTML document, even those that do not strictly conform the HTML DTD. But it also allows different types of documents to be written and accessed through the Web.
This approach permits a smooth transition from the current situation to a more structured representation of documents and opens the doors of the Web to SGML documents. It also allows simple documents to coexist with more complex ones. Simple things remain simple; complex things become simpler.
Grif makes the task of Web authors simpler, but readers also take advantage of this evolution towards more powerful tools.