TBK-HTML: a filter for using Toolbook as a front-end editor on the Web

Frega N., Volpentesta A., Della Gala S.

Giuda Lab. University of Calabria, ITALY

Abstract

Recently, many authoring tools have been developed to enable users making easy presentations of their documents on the Web. A class of such tools deals with filters or translators from some application shell to HTML. Their development has essentially been motivated by two main reasons:

At Giuda Lab, we have developed a filter, called TBK-HTML, which converts a toolbook document into standard HTML files. Such a filter allows one to use the Toolbook shell as a front-end editor for publishing on the Web, and, in our opinion, this could increase efficiency and flexibility in the hypermedia document creation process. Solutions to the implementation problems we faced in the development of TBK-HTML as well as its use and effectiveness are the subject of this paper.

Introduction

Accessing to WWW documents via Mosaic-like browser gives the user a huge amount of on-line information. At present, the Web document editing is totally charged to the server administrator who organizes and makes available the information on his site by using HTML, that is the usual standard for the creation of hypermedia documents on the Web. To let users show their work on the Web is as important as accessing to existing information. The knowledge of HTML can be surely regarded as a sort of barrier for many W3 users who are not yet familiar with this mark-up language and this could be a hurdle to WWW diffusion. To avoid this kind of handicap, many tools have been developed; one class of them deals with front-ends for on-line composition of HTML pages ([FRG94]), another class consists of several filters or translators from some applications shell to HTML. In particular, many elements of the second class have been developed for the most popular productivity application shells (especially word processor); they are generally oriented to create "ex novo" HTML documents and in any case none of them can handle hypertext structured documents ([DRK94],[LAM94],[DAY94]).

At Giuda Lab, we have created a tool for automatically exporting into the Web existing hypermedia documents saved in a format different than HTML one, namely TBK format. Moreover, it lets W3 developer work with user- friendly environment naturally oriented to hypertext creation.More specifically we have realized a converter/translator of Toolbook documents to standard HTML 2.0. The choice of Toolbook, as an application source for the conversion process, was due to some motivations that seem quite relevant:

For these points it seems natural to join the hypermedia environment of Toolbook with the W3 one. The use of the TBK-HTML could open the door to W3 for Toolbook users who have a little or no kind of knowledge of HTML and want to use their familiar application shell for publishing on the Web.

How the TBK-HTML filter works

As already written in the introduction, the filter lets us translate written hypermedia Toolbook documents into HTML files which can be made available for the Web.

To perform this translation we've written an application (using OpenScript of Toolbbok) [TBK94] that analyzes and then converts existing TBK books into HTML pages. The application runs in Toolbook environment to make the conversion easier and faster and to avoid any difficulty in using the filter. The procedures of our program can be schematized in three main phases:

These phases generally outline the filter work and how the TBK file conversion occurs. Our application, after the user query, opens a TBK file and, selecting the pages of the book, locates and parses the different objects on the pages. These objects, which can contain a text, a picture, hotwords, buttons and so on, are then converted into a HTML representation by inserting the appropriate commands. When the page content parsing is finished, the translation result is saved in a HTML page, and then the program switches to the next page to be analyzed. The whole process ends when all the pages (or the selected ones) of the book are translated. As an output, the user will have a HTML document, ready to be put on-line.

Opening TBK book and initialization.

In this first phase, the filter lets the user select the book or part of it to be converted. Before starting the pages parsing of a book, the application asks to the user some useful information for the creation of the HTML document; since it will be accessible from the Internet, we need to know the name of the server and the disk path of the output HTML files. Other internal variables are initialized at this point as the number of pages to be parsed, the work directory and so on.

TBK objects parsing

This is the most important phase of the whole conversion process. During this phase the main structure of the HTML document is created. The filter will parse all TBK-objects, page after page, up to the end of the book. Before seeing how this parsing occurs, we have to underline that all ToolBook is object-oriented and therefore in the conversion process we need to take into consideration all the objects in a document as well as their own properties. These objects are handled by Toolbook in a hierarchy. Therefore the filter will perform the parsing starting from the Book (the root object) and proceeding with the other objects. When the program reaches the lower level objects, it analyzes their own properties (as text, styles, script etc.).

In short we can split the parsing of the book objects in the following sub-phases.

Field and Record Field parsing

Fields and Record Fields have very similar properties. They are usually placed on the page or the background, by Toolbook users to insert text as well as hotwords (i.e. key words for hypertext navigation). The main difference between fields and record fields is that the first ones are used in the foreground and the latter ones in the background. Anyway, the filter analyzes the text which is contained in both objects in the same way. In this parsing phase, when the filter meets a field, the text is translated by using HTML style tags (< H1> , < H2> , < H3> , etc.) taking into account font sizes of each paragraph. Moreover, for any hotword which has been found out, a Script Analyze procedure is launched. This procedure searches for a link command (such as go to page, send next/back, to page, fxzoom to page, show field) in the hotword script. When a link command is met, the procedure checks the link destination; the destination page ID is stored in local variable and then used in the output HTML tag < A HREF=http://hostname /path/pageID.html> hotword< /A> . Note that an additional HTML page is created for any hidden field.


Figure 1

Button parsing

The buttons are controls which are normally used to define organizational links, for example to move through pages in a sequential way. They contain, in the related script, commands to perform jumps among pages; when the filter finds out a button it launches the same Script Analyze procedure used for field parsing.

Group Parsing

A group is a collection of objects that has properties like a single object and it is possible to refer to it by a TBK object type name. Thus, a user can create complex graphics or collection of objects and then bind them together as a group for working with them as a single object. Then, the parsing group phase consists of separating each group into single objects (button, graphic, field etc.). In this way, properties of any object are pointed out and may be analyzed as it is done in the previous phases.

Graphical object parsing

Since a TBK-book usually contains graphical objects as paintobjects and pictureobjects, the filter must handle them too. Due to the fact that ToolBook and HTML manage images in different ways, the development of the graphic conversion module in the filter has required the implementation of an external program to the TBK environment. Such a program has been implemented in Visual Basic 3.0 and it is called GifCapture. As a matter of fact, in a ToolBook application, the graphic content of an object, for instance a picture, is an integral part of a book-page and therefore, it cannot be exported into an external file; on the contrary, in a HTML document, images are being inserted on the pages through links to outside files, normally in GIF format. During the analysis phase, when the filter meets a paintobject or a pictureobject, it opens the DDE (Dynamic Data Exchange) Windows channel and it allows a TBK application to communicate with the GifCapture program. Once this channel has been opened, the graphical object is copied into the Win Clipboard, and pasted in the GifCapture which saves the object in a GIF file. After all, a < IMG SRC...> tag is inserted in the output HTML page.

During object parsing, the output is displayed in a window of the filter main form, see fig.2, and at the end of the conversion, they are saved in a file as HTML document.

Saving HTML documents.

The filter analyzes one page at a time, and at the end of each page analysis it creates a file with the name "page [index-page]". Due to the fact that in the analysis phase the filter has only inserted links to images in the HTML document, the real image conversion is carried out at the end of the whole book analysis. In this way the DDE channel is opened just once. Of course, the GifCapture program gives the output image file the same name which appears in the corresponding HTML page. Since our conversion process has preserved the hypertext link structure, the set of all pages makes a HTML document that can be put on line on the Web.


Figure 2. Filter main form: Converting a TBK Page into HTML format

Conclusions

The TBK-HTML filter has been tested at the Giuda-Lab, University of Calabria. Actually, some electronic books (in particular, a tourist guide of Calabria) which have been translated by the filter, are accessible at our WWW server (http://giuda.deis.unical.it). Our experience has shown the usefulness of TBK-HTML, for users at any HTML knowledge level. As a matter of fact, the filter allows one to exploit all capabilities (for instance in handling text, voice and images) of an application shell embedded in a graphical environment such as ToolBook for Windows. Moreover, in order to enhance and to make even easier the publishing process on the Web, TBK-HTML is currently being further developed to encapsulate a module for the import process. At present, only the export process has been implemented in the filter, and it would be useful to allow importation of a HTML document for modifying it in the TBK environment.


References

[TBK94]
Asymetrix Corporation, Using OpenScript, Asymetrix Corporation, Washington D.C., 1994

[DAY94]
Day J.R., Sullivan B., Spitnlink j., Solowoj E., "HTML made easy: HTML Claris XTND Translator", Advance Proceeding 2nd international WWW conference vol I, pgg 139-146, Chicago October '94

[LAM94]
Anton Lam, (anton-lam@cuhk.hk), HTML model for MS Word for Windows (CU_HTML.DOT), Computer Services Centre The Chinese University of Hong Kong, April 94

[DRK94]
Drakos N., "Latex2Html documentation",available on the Web

[FRG94]
Frega N., Volpentesta A., "A Multimedia Bulletin Board in WWW environment", Advance Electronic Proceeding 2nd international WWW conference, Chicago 17-20 October '94