HTML
Made Easy: The XTND HTML
Translator
Jonathan
Ryan Day and Brian A. Sullivan
Computer Science
& Engineering
Jeff Spitulnik
School of Education
Elliot
Soloway
Professor of Electrical Engineering
and Computer Science
Professor of Education
Highly Interactive Computing
Group (HI-C)
Department
of Electrical Engineering and Computer Science
University of Michigan
1101
Beal Avenue
Ann Arbor, MI 48109
Abstract
Accessing
documents on the World-Wide Web using NCSA Mosaic as an interface provides an
unparalleled on-line educational information resource. Enabling users to present
their compositions on the Web is as important as accessing existing information.
To facilitate this, it is
necessary for users to be able to easily compose documents to be accessed on
the Web.
At the Highly Interactive Computing group (HI-C) at the University
of Michigan, we have created a tool which simplifies the publishing process.
By developing a translator which allows users to create documents in a familiar
application and save as an HTML file, we have eliminated the need to know how
to compose HTML documents. The translator also allows importation of any HTML
document for modification.
We chose to target ClarisWorks for Macintosh as the document composition application.
It is a popular application in the educational market as well as the industry
in general.
Our translator allows the users to create their documents
and presentations containing text styles, and pictures as they normally would
in ClarisWorks, and save their compositions as HTML files. To do so, the
users simply select the HTML file format option in the 'Save As...' Standard File
dialog. The translator is useful
in converting existing ClarisWorks documents to HTML format as well. It
provides WYSIWYG conversion and allows the user to specify links to other files.
Once saved, the HTML files can be made accessible by uploading them to a WWW
server.
Introduction
The
Highly Interactive
Computing Group at the University of Michigan consists of undergraduate
and graduate students in both
education and in computer science as well as professors from both fields and professionals.
We work closely with schools to research ways in which computers
can aid in, and further stimulate the educational process. Through interaction
with students, educators, and administrators at Community High School in Ann Arbor,
we are trying to determine which types of applications and tools are beneficial
for educators and students. We then develop the applications or tools and
test them in the classroom with
the high school students. This gives us information about which types of software
tools are useful and which are inappropriate or ineffective.
Importance of the World-Wide
Web and Mosaic in the classroom
The
World-Wide Web provides accessibility in finding information on virtually
every subject from sources
around the world. For students, this type of resource is a tremendous asset
in the learning process. Not only does the Web provide information on topics which
cannot be found anywhere else; with an HTML browser such as Mosaic, it also
presents the data in a form which is new, and interesting to students and other
users. From this point on, we will assume NCSA Mosaic 1.0.3 or newer for Macintosh
in all discussions of an HTML browser. As the world of information continues
to increase in size, the
Web provides a standard, easy to navigate information structure while Mosaic adds
functionality and an inviting graphical interface [1]. All of this is very
important when targeting students as users. The easier it is for them to find
the resources they need, the more likely they will be to use and benefit from
them. It is also important to realize that publishing information in the form
of HyperText Markup Language (HTML) documents placed on a Web server for worldwide
browsing is crucial in the
educational process as well. It is relatively easy to get started browsing the
Web to find a plethora of information, even as a novice computer user. However,
this constitutes only half of the two-way information sharing path that the
Web provides. Users need to be able to publish such documents for others to view.
This, however, requires individuals to have some knowledge of the HyperText
Markup Language to compose even the most basic documents.
The
HTML barrier to document composition
The
HyperText Markup
Language is a markup language which is a Document Type Definition (DTD) written
using the Standard Generalized Markup Language (SGML) model [2]. It is the
standard format for hypertext documents being served by World-Wide Web servers.
HTML itself has a framework which consists of tags within a text document which
dictate the style of the body of text they surround, or they may indicate
links to other documents and other
types of resources available on the Web such as movie clips and audio clips.
Several
stumbling blocks exist for novice users trying to begin composing
HTML documents. There is the task of finding information on HTML itself.
Many on-line references and primers are easily accessible on the Web, but currency,
completeness and clarity in documentation are some factors that complicate
the process. For users not familiar with a markup language, the whole framework
may be unfamiliar as well.
Even when armed with documentation and an idea of how HTML is structured, it is
often necessary to expend considerable time reformatting a document to achieve
the desired result. Overall, it is very time consuming to compose documents when
compared with composition through use of a familiar page layout or word processing
application.
Motivation to develop the translator
The
desire to allow users to compose
HTML documents without having
to learn HTML was sparked by research in the Highly Interactive Computing
Group involving K-12 curricula. We realize how important it is to open the door
to the Web for students, but at the same time, not to overwhelm them with the
task of learning HTML. We found that, in order to make the procedure of creating
HTML documents seamless to the students when using ClarisWorks for Macintosh,
it was necessary to develop a translator which provided transparent translation
to HTML format. It would not
have been enough to simply develop an application which would allow drag and
drop conversion or as a separate translation application entirely. Rather, a
translation option accessible to the user through the 'Save As...' dialog box from
within the application was exactly the desired solution. This was our primary
motivation to develop the translator. We have realized the potential for application
outside of the educational world as well, however.
Implementation
The
XTND System
Developed
by the Claris Corporation, the XTND System allows an application
to use a virtually unlimited number of file formats. The XTND System itself
is a document which contains several routines which are called by XTND-capable
applications. The job of the XTND System is to present a list of all available
translators to the Standard File dialog for the user to select and indicate which
document to read from or write
to [3]. The XTND System then loads the selected translator. From this point
the translator does the remaining work. That is, it uses the available data
either from the application on export or from the input file on import and performs
the translation. The translator is, then, the intermediate piece in the
model that determines which information gets passed from the application to a
destination file or vice versa.
The modularity of the XTND System allows
each translator to be developed
independently of other translators. The ability to create import and export
translators for a specific file format without being concerned with other file
formats or translators adds much flexibility in development. For these reasons,
we chose to develop using the XTND System.
XTND HTML Translator
During
the export process, the translator has
access to data structures which contain data from the application. This data
comes in different phases. If
the data is text, it comes in the form of text runs, or continuous
streams of text of like style [3]. Each text run may have any number of the
available style characteristics associated with it. These characteristics include
font type, size, style, color, and other formatting characteristics. They
are important during export as the basis for which tags to write when exporting
data to the destination HTML text file. The translator also has access to information
about images contained
in documents. If an image is encountered during export, it is denoted by an image
flag which the translator checks at the beginning of each phase. If an image
exists, the translator receives a pointer to the image [3] which it uses to
write the image to a file. A more detailed description of how images are handled
appears in the section on functionality mapping. As explained above, the translator
receives information about a document being exported, then writes the
data to the destination file as
described in the mapping between ClarisWorks and HTML in the section on functionality
mapping.
The import process is very similar to the export process.
During the import process, the translator reads data from the source file
which is an HTML text file. As expected, nearly the reverse of the export process
occurs. The text is read in by the translator and the tags are interpreted
to determine what information is to be sent to the application. The translator
sends information regarding the
input data by using similar data structures as were used as information sources
during export. The appropriate characteristics are set for the incoming text,
and then passed to the application to be displayed for the user. Since the
input file must be a text file, there are only text runs to process, however,
depending on the tags read, it may be necessary to open an image which is linked
to within the HTML document. The method followed in such cases is described
in the section on functionality
mapping.
The user interface
There
are several pieces which make up the translation package
for use with ClarisWorks. First, there is the XTND System itself, which comes
with all Claris XTND-capable applications and resides in the Claris folder in the
System Folder on a Macintosh. Then, the XTND HTML Translator is located in
the Claris Translators folder in the Claris folder. Finally, to add ClarisWorks-specific
functionality for handling
links and styles, there is HTML Document stationery for ClarisWorks. This
stationery document is a template which is used to aid in composing HTML documents.
It has styles specific to HTML headers, preformatted text, and literal
HTML commands listed in the Style menu as shown in Figure 1. This is done to
provide a method for the user to specify headers and styles from the familiar ClarisWorks
Style menu. Also included in the stationery is a macro with an associated
shortcut button. When the
shortcut palette is displayed, the 'Link' button appears as in Figure 2. When
the user clicks on the macro button, a link is created in the manner described
in the next section.
The
style menu additions and the shortcut macro mentioned
above are the only noticeable differences in ClarisWorks using the HTML
Document stationery as opposed to using no stationery at all. As mentioned above,
the translation is initiated
in the Standard File dialog by selecting the translator from the pop-up list
of available translators. The dialogs are very similar for both export and import
as shown in Figure 3. Therefore, without breaking from familiar user interface
guidelines on the Macintosh [5], we have added functionality not inherent
in ClarisWorks, enabling users familiar with ClarisWorks to translate their documents
to HTML format easily.
Mapping ClarisWorks and HTML functionality
Due
to the differences in the types of styles and document formatting
supported by ClarisWorks and HTML it was necessary to evaluate how to map styles
and formatting between the two representations of a document. This mapping
was the model on which the translator code structure was based. The main issues
to consider in the export mapping from ClarisWorks to HTML were document formatting,
font type, size, and
style, and support for links to other resources or documents. When looking at
import functionality mapping from HTML to ClarisWorks, the issues are identical
but handled very differently due to the nature of importing versus exporting.
In
exporting, since margins, multiple columns, page breaks, footnotes, document
headers, and footers do not have direct counterparts in HTML files, these
characteristics were ignored. We assumed the user created the document in a
single column and that footnotes
would be used for a special purpose other than as footnotes, which is described
later in this section. Since the type of font used to display an HTML document
is dependent only on the browser, the font type is irrelevant and is ignored.
The remaining text export issues deal with font attributes.
Font style
and size are important issues in both ClarisWorks and HTML. In ClarisWorks
any size font can be used, but the only method for specifying differences in font
sizes in HTML is to tag the
selected text with header level tags or other style tags, then specify different
font sizes for each header or other tag type in the preferences of the HTML
browser. This assumes use of a browser which allows users to specify such preferences,
as is true in the case of NCSA Mosaic but may not necessarily be true
for other browsers. In order to maintain a certain level of uniformity in font
size mapping we divided font sizes into six different groups, each to be tagged
as a different header. Table
1 shows the mapping between font sizes in ClarisWorks and their corresponding
header tags in HTML. Provided that the font size representations of the different
headers scale down in the HTML browser from largest font for <H1> to
smallest for <H6>, the relative font sizes will look similar, if not identical,
in an HTML browser and ClarisWorks.
In
terms of font styles, the translator supports
bold, italics and underline. ClarisWorks also allows strike through, outline,
shadow, condensed, extended, superscript, subscript, and text colors, however,
these styles do not have equivalent tags in HTML. With the exception of superscript
numbers and text colors which are used by the translator as link flags and
literal HTML text flags, which
will be explained later in this section, the unmatched styles are ignored, and
the translator treats the run as plain text.
Export translation of images
within documents occurs as follows. When the XTND System notifies the translator
of an image during the export process, a file is created in the current
folder in the Macintosh file system which is titled filename.gif,
where filename is the name specified in the Save dialog by the user,
then the image information
is stored in the new file as a GIF image. An <IMG SRC = "filename.gif">
tag is written to the HTML export text file which is a link
to filename.gif so that the image can be inlined when viewed with
an HTML browser. This inline image link is generated assuming that the image
file will be located in the same directory as the HTML file itself which is
true during export since the translator assures this.
Even though ClarisWorks
does not provide a method
for users to add certain media such as sound files to documents, which would be
suitable in HTML, there is support for user specified links to facilitate this.
We realize that there are limitations to ClarisWorks in terms of being able
to display all possible types of media, but, with the ability to specify links
explicitly, the user still has full control to add different media and to specify
links to other HTML documents. The 'Link' macro described in The user
interface section above
enables users to specify links. Clicking on the macro button while highlighting
a selection of text to be the link text, causes the text selection to be changed
to blue colored, underlined text and tagged with a superscripted footnote
number. Footnote text is then created in association with the particular footnote
number at the bottom of the page in ClarisWorks. Initially the footnote text
is 'INSERT URL HERE.' This method is for use in specifying links to other
documents on the Web by typing
the Uniform Resource Locator (URL) as the footnote text. There is also functionality
through the Style menu associated with the HTML Document stationery as described
above, to insert literal HTML text which will not be translated, but rather
echoed to the HTML export text file. A selection of text with 'Literal HTML
Text' style associated with it appears in red. This can be any valid HTML
text including plain text, tags, links to other resources, or anchors. This functionality
makes it possible to
create HTML documents in ClarisWorks which support all possible HTML tags. We
based the translator on Level 1 HTML specifications, but with the literal HTML
text option, any currently valid tags may be used.
When importing HTML
documents into ClarisWorks using the translator, the process is fairly straight
forward. The text is read from the HTML file and the translator checks for valid
tags. Header tags, logical and physical styles, and normal text are all handled
at the time they are read.
The appropriate style flags are set, and the particular styles are applied
to the text read in, then passed to the application. Any list tags read are simulated
with tabs in ClarisWorks since it has no built in functionality for displaying
lists. If the tag read is a link to another document, it is displayed
as a numbered footnote at the bottom of the page with blue link text in the body
above. If the text read is a reference to an inlined image, an attempt to
find the image is made. Upon a
successful search, the image is inserted into the document being displayed in
ClarisWorks. If the image file is not found, the link text is displayed in red.
This is true for all tags which are read that are not understood by the translator.
After importation, what the user sees is text with possibly different
styles, footnotes denoting links if any exist, and red text displaying any information
read which was not understood by the translator such as bad HTML code or
unsupported tags.
Use and effectiveness
The
XTND HTML Translator
was developed to make composition of documents as easy as possible, however,
some basic knowledge is required. Knowledge of the Macintosh operating system
and file system structure are imperative, at least at a working knowledge level.
Also assumed, is that the user is familiar with ClarisWorks for Macintosh,
Claris stationery, use of macros
in ClarisWorks, and ClarisWorks tools and menus. It is also important to understand
how the Standard File dialogs for saving and opening documents operate.
These are reasonable requirements. The group of high school students we conduct
research with have all been introduced to these concepts through classroom
instruction and have been able to use the HTML translator successfully.
Building
knowledge
The method
by which the translator functions
is very conducive to gradually learning more about HTML documents and how
to compose them. Initially, it is easy to create flat documents, that is, documents
with text styles and images, but no links to other documents or resources.
Without much instruction about URLs and Web navigation, users can become familiar
with and start including links to other documents in their own compositions
by using the provided shortcut macro in ClarisWorks. Eventually, the ability
to include any HTML text will
enable users to be able to go beyond the basics to incorporate other resources
by specifying them in literal HTML text which is written directly to the output
file without translation. This is helpful in exploring new resources supported
by HTML while still being able to benefit from the simplicity of composing in
a familiar, user-friendly environment.
Future directions
The
XTND HTML Translator
is currently being further developed
to add even more flexibility in composing HTML documents. One such development
is support for QuickTime movies. ClarisWorks does allow for inclusion of
QuickTime movies in documents, however this is not currently supported by the
translator. Future versions will accommodate movies.
As mentioned earlier,
Level 1 HTML specifications [6] were used to model the current version of
the translator. Since Level 2 HTML documents are proliferating and support is
being added for Level 2 functionality
in HTML browsers such as NCSA Mosaic, future versions of the XTND HTML
Translator will also handle Level 2 compliant documents. As Level 3 HTML or HTML+
is further developed, it will be considered for support in future versions
of the translator as well.
We will continue testing use of the translator
in the educational environment to find ways to further increase its beneficiality
and ease of use. As possible improvements in the translator and the included
HTML Document stationery are
recognized, such changes will be implemented in subsequent releases.
Conclusion
Using
the XTND HTML Translator greatly
decreases the amount of time required to compose HTML documents while also
decreasing the level of knowledge necessary to begin. It is a valuable tool for
users at all levels of HTML knowledge for increasing document composition efficiency,
while still allowing
as much flexibility in manipulating text and images as ClarisWorks allows, and
also providing a means for support of future developments in the HTML standard.
It also makes simpler the task of progressing through learning different aspects
of Web publishing. It is possible to publish documents on the Web with little
or no knowledge of HTML, thus enabling anyone with access to a Macintosh, ClarisWorks
and the XTND HTML Translator to create HTML documents. The translator
allows more people to participate
in the growth of the Web, and move away from being merely observers.
Acknowledgements
Recognition
of the need for the translator and its application in educational curricula
is due in total to Jeff Spitulnik, School of Education, University of Michigan,
and Professor Elliot Soloway,
Professor of Electrical Engineering and Computer Science, Professor of Education,
University of Michigan. Sean DeMonner, School of Education, University of Michigan,
collected data and instructed a group of high school students about how
to use Mosaic, what the World-Wide Web is, and how to use the XTND HTML Translator
to compose documents to be viewed in Mosaic. He is also currently using
the translator as a key tool in forming a model for bringing schools up as Web
sites. Mike Farr, formerly of
Claris Corporation, now with Helios, Inc., gave us help in resolving several issues
specific to ClarisWorks during the development process. Kathy Brade, University
of Michigan, supplied information on Macintosh Toolbox routines. The entire
HI-C Group gave us hardware and software support during development of the
translator. Special thanks to the Community High School students who tested the
translator.
References
Software
Development Group, National Center for Supercomputing Applications,
NCSA Mosaic for Macintosh User's Guide Version 1.0, University
of Illinois at Urbana-Champaign, Champaign, IL, 1994.
National Center
for Supercomputing Applications, A Beginner's Guide to HTML, University
of Illinois at Urbana-Champaign, 1994.
URL: http://www.ncsa.uiuc.edu/General/Internet/WWW/HTMLPrimer.html
Developer
Technical Publications,
XTND Programmer's Guide For
XTND 1.3 version 1.3, Apple Computer, Inc., Cupertino, CA, 1991.
Claris
Corporation, ClarisWorks for Macintosh Handbook, Claris
Corporation, Santa Clara, CA, 1993.
Apple Computer, Inc., Macintosh
Human Interface Guidelines (Apple Technical Library), Addison-Wesley,
New York, 1992.
Berners-Lee, Tim and Connolly, Dan, A specification
in hypertext, CERN, Geneva, Switzerland, 1994.
URL: http://info.cern.ch/hypertext/WWW/MarkUp/HTML.html
Authors
Jonathan
Ryan Day is a senior undergraduate Computer Science student
at the University of Michigan. He has worked with the University of Michigan Libraries
Recon group maintaining data integrity in the on-line catalog searching
system. He has also worked with Learning Center, Ltd, a computer networking
and service company, developing in-house applications. Recently, he has been working
with the Highly Interactive
Computing group at the University of Michigan on several research projects.Brian
Sullivan is a senior undergraduate Computer
Engineering student at the University of Michigan. He has worked with Chase
Manhattan Bank as a systems engineer, at Northern Telecom as a system administrator,
and the Computer Aided Engineering Network (University of Michigan) as
a network specialist. Recently, he has been working with the Highly Interactive
Computing group at the University
of Michigan on several research projects.
Jeff Spitulnik is a doctoral
student in the School of Education, University of Michigan. He has been one
of the primary links in the project involving the University of Michigan School
of Education, the Highly Interactive Computing group, and Community High School,
Ann Arbor, MI.
Elliot Soloway is a Professor in the Department of
Electrical Engineering and Computer Science, College of Engineering, University
of Michigan. He is also a Professor
in the School of Education. Previously he was an Associate Professor at
Yale University, Department of Computer Science. His area of research is currently
interactive learning environments. In particular, he is concerned with how
learners of all ages will routinely manipulate computational media in pursuit
of their learning and workplace goals. Soloway's research group has produced
MediaText, a multimedia composition environment, that is available commercially
and is being used in hundreds
of classrooms around the country. Soloway is Editor-in-Chief of the Interactive
Learning Environments journal, published by Ablex, Inc. He was a Keynote Speaker
at the ACM Conference on Computer-Human Interfaces in 1989, and has given
numerous tutorials and presentations at other CHI conferences.
_________________
This project was supported, in part, by a grant from the National
Science Foundation
[RED 9353481].
_________________
Contact:
Jonathan Ryan Day, jrday@eecs.umich.edu