People who cannot see the screen can nonetheless access most of the important information contained on most well-constructed web pages. Most information is present as text, and people can use screen reader software to read text audibly or through an on-line braille display if they wish. However there are many common non-textual methods of conveying information visually that are much more effective (for people who can see and understand what they see) than text. Maps, Charts, graphs, diagrams, math equations and many other common display methods often are indeed worth a thousand words. It is possible to use non-visual (audio, tactile, or audio/tactile) methods to access non-textual information. In principle, such access is possible with the technology of today, although improved tactile display technologies could make them easier. In practice, access to non-textual information requires that the information, not just a picture of that information, be available to the browser. We review possible authoring tools and file structures that could make such non-textual information more accessible to people who cannot see or understand visual data, more useful for all users, and more compact than bit-mapped images.
All these accessibility problems arise fundamentally from the "printed page" paradigm of bit-mapped graphics and could be greatly reduced by intelligent use of the capabilities of electronic media. We present a brief overview of the concepts of "smart graphics" - "information behind a picture" and intelligent graphics browsers for accessing such information. We also discuss briefly the presentation methods by which such information could be used by readers, including those who are blind or have severe print disabilities.
Nearly every part of smart graphics technology exists today, but to our knowledge there is no complete package that incorporates everything necessary to author a smart picture, incorporate it into an electronic document, and display it intelligently. This paper is a discussion of what could be, not what is.
Charts, tables, graphs, diagrams, math equations, maps, etc, are common examples of pictures that can be represented by such object-oriented descriptions. There are undoubtedly many examples of representations (eg Escher's famous drawings of three dimensional impossibilities) that would be difficult or impossible to cast in a quantifiable-object-oriented form, but the vast majority of images intended to convey information could be so described.
The "information behind" such representations consists of a list of each object's properties and its position and orientation. Depending on which properties are included, the list may or may not be really "informative".
Consider as an example the simple equation, x equals the fraction whose numerator is y to the power n and whose denominator is 2. It is pictured as:
It is within the capabilities of current-technology optical character recognition to recognize that this picture shows a character x on the left, a double horizontal bar next followed by a horizontal line above which is a character y with a raised character n to its right. Below the horizontal line is the character 2. The double horizontal bar may or may not be recognized as an equals sign, and the raised n may or may not be recognized as the superscript of the y character.The font type would probably be classified as italic. To our knowledge there is no commercial OCR software that would recognize the fact that the structure on the right is a fraction.
An information map identifying the x, y, and n as math font characters, the double horizontal bar as an equals sign, the n as a superscript of y, and the right-hand structure as a fraction would be very helpful. This information is equivalent to the page markup for the equation. The maximum amount of information would be a full computable object list that identifies the superscript as a power (as opposed to a superscripted index for example).
The kind of information contained in such lists is often internally stored in the authoring program that created the picture. If the picture is created by a computational program, the program knows that the picture is of a math equation and exactly what its various parts are. If the authoring tool is a common math editor, the information is less complete, because the meaning of the superscript n is not normally defined. However if the author has used the math editor correctly, the semantic content stored internally is equivalent to that of the printed picture.
Unfortunately, this internal information is lost if the bit map image is the only "information" retained in the published electronic document. If this "information behind the picture" were made part of the electronic file, then any application smart enough to retrieve and interpret the information has full access.
We emphasize that the process described above requires absolutely no additional effort on the part of the author. Of course it is not always possible to use authoring tools that are smart enough to know the full information content without some additional input from the author, but there are many programs other than math editors that are capable of creating information lists with little or no extra effort from authors. Examples include some chart-drawing programs, flow-diagram creators, electronic circuit diagram CAD programs, and scientific graphing programs.
If the graphics browser were also sufficiently intelligent, the reader could alter the picture in order to obtain more information or better information than could be presented in a single picture. A simple option that allows the reader to show some objects and not others would, for example, permit a reader to simplify many complex figures by viewing one part at a time.
Many such reasons for wanting data rather than pictures of data exist in nearly all professional fields. For example, an electrical engineer who wanted to improve upon a published electronic circuit would find it extraordinarily convenient to have the ability to just read the electronic circuit information into a design program, make modifications, and proceed. This ability would avoid not only a great deal of human labor but also greatly reduce the possibility of errors resulting from human frilties in translating pictures into the information they represent.
Information lists also provide an extraordinarily flexible method of enriching information. Authors would like for non-textual information to be displayed in an elegant picture whose qualitative impact is often more important than the quantitative details. Nonetheless they must somehow indicate those details, and the result is often a cluttered picture that is so confusing that its message becomes lost in the clutter. Authors often find it necessary to write many pages of description about various details of pictures. Such information could often be better included as a field keyed to the object under discussion. Details not critical to understanding the article would not need to be included in the text but could be incorporated as information accessible in a field associated with the object(s) of concern.
Authors could also make good use of display options in an intelligent graphics browser. For example an author could provide a menu permitting several specialized views of different layers of a picture, or separate views of each data set in a group of overlapping data, etc.
Smart graphics could also provide a powerful new tool for educators. Authors of pedagocic treatises could write a simple introductory level text and supplement it with very detailed examples illustrated and explained in figures. A great deal of teaching is done by graphical illustrations, and it is known that a substantial fraction of students simply don't understand many such concepts (eg vectors). Students may be able to grasp such concepts more readily if they had the ability to ask figures to reveal information.
An intelligent browser would include some keyboard and- mouse-accessible method to access the object list. Some well-defined objects such as math equations, simple tree diagrams, and many kinds of charts and graphs may be browsed and understood without ever seeing the picture. However there are many kinds of non-textual information that can be understood only through spatial relationships. The information conveyed by such pictures is accessible only if the image can be seen.
For a blind person, a picture can best be "seen" tactually. Consequently, full access requires some device that can display a tactile image of the objects on the video screen. At present there is no commercially-available affordable device that provide such a tactile image. A considerable amount of research worldwide is being devoted to R&D;on possible refreshable tactile display technologies, and we can only hope that some adequate device becomes available in the not-too-distant future. One possible device of the future is describe by Fricke and various discussions of the subject are included in the information linked to Equal Access to Software and Information (EASI).
Until that time, the quickest access method for blind people is to print a hard copy tactile image of the desired picture and use that tactile image to locate objects and "click" on them for information. The technology of using a tactile image, an external digitizing pad, and an object description of the image accessed in this way was pioneered by Prof. Donald Parkes with his Nomad tablet. The technology is also discussed by Loetzsch.
We suspect that use of moderatly intelligent browsers and the full range
of these alternate access technologies will be sufficient to make most
non-textual information accessible to people with print disabilities.
Accessing the more complex figures will require skill on the part of the
reader in using the ability to pick and choose what objects to represent. It
is certainly true that not everything presented by every author will be
directly accessible however. Nonetheless, the need for human intervention
would certainly be a great deal smaller, and consequently information
much more accessible, if graphics were smart enough.