Neil G. Scott,
Senior Research Engineer
The Archimedes Project, CSLI
Stanford University, Stanford, CA, 94305
This presentation will discuss and demonstrate web access using the Total Access System developed by the Archimedes Project at Stanford University. This Total Access System makes it easier for disabled individuals to use any computer or computer-based device by clearly separating access requirements from applications. Individual access needs are handled by a personal "accessor" device which connects to any computer through a standardized Total Access Port (TAP). The Archimedes Project is developing a selection of accessors which cover the needs of disabled individuals, regardless of the type or severity of their disabilities, and a selection of TAPs to provide a standard connection to any computer. An appropriate accessor enables a disabled person to use the same browsing tools as everyone else to access information on the web. Currently available accessors provide a selection of alternative input capabilities including: special keyboards, speech input, head tracking, and eye tracking. Ongoing research is focused on providing blind users with alternative access to visually presented information, and deaf users with alternatives to spoken text.
In a perfect world, there would be no need to manufacture special access devices for disabled individuals. Products would automatically include all of the input and output capabilities necessary for them to be useable by anyone, regardless of particular physical and cognitive capabilities. In the real world, however, the variability in the needs of individuals with different disabilities makes it impractical to include all of the necessary input and output capabilities for a single device. One real-world need that is becoming increasingly apparent is access to the Internet and World Wide Web. Many companies are working on building accessible web tools. The Archimedes Project at Stanford University is working on a universal approach to access which automatically ensures access to the web. Stanford's Total Access System provides gives a disabled person the ability to use any computer in the same manner as anyone else. If a person has full access to a computer, it follows that he or she has full access to the web. This paper explains the Total Access System.
There is growing acceptance among access specialists that all products fall into one of three classes:
All applications of computers are linked in some way to information. Computers are tools which receive, process, store, retrieve and deliver information. Information is at the core of our modern society. Anyone who doesn't have adequate access to information is potentially severely disadvantaged. In itself, however, information is neither accessible, nor is it accessible. It is the form in which it is represented which makes it one way or the other. To succeed in today's world, individuals must have access to information in a form that they can use. Increasingly, worker productivity and effectiveness is directly linked to how well they are able to find and process information. When they have access tools, disabled workers are as capable as everyone else in an information driven society. While there may be many options for making information universally accessible, factors such as product cost and convenience to all users must be considered in making a choice.
In the past, access has been provided on a case-by-case manner. The needs and capabilities of the user are assessed by experts and the working environment is adapted to accommodate individual needs. This is a time-consuming and inefficient way to provide access and it often results in penalizing disabled users by forcing them to work at a single computer in a single location. In the course of a normal day, most of us use a variety of computers, often in quite different guises and locations, and would be greatly inconvenienced if restricted to doing everything through a single personal computer. Researchers at the Archimedes Project have developed a universal access system which directly addresses the problem of enabling a disabled person to use any computer or computer-based device. The Archimedes Total Access System (TAS) separates the special needs of a disabled user from the applications that are being used. It achieves this by breaking the access problem into two parts, one which deals with providing information to, and receiving information from a particular user, and the other which deals with connecting to specific host computers and other electronic devices.
Archimedes researchers solved the first part of the problem by developing a personal, portable device called an accessor. An accessor enables a disabled person to perform all input functions and retrieve all output information in whatever manner or format best suits his or her needs and capabilities. They solved the second part of the problem by developing a Total Access Port (TAP) which provides external connections to all user inputs and user outputs on the host device. The two partial solutions are connected by a standardized communications protocol called the Archimedes Protocol which allows any accessor to interact with any TAP.
The divide-and-conquer strategy provided by the Total Access System leads to many advantages over existing methods for making equipment accessible.
1. A personal accessor which handles all disability access functions, and
2. A Total Access Port (TAP) which provides a standardized interface to any type of host computer or computer-based device.
Accessors are designed to perform input and output functions in whatever manner best suits the needs and capabilities of a particular user. In addition, they may incorporate acceleration techniques such as word prediction to minimize the amount of effort required from a user to perform routine operations. Accessors can be as large or as small, as simple or as complicated, as cheap or as expensive, as is necessary to perform the required functions. Accessors can be designed to use any of the techniques that have been previously developed for making existing computers accessible. Additionally, accessors support the use input and output techniques that would otherwise be impractical due to the computational burden they would place on a host device.
The Total Access Port provides external access to all input and output functions on a host device. It achieves this by emulating the electrical functions of the physical input and output devices connected to the host. Emulation of the physical input/output devices is necessary because it provides full control of the host device without requiring the addition of special hardware or software inside the host. While it is possible to emulate some of the I/O functions on a host computer through its serial or parallel port, this approach was rejected for the TAP because it necessitates special software to be running in the host and it ties up a port that may be required by another device such as a modem or printer. We jokingly refer to the TAP as stealth technology because the host cannot tell that it is connected. There is no difference between signals coming from an accessor and those coming from physical devices such as the keyboard or mouse.
TAPs must provide interfaces to input functions such as keyboards and pointing devices, and output functions provided by a screen display or loud speaker. Our initial research has been focused on the input functions performed by the keyboard and mouse. After investigating a variety of options, we chose to emulate the physical and electrical characteristics of the keyboard and mouse. This leads to a stable design for the TAP since it is not possible for manufacturers to make arbitrary changes to hardware interfaces without causing major compatibility problems. In contrast, software interfaces can be quite unpredictable because they can be arbitrarily changed by any application software.
A unique TAP design is required for each type of computer because of the differences in their physical and electrical interfaces. In general, one TAP design handles all of the systems for each major manufacturer. To date, input TAPs have been produced for Sun, SGI, Mac and IBM PC computers. These four TAPs handle most of the workstations in the workplace because many of the workstation manufacturers are adopting the IBM PS/2 standard in place of their own proprietary keyboard and mouse interfaces. A new TAP is being developed to match the Universal Serial Bus which is due to appear on many new computer products. TAPs are also being developed for a variety of appliances.
TAPs were conceived as devices which provide simultaneous, external access to all input and output functions on a host system. However, because retrieving output information from a computer is considerably more difficult than to entering input data, the project was divided into two phases. The first phase, which is almost completed, focused on providing reliable and transparent input to any supported workstation. The second phase, which is just now beginning, focuses on the problems of retrieving output information from the screen of a host system and translating it into whatever form is most suitable for the individual user.
Individuals who cannot see what is displayed on a screen must rely on alternative output representations such as speech or Braille for text, and sound images or haptic devices for non-textual elements. Visually impaired users may require large character displays for text or highly magnified images for graphical elements. The greatest problem in providing these alternative forms of access is obtaining the information from the computer in an appropriate format. The primary reason for this is that almost all computer software is designed to use the video screen as the default output and there is no provision for direct access to the information that is displayed. Because of this, programs which provide alternative outputs must capture the screen information from within the operating system before it is formatted for the screen. As operating systems and applications continue to become more complex, this is becoming incredibly difficult to do.
The Graphical User Interface (GUI) is the major factor contributing to the difficulties of recovering raw screen data from within a computer. Typical GUI displays are a collection of separate, rectangular windows containing textual or graphical information. Each window is created and managed independently by the operating system at the request of the different applications that are running in the machine. Each application can own windows created at different times and located anywhere on the screen. The windows may be arranged on the screen in many different ways: side-by-side like tiles, cascading patterns of overlapping rectangles, or randomly scattered with some windows fully visible and others either partly or fully obscured. At any time, only one window has the focus, meaning it is the one that the user can send information to. It is incredibly difficult to keep track of which window belongs to which application, and exactly what is visible to the user.
An additional complication with the GUI is that much of the information is represented by graphical elements in the form of small picture icons and various symbols representing selection buttons, information boxes and slider bars for scrolling through the contents of the windows. In many situations, all of the meaningful data is contained in these graphical elements and there is no textual representation available.
Yet another complication of the GUI arises from the multi-tasking capabilities of the computer which leads to many different windows actively displaying and changing data at the same time. In contrast to older systems in which one stream of output information was presented to the user in a sequential manner, a GUI presents many streams of output information simultaneously. Some of the changes are important, e.g., warning messages, and must be brought to the attention of the user immediately. Other changes, such as the moving hands on an image of a clock face, are part of the routine system operation which the user may or may not take any notice of.
The GUI also leads to related input problems for various groups of users. Activities known as pointing and clicking are used to convey the user's intentions to the GUI. Pointing is performed by physically moving a mechanical pointing device, typically a mouse. The pointing device causes a small cursor image, such as an arrow head, to move about the screen. The clicking is performed by pressing a physical button when the cursor is at a desired location. The meaning of a button click depends on what screen object is under the cursor when the button click occurs. Different screen objects are used to represent the various activities available to the user. Making choices from a menu, specifying the way a program is to behave, selecting fragments of text to be edited, are typical operations that are performed with a pointing device. Pointing and clicking operations require a high level dexterity and hand-to-eye coordination. As a consequence, many individuals with visual impairments, blindness and/or physical disabilities find the GUI difficult or even impossible to use.
Current methods for accessing a GUI use an add-in program called a screen reader. This gathers information about what is displayed on the screen and transforms it into a format that has some meaning for the user. Typically, text is converted to spoken speech or Braille, and graphics are either ignored or converted to some form of sound or touch representation. The information required for creating the alternative outputs is obtained keeping track of all the messages that pass between the application program and the operating system. These messages are used to construct a text-based off-screen model of the GUI screen. The screen reader software interacts with the off-screen model instead of the GUI screen. If this sounds complicated, it is. Because of the intimate real-time relationships that must be maintained with the operating system and the applications, screen reading programs are among the most complex pieces of software currently used on computers. The growing complexity of GUI systems, and the specialized knowledge and programming resources required to create and maintain a screen reader makes it extremely difficult for access designers to keep up with the ongoing developments in operating systems. Some developers believe we are rapidly reaching a point where only large organizations like IBM or Microsoft will have the knowledge and resources to build screen reading software in a timely manner.
The Total Access System provides an alternative method for giving blind and visually impaired users access to the output from a computer. Instead of working inside the host computer, the TAS retrieves information directly from the screen display. With appropriate hardware and software, the image on the screen can be captured as a bit map and the information contained within it can be retrieved by pattern recognition and Optical Character Recognition (OCR). While this approach is quite simple in concept, it is extremely challenging to implement due to the amount of information contained in a full screen display and the rapid refresh rates that are used on a video monitor.
This direct retrieval approach is being developed for the Total Access System because it is the only strategy that will meet our design goals of being able to work with any type of host computer without interfering with its operation or requiring the addition of internal software. Several novel approaches have been formulated for reducing the amount of information that must be captured and processed in real time.
The input functions of TAS have been implemented on four of the most widely used families of computers. TAPs are now commercially available for IBM PC, Macintosh, SGI, and Sun workstations. The TAPs provide identical access to all keyboard and mouse functions on each of the supported platforms.
While it is possible to create many different types of accessor, most of our development effort has been directed to speech accessors. The reason for this is quite simple. If a person can speak reasonably well, speech recognition provides the fastest and most convenient alternative to the keyboard for entering text into a computer. However, speech falls short if a large amount of pointing is required. While we have developed a speech interface that allows all mouse functions to be performed entirely by speech, it is clumsy compared to a the normal use of a mouse. The limitations of the speech driven mouse have been overcome by combining speech recognition with head tracking and eye tracking. One of our researchers has been using a combination of head tracking and speech recognition for more than two years. He works with a Macintosh computer and is able to work as quickly and accurately as anyone using the keyboard and mouse. The Total Access System allows these two access technologies to be combined without any side effects or incompatibilities.
Archimedes researchers have also developed several eye tracking accessors that can be used as stand-alone communicators or as accessors. A user is able to switch from one type of operation to the other merely by looking at a selection button on the display screen. When used as an accessor, the eye tracker enables a severely paralyzed person to perform all keyboard and mouse functions merely by looking at objects displayed on a computer screen. A full range of mouse functions is supported by the eye tracker, but, because of the involuntary movements that occur as a people shift their gazes from one place to another, it is extremely challenging to provide fast and accurate mouse control. We are now working on a system which combines both eye and head tracking with voice. We believe that this system will provide the most natural form of text entry and pointing for anyone with suitable speech and sufficient head and eye movement.
At this stage, the TAPs are functioning very well and new versions are planned for additional workstations such as HP and DEC. The present accessors are limited by the state of the special input technologies such as speech recognition and eye tracking. One major problem is that new speech recognizers are designed to run in the Windows 95 environment and are slow because of the sluggish operation of Windows 95. A great deal of careful software design has been necessary to get the speed performance or the Windows 95 version of the speech accessor close to that of its DOS predecessor. Our highest priority is to develop faster accessors by eliminating the overheads imposed by the Windows 95 operating system.
Setting up a TAS system is extremely straight forward. After turning off the host computer , the keyboard and mouse cables are disconnected from the host and reconnected to marked inputs on the TAP. New keyboard and mouse cables are connected from the TAP to the input ports on the host. At this point the host can be turned back on and rebooted. Everything on the host should come up and behave as usual.
An accessor is either connected to the TAP with a serial null-modem cable or else it is used in a wireless mode to send input information directly to the TAP. All mouse and keyboard functions on the host are immediately available and the user can perform the equivalent of manual keyboard and mouse entry at this point. However, the full advantages of using the TAS require the user to train a selection of TAP Macros that allow large and complex operations on the host to be performed with the input of a single utterance or eye movement to the accessor.
Training a person to use an accessor is usually quite straightforward since the accessor is performing only two main functions, i.e., receiving and processing user input, and sending messages to a TAP. Modern speech recognizers learn the characteristics of a user's voice in well under an hour but it takes several hours for the person to become confident about how to talk to the host device. In our experience, most new users require between eight and twelve hours of one-on-one training to become proficient at using the speech recognizer. Some computer savvy users become proficient with about four hours of training.
The Total Access System provides a viable, cost-effective method for making any computer or computer-based device accessible. While its initial purpose is to make it easier for disabled individuals to remain at work or to get into the workforce, the long-term potential is to use the system to increase productivity and to prevent computer users from developing CTDs. Initial versions of the system have proven the effectiveness of the concept and we are now receiving serious attention from many large organization, both in the US and overseas.
Ongoing work is focused on improving the speed of the accessors and developing the hardware and software necessary for enabling blind users to retrieve output information from the host without the need to connect to internal hardware or software components.