Discovering Usability Improvements for Mosaic:
Application of the Contextual Inquiry Technique with an Expert User
David R. Britton Jr., Information and Computer Science
Graduate Student
Arthur A. Reyes, Information and Computer Science
Graduate Student
Mosaic is currently the most popular World Wide Web (WWW) navigation tool and has introduced thousands of people to the WWW. Similarly, Mosaic has demonstrated many of the WWW's capabilities and helped many users form their mental models of the WWW. Because of its status, Mosaic's usability has the potential to indirectly help or hinder use and development of the WWW in different ways.
For Mosaic to remain the premier WWW navigation tool, future versions of Mosaic must rectify any current usability problems and provide new functionality. These improvements will facilitate fuller utilization of the WWW's capabilities by the largest number of users.
Expert users often develop workarounds for the usability deficiencies in the tools they use and frequently have ideas for new functionality in these tools. User evaluation techniques exist that allow this type of knowledge to be collected and analyzed. Such analyses can result in recommendations for tool usability improvements. Contextual Inquiry is a user evaluation technique that allows this knowledge to be (1) collected as a tool is used in normal working contexts, and (2) analyzed via the construction and inspection of Affinity Diagrams. We believe that the Contextual Inquiry technique provides a powerful means for evaluating the usability of WWW tools and information spaces.
This paper describes our use of the Contextual Inquiry technique to discover possible usability improvements for Mosaic for X-Windows. This process involved observing an expert as he used Mosaic to navigate information spaces in the WWW, reviewing audio- and videotapes recorded during the observation, reviewing audiotape of pre- and post-experiment interviews, constructing an Affinity Diagram to organize our understandings of how the expert used Mosaic, and deriving a prioritized list of usability improvement proposals by refining the Affinity Diagram. Because this process has so far been executed using only one Mosaic expert, it is not known if the usability improvements discovered will facilitate fuller utilization of the WWW's capabilities by the majority of Mosaic users. Nevertheless, to promote discussion and further research in the Mosaic/WWW community, we offer a list of the lessons learned from our implementation of the Contextual Inquiry technique.
This paper principally discusses the authors' use of the Contextual Inquiry technique to discover usability improvements for Mosaic for X-Windows (version 2.1). Mosaic for X-Windows is currently the most popular hypermedia navigator and browser for the World Wide Web (WWW).
The WWW is a ```wide-area hypermedia information retrieval initiative aiming to give universal access to a large universe of documents.' ... What the WWW project has done is provide users on computer networks with a consistent means to access a variety of media in a simplified fashion''[3].
A picture of the basic Mosaic Document View Window is shown in Figure 1.
We begin by examining our motivations for performing usability analysis on Mosaic and for choosing the Contextual Inquiry technique to accomplish this goal.
Next, we:
We are interested in improving the usability of the Mosaic for X-Windows WWW navigation tool (hereafter referred to as Mosaic). This interest was spawned by our experiences using Mosaic and our difficulties in finding workarounds for its weak areas. This work is highly relevant because Mosaic is currently the premier WWW navigation tool for X-Windows systems. Given the ubiquity of X-Windows systems in academia, research and industry, Mosaic has become the de facto standard in WWW navigators/browsers. Because so many people are using Mosaic, significant changes in Mosaic's usability can have high leverage in affecting the development of the WWW.
In particular, we are interested in studying the usage patterns of expert Mosaic users. This is motivated by our assumption that expert Mosaic users are more likely to actively develop the WWW (by providing information) than are novice Mosaic users. Our goal is to derive a list of concrete usability improvement recommendations for consideration by the developers of Mosaic.
Because both the WWW and the information found in it can have complex structures and afford a broad range of human-computer interaction types (such as, viewing and interacting with images and animations, reading and typing text, and so forth), and because purely mathematical usability models are limited to simpler types of human-computer interaction, we decided that observation of users was the most appropriate method for discovering Mosaic's usability problems. Because we are interested in the application of video for user observation, we chose to follow the Contextual Inquiry technique.
The Contextual Inquiry technique is described in [1] and [2]. Contextual Inquiry is both a form of field study (or field experiment) and a data analysis technique. In Contextual Inquiry, a fundamental concept is that users are to be studied in their normal working context. Users are studied as they execute ordinary work assignments. Experimenters following the Contextual Inquiry technique observe users working and record both how the users work and the experimenters' interaction with the users. This recording can be limited to hand-written notes, but information bandwidth can be significantly increased with the use of video- and/or audiotape recordings.
According to [1] and [2], a typical application of the Contextual Inquiry technique can occur in this way: Several experimenters spend a day at an organization, interviewing and studying the users to get ``an understanding of [their] work and system usage''[2, p. 189]. Each experimenter typically interviews two users, spending about two and one half hours per user. The users are asked in advance for permission to audiotape or videotape the experiments. Once data is collected from the experiments, Affinity Diagrams are constructed by the experimenters to structure their understandings, ``develop interpretations on which the [experimenters] agree ...''[1, p. 94], and to discover weaknesses in the data. Post-experiment interviews are then conducted to address any weaknesses in the experimenters' understandings.
An Affinity Diagram is created by the experimenters by reviewing the various recordings made during the experiment and writing each idea, understanding, or question on a Post-it Note as it occurs. Hereafter, the Post-it Notes will be referred to as notes. The notes are created, arranged, and rearranged into related groups on a large sheet of paper attached to a wall. This process continues until the groups stabilize. The arrangement of the notes is the resulting Affinity Diagram and represents the structuring of the collective experimenters' understandings of how the user works.
This technique has the advantage that the experimenters can note the way in which a user has organized his or her environment, including hardware and software tools, furniture arrangement, lighting, and so forth. Experimenters strive to understand the purpose of a user's work and how objects in the user's environment support or detract from that purpose. Using videotape allows very careful analysis of a user's interaction with his or her working environment. Suchman and Trigg[6] provide an overview of the full power of analysis of videotape for understanding users' interactions with their working environments.
Because Contextual Inquiry experiments can be classified as field studies or field experiments, they are limited in the same ways as other techniques that only address a subset of the research strategies described in [4]. The process of creating an Affinity Diagram is informal and it is possible that 2 different Affinity Diagrams could be created from the same set of notes.
A summary of the activities that we performed in the evaluation described in this paper and our approximate time of involvement (if known) follows. We:
This section of the paper describes the detailed procedure we used to implement the Contextual Inquiry experiment. As we review our procedure, we will note the specific differences between our procedure and descriptions of it in [1] and [2] and comment how these differences may have affected our results. We did not perform work modeling or work redesign [1], because we felt that these tasks were outside the scope of our focus.
To gain experience with the planned experimental procedure, we performed a pilot experiment of Mr. Reyes (himself experienced with Mosaic) exploring WWW documents listed in the ``What's New with NCSA Mosaic'' document [5] (hereafter the What's New Page). This WWW document features links to information related to the latest version of Mosaic, and also features a list of significant new documents and sites recently added to the WWW.
Mr. Britton interviewed Mr. Reyes, recorded the dialog between the two of them on audiotape, and videotaped a profile of Mr. Reyes as he sat at the workstation (showing his head, torso, and interaction with the keyboard and mouse). Prior to the interview, Mr. Reyes set up a screen capture video recorder to obtain real-time video of the workstation's screen.
This pilot experiment gave us valuable experience with the equipment, enabled us to establish working relationships with the people responsible for the facilities, and showed us how we could glean a great deal of information from the resulting videotapes.
Review of both videotapes obtained during the pilot experiment allowed us to create a detailed list of assumptions that we believed characterized expert Mosaic users. This list is published on the WWW at:
http://www.ics.uci.edu/~dbritton/2nd-Intnl-WWW-Conf/list-of-assumptions.html.
Such a list of assumptions is discussed in the Contextual Inquiry literature:
The designer begins any inquiry or system design with a set of assumptions. These assumptions concern the nature of work, opinions about design ideas, beliefs about usability, and goals for the inquiry itself. The designer begins with an existing understanding of users' work and system use. This is the designer's entering focus. [2, p. 188]
Upon reflection, we were probably more zealous in creating this list of assumptions than was necessary. We hoped that the list of assumptions would serve as basis for questions to ask the expert during our interviews with him. It turned out that the list was too long and too inflexible for the dynamic nature of the interviews.
Because this paper was originally written for a 10 week course in human-computer interaction, we felt that limiting the number of experiments with expert users would be necessary. Our plan was to go through the entire Contextual Inquiry process for a single expert user, then repeat the process for additional users if time allowed. This approach would allow us to gather more data and improve our methodology over time. Unfortunately, we had time to perform the complete process for a single expert user. We acknowledge that ``one developer talking to one user is insufficient''[1, p. 94]. As a consequence, our results cannot be generalized.
Our expert Mosaic user is a director of a scientific computing organization with a major university. He possesses a Ph.D. in Mathematics. He is a strong advocate for and participant in the development of the WWW. The expert also works as a lecturer in the areas of computer-based educational technology, computer graphics, and programming.
At the time of our study, the expert typically worked between 10 and 11 hours on a weekday. He spent a minimum of between 2 and 3 hours and a maximum of between 6 and 7 hours per day in ``electronic communication,'' which included telephone, electronic mail, bulletin boards, mailing lists, and Internet navigation. The expert principally used Mosaic on weekends, often over a period of between 4 and 5 hours.
The expert has used Mosaic (indeed, the WWW) to establish and maintain personal relationships. He also likes to use Mosaic for more general WWW exploration and to browse the ``window dressing'' of individuals and institutions that are being ``advertised'' on the WWW.
The expert enjoys taking an active role in early user testing of tools. He likes being on the ``leading edge'' of tool development, using beta versions, because he wishes to make significant contributions to the usability of tools. The expert likes to have a year to experience the leading edge version of a tool, because this gives him the opportunity to establish a working relationship with the tool developers. In this time he makes his desires known to the developers and looks forward to tool improvement. In the meantime, he finds workarounds for deficiencies in the tools. If at the end of a year of use, the tool's developers have not made usability improvements, the expert may become dissatisfied with the tool.
Additional information about the expert may be gleaned from the Index of Understandings (discussed in a later section) derived from the Affinity Diagram.
Together, we and the expert decided that a good way to see a representative cross-section of the expert's Mosaic usage patterns would be for the expert to peruse the What's New Page. Unfortunately, because we did not maintain this focus during the experiment and because the What's New Page was not available at the time of the experiment, we failed to carry out the experiment exactly as planned.
The experiment with the expert took place in three parts in February and March of 1994. First, we interviewed the expert in his office for 1.5 hours. We recorded our dialogue with the expert on audiotape and simultaneously took notes. This pre-experiment interview is recommended by the Contextual Inquiry technique to obtain an understanding of the expert's job:
We ask them to give us an overview of their work and to discuss what specific work they will be doing during our visit. We ask for their opinions of the tools they use. [2, p. 196]
The second part of the experiment occurred immediately after the pre-experiment interview. We and the expert situated ourselves in the university's computer Demo Room near the expert's office, where we and university employees had set up audio- and videotape equipment beforehand.
This experimental environment differed from the expert's normal work environment in two ways: (1) the experiment took place in the Demo Room, in which the expert works less frequently than in his office, and (2) the expert wore a headset mounted microphone for the screen capture video recorder. We originally asked if we could conduct the experiment in the expert's office. He declined because he felt that moving the screen capture video recorder (a cumbersome, moveable rack-mounted device with 2 color television monitors) into his office would be too great an inconvenience. Moreover, he felt perfectly comfortable conducting the experiment in the Demo Room. At the time of the pilot experiment, one of the expert's employees supported this by telling us that the expert was familiar with the experimental setup we were to use.
We began the experiment by reminding the expert that we wanted him to explore the What's New Page. As noted above, this focus was not maintained. The experiment proceeded with the expert using Mosaic to navigate the WWW first to explore several WWW documents that were referred to him by a colleague via e-mail. During this exploration, the expert demonstrated how he uses Mosaic and a number of other tools in concert with Mosaic. As the expert worked and talked, we asked questions. He explained his ideas for how Internet navigation tools can be made more efficient, demonstrated Internet search tools that he liked, and covered a number of other diversions.
Once this session ended, we began work on our Affinity Diagram. After making initial progress (discussed in the next section of this paper), we conducted the third part of the experiment: a post-experiment interview with the expert in an attempt to answer a number of questions that arose. We recorded this interview on audiotape and took notes as well. We should note that this post-experiment interview was conducted under a number of time constraints. Therefore, not all of our questions could be addressed. Please refer to the Lessons Learned section of this paper for a more complete explanation of this problem. Nevertheless, we were fortunate to obtain feedback from the expert who reviewed an earlier draft of this article.
While a major goal of the Contextual Inquiry technique is to understand the user, our experiment reflects the interests and views of the expert as we understood them. As a result, there is no guarantee that the participant's views are equivalent to ours.
Holtzblatt and colleagues typically have the audio- or videotapes of their experiments transcribed. This way, experimenters can each have an identical copy of the experiment's dialogue. Experimenters read small portions of the transcript, discuss them among themselves to share understandings, write each understanding on a note, and repeat this cycle until the transcript has been covered completely.
We watched the videotape showing the profile view of the expert and simultaneously followed the transcript. Whenever either of us had an idea, understanding, or question, the tape was stopped to allow discussion, a note was written to capture the idea, understanding, or question, and review of the videotape resumed. This process also helped us locate errors in the transcript.
We found the transcript of the experiment to be a very useful resource. We referred to it many times over the range of activities that took place after its creation. It was very easy to find a particular point of interest in the transcript. It would have been much more difficult if we had to fast forward and rewind audio- or videotapes during our reviews. In addition to the printed copies of the transcript, we used it on-line as well. It was particularly easy to find portions of the experiment using the search functions in emacs and vi.
Our Affinity Diagram was composed of a large, portable, paper sheet for holding the notes, as recommended by [2]. We mounted this sheet on a wall, and initially placed the notes (written during review of the videotape) on the diagram in no particular order.
Next, we listened to the audiotape of the pre-experiment interview. As we listened, we wrote more notes with ideas, understandings, and questions and added them to the diagram, again, in no particular order.
We then began to arrange the individual notes into groups that had a certain type of affinity for each other. Specifically, we proceeded as follows: A single note was picked at random. The person that picked the note read it aloud. We discussed the idea, understanding, or question written on the note, reviewed the different groups of notes already placed on the working area of the diagram, and placed the current note closest to the note that had the greatest affinity. If the current note had an idea, understanding, or question that did not have a clear affinity to other notes, the current note would be the first idea, understanding, or question of a new group. We intentionally avoided labeling the groups of notes at this stage, as recommended by [1] and [2], to prevent us from constraining our thoughts prematurely.
When our creation of the notes or construction of the Affinity Diagram caused either of us to become uncertain about a software tool or aspect of Mosaic, we used a nearby X-Windows terminal and attempted to experience the tool or aspect of Mosaic for ourselves. This helped us to achieve a fuller understanding of these areas than would have been possible otherwise.
Notes with questions that we could not answer were placed in an area of the diagram devoted to questions. These questions motivated the need conduct a post-experiment interview with the expert as mentioned earlier. After the post-experiment interview, we listened to the audiotape, reviewed our notes from that interview, and created new notes for incorporation into our Affinity Diagram. Questions that remained unanswered were judged as to their possible relevance. If we considered a question relevant, we placed its note on the Affinity Diagram in the most appropriate location. Questions that we considered to be trivial or of little relevance had their notes removed from the diagram.
Holtzblatt and Beyer [1] say that 400 to 600 notes are written for between 5 and 8 interviews. This means that they have approximately 50 to 120 notes per interview. Our final Affinity Diagram contained 164 notes.
We found the process of constructing the Affinity Diagram enlightening. As we discussed individual and groups of ideas, understandings, and questions, we experienced breakthroughs in understanding. The following quotation expresses how we were able to form an exciting concept of a ``webizer'' function for Mosaic (explained in a later section):
As we interpret the information, we do not critique ideas. We include ideas that seem technologically impossible or impractical. We ... imagine the impossible. We do not want to constrain our creative ideas. [2, p. 201]
We reorganized the Affinity Diagram to carefully place notes in large rectangular regions. This was performed to clarify major groups. This is an idea that seemed appropriate, but was not specifically mentioned in the literature. The rectangular regions denoted the most important distinguishing characteristics of the ideas, understandings, and questions on the diagram. These regions represented possible recommendations to improve Mosaic, and the expert's personal characteristics, motivations, and tools.
An idea, understanding, or question in a rectangular region that had an affinity for an idea, understanding, or question in another region caused us to place both along the border between the regions. These were grouped and labeled. Examples of these groups include keyboard consistency across hardware configurations, negative functionality changes (what the expert does not want), personal and shared webspace creation, social catalyst mechanisms (tools and motivations for social interaction), and the Hotlist.
After all groups had been formed, those contained within a rectangular region were labeled. A photograph of the final Affinity Diagram configuration is shown in Figure 2.
Figure 3 shows the final Affinity Diagram, annotated to show the major groups (rectangular regions) of ideas, understandings, and questions, and some of the larger subgroups.
After all of the groups were labeled, each group of notes and individual notes were given an index number to facilitate the transformation of the Affinity Diagram into an Index of Understandings, which resembles an outline. The Index of Understandings contains not only the contents of the Affinity Diagram, but descriptions of our understandings. Our approach-deriving an Index of Understandings from the Affinity Diagram-deviates somewhat from that described by Holtzblatt and Beyer:
When done [constructing the Affinity Diagram], we `walk' the affinity, saying what each part is about and brainstorming design ideas for that part. These ideas can be attached directly to the affinity itself. [1, p. 95]
For us, the brainstorming of design ideas occurred incrementally as we constructed the diagram. However, we ``walked'' the diagram to write the Index of Understandings once the diagram was completed. Index numbers were assigned as follows:
Wherever there was an intersection of top-level groups, the name of the intersection (such as, Personal and Shared Webspace Creation) was listed in the index for each relevant top-level group (see Figure 3):
...
...
...
After the Index of Understandings was (almost mechanically) derived from the Affinity Diagram, we refined and expanded it for readability. We also made adjustments following a review by the expert. The final Index of Understandings is published on the WWW at:
http://www.ics.uci.edu/~dbritton/2nd-Intnl-WWW-Conf/index-of-understandings.html.
Recommendations to improve Mosaic were easily derived from use of the Contextual Inquiry technique. We determined that the following prioritized list of design changes would make Mosaic more usable for the expert user we studied. We estimated list item priorities based on four rules of thumb: (1) the number of notes used in each group, (2) the amount of discussion generated (with regard to a particular list item) between the expert and us during the experiment and interviews, (3) the nature of the workarounds that the expert must use to accomplish his goals, and (4) the emphasis that the expert placed on need for the change. Our recommendations follow:
This section of the paper discusses difficulties that we discovered in executing the Contextual Inquiry technique. It is given in the hope that the reader will be able to build upon our experience and avoid time-consuming and/or costly mistakes. We will present our lessons learned in roughly the order we experienced them.
Conduct a pilot experiment. This will allow you to get necessary experience with the recording equipment, expose weaknesses in your experimental method, meet people with whom you will work later on, and make you more relaxed about executing the actual experiment.
Realize that how you interact with the system under study may not be the same as how the expert interacts with the system. Thus if you create a long list of assumptions about expert users based on the pilot experiment and your knowledge of the system, much of it may be invalidated when you conduct your experiment. Instead, keep your list of assumptions short and general. This way your assumptions can more easily serve as a basis for questions in the dynamic, interpersonal environment of the experiment.
If you are planning on using a particular transcription service for the first time, or if the transcription service rarely does business in the domain under study, make the transcripts yourself. Our transcript was much more expensive than the original estimate given by the transcriber. The transcriber quoted (over the phone) a per hour charge and stated a transcription rate of 4 hours of transcribing per 1 hour of audiotape. We gave the transcriber 1 hour of audiotape, but the transcription took 11 hours and 35 minutes (nearly 3 times the transcriber's estimate!). We found evidence in the transcript indicating that the transcriber was not familiar with much of the language used during the experiment. We had to make corrections, which took an additional 9 hours.
Logical affinities of groups are quickly forgotten until the group is labeled. The unlabeled groups of notes were reviewed frequently to remind us of the key distinguishing characteristics each possessed. This made constructing the Affinity Diagram more difficult. Even when the notes were arranged according to the rectangular sections of the diagram, it was difficult to remember the meaning of the groups developed thus far.
Affinity Diagram construction allows powerful analysis of human dialog. It really helps to organize thoughts and enable insight. You will be surprised at the power of this technique.
Try to make sure that both you and the expert have enough time for what you need to cover. The post-experiment interview was frustrating because the expert as well as the experimenters were under time pressure. We would have liked to have taken the expert's suggestion to come back the next day. However, this was not possible under the circumstances.
We have described the Contextual Inquiry technique and our implementation of it to discover usability improvements for Mosaic for X-Windows. We reviewed the central concepts of the Contextual Inquiry technique as described by its developers, explained where our implementation differed, and provided evidence that the technique can produce useful information.
Above all we wish to thank our expert informant, not only for participating, but for reviewing an earlier draft of this paper, for offering suggestions in the design of the experiment, and for making us aware of equipment available to aid our inquiry. We wish to thank Jonathan Grudin for his guidance, helpful comments and suggestions on an earlier draft of this article, and for providing funds and equipment loans. We also wish to thank Allen V. Schiano for his help in setting up equipment.
Arthur Reyes received the B.S. degree in Aerospace Engineering from Polytechnic University in 1987. His undergraduate academic honors include being granted the award of summa cum laude upon graduation and being placed on the Dean's List. He held both a Polytechnic University Scholarship and a New York State Board of Regents Scholarship. He worked as a Senior Honors Researcher and undertook original research with a professor in Polytechnic University's Department of Mechanical and Aerospace Engineering.
Mr. Reyes was employed as an Engineer with Northrop Corporation, B-2 Division from September 1987 until August 1992.
Mr. Reyes is currently a second year graduate student in the Department of Information and Computer Science at the University of California, Irvine (UCI). His research interest include multi-lingual software specifications, testing, and formal methods of software development.
David R. Britton Jr. earned a B. S. in Computer Science from University of the Pacific (UOP) in May of 1991. He was the Outstanding Graduating Senior in the Department of Computer Science at UOP for 1991. Mr. Britton graduated magna cum laude and is a member of the Honor Society of Phi Kappa Phi.
Mr. Britton is a second year graduate student in Information and Computer Science at the University of California, Irvine. His research interests include educational technology, automatic speech recognition as applied to educational technology, and experimental design.
Please send your questions, comments, and suggestions to: dbritton@ics.uci.edu.
This document was generated using the Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The translation was initiated by dbritton@ics.uci.edu on Sat Sep 10 22:04:42 PDT 1994