Chapter 3 : Users, Tasks and Information

CHAPTER 3 : USERS, TASKS AND INFORMATION

"A man ought to read just as inclination leads him; for what he reads as a task will do him little good."

Boswell’s Life of Johnson

Introduction

In Chapter 1 we said that when talking about hypertext we are referring to people using information to perform some task. These concepts will be discussed further here as we look at some attributes of the user, information and task that can guide us in the design and rôle of usable hypertext applications. The complexity of issues involved and the range of applications and tasks that hypertext may be used for imply that we cannot talk of hypertext as a unified form of presentation any more than we can meaningfully describe a typical text. It is important to understand from the outset therefore that when discussing hypertext we do not seek simplistic answers to questions such as "Is hypertext better than paper?" or "Do people learn more from hypertext systems than standard texts?"

Figure 6: The important interaction variables for hypertext.

Users are people. They have skills, habits, motivations, intelligence, intentions and a whole host of other attributes that they bring to the computer when using hypertext. Modern cognitive psychology and ergonomics have accumulated a substantial amount of information about humans using computers. It is not the intention of this chapter to review this work (a useful non-specialist introduction can be found in Shneiderman, 1987a) but rather to introduce readers to its basic orientation so that they might better understand the relevance to hypertext.

The rapid development of information technology over the last decade or so means that to some extent we are all users (even if we are unaware of it), and contemporary thinking rightly stresses that technology should be designed with users’ needs in mind. In the hypertext domain it is likely that potential users will

come from all walks of life and age groups. Hypertext applications will not exist only in libraries, schools or offices but will also be found in museums (e.g., HyperTIES), tourist information centres (e.g., Glasgow On-line) and eventually, in the home. Thus when we talk of hypertext and its uses, it is important that we try to place our discussion in a specific context by looking at the first element of our triumvirate and asking ‘who are the target users?’

The information that users deal with when interacting with contemporary computer systems varies tremendously. With hypertext such variation is equally apparent. Hypertext systems can be used to manipulate and present lengthy texts such as journal articles (McKnight, Richardson and Dillon, 1990), encyclopædias (Shneiderman, 1987b), computer programs (Monk et al., 1988) or English literature (Landow, 1990) to name but a few current applications. In fact, there is no reason why any information could not be presented in hypertext form – rather, the question is, looking at the second element of our triumvirate, ‘what sort of information would benefit from such presentation?’

Just as users and information vary, so too do the tasks that can be performed on computers. Software is (or should be) designed with specific tasks in mind which it will support, e.g., desk-top publishing, database management, process control, statistical analysis and so forth. Equally with hypertext, users will perform a variety of tasks – the third part of the triumvirate – and consequently hypermedia must be designed accordingly. In short, when we consider users, information and tasks we draw the conclusion that different implementations of the hypertext concept will be required in different domains.

It makes little sense to talk about users, information and tasks as if they were independent entities because clearly they are not. By definition, a user must be using something, i.e., performing a task, the very act of which implies information transfer. Therefore, a convenient trichotomisation is not possible. This chapter will attempt to describe the relevant issues relating to each of these elements before demonstrating how our understanding of all three is important to hypermedia.

The user as reader

Invariably, the user of a hypertext application will be reading material from a visual display unit (though authoring is increasingly possible and will be discussed in detail in Chapter 5). Thus, it is worth considering what we know about reading and its relevance to screen-displayed text. Reading is one of the most intensively studied cognitive activities, with serious investigation of it as a psychological phenomenon commencing in the last century (for a review of work at that time see Huey, 1908). Since then, researchers have analysed the processes involved in reading, from the level of eye-movements across the page to that of how readers comprehend visually presented material. It is not the intention of this chapter to summarise such work or present sections of it in great detail (a comprehensive review can be found in Beech and Colley, 1987). However a brief description of the salient aspects is relevant to the present discussion.

The psychology of reading

When reading, light enters the eye through the cornea, passes via the aqueous humour to an opening in the iris known as the pupil where it is focused by the lens. From here it passes through the vitreous humour to the innervated portion of the eye called the retina (see Figure 7). From the retina the light signal passes down the optic nerve to the brain. The retina is effectively split down the middle and light signals impinging on the outer side of the retina go to the same outer side of the brain, but those from the nasal side cross at the optic chiasma, just behind the eyes, and go to opposite sides of the brain. This is obviously a gross simplification of the processes involved but serves to highlight the complexity of the transformation of light into image.

Figure 7: The passage of light through the eye.

As you read the words on this page, your eye movements may feel smooth but actually consist of a series of rapid jumps and rests, termed ‘saccades’ and ‘fixations’ respectively. Saccades last approximately 25-30 milliseconds, fixations 200-250 milliseconds, so the eyes are stationary for about 90% of the reading time. Readers of English normally proceed from left to right, one line at a time,

with visual information uptake occurring only during fixations. The number of fixations and time per fixation increase with text difficulty, with readers of fairly difficult text fixating up to 20 times per line, sometimes twice per word (Just and Carpenter, 1980). As this implies, regressions (i.e., backward movements of the eyes to review already fixated text) occur also. Readers can process a maximum of 10 characters either side of a fixation point, i.e., 20 characters at the most (McConkie and Rayner, 1975). This is known as their perceptual span and in readers of English is biased to the right of their fixation point.

While there is some agreement about the processes involved at this level, theoretical differences emerge when we come to describe letter and word recognition. It is clear that both letter and word recognition processes occur, perhaps interactively, on the basis of features such as shape, size, form and context, although the relative importance attributed to any one of these factors varies. There has been some success in modelling these processes on computers (see, for example, McClelland and Rumelhart, 1981) but a complete theoretical explanation that accounts for all the empirical observations on these phenomena has yet to be presented.

The situation is even less clear at the level of comprehension, i.e., how do people form an understanding of what they read? It seems obvious that readers draw inferences from particular sentences and form representations at different levels of what is happening in the text. Various models of comprehension have been proposed to explain this. Thorndyke (1977) proposed a set of ‘grammar rules’ by which the reader forms a structure in their mind of how the story fits together. Van Dijk and Kintsch (1983) proposed a very detailed model involving an analysis of the propositions of a text, leading to the development of a ‘macropropositional hierarchy’ influenced by the reader’s model of the situation represented in the text. More recently, Johnson-Laird (1983) and Garnham (1986) have proposed a ‘mental models’ approach to text comprehension that involves the reader representing the meaning of the text as an imaginary, updatable model in their mind.

In view of the lack of agreement on the nature of comprehension, it is not surprising that measuring comprehension has also proved problematic. It might seem a simple enough matter to ask readers a number of questions about the text they have just read, but it is difficult to know whether memory for detail is being tested rather then comprehension. Furthermore, texts may be open to interpretation, the richness of which depends on the reader’s contextual knowledge and appreciation of the author’s message – all factors which make comprehension measurement a complicated issue.

It is also clear that readers acquire knowledge of how texts are structured or organised and can use these structural models to predict the likely meaning of the text (van Dijk and Kintsch, 1983) or the location of information within it. Furthermore, readers have been shown to remember spatial location of information within a text after reading it (Rothkopf, 1971). These are findings pertinent to the development of hypertext applications and will be discussed in greater detail in Chapter 4.

Thus, the psychological study of reading shows us the complexity of the processes performed when reading even simple material such as isolated words and sentences. It is also clear that on matters such as comprehension, psychology has invested heavily in theoretical constructs which attempt to account for mental representation of the text but do not necessarily provide us with useful methods for assessing readers’ understanding of the material. These issues have relevance to the development and use of hypertext systems and will be raised in several guises throughout this book, especially when we talk about usability or applications of the technology in specific domains such as education. In the following section we discuss the potential differences between reading from paper and screen. The psychological view of reading offers an explanatory framework within which some of the findings in this area can be interpreted.

Reading from screens

Not surprisingly, a good deal of research has addressed the potential difference between reading material from paper and reading from screen. A recent review by Dillon et al. (1988) highlighted five potential differences between the media:

• Speed

• Accuracy

• Fatigue

• Comprehension

• Preference

In general, people read 20—30% slower from typical screens, where ‘typical’ is taken to mean a low-resolution 24-line display with white (or green) text on a black background (Muter et al., 1982; Wright and Lickorish, 1983; Gould and Grischkowsky, 1984). However, the emphasis in hypertext on easy selectability of links, multiple windows and so forth has meant that such packages are implemented on systems with large, high-resolution screens with black characters on a white background. Under such conditions (with the addition of anti-aliased characters) Gould et al. (1987) reported no significant difference in reading speed between screen and paper. The explanation for this probably lies at the image quality level: the human eye is better able to perceive and distinguish rapidly between letters and words presented on paper than they are with more typical screens; as technology improves and screen quality approaches that of paper, reading speed differences may cease to be an issue for the hypertext user.

‘Accuracy’ of reading usually refers to performance in some form of proof-reading task, although there is debate in the literature about what constitutes a valid task. Typically, the number of errors located in an experimental text has been used as a measure (Wright and Lickorish, 1983; Gould and Grischkowsky, 1984). While it is probably true to say that few users of hypertext will be performing such routine spelling checks, many more users are likely to be searching for specific information, scanning a section of text and so forth. For the more visually or cognitively demanding tasks such as these, a performance deficit for screen-based presentation is more likely (Creed et al., 1987; Wilkinson and Robinshaw, 1987). In a study by the present authors (McKnight, Dillon and Richardson, 1990), subjects were asked to locate answers to a set of questions in a text using either a paper version, a word processor document or one of two hypertext versions. Results showed an accuracy effect favouring paper and the linear-format word processor version, suggesting a structural familiarity effect. Obviously, more experimental work comparing hypertext and paper on a range of texts and tasks is needed.

With both speed and accuracy, a performance deficit may not be immediately apparent to the user. However, the same cannot usually be said of fatigue effects. There is a popular belief that reading from screens leads to greater fatigue, so will hypertext have users reaching for the Optrex? Gould and Grischkowsky (1984) obtained responses to a 16-item "Feelings Questionnaire" which required subjects to rate their fatigue, levels of tension, mental stress and so forth after several work periods on a computer and on paper. Furthermore, various visual measurements such as flicker and contrast sensitivity, visual acuity and phoria, were taken at the beginning of the day and after each work period. Neither questionnaire responses nor visual measures showed a significant effect for presentation medium. These results led the authors to conclude that good-quality VDUs in themselves do not produce fatiguing effects, although the findings have been disputed by Wilkinson and Robinshaw (1987). Nevertheless, to suggest as these latter authors do that Gould’s equipment was ‘too good to show an effect’ throws us back on the definition of a ‘typical’ screen. Since typical for the hypertext user is likely to be of better quality than the average microcomputer screen, it suggests that visual fatigue may be no more of a problem than for a draughtsman facing a piece of white paper illuminated by fluorescent light.

The effect of presentation medium on comprehension is particularly difficult to assess because of the lack of agreement about how comprehension can best be measured. If the validity of such methods as post-task questions or standardised reading tests is accepted, it appears that comprehension is not affected by presentation medium (see, for example, Kak, 1981). However, such results typically involve the use of an ‘electronic copy’ of a single paper document. The hypertext context differs significantly in terms of both document structure and size. A hypertext document is freed from the traditional structure of printed documents and is also likely to be just one member of an inter-related library. The issue of comprehension takes on a new dimension in this context.

It is widely argued that with hypertext, the departure from linear structure makes it difficult for the user to build a ‘mental model’ of the text and increases the potential for ‘getting lost’ (although the extent to which this is also true for complex and extensive paper document systems remains unanswered). We will address issues of navigation and readers’ models in more detail in Chapter 4, but for the present it appears that the cognitive and manipulative demands of hypertext could lead to a comprehension deficit. If there is no time pressure on the user, this deficit may simply appear as a speed deficit – the user takes longer to achieve the same level of comprehension. However, if time pressure is involved a comprehension deficit is more likely to be observed. This may have particular relevance to educational applications.

No matter what the experimental findings are, a user’s preference is likely to be a determining feature in the success or failure of any technology. Several studies have reported a preference for paper over screen (e.g., Cakir et al., 1980), although some of these may now be discounted on the grounds that they date from a time when screen technology was far below current standards. Experience is likely to play a large rôle, but users who dislike technology are unlikely to gain sufficient experience to alter their attitude. Therefore the onus is on developers to design good hypertext systems using high quality screens to overcome such users’ reticence. What seems to have been overlooked as far as formal investigation is concerned is the natural flexibility of books and paper over VDUs; books are portable, cheap, apparently "natural" in our culture, personal and easy to use. The extent to which such "common-sense" variables influence user preferences is not yet well-understood.

Conclusion

To date, the work on reading from screens is useful in highlighting the likely problems that may be encountered by readers using hypertext. However, it must be noted that much of this work was carried out on poorer quality screens than are currently available on many hypertext systems. Furthermore, studies have tended to employ tasks that bear little resemblance to the type of activities which hypertext users perform. As technology improves, any differences resulting purely from image quality should disappear. This still leaves the questions of accuracy, comprehension and preference open, however, and these will be considered further in the context of the text types and tasks which hypertext might be called upon to support.

Classifying information types

We live in a world where books, newspapers, comics, magazines, manuals, reports and a whole host of other document forms are commonplace. We take such a range for granted, yet there are significant differences between these documents in terms of content, style, format, literary merit, usefulness, size and so forth. As with tasks, it is certain that hypertext will have more impact on certain types of document than others and will almost certainly create new document forms that are not feasible with paper. Hence it would be useful to develop a framework that would facilitate the classification of texts.

At first glance, it might appear that such a classification would be relatively easy to develop. Obvious distinctions can be drawn between fiction and non-fiction, technical and non-technical, serious and humorous, paperback and hardback and so forth, which discriminate between texts in a relatively unambiguous manner. However, such discriminations are not necessarily informative in terms of how the text is used or the readers’ views of the contents – aspects which should be apparent from any typology aiming to distinguish meaningfully between texts.

The categorisation of texts has received some attention from linguists and typographers (see Waller, 1987, for an excellent review). For example, de Beaugrande (1980) defines a text type as:

"a distinctive configuration of relational dominances obtaining between or among elements of the surface text, the textual world, stored knowledge patterns and a situation of occurrence" (p.197)

and offers the following illustrations: descriptive, narrative, argumentative, literary, poetic, scientific, didactic and conversational. However, de Beaugrande freely admits that these categories are not mutually exclusive and are not distinguishable on any one dimension. Waller adds that it is not at all clear where texts such as newspapers or advertisements fit in such a typology, and proposes instead an analysis of text types in terms of three kinds of underlying structure:

• topic structure, the typographic effects which display information about the author’s argument, e.g., headings

• artefact structure, the features determined by the physical nature of the document, e.g., page size

• access structure, features that serve to make the document usable, e.g., lists of contents.

In a more psychological vein, van Dijk and Kintsch (1983) use the term "discourse types" to describe the macrostructural regularities present in real-world texts such as crime stories or psychological research reports. According to their theory of discourse comprehension, such types facilitate readers’ predictions about the likely episodes or events in a text and thus support accurate macroproposition formation. In other words, the reader can utilise this awareness of the text’s typical form or contents to aid comprehension of the material. In their view, such types are the literary equivalent of scripts or frames and play an important rôle in their model of discourse comprehension. However, they stop short of providing a classification or typology themselves and it is not clear how this work can be extended to inform the design of hypertext documents.

From a less theoretical standpoint Wright (1980) describes texts in terms of their application domains:

• domestic (e.g., instructions for using appliances)

• functional (e.g., work-related manuals)

• advanced literacy (e.g., magazines or novels)

She uses these categories to emphasise the range of texts that exist and to highlight the fact that reading research must become aware of this tremendous diversity. Research into the presentation and reading of one text may have little or no relevance to, and may even require separate theoretical and methodological standpoints from, other texts.

It is doubtful if any one classification or typology can adequately cope with the range of paper information sources that abound in the real world. In Wright’s (1980) categorisation, for example, the distinction between functional and domestic blurs considerably when one thinks of the number of electronic gadgets now found in the home which have associated operational and trouble-shooting manuals (not least the home computer). This is not a criticism of any one classification but rather an indication that each has a limit to its range of applicability, outside of which it ceases to have relevance. Thus, one can find situations in which any typology fails to distinguish clearly between texts. We should not necessarily expect classifications designed for typographers to help developers of hypertext systems.

Some classifications aimed specifically at the hypertext domain are beginning to emerge. Wright and Lickorish (1989) for example, distinguishes between texts in terms of their information structure and uses this to guide the design of hypertext versions. In particular, they highlight linear, modular, hierarchic and matrix document structures and argue convincingly that such structures have implications for the type of linkages, visual appearance and navigation support that need to be provided by authors. Thus, the navigation support necessary for a linear structure (such as a set of instructions) might be a series of loops initiated by the reader through pointing, while readers of modular texts (such as an encyclopædia) will require more explicit information about where they came from and where they currently are in the information space, since looping will not be as common with such texts.

An alternative approach to text classification by Dillon and McKnight (1990) involved using a technique known as repertory grid analysis to describe readers’ views of texts. Rather than devising a formal classification, this work resulted in a

framework for considering text types. They concluded that readers conceive of text in terms of three characteristics: how they are read, why they are read and what sort of information they contain. By viewing text types according to these attributes, it is easy to distinguish between, say, a novel and a journal. The former is likely to be read serially (how), for leisure (why) and contain general or non-technical information (what), whereas the latter is more likely to be studied or read more than once (how), for professional reasons (why) and contain technical information which includes graphics (what). This approach can be represented in terms of a three-dimensional ‘classification space’ as shown in Figure 8. Here, three texts are distinguished according to their positions relative to the How, Why and What axes The descriptors study—skim, work—personal and general—specific may vary and are only intended as examples. Different people may employ very different terms. However, the authors argue that they are still likely to be descriptors that pertain to the attributes How, Why and What.

Figure 8: A three-way classification of texts based on How, Why and What attributes.

According to this perspective any particular text may be classified in several ways depending on the reader and their information needs. Not only does this mean that a text may be seen differently by two readers but also that a reader may view a text differently according to their needs at any one time. The hypertext version should thus be designed with these principles in mind, analysing the how, why and what questions in detail before attempting to build an electronic version of a text. The authors use the example of a novel and suggest that, from a classification of this text type in terms of these three aspects, we would not expect a hypertext version to have much potential. However, if we consider the full range of how and why attributes that emerge as a result of analysing novels, one can envisage an educational context where various parts of the text are analysed and compared linguistically, thereby rendering a suitable hypertext version more useful than paper. In other words, a hypertext novel may not be suitable for reading on the train but it may be ideal for the student of English literature.

Conclusion

The classifications of information types that have been proposed are based on numerous criteria: underlying, typographic and macro structures, application domain, readers’ perceptions and attributes of use. No doubt other classifications will emerge, perhaps based on altogether different criteria than any described here. Structural distinctions are likely to remain important as hypertext creates new document types based on innovative electronic structures, but distinctions based on usage patterns and readers’ views will always be relevant. It is unlikely that any one classification scheme is best or suitable for all situations.

Reading as an information manipulation task

By ‘task’, we usually mean the performance of any goal-directed activity. In the context of reading, the term includes identifying, locating and processing relevant material. With texts this may involve looking for a book title and shelf number in a library catalogue, reading a novel, browsing a newspaper, proof-reading a legal contract or studying a mathematical theorem. In fact, if we define reading as the visual-to-cognitive processing of textual and graphical material in an information-seeking manner, and extend its traditional boundaries to include the procedures that necessarily precede contact between eye and source such as scanning bookshelves or selecting a correct edition of a journal (i.e., gaining access to the material), then we begin to get some idea of the diversity of tasks that are performed by readers.

At present, hypertext systems have only scratched the surface of the range of tasks they will eventually support. Indeed, hypertext applications may well eventually give rise to tasks that have not been thought of or cannot yet be effectively performed. Whatever the final outcome, it is clear that they have the potential to play a significant rôle in many reading scenarios.

The diversity of reading tasks

It is worth considering how people use the paper medium to perform information manipulation tasks if we want to understand how hypermedia may not only support them but offer advantages to the user. Laboratory studies of reading have tended, in the main, to concentrate on a very limited subset of tasks such as locating spelling errors on pages of text (Gould et al., 1987), making sense of anaphoric references in sentences (Clark, 1977) or recalling episodes in short stories (Thorndyke, 1977). Although some of this work has been invaluable in advancing our knowledge of the reader, it tells us very little about information usage in the real world or how to design hypertext systems for people.

People do not all interact with texts in the same fashion, and although texts may appear linear, this rarely ensures a serial style of reading. Furthermore, the same text may be read in different ways depending on the reader’s task. This much is intuitively obvious from self-observation. Consider the differences between reading a newspaper and novel. The former is likely to consist of a mixture of scanning and browsing in an unplanned fashion, reacting to interesting headlines or pictures, while the latter is more likely to involve serial reading of the complete text. Next, consider the differences between reading an interesting article in the newspaper and checking on a sports result or looking for the TV page. The former will involve the so-called higher-level cognitive functions of comprehension while the latter will be more of a recognition task. It is clear that while all these scenarios can be described as reading, they vary widely in terms of how and why the reader interacts with the text as well as the text type under consideration. Characterising these differences would be useful for hypertext developers because it would suggest ways in which hypertext documents should be developed in order to optimise reading performance.

Researchers as early as Huey (1908) were aware of the diversity of tasks typically performed by readers, and to this day it has remained generally accepted that this variety is worthy of consideration. For example, on the subject of electronic journals Wright (1987) drew attention to the fact that, depending on what information is being sought, readers may employ browsing, skimming, studying, cross-checking or linear reading. However, while it is possible to compile lists of likely reading tasks and their performance characteristics, it is entirely another matter to validate them empirically.

Observing task performance

Schumacher and Waller (1985) make the point that reading research has tended to concentrate on outcome measures (such as speed or comprehension) rather than process ones (what the reader does with the text). Without doubt, the main obstacle to obtaining accurate process data is devising a suitable, non-intrusive observation method. While numerous techniques for measuring eye-movements during reading now exist, it is not at all clear from eye-movement records what the reader was thinking or trying to do at any particular time. Furthermore, use of such equipment is rarely non-intrusive, often requiring the reader to remain immobile through the use of head restraints, bite bars and so forth, or read the text one line at a time from a computer display – hardly equatable to normal reading conditions!

Less intrusive methods such as the use of light pens in darkened environments to highlight the portion of the text currently viewed (Whalley and Fleming, 1975), or modified reading stands with semi-silvered glass which reflect the readers eye movements in terms of current text position to a video camera (Pugh, 1979), are examples of the lengths to which researchers have gone in order to record reading behaviour. However, none of these are ideal as they alter the reading environment, sometimes drastically, and only their staunchest advocate would describe them as non-intrusive.

Verbal protocols of people interacting with texts – asking them to ‘think aloud’ – require no elaborate equipment and can be elicited wherever a subject normally reads. In this way they are cheap, relatively naturalistic and physically non-intrusive. However, the techniques have been criticised for interfering with the normal processing involved in task performance (i.e., cognitive intrusion) and requiring the presence of an experimenter to sustain and record the verbal protocol (Nisbett and Wilson, 1977).

If we accept that a perfect method does not yet exist, it is important to understand the relative merits of those that are available. Eye-movement records have significantly aided theoretical developments in modelling reading (see, for example, Just and Carpenter, 1980) while use of the light-pen type of technique has demonstrated its worth in identifying the effects of various typographic cues on reading behaviour (see, for example, Schumacher and Waller, 1985). Verbal protocols have been effectively used by researchers to gain information on reading strategies (see, for example, Olshavsky, 1977). Nevertheless, such techniques have rarely been employed with the intention of assessing the likely effect of hypertext presentation on performance. More usually they have been used to test various paper document design alternatives or to shed light on reading performance in highly controlled experimental studies.

Where text is presented on computer screens it is possible to record and time the duration of displayed text, thus facilitating reasonable inference about what readers are doing. However, as an attempt to analyse normal reading performance, this fails because there is no way of gauging the influence of the medium on performance, i.e., reading electronic text cannot be directly equated to reading paper. A fuller treatment of these issues can be found in Wright (1987). Nevertheless, this method provides a simple and effective way of observing gross manipulations of text without making the person aware of the recording process.

Virtually all of the work on hypertext has employed such recording techniques to monitor user navigation and use of various commands. With the advent of cheap screen recording devices it is possible to record how the entire screen display changes throughout the course of the interaction, for later replay without resorting to video recordings. While this is obviously useful, it does not solve the problem of how to assess the use of paper and its importance for hypertext design. Where paper and hypertext are directly compared, although process measures may be taken with the computer and/or video cameras, the final comparison often rests on outcome measures.

Evidence for task effects

Probably because of the difficulties outlined above, research has tended not to focus directly on how people read or the range of tasks they perform with texts. Instead, investigators normally test reading in circumstances where the task is specifically designed to manipulate experimental variables. Typically these have been: proof-reading, short story comprehension, letter or word recognition, sentence recall and so forth. In itself this is a recognition of the variable effect of task on reading. However, there have been some studies which have sought to test task effects directly. Two are described here which illustrate the range from the single sentence to the full document level.

Aaronson and Ferres (1986) had subjects read 80 sentences one word at a time from a computer screen and perform either a recall or comprehension test item after each sentence. They observed differences in the time taken to read words in a sentence depending on the task being performed. Their results supported the hypotheses that readers tend to process structure for retention tasks and meaning for comprehension tasks, i.e., their cognitions are altered according to the demands of the reading task.

Verbal protocol analyses combined with observation of readers’ manipulations of text were used in a study by Dillon et al. (1989) to examine researchers’ use of academic journals. Unlike the other studies cited, the specific intention of this work was to identify the issues that needed to be addressed in developing a hypertext database of such material. The study was reasonably successful in eliciting information on individuals’ style of reading, identifying their reasons for using journals and demonstrating three strategies that were typically performed: rapid scan of the contents, browsing of certain relevant sections, and full serial reading of the text. The strategies matched the tasks being performed, e.g., rapid scanning told the reader about the suitability of the article for his purposes while full serial reading was employed when the reader wanted to study the contents of the article.

Conclusion

Task demands dictate the manner in which readers manipulate information but as yet few empirically validated performance strategies have been described. A major obstacle appears to be the lack of a suitable method for measuring such behaviour. Furthermore, any task effects demonstrated with paper documents may not transfer directly to the electronic medium. Nevertheless, the sheer diversity of tasks performed with documents necessitates some consideration of the effect of task type on the manner of usage. One hypertext implementation of a document might suit some tasks but be far from optimal for others. Without appreciating the range and manner of the tasks performed with a document, the chances of designing suitable hyper-versions are reduced.

Understanding hypertext in terms of user, task and information interactions

We stated at the beginning of this chapter that questions such as "Is hypertext better than paper?" were simplistic. The previous sections on users, tasks and information should have given some indication of why we believe this is the case. In the present section some of the emerging empirical data on hypertext will be discussed in the light of the interactions between these aspects.

To date researchers and developers have, in the main, been content to discuss the apparent advantages of hypertext systems for most tasks, occasionally describing systems they have implemented and informally presenting user reactions to them (see, for example, Marchionini and Shneiderman, 1988). Such reports are difficult to assess critically and it is easy to get carried away with the hype surrounding the new medium. If one looks at the proceedings of recent conferences on hypertext (such as Hypertext ’87, Hypertext II and so forth) one will find that such reports are in the majority, with well-controlled experimental papers few and far between. To be fair, it is possible that this state of affairs reflects the stage of development of these systems; they seem like a good idea and they are being developed, with formal studies being seen as appropriate only at a later stage. Nevertheless, in the absence of confirmatory data, any claims for the new medium should be treated with caution.

What data has emerged needs to be carefully considered in the light of the three elements discussed above since findings are often conflicting and any one study can only hope to answer a subset of the questions to be asked. The next section considers some of the experimental data in terms of comparisons that have been made between hypertext and paper, hypertext and linear electronic text and between various implementations of one hypertext.

Hypertext versus paper

Perhaps the most basic comparison is between hypertext and the traditional medium of paper. Given the revolutionary form of presentation afforded by hypertext this comparison is of considerable importance to both authors and readers, educators and learners.

In a widely reported study, Egan et al. (1989) compared students’ performance on a set of tasks involving a statistics text. Students used either the standard textbook or a hypertext version displayed via SuperBook, a structured browsing system, to search for specific information in the text and write essays with the text open. Incidental learning and subjective ratings were also assessed. The authors report that:

"students using SuperBook answered more search questions correctly, wrote higher quality ‘open-book’ essays, and recalled certain incidental information better than students using conventional text." (p.205)

Students also significantly preferred the hypertext version to the paper text.

At first glance this is an impressive set of findings. It seems to lend firm support to the frequently expressed view that hypertext is better than paper in the educational setting (cf. Landow, 1990). However, a closer look at the experiment is revealing. For example, with respect to the search tasks, the questions posed were varied so that their wording mentioned terms contained in the body of the text, in the headings, in both of these or neither. Not surprisingly, the largest advantage to hypertext was observed where the target information was only mentioned in the body of text (i.e., there were no headings referring to it). Here it is hardly surprising that the search facility of the computer out-performed humans! When the task was less biased against the paper condition, e.g., searching for information to which there are headings, no significant difference was observed. Interestingly, the poorest performance of all was for SuperBook users searching for information when the question did not contain specific references to words used anywhere in the text. In the absence of suitable search parameters or look-up terms, hypertext suddenly seemed less usable.

This is not a criticism of the study. To the authors’ credit they describe their work in sufficient detail to allow one to assess this study fully. Furthermore, they freely admit that an earlier study using an identical methodology showed less difference between the media (paper even proving significantly better than hypertext for certain tasks!) Only on the basis of this were modifications made to SuperBook which led to the observations reported above.

In a study by McKnight, Dillon and Richardson (1990), subjects located information in a text on the subject of wine. This was presented in one of four conditions: paper, word processor, HyperTIES or HyperCard. The tasks were designed in order to require a range of retrieval strategies on the part of the subject, i.e., while the search facility might prove useful for some questions, it could not be employed slavishly to answer all questions. This was seen as reflecting the range of activities for which such a text would be employed in the real world.

Results showed that subjects were significantly more accurate with paper than both hypertext versions though no effect for speed was observed. Subjects were also better able to estimate the document size with paper than with hypertexts and spent significantly less time in the contents and index sections of the text with the paper version. Paper and word processor versions were similar in most scores suggesting that the familiar structure inherent in both versions supported the subjects’ performance.

On the face of it we have two conflicting findings on the question of paper versus hypertext. However, by appreciating the users, tasks and information types employed in these two studies we can see that they are not directly comparable in anything but a superficial manner. As evidence to support a "Yes/No" answer to the question "Is hypertext better than paper?" they are obviously limited. This is not to say that no implications can be drawn. These two experiments show that each medium has its own inherent advantages and shortcomings, e.g., hypertext is better than paper when locating specific information that is contained within the body of text but seems to offer no clear advantage when readers have only an approximate idea of what they are looking for. When readers access a text for the first time on a subject for which they have no specialist knowledge and cannot formulate a precise search parameter, the familiarity of paper seems to confer certain advantages.

Hypertext versus linear electronic text

As mentioned earlier in the chapter, comparisons between paper and screen reading often favour paper because of the differences in image quality between the two media. Though hypertext systems usually run on good quality screens it is possible that image quality variables still have an influence for tasks that require fast scanning and visual detection of material. A number of studies that overcome such differences are those comparing hypertext implementations with linear, i.e., non-hypertext, electronic documents on identical screens.

Monk et al. (1988) report two experimental comparisons of hypertext with folding and scrolling displays for examining a computer program listing. In the first study, subjects attempted fifteen questions related to program comprehension. While no effect was observed for number of tasks answered correctly, there was a significant difference between the hypertext and scrolling browsers in terms of the rate at which tasks were performed, with hypertext proving slower. In the second study they attempted to overcome this performance deficit by providing subjects with either a map of the structure of the program or a list of section headings. The map improved performance to levels comparable with users of the scrolling browser in the first study, but the list of titles had little effect. The authors conclude that the map is a vital component of any hypertext implementation; without it too much of the user’s cognitive resources are required to navigate rather than deal with the primary task.

In another study comparing hypertext with linear electronic text, Gordon et al. (1988) asked subjects to read two articles, one in each format. Half the subjects read general interest articles with instructions for casual reading while the rest read technical articles with instructions to learn the material. Thus it can be seen that two distinct tasks and two distinct text types were employed. Performance was assessed using post-task free recall tests and question probes, while preference was assessed using a questionnaire.

Subjects answered significantly more questions correctly with the linear format than with the hypertext and also recalled significantly more of the general interest articles in linear format. Questionnaire data revealed a general 2:1 preference for the linear format with a 3:1 ratio expressing that this format required less mental effort than hypertext. Subjects frequently stated that they were used to the linear format and found hypertext intrusive on their train of thought. Similar to the Monk et al. study, Gordon et al. conclude that navigational decisions are more difficult with the hypertext and therefore disrupt the reader, i.e., cognitive intrusion occurs. The differences between formats were greater for the general interest texts than for the technical material, suggesting possible task effects, but the authors do not discuss this in any detail. One curious aspect of this study was the use of texts on the themes "Falling in love" and "Reverse sterilization" as general interest material!

Interestingly these studies seem to suggest that the hypertext structure places an extra burden on the reader in terms of navigation and consequently leads to poorer performance. On the face of it this is not too surprising for the users and texts employed here. The subjects were familiar with the structure of the linear documents and were only constrained by the manipulation facilities available to them with the system. With the hypertext systems, though manipulation may have been faster and more direct the subjects needed not only to learn the new document structure but also suppress their existing model of where information was likely to be positioned. The fact that naïve users can perform equally well with hypertext versions (Monk et al.’s second study) suggests that such problems can be overcome. How performance would be altered with experience is an obvious question for further research.

Hypertext versus hypertext

Several studies have compared different implementations of hypertext documents to observe the effects of various organising principles or access mechanisms on performance. This is an important area. The term hypertext does not refer to a unitary concept. When comparisons are said to be made between hypertext and paper documents they are really being made between certain implementations of hypertext and standard versions of paper texts. Each implementation consists of one designer’s (or group of designers’) ideas about how to build the interface between users and information. To make general claims or draw conclusions about the wider relevance of hypertext in such circumstances is problematic. However, studies comparing varying implementations can shed some light on what constitutes good hypertext.

Simpson and McKnight (1990), for example, created eight versions of a hypertext document on plants, manipulating the contents list (hierarchical or alphabetic), presence or absence of a current position indicator and presence or absence of typographic cues. Subjects (researchers and students) were required to read the text until they felt confident they had seen it all and were then required to perform 10 information location tasks before attempting to construct a map of the document structure with cards. Results showed that readers using a hierarchical contents list navigated through the text more efficiently and produced more accurate maps of its structure than readers using an alphabetic index. The current position indicator and additional typographic cues were of limited utility.

Wright and Lickorish (1990) compared two types of navigation system for two different hypertexts. The navigation systems were termed Index navigation where the reader needed to return to a separate listing to specify where to move next, and Page navigation where the reader could jump directly to other "pages" from the current display. The two texts were on house-plants and supermarket prices. Twenty-four subjects read both hypertexts, 12 per navigation system, answering multiple-choice questions with the plants text and a variety of ‘GoTo’, compare and compute tasks with the supermarket text.

From their results they concluded that each navigation system had certain advantages in particular situations. For example, the paging navigation system may appear burdensome but was found to be beneficial with the house-plants text as it coupled navigation decisions with an overview of the text’s structure. However, such activity with the tasks performed on the supermarket text (where decisions about where to go were not an issue because the questions provided such information) turned out to be an extra load on working memory. As Wright and Lickorish state, "authors need to bear in mind both the structure inherent in the content material and the tasks readers will be seeking to accomplish when they are designing navigation systems for hypertext."

Conclusions

Despite the claims frequently made for hypertext, experimental comparisons reveal that it is no guarantee of better performance, be it measured in terms of speed, comprehension, range of material covered or problems solved. Some hypertext implementations are good, some bad. Hypertext has advantages when readers are performing certain tasks with particular texts but offers no benefit, or is in fact worse than paper, in others. This is not too surprising. We should not expect any one implementation to be superior to all paper documents. After all, the diversity of textual documentation exists in the main for a purpose, i.e., to support the reader. Furthermore, as readers we all have a wealth of experience in dealing with paper texts which once learned, we apply effortlessly to our dealings with paper. Such experience, be it in the form of models of structure, rapid manipulation skills or accurate memory for location of items in a document, is often overlooked rather than exploited by developers of hypertexts. As the technology improves, any differences based on image quality should disappear; as readers become more experienced with hypertext, initial cognitive intrusion effects should be overcome. However, there are still many other problems to be addressed before hypertext will be exploited fully. The main point to take from this chapter is that users, tasks and texts vary tremendously and only by understanding the interaction of these three aspects of document usage can real progress be made.

Contents Page