CHAPTER 4 : NAVIGATION THROUGH COMPLEX INFORMATION SPACES


 

"He that travelleth into a country before he hath entrance into the language, goeth to school, not to travel."

Bacon: Of Travel

 

With the advent of hypertext it has become widely accepted that the departure from the so-called ‘linear’ structure of paper increases the likelihood of readers or users becoming lost. In this chapter we will discuss this aspect of hypertext in terms of its validity, the lessons to be learned from the psychology of navigation, and the manner in which good design can minimise such problems for users of hypertext documents.

Is navigation a problem?

There is a striking consensus among many of the ‘experts’ in the field that navigation is the single greatest difficulty for users of hypertext. Frequent reference is made to "getting lost in hyperspace" (e.g., Conklin, 1987; McAleese, 1989a) which is described, in the oft-quoted line of Elm and Woods (1985), as:

"the user not having a clear conception of the relationships within the system or knowing his present location in the system relative to the display structure and finding it difficult to decide where to look next within the system." (p.927)

In other words, users do not know how the information is organised, how to find the information they seek, or even if that information is available. With paper documents there tends to be at least some standards in terms of organisation. With books, for example, contents pages are usually at the front, indices at the back and both offer some information on where items are located in the body of the text. Concepts of relative position in the text such as ‘before’ and ‘after’ have tangible physical correlates. No such correlation holds with hypertext.

There is some direct empirical evidence in the literature to support the view that navigation in hypertext can be a problem. Edwards and Hardman (1989), for example, describe a study which required subjects to search through a specially designed hypertext. In total, half the subjects reported feeling lost at some stage. Such feelings were mainly due to "not knowing where to go next" or "not knowing where they were in relation to the overall structure of the document" rather than "knowing where to go but not knowing how to get there" (descriptors provided by the authors). Unfortunately, without direct comparison of ratings from subjects reading a paper equivalent we cannot be sure such proportions are solely due to using hypertext. However it is unlikely that many readers of paper texts do not know where they are in relation to the rest of the text!

Indirect evidence comes from the numerous studies which have indicated that users have difficulties with a hypertext (e.g., many of the studies cited in the previous chapter). Hammond and Allinson (1989) speak for many when they say:

"Experience with using hypertext systems has revealed a number of problems for users…First, users get lost…Second, users may find it difficult to gain an overview of the material…Third, even if users know specific information is present they may have difficulty finding it." (p.294)

There are a few dissenting voices. Brown (1989) argues that:

"although getting lost is often claimed to be a great problem, the evidence is largely circumstantial and conflicting. In some smallish applications it is not a major problem at all." (p.2)

This quote is telling in several ways. The evidence for navigational difficulties is often circumstantial, as noted above. The applications in which Brown claims it is not a problem at all, are, to use his word, "smallish" and this raises an important issue with respect to hypertext. When we speak of documents being so small that a reader cannot ‘get lost’ in them or so large that navigation aids are required to use them effectively, the implication is that information occupies "space" through which readers ‘travel’ or ‘move’. Hammond and Allinson (1987) talk of the "travel metaphor" as a way of moving through a hypertext. Canter et al. (1985) speak of "routes through" a database. Even the dissenters believe that the reader or user navigates through the document, the only disagreement being the extent to which getting lost is a regular and/or serious occurrence.

The weight of evidence, be it experiential, anecdotal or empirical suggests that navigation is an issue worthy of consideration. In the following section we will discuss what is known about the psychology of navigation in physical environments and show how this might have relevance to the ‘virtual’ worlds of information space.

The psychology of navigation

Surprisingly, for an activity that is routinely performed by all of us, navigation is not a well-studied psychological phenomenon in the same way that reading is. However, aspects relevant to the study of navigation are dealt with in some studies of spatial imagery, orientation, distance judgement and so forth. It is difficult to make a cohesive theory out of these disparate strands but some agreements do exist.

Schemata and models of generic environments

It seems obvious, for example, that we have a schema or model of the physical environment in which we find ourselves. This is acquired from experience and affords us a basic orienting frame of reference for navigatory purposes. Thus, we soon acquire schemata of towns and cities so that we know what to expect when we find ourselves in one: busy roads, numerous buildings, shopping, residential and industrial areas, many people, churches, pubs, and so on. According to Downs and Stea (1977), such frames of reference exist at all levels of scale from looking at the world in terms of east and west or First and Third Worlds, to national distinctions between north and south, urban and rural and so on down to local entities like buildings and neighbourhoods. It is precisely such models that give rise to powerful stereotypes encapsulated in the phrase "when you’ve seen one slum, you’ve seen them all" (for which ‘slum’ could be replaced with city, factory, church or whatever physical structure the speaker was dismissing).

Such frames of reference also guide our responses to the environment in terms of how we should behave. Therefore we soon realise that to interact effectively with an urban environment (e.g., to get from A to B) there are probably a variety of information sources available to us such as maps, street signs, landmarks, tourist information facilities and so forth. Roads must be crossed in certain ways, e.g., at pedestrian crossings or when there is no traffic, and you must pay if you want to use public transport. In this sense the frame of reference is identical to the concept of script (Schank and Abelson, 1976) mentioned in the previous chapter.

While schemata are effective orienting guides, in themselves they are limited. They do not reflect specific instances of any one environment and provide no knowledge of what exists outside of our field of vision. Yet humans have such knowledge of places with which they are familiar. We know our houses well enough to walk through them mentally and describe objects and colours we encounter. Schemata or frames would allow us to predict that any one house contains a kitchen towards the back or bedrooms and a bathroom upstairs, for example, but as such this would be an expectancy rather than knowledge. We could only guess at colours and objects we might find. With cities, a schema might tell us that there are numerous routes to the same place but would not enable us to describe one or predict the shortest. However, individuals who live in that city would be able to describe the routes and accurately predict their respective journey times. So what is this detailed knowledge that we acquire of our environment, and how does it emerge?

The acquisition of cognitive maps

Current theories of navigation vary and the topic is no longer the province of psychologists alone. Geographers, anthropologists and urban planners all show an interest (see, for example, Downs and Stea, 1973). However, Tolman’s (1948) paper on cognitive maps is frequently cited as seminal. True, this paper describes a number of studies on rats travelling through mazes (a constant source of amusement to those who feel that academic psychology has little relevance to real life!), but in it Tolman discusses the implications of this work for human cognition and postulates the existence of a cognitive map, internalised in the human mind, which is the analogue to the physical lay-out of the environment. In dismissing much of the then popular behaviouristic school of psychology, Tolman argues that information impinging on the brain is:

"worked over and elaborated…into a tentative cognitive-like map of the environment indicating routes and paths and environmental relationships..."

Over 40 years later, such a perspective is readily accepted and the non-psychologists among us might even wonder how such an obvious statement could ever have caused controversy. Recent experimental work takes the notion of some form of mental representation of the environment for granted, concerning itself more with how such maps are formed and manipulated. Many theorists agree that the acquisition of navigational knowledge proceeds through several developmental stages from the initial identification of landmarks in the environment to a fully formed mental map. One such developmental model has been discussed by Anderson (1980) and Wickens (1984) and is briefly described here.

According to this model, in the first instance we represent knowledge in terms of highly salient visual landmarks in the environment such as buildings, statues, etc. Thus we recognise our position in terms relative to these landmarks, e.g., our destination is near building X, or if we see statue Y then we must be near the railway station, and so forth. This knowledge provides us with the skeletal framework on which we build our cognitive map.

The next stage of development is the acquisition of route knowledge which is characterised by the ability to navigate from point A to point B, using whatever landmark knowledge we have acquired to make decisions about when to turn left or right. With such knowledge we can provide others with effective route guidance, e.g., "Turn left at the traffic lights and continue on that road until you see the Bull’s Head public house on your left, and take the next right there…" and so forth. Though possessing route knowledge, a person may still not really know much about his environment. A route might be non-optimal or even totally wasteful.

The third stage involves the acquisition of survey knowledge. This is the fully developed cognitive map that Tolman described. It allows us to give directions or plan journeys along routes we have not directly travelled as well as describe relative locations of landmarks within an environment. It allows us to know the general direction of places, e.g., "westward" or "over there" rather than "left of the main road" or "to the right of the church". In other words, it is based on a world frame of reference rather than an ego-centred one.

It is not clear if each individual develops through all stages in such a logical sequence. Obviously, landmark knowledge on its own is of little use for complex navigation, and both route and survey knowledge emerge from it as a means of coping with the complexity of the environment. However, it does not necessarily follow that once enough route knowledge is acquired it is replaced by survey knowledge. Experimental investigations have demonstrated that each is optimally suited for different kinds of tasks. For example, route knowledge is better for orientation tasks than survey knowledge, the latter being better for estimating distance or object localisation on a map (Thorndyke and Hayes-Roth, 1982; Wetherell, 1979). Route knowledge is cognitively simpler than survey knowledge but suffers the drawback of being virtually useless once a wrong step is taken (Wickens, 1984). Route knowledge, because of its predominantly verbal form, might suit individuals with higher verbal than spatial abilities, while the opposite would be the case for survey knowledge.

Conclusions

While such theoretical work on navigation is primarily concerned with travels through physical space such as cities and buildings, it does offer a perspective that might prove insightful to the design of hypertext systems where navigation is conceptualised as occurring through an information space. In an attempt to relate the discussion of navigation to more directly relevant issues the following section details what is known about navigation through paper documents.

Navigation applied to paper documents

Schemata and models

When we pick up a book, we immediately have access to a whole host of information about the likely contents, its size, subject matter and so forth. Even looking at just the book cover tells us a lot about the likely style of coverage and so forth. When we open the book, we have expectations about what we will find inside the front cover such as details of where and when it was published, perhaps a dedication, and then a contents page. We know, for example, that contents listings describe the layout of the book in terms of chapters, proceeding from the front to the back. Chapters are organised around themes, and an index at the back of the book, organised alphabetically, provides more specific information on where information is located in the body of the text. Experienced readers know all this before even opening the text. It would strike us as odd if such structures were absent or their positions within the text were altered, e.g., the contents page was at the back or in the middle, there were no chapter divisions, or the index was not arranged alphabetically.

The same might be said of a newspaper. Typically we might expect a section on the previous day’s political news at home, foreign coverage, market developments and so forth. News of sport will be grouped together in a distinct section and there will also be a section covering that evening’s television and radio schedules. The same is probably true for most text types, i.e., there are organisational principles governing the lay-out and structure of their contents.

Some of the most impressive work in this area has been carried out by van Dijk and Kintsch (1983). We noted in the previous chapter how they have proposed a model of discourse comprehension that involves readers analysing the propositions of a text and forming a macropropositional hierarchy. According to this theory, readers acquire (through experience) schemata (which van Dijk and Kintsch term ‘superstructures’) that facilitate comprehension of material by allowing readers to predict the likely ordering and grouping of constituent elements of a body of text. To quote van Dijk (1980):

"a superstructure is the schematic form that organises the global meaning of a text. We assume that such a superstructure consists of functional categories…(and)…rules that specify which category may follow or combine with what other categories." (p.108)

Apart from categories and functional rules, van Dijk adds that a superstructure must be socio-culturally accepted, learned, used and commented upon by most adult language users of a speech community.

They have applied this theory to several text types. For example, with respect to newspaper articles they describe a schema consisting of headlines and leads (which together provide a summary), major event categories each of which is placed within a context (actual or historical), and consequences. Depending on the type of newspaper (e.g., weekly as opposed to daily) we might expect elaborated commentaries and evaluations. Experiments by Kintsch and Yarborough (1982) showed that articles written in a way that adhered to this schema resulted in better grasp of the main ideas and subject matter than ones which were re-organised to make them less schema-conforming. When given a cloze test of the articles, no difference was observed. The authors suggest that schematic structures are not particularly relevant as far as ability to remember specific details such as words is concerned, but have major importance at the macropropositional level of comprehension.

At a more global level, two studies by Dillon (1990a) tested readers’ ability to impose structure on paragraphs and sentences of text. In the first study, subjects were given paragraphs from academic journal articles and asked to organise them into one article as fast as they could. To avoid referential continuity, every second paragraph was removed. In one condition headings were provided, in the other they were absent. The results indicated that readers had little difficulty piecing the article together into gross categories of Introduction, Method, Results and Discussion (over 80% accuracy at this level) but had difficulties distinguishing the precise order at the within-section level. When provided with headings, subjects formed the same major categories but were less accurate in placing second level headings in the correct section. This suggests that experienced journal readers are capable of distinguishing isolated paragraphs of text according to their likely location within a complete article with respect to the major categories. Interestingly, this could be done without resorting to reading every word or attempting to understand the subject matter of the paper.

In the second study, subjects read a selection of paragraphs from two articles on both paper and screen and had to place each one in the general section to which they thought it belonged (Introduction, Method, Results or Discussion). Again, subjects showed a high degree of accuracy (over 80%) with the only advantage to paper being speed (subjects were significantly faster at the 5% level in the paper condition) which is probably explicable in terms of image quality variables, as outlined in the previous chapter. Taken together, these results suggest that readers do have a model of the typical journal article that allows them to gauge accurately where certain information is located. This model does not seem to be affected by presentation medium.

In this format, the model/schema/superstructure constitutes a set of expectancies about their usual contents and how these are grouped and positioned relative to each other. Obviously, in advance of actually reading the text we cannot have much insight into anything more specific than this, but the generality of organisation within the multitude of texts we read in everyday life affords stability and orientation in what could otherwise be an over-complex informational environment.

Acquiring a cognitive map of the text

If picking up a new book can be compared to a stranger entering a new town (i.e., we know what each is like on the basis of previous experience and have expectancies of what we will find), how do we proceed to develop our map of the information space?

To use the analogy of navigation in physical space, we would expect that generic structures such as indices, contents, chapter headings and summaries, page numbers and so forth would be seen as landmarks that provide readers with information about where they are in a text, just as signposts, buildings and street names aid navigation in physical environments. Thus, when initially reading a text, we might notice that there are numerous figures and diagrams in certain sections, none in others, or that a very important point or detail is raised in a section containing a table of numerical values. In fact, readers often claim to experience such a sense of knowing where an item of information occurred in the body of the text even if they cannot recall that item precisely, and there is some empirical evidence to suggest that this is in fact the case.

Rothkopf (1971) carried out an experiment to test whether such occurrences had a basis in reality rather than resulting from popular myth supported by chance success. He asked people to read a 12 page extract from a book with the intention of answering questions on content afterwards. What subjects didn’t realise was that they would be asked to recall the location of information in the text in terms of its occurrence both within the page (divided into eighths) and the complete text (divided into quarters). The results showed that incidental memory for locations within any page and within the text as a whole was more accurate than chance, i.e., people could remember location information even though they were not asked to. There was also a positive correlation between location of information at the within-page level and accuracy of question answering.

There have been several follow-up studies by Rothkopf and by other investigators into this phenomenon. Zechmeister and McKillip (1972) had subjects read eight pages of text typed into blocks with four blocks per page. Subjects were asked to read the text before being tested on it. The test consisted of fill-in-the-blank questions, confidence ratings on the answers, and location of the answer on the page. Again, an effect for knowledge of location was observed which was correlated to accuracy of answers, suggesting that memory for location and for content are independent attributes of memory that can be linked for mnemonic purposes. Interestingly, no interaction of memory for location and confidence in answer was found. Further work by Zechmeister et al. (1975) and by Lovelace and Southall (1983) confirm the view that memory for spatial location within in body of text is reliable even if it is generally limited.

The analogy with navigation in a physical environment is of limited applicability beyond the level of landmark knowledge. Given the fact that the information space is instantly accessible to the reader (i.e., he can open a text at any point), the necessity for route knowledge, for example, is lessened (if not eliminated). To get from point A to point B in a text is not dependent on taking the correct course in the same way that it is in a physical three-dimensional environment. The reader can jump ahead (or back), guess, use the index or contents or just page serially through the text. Readers rarely rely on just one route or get confused if they have to start from a different point in the text in order to go to the desired location, as would be the case if route knowledge was a formal stage in their development of navigational knowledge for texts. Once you know the page number of an item you can get there as you like. Making an error is not as costly as it is in the physical world either in terms of time or effort. Furthermore, few texts are used in such a way as to require that level or type of knowledge.

A similar case can be made with respect to survey knowledge. While it seems likely that a reader experienced with a certain text can mentally envisage where information is in the body of the text, what cross-references are relevant to his purpose and so forth, we must be careful that we are still talking of navigation and not changing the level of discourse to how the argument is developed in the text or the ordering in which points are made. Without doubt, such knowledge exists, but often it is not purely navigational knowledge but an instantiation of several schemata such as domain knowledge of the subject matter, interpretation of the author’s argument, and a sense of how this knowledge is organised that come into play now. This is not to say that readers cannot possess survey-type knowledge of a text’s contents; rather, it is to highlight the limitations of directly mapping concepts from one domain to another on the basis of terminology alone.

The fact that we use the term navigation in both situations does not mean that they are identical activities with similar patterns of development. The simple differences in applying findings from a three-dimensional world (with visual, olfactory, auditory and powerful tactile stimuli) to a two-dimensional text (with visual and limited tactile stimuli only) and the varying purposes to which such knowledge is put in either domain are bound to have a limiting effect.

It might be that, rather than route and survey knowledge, a reader develops a more elaborated analogue model of the text based on the skeletal framework of landmark knowledge outlined earlier. Thus, as familiarity with the text grows, the reader becomes more familiar with the various landmarks in the text and their inter-relationships. In effect the reader builds a representation of the text similar to the survey knowledge of physical environments without any intermediary route knowledge but in a form that is directly representative of the text rather than physical domain.

The manner in which knowledge is represented mentally is a fundamental issue in cognitive psychology and one which we will not delve deeply into here. Suffice to acknowledge that various representational forms exist and the distinctive nature of navigation in text compared to physical environments is sufficient to require alternative representations.

Navigation applied to electronic documents

Schemata and models

The concept of a schema for an electronic information space is less clear-cut than those for physical environments or paper documents. Electronic documents have a far shorter history than paper and the level of awareness of technology among the general public is relatively primitive compared to that of paper. Exposure to information technology will almost certainly improve this state of affairs, but even among the contemporary computer-literate it is unlikely that the type of generic schematic structures that exist for paper documents have electronic equivalents of sufficient generality.

Obviously, computing technology’s short history is one of the reasons why this might be so, but it is also the case that the media’s underlying structures do not have equivalent transparency. With paper, once the basic modus operandi of reading are acquired (e.g., page-turning, footnote identification, index usage and so forth) they retain utility for other texts produced by other publishers, other authors and for other domains. With computers, manipulation of information can differ from application to application within the same computer, from computer to computer, and from this year’s to last year’s model. Thus, using electronic information is often likely to involve the employment of schemata for systems in general (i.e., how to operate them) in a way that is not essential for paper-based information.

The qualitative differences between the schemata for paper and electronic documents can easily be appreciated by considering what you can tell about either at first glance. We have outlined the information available to paper text users in the section on paper schemata above. When we open a hypertext document, however, we do not have the same amount of information available to us. We are likely to be faced with a welcoming screen which might give us a rough idea of the contents (i.e., subject matter) and information about the authors/developers of the document, but little else. It is two-dimensional, gives no indication of size, quality of contents, age (unless explicitly stated) or how frequently it has been used (i.e., there is no dust or signs of wear and tear on it such as grubby finger-marks or underlines and scribbled comments). At the electronic document level, there is usually no way of telling even the relative size without performing some ‘query operation’. For example, in Figure 9, is the ‘WineBook’ bigger or smaller than ‘Cliff’s PhoneBook’? Without using the ‘Get Info’ command on both and comparing their sizes (given in kilobytes), there is no way of telling.

Figure 9: Electronic documents give no obvious clues to their size.

 

Performing the hypertext equivalent of opening up the text or turning the page offers no assurance that expectations will be met. Many hypertext documents offer unique structures (intentionally or otherwise) and their overall sizes are often impossible to assess in a meaningful manner (McKnight et al., 1989b). At the current stage of development, it is likely that users or readers familiar with hypertext will have a schema that includes such attributes as linked nodes of information, non-serial structures, and perhaps even potential navigational difficulties! The manipulation facilities and access mechanisms available in hypertext will probably occupy a more prominent rôle in their schemata for hypertext documents than they will for readers’ schemata of paper texts. As yet, empirical evidence for such schemata is lacking.

The fact that hypertext offers authors the chance to create numerous structures out of the same information is a further source of difficulty for users or readers. Since schemata are generic abstractions representing typicality in entities or events, the increased variance of hypertext implies that any similarities that are perceived must be at a higher level or must be more numerous than the schemata that exist for paper texts.

It seems, therefore, that users’ schemata of hypertext environments are likely to be ‘informationally leaner’ than those for paper documents. This is attributable to the recent emergence of electronic documents and comparative lack of experience interacting with them as opposed to paper texts for even the most dedicated users. The lack of standards in the electronic domain compared to the rather traditional structures of many paper documents is a further problem for schema development at this point in time.

Acquiring a cognitive map of the electronic space

As mentioned above, navigation through hypertext is considered a major issue by many designers and researchers. The roots of this issue lie in the literature on users interacting with non-hypertext databases and documents as well as with menu-driven interfaces, where it has been repeatedly shown that users can lose their way in the maze of information.

Hagelbarger and Thompson (1983) claim that when users make an incorrect selection at a deep level they tend to return to the start rather than the menu at which they erred. Research by Tombaugh and McEwen (1982) and Lee et al. (1984) indicates that the actual to minimum ratio for screens of information accessed in a successful search is 2:1, i.e., users will often access twice as many menu pages as necessary. These findings led researchers to conclude that navigation through electronic (but non-hypertext) databases can pose severe navigational problems for users.

A relevant variable in navigation through menus is the method of classifying the information available in the database, i.e., how the information space is organised. Barnard et al. (1977) demonstrated that the manner of classification (such as whether it was alphabetical or relational) influenced the time taken to access targets in a menu-style task. They conclude, not surprisingly perhaps, that the users’ conceptualisation of the desired information influences the selections they make en route. Research by Snowberry et al. (1985) indicates that the main source of difficulty is the relatively weak associations which users have between category descriptors at the highest level of menu and the desired information at the lower level, i.e., there is little information in the immediate environment that aids users’ navigation decisions. This is a fault of design where little attempt is made to identify the end-user’s conceptualisation of the information space. Significantly enough, Lee et al. (1984) discovered considerable variation among experts in terms of what they believe constitutes a "good" or well-organised menu.

In terms of the model of navigational knowledge described above, we should not be surprised by such findings. They seem to be classic manifestations of behaviour based on limited knowledge. For example, returning to the start upon making an error at a deep level in the menu suggests the absence of survey type knowledge and a strong reliance on landmarks (e.g., the start screen) to guide navigation. It also lends support to the argument about route knowledge, that it becomes useless once a wrong turn is made. Making ‘journeys’ twice as long as necessary is a further example of the the type of behaviour expected from people lacking a mental map of an environment and relying on only landmark and route knowledge to find their way.

Jones and Dumais (1986) empirically tested spatial memory over symbolic memory for application in the electronic domain, citing the work of Rothkopf and others as indicators that such memory might be important. In a series of three experiments, they had subjects simulate filing and retrieval operations using name, location or a combination of both stimuli as cues. Like the preceding work on texts, they found that memory for location is above chance but modest compared to memory for names and concluded that it may be of limited utility for object reference in the electronic domain.

Therefore, we know that navigational difficulties exist where users need to make decisions about location in an electronic information space. There seems to be some evidence that the first stage of knowledge about navigation is of the landmark variety and that the organising principles on which the information structure is built are important. We now turn to the more specific evidence for hypertext.

Acquiring a cognitive map of a hypertext document

The study by McKnight et al. (1990) described in the previous chapter looked at navigation in terms of the amount of time spent in the contents and/or index sections of the documents employed. They found that subjects in both hypertext conditions spent significantly greater proportions of time in the index/contents sections of the documents. We noted that this indicated a style of interaction based on jumping into parts of the text and returning to ‘base’ for further guidance – a style assumed not particularly optimal for hypertext – and concluded from this that effective navigation was difficult for non-experienced users of a hypertext document.

Once more this is an example of using landmarks in the information space as guidance. Subjects in the linear conditions (paper and word processor versions) seemed much happier to browse through the document to find information, highlighting their confidence and familiarity with the structure presented to them. Similar support for the notion of landmarks as a first level of navigational knowledge development are provided by several of the studies which have required subjects to draw or form maps of the information space after exposure to it (e.g., Simpson and McKnight, 1990). Typically, subjects can group certain sections together but often have no idea where other parts go or what they are connected to.

Unfortunately it is difficult to chart the development of navigational knowledge beyond this point. Detailed studies of users interacting with hypertext systems beyond single experimental tasks and gaining mastery over a hypertext document are thin on the ground. Edwards and Hardman (1989) claim that they found evidence for the development of survey type navigational knowledge in users exposed to a strictly hierarchical database of fifty screens for a single experimental session lasting, on average, less than 20 minutes. Unfortunately the data are not reported in sufficient detail to allow a critical assessment of such a claim, but it is possible that, given the document’s highly organised structure, comparatively small size and the familiarity of the subject area (leisure facilities in Edinburgh), such knowledge might have been observed. Obviously this is an area that needs further empirical work.

While it is clear that empirical work on hypertext is limited, numerous designers and researchers have considered the navigation issues in less experimental ways, without concerning themselves with the development of mental representations of the information space. In the following section we discuss the two major themes to have emerged from this work: the design of suitable maps, browsers and landmarks for users, and the concept of metaphor provision to aid navigation.

Providing navigational information: browsers, maps and structural cues

A graphical browser is a schematic representation of the structure of a hypertext aimed at providing the user with an easy-to-understand map of where particular information is located. According to Conklin (1987), graphical browsers are a feature of a "somewhat idealized hypertext system", recognising that not all existing systems utilise browsers but suggesting that they are desirable. A typical browser is shown in Figure 10. The idea behind a browser is that the document can be represented graphically in terms of the nodes of information and the links between them and, in some instances, that selecting a node in the browser would cause its information to be displayed.

Figure 10: A graphical browser from Apple’s HyperCard Help stack.

 

It is not difficult to see why this might be useful. Like a map of a physical environment, it shows the user what the overall information space is like, how it is linked together and consequently offers a means of moving from one information node to another. Indeed, Monk et al. (1988) have shown that even a static, non-

interactive graphical representation is useful. However, for richly interconnected material or documents of a reasonable size and complexity, it is not possible to include everything in a single browser without the problem of presenting ‘visual spaghetti’ to the user. In such cases it is necessary to represent the structure in terms of levels of browsers, and at this point there is a danger that the user gets lost in the navigational support system!

Some simple variations in the form of maps or browsers have been investigated empirically. In a non-hypertext environment, Billingsley (1982) had subjects select information from a database aided by an alphabetical list of selection numbers, a map of the database structure or no aid. The map proved superior, with the no aid group performing worst.

In the hypertext domain, a number of studies by Simpson (1989) have experimentally manipulated several variables relating to structural cues and position indicators. Her subjects performed a series of tasks on articles about house-plants and herbs. In one experiment she found that a hierarchical contents list was superior to an alphabetic index and concluded that users are able to use cues from the structural representation to form maps of the document. In a second study she reported that users provided with a graphical contents list showing the relationship between various parts of the text performed better than users who only had access to a textual list. Making the contents lists interactive (i.e., selectable by pointing) also increased navigational efficiency.

Manipulating ‘last card seen’ markers produced mixed results. It might be expected that such a cue would be advantageous to all users, but Simpson reported that this cue seemed of benefit only during initial familiarisation periods and for users of non-interactive contents lists. Further experiments revealed that giving users a record of the items they had seen aided navigation, much as would be expected from the literature on physical navigation which assumes that knowledge of current position is built on knowledge of how you arrived there (Canter, 1984). In general, Simpson found that as accuracy of performance increased so did subjects’ ability to construct accurate post-task maps of the information space.

Such work is important to designers of hypertext systems. It represents a useful series of investigations into how ‘contents pages’ for hypertext documents should be designed. Admittedly, it concerned limited tasks in a small information space, but such studies are building blocks for a better understanding of the important issues in designing hypertext systems. As always, however, more research needs to be done.

The provision of metaphors

A second area of research in the domain of navigational support concerns that of metaphor provision. A metaphor offers a way of conceptualising an object or environment and in the information technology domain is frequently discussed as a means for aiding novices’ comprehension of a system or application. The most common metaphor in use is the desk-top metaphor familiar to users of the Apple Macintosh among others. Here, the user is presented with a virtual desktop on-screen and can perform routine file manipulations by opening and closing ‘folders’ and ‘documents’ and throwing them in the ‘wastepaper bin’ to delete them. Prior to this metaphor, the word processor was often conceptualised by first-time users as a typewriter.

The logic behind metaphors is that they enable users to draw on existing world knowledge in order to act on the electronic domain. As Carroll and Thomas (1982) point out:

"If people employ metaphors in learning about computing systems, the designers of those systems should anticipate and support likely

metaphorical constructions to increase the ease of learning and using the system."

However, rather than anticipate likely metaphorical constructions, the general approach in the domain of hypertext has been to provide a metaphor and hope (or examine the extent to which) the user can employ it. As the term navigation suggests, the most commonly provided metaphor is that of travel.

Hammond and Allinson (1987) report on a study in which two different forms of the travel metaphor were employed: ‘go-it-alone’ travel, and the ‘guided tour’. These two forms were intended to represent different loci of control over movement through the document, the first being largely user-controlled and the second being largely system-controlled. Additionally, a map of the local part of the information structure was available from every screen, with selectable arrows at the four edges leading to further maps, frames so far visited indicated, and all frames directly selectable from the map. Hammond and Allinson stress the importance of integrating the metaphor in the design of the system, which they did, and not surprisingly they found that users were able to employ it with little difficulty.

Of course, one could simply make the electronic book look as similar to the paper book as possible. This is the approach offered by people such as Benest (1990) with his book emulator and as such seems to offer a simple conceptual aid to novice users. Two pages are displayed at a time and relative position within the text can be assessed by the thickness of pages either side which are splayed out rather like a paper document would be. Page turning can be done with a single mouse press, which results in two new pages appearing, or by holding the mouse button down and simulating ‘flicking’ through the text. The layout of typical books can also be supported by such a system, thereby exploiting the schematic representations which we know that experienced readers possess.

If that was all such a system offered it would be unlikely to succeed; it would just be a second-rate book. However, according to Benest, his book emulator provides added-value that exploits the technology underlying it. For example, although references in the text are listed fully at the back of the book, they can be individually accessed by pointing at them when they occur on screen. Page numbers in contents and index sections are also selectable, thereby offering immediate access to particular portions of the text. Such advantages are typical of most hypertext applications. In his own words:

"the book presentation, with all the engrained expectations that it arouses and the simplicity with which it may be navigated, is both visually appealing and less disruptive during information acquisition, than the older ‘new medium demands a new approach’ techniques that have so far been adopted."

This may be true, but at the time of writing no supporting evidence has been presented and as we have noted earlier, in the absence of empirical data one should view all claims about hypertext with caution.

It is interesting for two reasons that Benest dismisses the ‘new medium demands a new approach’ philosophy of most hypertext theorists. Firstly, there is a good case to be made for book-type emulations according to the arguments put forward above about the schematic representations which readers possess of texts. As outlined earlier, such representations facilitate usage by providing orientation or frames of reference for naïve users. Such points have been raised in sufficient detail earlier not to require further elaboration here. Secondly, the new approach which rejects such emulations has largely been responsible for the adoption of the concept of navigation through hypertext.

In response to the first issue it is worth noting that Benest’s approach is, to our way of thinking, correct up to a point. We ourselves have been developing a hypertext journal database and have decided that, on the basis of some of our studies cited earlier on usage styles and models of academic articles, emulating the structure of the journal as it exists in paper is good design. However, we are less concerned with emulation as much as retention of useful structures. This does not extend as far as mimicking page-turning or providing splayed images of the pages underlying either opened leaf. Furthermore, while we advocate the approach of identifying relevant schematic structures for texts, we would not expect all types to retain such detailed aspects of their paper versions in hypertext. There seems little need, for example, to emulate the book form to this degree for a hypertext telephone directory. Benest does not seem to draw the line however between texts that might usefully exploit such emulations and those that would not, or state what he would expect unique hypertext documents to emulate.

In response to the second point, it is worth asking ‘is there an alternative to navigation as a metaphor’? As we have continually noted in this chapter, the dominant approach to hypertext has produced the navigation-through-space metaphor. Benest, though still talking of navigation, does so in the limited sense that it is used in the paper domain. The more typical hypertext approach embraces navigation whole-heartedly and uses it as a means of inducing orienting schemata in the user’s mind.

Hammond and Allinson (1987) discuss the merits of the metaphor approach in general and the navigation one in particular for hypertext. They argue that there are two relevant dimensions for understanding the information which metaphors convey: scope and level of description. A metaphor’s scope refers to the number of concepts that the metaphor relates to. A metaphor of broad scope in the domain of HCI is the desk-top metaphor common to many computing interfaces. Here, many of the concepts a user deals with when working on the system can be easily dealt with cognitively in terms of physical desk-top manipulations. The typewriter metaphor frequently invoked for explaining word processors is far more limited in scope. It offers a basic orientation to using word processors (i.e., you can use them to create print quality documents) but is severely limited beyond that as word processors do not behave like typewriters in many instances.

The metaphor’s level of description refers to the type of knowledge it is intended to convey. This may be very high level information such as how to think about the task and its completion, or very low, such as how to think about particular command syntax in order to remember it easily. Hammond and Allinson talk of four levels: task, semantic, lexical and physical which refer to general issues such as: "Can I do it?"; "What does this command do?"; "What does that term mean?" and "What activities are needed to achieve that?" respectively. Few, if any, metaphors convey information at all levels, but this does not prevent them being useful to users. In fact, few users ever expect metaphors to offer full scope and levels of description.

According to Hammond and Allinson, the navigation metaphor is useful in the hypertext domain and when users are offered ‘guided tours’ through an information space they do not expect physical manifestations of the metaphor to apply literally but might rely primarily on semantic mappings between metaphor and system much more heavily. As we have attempted to outline in the present chapter, there are numerous rich mappings that can be made between the navigation metaphor and hypertext, and thus it seems sensible to use it.

Benest’s book emulation is also a metaphor for using the system and in some instances would offer a broad scope and many levels of description between the paper text and the hypertext. The fact that we can talk about navigation and book metaphors in the one system shows that mixed metaphors are even possible and (though awaiting confirmatory evidence) probably workable in some instances.

It is hard to see any other metaphors being employed in this domain. Navigation is firmly entrenched as a metaphor for discussing hypertext use, and book comparisons are unavoidable in a technology aimed at supporting many of the tasks performed with paper documentation. Whether there are other metaphors that can be usefully employed is debatable. Limited metaphors for explaining computer use to the novice user are bound to exist, and where such users find themselves working with hypertext new metaphors might find their way into the domain. For now at least, though, it seems that navigation and book emulation are here to stay.

Navigating the semantic space

One aspect of the whole navigation issue that often appears overlooked in the hypertext literature is that of the semantic space of a text or electronic document. In other words, to what extent does a user or reader need to find his way about the argument that an author creates as opposed to, or distinct from, navigating through the structure of the information?

It is probably impossible to untangle these aspects completely. We noted earlier, in the section on readers’ memory for spatial location on pages, that there was a correlation between memory for location and comprehension. This is attributed to the fact that they are independent aspects of memory which are capable of being linked for mnemonic purposes. In other words, memories may consist of a constellation of attributes in which the recall of any one attribute is facilitated by the recall of others.

In terms of hypertext it seems that the ability to navigate through the information space should in some sense be related to the user’s comprehension of the contents of the document. At the time of writing, this question has not been directly tested. Most of the studies reported in this and the preceding chapter have contented themselves with looking at ability to locate information in a document rather than considering comprehension. It is not valid to infer that, because users can locate correct information or produce accurate maps of a hypertext’s structure, they necessarily understand its contents. It is likely that readers who well understand a hypertext’s contents will also have the ability to accurately locate information in it, but there is no guarantee that the reverse holds true.

Navigation through the semantic space will not relate or map absolutely to navigation in the structural sense if authors are not good at structuring information. It is conceivable that the linking power of hypertext packages will encourage some authors merely to link everything to everything else, or keep adding further nodes of information according to some vague philosophy which espouses association as ‘natural’ and hence desirable, leaving the reader to decide what they want to access. In such situations, grasping the message and grasping the structure will not necessarily overlap.

A further distinguishing characteristic is that while we can easily compare how near ideas are in terms of location in a structure, i.e., how many links exist between two nodes or the number of selections/button presses need to be made to access node Z from node A, we cannot offer a similar measure of semantic distance. The extent to which two ideas are related may seem intuitively easy to assess but is unlikely to have an agreed quantifiable metric (see, for example, the work of Osgood et al, 1957, and Kelly, 1955). The notion of structuring semantic space will be discussed further in the chapters on writing and education.

Conclusion

The concept of navigation is a meaningful one in the hypertext domain in the sense that we can view user actions as movement through electronic space. Research in the psychology of navigation in physical environments has some relevance but needs further empirical investigation to identify the extent to which the issues it raises may map directly onto users of electronic documents. Limitations in scope and level of such a mapping need to be made explicit. The expression of navigation difficulties is rarely supported with clear evidence however, and the need for sound empirical work here should not be underestimated. The psychological model of navigation knowledge could prove a useful research tool in these circumstances. With respect to navigation of semantic space, it seems that existing research has little to tell us and the onus is on workers in the area to gain an understanding of such concepts through their own work.


Contents Page