In order to understand the principles of designing for cognitive control, one
must have a model of user behavior. Such a model for menu selection must
incorporate the basic cognitive elements involved in the menu selection. These
components include the processes of visual information search, judgment and
decision processes, choice and response production, and evaluation of feedback.
To an extent these processes follow the temporal sequence of menu selection at
the level of the frame.
At a higher level, the model of the user must concern itself with the user's strategies of search and problem solving. If an item is not found on the first path through a database, how does the user redirect the path of search? In command menu system how does the user minimize the number of steps to complete a task by changing the order in which subcomponents of the task are performed? The answers to these questions depend on the user's model of the system and strategies for navigation through that model. Mental models have been represented in several ways in cognitive psychology. Scripts have been used to layout the expected series of events. Metaphors have been used to map the elements and relations from a familiar system to a less familiar one. Production rules have been used capture the knowledge that the user may have about the workings of the system. In each case, the idea of a cognitive layout may be used to describe the way in which users may engage a particular model and cast a visual representation or layout of the model. Such a layout defines the way in which the user thinks about using the system and serves as a vehicle for formulating plans.
It will be seen that a number of models of user behavior can be formulated depending on the level of analysis and the processes of interest. There is no single unified model but rather a collection of modeling techniques that can be applied to particular situations and performance variables. This chapter will cover a number of these models and techniques as they apply to menu selection.
4.1 The Menu Selection Process
The previous chapter dealt with the menu frame as a stimulus. This section will consider the cognitive processing of that stimulus frame. The menu selection process involves a number of cognitive elements. Within a particular menu frame, the user must read the alternatives, choose the desired option, effect the choice, and finally ascertain the consequences. Across menu frames, the user must maintain a sense of direction, evaluate proximity to the goal, and effect a plan of search or problem solving strategy. This section will examine the process within the frame. A theoretical model of these processes will help to evaluate the design of menu frames.
Menu processing is both a time relevant and information relevant task. For the most part, theories have been more concerned with user response time than with information received or information transmitted. While time is an important variable, its overall impact on performance may not be great when it only accounts for a second here and there. However, the time that it takes to respond to a menu frame can be used to test models of how the user processes information received via menu labels and options. Information is transmitted refers to the choices made by the user. Each time the user makes a selection, information is transmitted to the computer. Choice behavior is subject to user preferences, goals, and expectations. An adequate theory must involve both response time and information transmission.
4.1.1 Information Acquisition and Search.
Figures 4.1 and 4.2 show several information processing models. The way in which a user scans a menu frame for information depends on the task and the user's prior knowledge about the frame. Typically the user starts with either an explicitly known target or a partially specified target. If the target is explicitly known (Figure 4.1), the user engages in a visual matching process. For each alternative scanned, the process detects either a match or a mismatch. Since errors can occur, the classic two-way table of possibilities from signal detection theory (Green & Swets, 1966) obtains as shown in the bottom panel of Figure 4.1. It is generally the case that the processing time is faster for a match than for a mismatch (e.g.). Second, to the extent that any transformation on the stimulus is required to process a comparison, response time will be increased (e.g.). Third, to the extent that alternatives are similar and confusable, there will be an increase in the number of errors (e.g., Kinney, Marsetta, & Showman, 1966). Menus which use visually and semantically distinct alternatives will result in faster response times and fewer error. In practice, however, labels are not always distinct and may lead to increased processing time and selection errors.
If the target is partially specified, the user engages an encoding and evaluation process as shown in Figure 4.2. The user must read each alternative, understand its meaning, and generate an assessment. If the selection is construed as having a correct response, the user generates a subjective likelihood that the alternative satisfies the requirements of the partial specification. If the selection is construed as a preference on the part of the user with no correct answer, he or she generates a subjective utility for the alternative as function of its worth relative to prior goals or requirments in the specification. For example, in information retrieval, if the user is looking for the population of India, alternatives such as "History," "Demographics," "Politics," "Religion", and "Facts at a Glance" may be evaluated for their subjective likelihood of supplying the answer. On the other hand, if the user is looking for something interesting about India, the alternatives would be evaluated on the basis of user preferences. In either case, an evaluation is made and the user makes a selection on the basis of its value.
In the case of partially specified goals, users may either evaluate all of the alternatives and select the alternative having the highest evaluation (left panel of Figure 4.2) or they may select the first alternative that exceeds a predetermined criterion value (right panel of Figure 4.2). This strategy is called satisficing (Simon, 1976). When the cost of an error or the negative consequences of selection of a less than optimal alternative is great, users will tend to engage in a careful and complete processing of alternatives. On the other hand, when time is of the essence, users will curtail their processing and select the first alternative that exceeds a preset criterion value (Beach & Mitchell, 1978).
One might initially suppose that novice users would search a menu by reading each item one by one from the top of the list down and stop when the desired item is reached. While this may at times be the case, the evidence is that things are not so simple (Card, 1982). Users often scan menus in an idiosyncratic manner, glancing across the list of alternatives, hoping to light upon the desired alternative.
Three alternative search models are shown in Figure 4.3. Search may be (a) a serial inspection of items, (b) a random inspection without repetition, or (b) a random inspection with replacement. A serial search requires that the user inspect each item one by one without skipping around. Random inspection without repetition allows the user to skip around, but requires the user to keep track of items already inspected. Finally, random search with replacement allows the user to skip around; but because an item may be randomly inspected over again, the search lacks efficiency.
Search strategies are also characterized by their stopping rule. In a self-terminating search, the user stops when the desired item is encountered. An exhaustive search requires the user to inspect all of the items prior to making a choice. Finally, in a redundant search after all the items have been inspected, the user still cannot make a choice and must re-inspect some items. Menus and tasks that promote self-terminating search are expected to be faster than when users must examine all items exhaustively and redundantly. Typically, self-terminating search occurs when the user has an explicitly known target in mind and need only recognize a match between the target and an item. Self-terminating search may also occur if the subject uses the strategy of satisficing. If none of the alternatives meet the criteria before the list is exhausted, no decision has been achieved, and the user must adopt a different strategy. If the user kept track of an evaluation of each alternative, he or she may pick the alternative having the highest score. But more likely than not, the user may have to go back and re-evaluate alternatives in order to weigh the pros and cons associated with items still in the running.
Even after assessing all of the alternatives, it is possible that none of them proves satisfactory. The user has exhausted the list of options and not found any that meet his or her needs. Since menu selection provides only a finite set of alternatives, the user may feel limited and frustrated. In traditional decision making, the decision maker at this point would attempt to generate new alternatives. Within the confines of menu selection, the user may need to move to some other area of the menu tree. But more often than not, the menu simply does not provide the particular alternative needed. And the user must abandon the search and try to solve the problem or find the information in a totally different manner. More will be said about this in a later section on strategies and problem solving.
The amount of time that it takes to process a menu and select an alternative depends on the processing model and the number of alternatives per menu frame. Menu processing time as a function of the number of items has become an important issue in designing efficient hierarchical menus. If broad menus require an inordinate amount of time to search, then designers are advised to limit the number of items per frame and increase the depth of the menu hierarchy. On the other hand, if each decision requires a certain amount of overhead time, then depth will add to the total time, and designers are advised to increase the breadth. Consequently, the type of search process within each frame is extremely important.
Response time for menu scanning is a function of the number of items scanned and the time required to scan each item. Lee and MacGregor (1985) present a model in which search time within a frame is a linear function of the number of alternatives. For any search there will be an expected number of alternatives that will be inspected, E(A). For an exhaustive search E(A) = a, the total number of items in the frame. With a self-terminating search, if the correct alternative is at a random position, then E(A) = (a + 1)/2. Furthermore, E(A) may be greater than a if users need to re-evaluate alternatives in order to make a choice. Lee and MacGregor assume that the total time for each choice is
S = E(A)t + k + c,
where t is the time required to read one alternative, k is the key-press time, and c is the computer response time. The type of processing model operating determines the value of E(A).
Lee and MacGregor's model assume that users scan in a systematic fashion as if they were reading text. However, when alternatives are graphic or the alternatives can be recognized on the basis of graphic characteristics, the locus of search may jump around considerably. Card (1982) has proposed that users sample from a portion of the display randomly with replacement (rightmost panel of Figure 4.2). Each sample is dependent on a saccade of the eyes. The assumption of random replacement means that the user may re-examine items. This model also assumes that search is self-terminating. Card draws upon a model originally developed by Kendall and Wodinsky (1961) for searching for airplanes in the sky or for blips on a radar screen.
If p is the probability of finding the target on a single saccade and k is the number of saccades required to find the target, then the cummulative probability of finding a target in k saccades under the assumption of sampling with replacement conforms to the geometric probability distribution:
P(k) = 1 - (1 - p)k.
Assuming that each saccade takes about the same amount of time t, the average time to detect a target will be S = t/p.
If there is one correct alternative in a list ofn, the probability of finding the target on a particular saccade will be p = 1/n and S = nt.
Consequently, search time is again a linear function of the number of items. And the geometric model predicts the same average time as the Lee & MacGregor model for an exhaustive search. The major difference is that the predicted variability will be much greater in the Card model than in the Lee & MacGregor model. Lee & MacGregor emphasize reading time because they are primarily addressing videotext systems. Card emphasizes saccade and visual search time for command menus. Unfortunately, both models ignore the decision process and assume that choice time (distinct from reading or visual scanning time) does not vary with the number of alternatives. A later section will address this issue.
It is often the case that users have more than one possible target for which they are searching. Several different items may satisfy the requirements of the search. For example, the user may be looking for either "stop" or "quit." An extensive series of studies on visual and memory scanning (Neisser, 1963; Schneider & Shriffin, 1977; Shiffrin & Schneider 1977) show the relationship between the number of possible targets and the total response time. The experimental task is analogous to menu selection. A subject is asked to search for a target in a display of characters. For example, one might be asked to search the array shown in the upper panel of Figure 4.4 and report when the target has been found. The target may be simply defined as "the letter L," or the "letters L, M or Y." In general, the greater the number of possible targets, the longer it takes to detect the one that is actually there. The upper panel of Figure 4.5 shows the idealized results of such experiments using from one to six possible targets. The results indicate that there is a linear increase in search time as the number of possible targets increases. Presumably subjects scanned each item and then compared each of the possible targets in the target set with the item. Each comparison added a constant amount of time.
The intriguing result of these studies occurs when subjects practice the same set of targets over an extended period of practice week after week and month after month. The results of these studies indicate that differences due to both the number of targets and the number items scanned decrease greatly. The lower panel of Figure 4.5 shows the idealized results after practice. Schneider and Shiffrin (1977) found that subjects looking for the same targets eventually could search for four targets about as quickly as they could for one. They could also search through four characters about as quickly as they could through one.
These results indicate that scanning and recognition processes may become automatic with extensive practice. Detection becomes a rapid, effortless, and almost unconscious process. Users of menu selection systems that routinely proceed through the same processes of scanning and selection develop to a point where their response times are no longer affected by the number of items or number of targets. Furthermore, the selection process becomes so engrained that they do not think about it. The user of a word processor or spreadsheet package over many months of use no longer operates at the level of linear scanning for items. Recognition and selection becomes automatic allowing the user to think about the task at hand rather than the control of the human/computer interface.
4.1.2 Choice Process and Time. The choice process may either occur after the user has scanned and evaluated all of the alternatives or it may occur in conjunction with the scanning and recognition process. In exhaustive search, the choice process occurs only after all of the alternatives have been scanned. In self-terminating search, the choice process is engaged following the evaluation of each alternative.
When the choice is separated from the linear scanning of alternatives, it is governed by the same process as in choice reaction time experiments. In these experiments, the subject is presented with a linear array of potential stimuli and a corresponding array of response buttons. When a stimulus is presented, the subject must press the response button corresponding to the stimulus. For example, the stimulus may be a number 1, 2, 3, or 4, and the buttons may be listed as 1, 2, 3, and 4. The results of such an experiment are summarized in Figure 4.6. Response time is a linear function of the uncertainty as to which stimulus will be presented. For equally likely stimuli, uncertainty is given by the number of bits of information or log2n. This relationship is known as the Hick-Hyman law (Hick, 1952; Hyman 1953).
A log model has been proposed for menu selection by Landauer and Nachbar (1985) based on the Hick-Hyman law for choice reaction time and on Fitts' law for movement time (discussed in the next section). According to the Hick-Hyman law, the time that it takes to select one out of n items in a choice reaction time study is
S = a + b log2(n),
where a and b are constants. When applied to menu selection, the equation requires that the probability of selecting any item is equal (i.e., 1/n). This is generally not the case in real world menu systems. Menu items vary greatly in the probability of selection. When the probability is greater than 1/n, the choice time is even faster.
S = a - b log2(pi),
where pi is the probability that alternative i is the desired alternative.
The log law predicts that users can choose among a relatively large number of alternatives rapidly since choice time is a linear function of the amount of information rather than n. However, it must be remembered that this is true only when reading time is negligable. Consquently, the log law pertains more to highly practiced command menus.
For complicated choice processes in which simple matching cannot be used as a basis for choice, response times will be subject to the difficulty of the choice as well as the number of alternatives. Menu selection becomes difficult when alternatives are complex bundles of attributes and no one alternative clearly dominates the rest. The evaluation process can be time consuming and mentally taxing. It has been shown, however, that when the choice difficulty exceeds the maximum cognitive load of an individual response time decreases. The decision maker may resolve the choice on an ad hoc basis and circumvent the evaluation process. Hogarth (1975) presents a model for response time of complex decision processes.
As noted earlier, the selection may be based on either a target match or based on (a) the subjective likelihood of an alternative being correct or (b) the subjective utility of an alternative to the user. Subjective likelihoods and utilities can be scaled on the basis of choice probabilities with Luce's (1959) choice axiom. Suppose that the user is faced with a menu of n alternatives. Let pi be the probability that alternative i is chosen out of the whole set and let pij be the probability that alternative i is preferred over alternative j when only the alternatives i and j are available for selection. The choice axiom has two parts which are summarized below as they might apply to menu selection:
Axiom 1. If all pairwise preferences between alternatives are imperfect (e.g., 0 < pij < 1 for all i and j) then a Constant Ratio Rule holds such that the probabilities of choice from any subset of all the alternatives are naturally induced from pi according to the rules of conditional probability. For example,
p(a1| a1, a2)/p(a2| a1, a2) = p(a1| a1, a2, ..., an)/p(a2| a1, a2, ..., an).
Consequently, in selection among four alternatives if the probabilities are .10 and .40 for Alternatives 1 and 2, then the ratio of probabilities in a binary choice would also be .25 (.10/.40) and the respective probabilities would be .20 and .80.
Axiom 2. If for any alternative ai in the total set of altneratives there is an aj such that pij = 0 (i.e., such that ai is never preferred over aj) then ai may be deleted from the set of possible choices (i.e., ai is never chosen from the total set). In other words, alternatives that are never chosen may be effectively considered as not existing.
These two axioms result in a ratio scale of measurement for the alternatives. A positive real number vi can be assigned to each member ai such that for i = 1 ..., n,
One possibility is to let vi = pi since the sum of the
probabilities equals one. Alternatively, vi = kpi.
Luce's choice axiom is particularly useful in predicting choice probabilities when the menu set is restricted. This occurs when menu items are not appropriate and are dropped from the list or grayed out. Restricted sets also occur when the user has already selected one or several items and found out that they are not correct or do not lead to the desired goal. The effect of Luce's choice axiom is that the probabilities are essentially normalized to the number of effective alternatives in the set. Table 4.1 shows what happens for example when there are initially 6 menu items and the set is restricted.
Table 4.1
An Example Choice Probabilites in Restricted Menus According to Luce's Choice Axiom
The probabilities change in the restricted sets such that there is a constant
ratio between any pairs of alternatives. These probabilities can be used to
predict the user choices in restricted sets.
4.1.3 Response Process. In order for the user to effect a selection, he or she must produce an overt response that can be detected by the computer. Two basic types of response production are used. The user may be required to enter a code for the alternative (e.g., "press '1' for Account") or to point to the alternative using some sort of pointing device such as cursor keys, a mouse, or a touch screen. Selection by code is complicated by the fact that it requires encoding and production processes. The user must read the instructions from the screen, encode their meaning, plan an intended action, and produce the overt response on the keyboard. Selection by pointing takes advantage of the fact that the pointing response has been highly practiced since infancy. Pointing requires the user to locate the current position of the pointing device (cursor or hand), locate the position of the desired alternative, and plan a targetory from the current position of the pointing device to the desired alternative. To the extent, however, that any transformation or translation is required, time will be increased. The touch screen requires the least translation since it is a direct eye-hand response. The use of a mouse, drawing tablet, joy stick, or trackball requires a degree of tranformation since the pointing device is generally on a horizontal plane whereas the menu items are on a vertical plane. Furthermore, the user must translate the extent of hand movement to movement of the cursor. With practice, these times are reduced and the use of the pointing device seems quite natural.
Pointing by cursor keys requires the greatest amount of response transformation and translation since the targetory must be translated into a discrete sequence of moves. For simple list menus, the cursor may be positioned with only the up and down arrows. For array menus and pull down menus all four arrow keys may be required. To change direction, the user must change keys. This requires additional response time and could produce errors. While cursor key positioning is rather simple with list menus, it becomes excessively difficult with large array menus requiring the user to traverse long distances.
Studies on stimulus-response compatibility strongly suggest that the layout of the alternatives on the screen match the physical layout of the response buttons (Fitts & Seeger, 1953; Fitts & Switzer, 1962). Without such compatibility, the user must engage a translation process to remap the location of items. The worst cases are when the layouts are reversed and when directional indicaters are reversed as in mirror writing (i.e., physical movement to the right moves the cursor to the left on the screen).
Motor response time depends on the distance from the current position of the pointing device to the location of the desired target as well as the difficulty of hitting the target. For analog pointing devices the time depends on the distance to the target and the size of the target. According to Fitt's law (Fitts, 1954), the time that it takes to move to a target is a logarithmic function of the ratio of its distance and width:
R = a log(d/w) +b,
where, d is the distance to the target, w is the width of the target, and a and b are constants. For analog movements motor time is inversely related to the log of the width of the target. Consequently, it would behoove designers of analog input devices to display large menu targets rather than small buttons.
Motor response time for discrete pointing using arrow keys is governed by a different process. One would expect that the time to select an alternative using arrow keys would be linear function of the x + y distance of the cursor from the alternative.
R = a (dx + dy) + b,
where dx and dy are the x and y displacements of the cursor position from the target location and a and b are constants. Although this model is intuitive and simple, it is not entirely correct. In general with such motor movements, there is a large initial startup time or acceleration and a slow down time or decceleration upon approaching the target. Nevertheless, the equation serves as a good first approximation for motor time.
For short list menus, cursor arrow keys may be faster than analog pointing devices. However, when there are a large number of alternatives, the analog device has the distinct advantage.
In both discrete and analog situations, the constant b is the time to plan and move to the response device. If the user's hand is already on the device, the time to move to the device is eliminated. However, a common complaint is that to use a pointing device one has to take one's hands off home position on the keyboard and locate either the arrow keypad or the analog device. If the majority of time is spent navigating through menus, then the home position may in practice be on the pointing device. The problem is critical only when the user must frequently alternate between devices.
4.1.4 Evaluation and Error Detection. Once the user has made a selection, the system generally provides feedback of some type. The feedback may be receipt of some information, the location of a target item, the execution of a function, or the presentation of subsequent menu frame. The feedback may immediately indicate to the user that the selected alternative was correct or incorrect. If it is correct, the user is reinforced and the processes leading to that selection are strengthened. For example, prior to the selection, the user may have assessed only a .5 probability that the alternative would lead to the goal. Following the feedback, that probability can be updated.
On the other hand, feedback may not directly indicate whether the selection was correct. The user may have only partial knowledge about the success of the prior selection. This is particularly true in hierarchical menus. The next frame may give some indication about whether the user is on the right path; but since it is not the target item itself, the user cannot be sure that he or she is on the right path. If the feedback is positive, the user is likely to continue. If it is negative, the user may turn back depending on how unlikely the path now appears. Consequently, feedback engages another decision making process in hierarchical menu search. How this affects the search strategy will be discussed in the next section.
4.2 Problem Solving and Search Strategies
Although much goes on at the frame level of menu processing, the cognitive control of the interface is more properly positioned at a global level. How does the user plan a task that requires a series of menu selections? What is the user's strategy for effecting a search through a complex database? These questions strike at the very essence of thinking and problem solving as they apply to cognitive control of the human/computer interface.
The menu interface provides the user with options that if applied in the right order may achieve the goal state. For example, the goal may be to align the left edges of cubes in a 3-dimensional drawing program. The user must use the menu interface to select the cubes, select their left edges, and finally select the command to align. The exact order of operation is determined by the rules of the system. Each menu selection constitutes a move which may or may not get closer to the desired state. The steps to solving such problems include planning the solution, carrying it out, and finally checking the results. The difficulty and length of each step depends on the complexity of the problem. The drawing problem above is relatively easy and the length of each stage is short. The problem of generating a 3-dimensional image of a space station using a drawing program is much more difficult yet involves the same idea.
Menu interfaces often do more than just provide options. Problems or tasks that are repetitively solved or performed in the same way can be gracefully directed by the menu. For example, in an electronic mail system there is a natural order of steps that may be incorporated in the order of menus: check if there are new messages, read the first message, respond, read the next message, respond, and so on. The menu system can incorporate this order and bypass a number of redundant steps by initially listing new messages. The user may immediately select a message to read and then select options to reply, forward, or delete. One system that explicitly attempts to incorporate the user's plan of work gives the user an "inbox" menu in which the user may either select messages to view or select other program functions. The concept is to position the user at the most likely point of entry rather than at the beginning of a hierarchical command path.
4.2.1 Heuristics. A number of heuristics, strategies, and problem solving styles have been discussed in the literature that are relevant to search in menu selection. Heuristics are plans for attacking problems. They are usually simple sequences of steps that generally work but are not guaranteed to result in a solution in the same way that an algorithm would. The advantage of heuristics is that they require a minimum of time and effort. They are cheap and dirty.
Generate-Test. The generate-test heuristic is one of the simplest heuristic strategies with only two steps (Newell & Simon, 1972): (a) generate a candidate for a solution and (b) test to see if it is actually a solution. If the candidate fails the test, the problem solver keeps generating candidates until the goal is attained. The generate-test heuristic is, however, only as effective as the heuristic is at generating potential responses. The advantage of menu selection is that the user generates responses by selecting options. Newell and Simon (1972) note four difficulties in the generate-test strategy. First, it may be difficult to generate candidates. Menu selection reduces this problem by explicitly listing a set of potential candidates. Second, it may be hard to test to see if the candidate is actually a solution. For explicit targets a simple matching test is all that is required. For partially specified goals, the test may be more complex. And for complex problems requiring a number of steps, it may be extremely difficult to evaluate if the selection is on the right path. For example, it is easy to generate a chess move, but very difficult to know if it is the best of all possible moves. Third, if there are a large number of candidates with a low probability of any one achieving the goal, the generate-test heuristic is unlikely to work. A random trial and error approach is doomed to failure in complex systems and large search spaces. On the other hand in simple systems, the trial and error approach may work well. In fact, a number of systems advocate trial and error as a good way to start learning how to use functions. Fourth, it may be that the correct solution has a low probability of being selected by the problem solver. The user is likely to pick a number of other candidates before selecting the correct one. Menu selection as an interface to problem solving may help to direct the problem solver to the correct solution by the order in which alternatives are listed. Highly likely candidates should be listed first and unlikely candidates are buried at the bottom of the menu.
One of the greatest problems with the generate-test heuristic is that often a candidate cannot be evaluated until it is completely generated. For many problems this is inefficient. The problem solver may be able to evaluate partial solutions. For example, in solving a crossword puzzle, one does not fill in all of the spaces and then check to see if it is the correct solution. Instead one looks for and evaluates partial solutions along the way. Furthermore, there is a great utility to breaking the problem into subgoals. The problem-reduction approach (Nilsson, 1971) reduces the overall size of the search space. Menu systems that organize search into a series of substeps can make effective use of the problem-reduction approach. Rather than searching an index of all newspaper articles, the system may break the search into the substeps: (a) select the year, (b) select the topic, and (c) finally search through the remaining articles. Similarly in a drawing program, problem-reduction may be implemented by allowing the user to construct elementary objects as subgoals. These objects may then be selected for use to achieve more complex goals.
Hill Climbing. A similar strategy takes as its metaphor the idea of hill climbing. One can climb to the top of a simple hill (monotonically increasing in height from any point) blindfolded by merely taking each step such that it results in a higher position than before. Similarly, in the formal strategy of hill climbing, the problem solver selects each move such that an evaluation function results in a higher value than the previous move. Ultimately, one assumes that the goal has been reached when no move can be found that increases the function. In a menu selection system, each menu selection constitutes a move. The resulting frame generally provides informative feedback indicating if the user is getting "hotter" or "colder."
More formally, assume that for each selection i the user has an expected value of the feedback that will be received, e(Fi). When the user evaluates the feedback, it results in a subjective value, s(Fi). These two values are compared. If e(Fi) - s(Fi) is less than a criterion value ci, then the user will proceed. If it is not, the user will terminate or redirect the search path. It is expected that the value of ci will depend on the depth of search and on the ease of redirecting the path in a more profitable direction. Users will probably be more and more reluctant to shift off the path the further they have committed themselves to a particular course. Consequently, the further down the tree and closer to the terminal level, the greater ci. Moreover, if there was another alternative in a previous menu judged to have a high likelihood of leading to the goal, the value of ci will be reduced. Users will shift to another path if it requires little extra in the way of repositioning. The option to move back to the previous menu frame allows repositioning at a local level; whereas, the option to move back to the top of the menu allows repositioning at a distal but fixed level. Very few systems allow for user-set repositioning by way of markers. An innovative technique would be to allow the user to define markers to be placed at various points along the search as one might drip bread crumbs on a path through a maze to find one's way back. Search could be repositioned by selecting one of the markers and restarting from there.
Studies of hill climbing indicate that problem solvers tend to concentrate on only one attribute at a time in selecting their next move rather than selecting moves that change several attributes and achieve the goal in fewer moves (Norman, 1983). In a data base search of a library catalog, this would be analogous to searching first on the basis of author's name to reduce the set and then switching to search on the basis of title. Ultimately, the user may need to switch back to a name search if the title search does not result in a find.
Hill climbing is an effective strategy only if the evaluation function is well behaved and there is only one global maximum. If this is not the case, the problem solver may only find the solution by taking a detour in which the evaluation function goes down for one or several moves before it raises again. Moreover, if local maxima exist, the problem solver may get trapped at what appears to be the solution, but is not truly the optimal selection.
Test-Operate-Test-Exit One of the basic ideas behind the generate-test heuristic is that of feedback. The problem solver monitors the current state and generates responses to change that state to satisfy some criterion. But many tasks require a more complicated strategy. Miller, Galanter, & Pribram (1960) discuss a strategy that not only incorporates the idea of feedback but also the hierarchical structure of interlocking component processes. This plan is called TOTE for Test-Operate-Test-Exit. A simple plan for hammering nails is shown in the left panel of Figure 4.7. The object is to hammer a nail until it flush with the surface. The first stage is to test the nail. If it sticks up, then one goes to the second stage; otherwise stops. The second stage is to test the hammer. If it is down, one lifts it up, otherwise one goes the third stage. The third stage is to strike the nail after which one goes to the first stage.
Many tasks involve just this sort of combination of feedback and hierarchical structuring of components. The right panel of Figure 4.7, shows the same sort of TOTE for a data entry task. The first stage is to check the inbox for data. If there is data, then one goes to the second stage; otherwise stops. The second stage is to test the entry field. If it is not the correct one, the correct field is selected; otherwise one goes to the third stage. The third stage is to enter the data after which one goes to the first stage again.
The value of hierarchical plans for solving problems has been emphasized by Simon (1969). To illustrate the advantage of hierarchical structure, Simon presents a parable about two watchmakers, Tempus and Hora. Both make watches consisting of 1000 parts. Tempus builds his watches in one assembly of 1000 parts. However, if he is interrupted in the middle of the assembly by a customer, the partially assembled watch falls apart into its original pieces. Hora's watches are build in units of 10 pieces. Ten single parts make a unit, 10 units make a larger component, and the 10 components make the entire watch. If Hora is interrupted he loses only a small portion of the unfinished watch. Simon estimated that Tempus will lose an average of 20 times as much work per interruption as Hora. Although, problem solvers may not suffer from the problem of loosing prior work, hierarchical structure may prove beneficial in that the problem solver needs to think about only a limited number of elements at a time.
Menu interfaces are particularly germane to problem solving involving simple hierarchies. Menu systems can incorporate the hierarchical nature of the task into their own structure. The question is whether they do so in an effective and effecient manner that facilitates performance.
Means-Ends Analysis. Another important strategy in problem solving is the means-ends analysis. This heuristic involves a number of components already discussed. In the means-ends analysis the problem solver works on one goal at a time. If that goal cannot be achieved, the problem solver sets a subgoal of removing the obstacle that blocks that goal. The problem solver constantly monitors the difference between the current state and the goal state desired. If a difference exists, an attempt is made to generate a response that will reduce that difference. Simon (1969, p112) summarizes the means-ends analysis as follows: " Given a desired state of affairs and an existing state of affairs, the task of an adaptive organism is to find the difference between these two states and then to find the correlating process that will erase the difference."
A typical problem for the computer user that would be addressed by the means-ends analysis might be the task of viewing File X. But this goal is blocked because the file is not loaded and the user doesn't know the path to that file. Consequently, the first subgoal is to find File X. This may be solved by getting a directory listing. But that can only be accomplished if the program listing the directory can be run. So the sub-subgoal is to run the directory program. Running the directory program may itself entail the solution of a number of sub-subgoals. Once File X is found, the second subgoal is to run a browsing program to view the file. Again, a number of sub-subgoals may have to be achieved to accomplish this.
The means-ends analysis can be mentally taxing when the user must keep track of a number of embedded subgoals. Although the computer may easily store such goals on a push-down-pop-up stack, it is not so easy for the user. However, solutions to problems requiring the means-ends analysis may be facilitated by the use of hierarchical menu selection and event trapping menus. The main advantage of the menu interface is the ability to traverse goal and subgoal states up and down the hierarchy of operations and to access menu options from numerous points during the interaction. The menu interface reduces the memory load on the user by keeping track of goal states (e.g., location in the hierarchical structure) and prompting the user for input appropriate to each subgoal. Furthermore, the hierarchical menu structure may allow the user to solve subgoals in an order more amenable to human thinking rather than in an order dictated by formal problem solving. For example, in order to view File X in the problem above, the user of a menu selection system might first select a browsing program from a menu before knowing the location of File X. The browsing program prompts the user for a file. A means-end analysis reveals that the user should have solved the problem of the location of File X first. The user would then have to back out of the browsing program in order to find the file. However, a menu interface that allows the user may select concurrent processes frees the user from having to perform tasks in a predetermined order. While in the browsing program, the user could select the file directory program, find the file, and pass its location to the browsing program.
4.2.2 User Strategies and Styles. The types of strategies that users employ and their effectiveness depend not only on the tasks but on characteristics of the users. For example, users vary in their repertoire of generating solutions--an attribute important in the generate-test heuristic. Users differ greatly in their ability and willingness to plan ahead--an attribute important in the means-ends analysis. Finally, users differ in the degree with which they will pursue a particular course of action before they give up. This attribute is particularly important in menu selection since problem solving often involves a search through a hierarchical menu structure.
Search behavior by problem solvers have been characterized as either (a) shallow and broad or (b) narrow and deep. A shallow problem solver is likely to survey a wide number of possible solutions but explore them only superficially. This type of problem solver considers a solution and if it is not immediately apparent that it will led to the solution, drops the alternative and turns to another one. This user is likely to look only 1 or 2 levels deep before going back to the top again. On the other hand, the narrow and deep problem solver is likely to limit his or her search to only a few alternatives and explore them in depth. In this style, the problem solver picks one path and follows it out until it either results in the solution or its potential is completely exhausted. Different types of problems are more conducive to solution by different styles of problem solving. The shallow and broad strategy is more appropriate for placing pieces in a jigsaw puzzle; whereas, the deep and narrow strategy is more appropriate for solving the Tower of Hanoi puzzle or playing chess. In a similar way different types of menu system are more conducive to different types of search strategy. The shallow and broad strategy seems appropriate for the broad range of easily scannable alternatives provided by pull down menus. The deep and narrow strategy would be more appropriate for navigation in a complex data base.
A mixture of these strategies is known as progressive deepening. In this strategy each alternative is explored to a certain level. If no solution is found, alternatives are pursued to a further level. Thus, a progressive deepening is conducted in search. This type of strategy is particularly useful when there are a fair number of possible solutions and it is not clear to what extent each alternative must be explored to determine its suitability. Unfortunately, it is hard to see how users of a hierarchical menu system could effectively use this strategy unless they could keep track of previous search depths and jump to those points quickly. However, menu interfaces that allow concurrent searches or that allow the users to place bookmarks for fast return can facilitate the progressive deepening strategy.
Search styles can be much more complex than merely varying on depth. A number and patterns and additional factors have been characterized by Canter, Rivers, and Storrs (1985) for navigation through complex data structures. Canter et al. define six indices that can be used to quantify search patterns. The patterns are based on paths, rings, loops, and spikes in data base traversal (see Figure 4.8).
Pathiness : A path is any route through the data that does visit any node twice. It starts at one point and terminates at another. Menu traversal may be characterized by many short paths (high pathiness) or few long paths (low pathiness).
Ringiness : A ring is a route through the data which returns to the node from which it started. Since a ring has a home base, it may be thought of as an "outing." Such a ring may include other rings. Menu traversal may be characterized by many rings returning to home base (high ringiness) or few rings (low ringiness).
Loopiness : A loop is a ring which contains no other rings. A loop is simple ring and is distinguished by the fact that no node is visited twice except the home base.
Spikiness : A spike is a route through the data which goes out to a node and returns exactly the way it came. Hierarchical data bases are likely to result is high spikiness since one would traverse the hierarchy down and retrace the path back out.
NV/NT : The ratio of the number of nodes visited (NV) to the total number of nodes available in the system (NT) gives the proportion of available nodes utilized by the user. A high NV/NT ratio indicates a more comprehensive coverage of the data base.
NV/NS : The ratio of the number of different nodes visited (NV) to the total number of visits to nodes (NS) gives the proportion of first time visits. A low NV/NS ratio indicates a high degree of repetitive visits to nodes.
Canter at al (1985) use these definitions to further characterize search strategies.
Scanning : When users are scanning, they tend to cover a large area of the menu system, but without going into great detail or depth. Scanning will result in a long spikes and short loops which traverse through the database but do not extend very far into it. It is characterized by a high proportion of nodes visited relative to the total number of nodes available.
Browsing : Users may be happy to go wherever the data leads them. Users will pursue a path as long as it sustains their interest. Browsing behavior may be characterized by many long loops and a few large rings.
Searching : When users are searching for a particular target, the pattern may include ever-increasing spikes with a few loops. It is also characterized by a high redundancy of nodes revisited relative to the total number of different nodes visited.
Exploring : Many different paths of medium or short length suggest that the users are trying to grasp the extent and nature of the database. They may be attempting to gain a global map of the menu system.
Wandering : Users may wander more or less randomly through the database. The unstructured journey will lead to many medium-sized rings.
These strategies give an idea of the different types of search patterns that may occur as a function of motivational factors. Although Canter et al. characterize them in terms of the indices defined earlier, it remains to be seen whether one can delineate the type of search strategy based on the six indices.
4.3 Cognitive Layouts of Mental Models
Problem solving is governed by the way in which the problem space is represented. One may represent a problem mathematically, another visually, and still another metaphorically. Studies of problem solving behavior suggest that the key to problem solving is more often having the proper representation rather than the ideal strategy (Wickelgren, 1974).
When the user plans a task involving a menu selection system, both the menu and the task domain comprise the problem space. The menu representation and the task representation help to define the way in which the user thinks about problem. The term "mental model" has been used loosely to refer to these representations in the sense that the user adopts a conceptual model of computer operations that may relate abstract ideas (e.g., storage registers, I/O drivers, etc.) to concrete things (e.g., mailboxes, TV channels, etc.). The user's mental model of the system has been defined and illustrated numerous ways in the literature (Norman & Draper, 1986). Representations may take the form of metaphors, schemata, scripts , or cognitive layouts. These representations are by no means exhaustive or nonoverlapping. However, they serve to characterize the way in which users think about cognitive control of the human/computer interface.
4.3.1 Menu Selection as a Metaphor. The purpose of a metaphor as a literary device is to transfer the reader's concrete knowledge about a familiar thing to an unfamiliar subject being written about. The author draws upon the wealth of existing knowledge to shed light on a novel topic. It has been suggested that the same process of transference be capitalized upon in human/computer interaction. Carroll and Mack (1985) ask, "Can interfaces be designed to take advantage of the metaphors new users generate spontaneously as they apply their prior knowledge to this novel learning situation?" The extent to which the design suggests and actually conforms to the metaphor determines the amount of transference of knowledge. Part of the knowledge transferred is prior experience in problem solving strategies. For example, if the metaphor for a telecommunication program is a scrolling teletype, then the user would infer that text that scrolled off the top of the screen can be viewed again by scrolling backwards.
In most metaphors for computer operations, the base is more familiar but the novel area is more functionally rich. The typewriter is well understood, but it's functionality is considerably less than the word processor. Similarly, the card catalog is well understood, but it's functionality is considerably less than the online catalog. However, when it comes to metaphors for human/computer interaction the functionality of the base is greater than that of the computer application. Natural language has greater familiarity and functionality than computer command languages. In a similar, but more structured way, restaurant menu selection has a greater familiarity and functionality than computer menu selection.
Computer menu selection is, quintessentially, a metaphor. The original knowledge base is that of ordering items in a restaurant. However, the correspondence of elements between the two domains runs deeper than a superficial application of the metaphor. Webster (1976) gives the following definition for a menu:
menu n. [ Fr., small, detailed, from L. minutus, pp. of minuere, to lessen, from minor, less] 1. a detailed list of the foods served at a meal; bill of fare. 2. the foods served.
The menu presents a finite set of items available at the establishment. The customer then makes a selection and informs the server. The order is then prepared and served to the customer. In a similar vein the computer displays a detailed list of options available using that program. Current applications of computer menu selection bear a strong correspondence to restaurant menus and reinforce the metaphor. However, as with every metaphor, there are certain aspects that may either be deficient or enhanced in the target domain.
Norman and Chin (1989) provide a comparison between restaurant menus and computer menus used in common software packages such as Lotus 123(TM). Their sample of computer programs contained nearly three times as many selectable items as the restaurant menus. However, the ratio of pages in restaurant menus to frames in computer menus did not match that ratio. Restaurant menus contained considerably fewer pages with many more items per page. The organization of items also varied. The number of items per category level (i.e., per frame) was somewhat greater for restaurant menus for first level categories but fewer for bottom level. At the top level, restaurant menus averaged 11 categories (e.g., appetizers, sandwiches, main courses, beverages, etc.); whereas, computer menus averaged only 7.8 (e.g., Print, Rename, Copy, Delete, Run, Exit, etc.). At the bottom choice level, restaurant menus averaged 2.5 items (generally several sizes such as large, medium or small); whereas, computer menus averaged 4.7 ( e.g., a list of font sizes, or baud rates).
Table 4.2
A Comparison of Restaurant Menus (n = 56) and Computer Menus (n = 4)
Note. The computer programs were Smartcom II(TM), Lotus 123(TM),
Wordstar 3.31(TM), Procomm 2.4(TM) and Word Perfect 4.2(TM). (From Norman &
Chin, 1989).
Furthermore, Norman and Chin identify a number of common aspects between restaurant and computer menus (see Table 4.3).
Table 4.3
A Comparison of the Aspects of Restaurant Menus and Computer Menus
Both restaurant and computer menus often provide not only the names of items but also descriptions, definitions, and pictures or icons to help in making an informed choice. Both typically provide alternate response modes so that an item may be selected by name, number, or pointing. Both organize items along some line to help in the search process. Restaurant menus are customized for breakfast, lunch, and dinner, by day of the week, by type of customer (adult or child), by nationality, and by type of food. Similarly, computer menus are customized from one computer application to another; and one picks the package that will be the most functional for the task at hand. The selection of a restaurant serves to restrict the set of options available to the customer and thereby helps to focus the decision process. Similarly the selection of a particular software package restricts the functionality, but provides a finite set of options that helps to simplify the interaction. The menu, consequently, conveys the speciality of the restaurant or software package.
Both restaurant and computer menus can allow for complex selection. The restaurant allows the customer to order (a) multiple items, (b) several of any one item, and (c) combination platters. In a similar manner, some computer applications provide pick lists, multiple selections, and pre-programmed selections. Both allow for a form of menu bypass. If the customer is familiar with the menu, there is no need to refer to it when ordering. Similarly, many computer menus incorporate this feature with jump ahead or menu bypass commands (Laverson, Norman, & Shneiderman, 1987).
Restaurant menus handle specials of the day by clipping them to the standard menu or display them on a chalk board. Such specials take advantage of seasonal variation, market fluctuations, or the whim of the chef. Some timeshare systems make use of this too. The variety of the menu adds interest to the menu for regular users that are looking for something new. In addition, computer systems may take advantage of prevailing system conditions such as access to the LAN, the printer, or other system resources.
While there are many aspects in common, there are several aspects of the computer menu that do not correspond to the static menus of restaurants. These result primarily from the dynamic nature of the computer and its ability to update its display. Although restaurant menus typically organize and group items by course or food category, they are limited in terms of the number of levels that can be meaningfully displayed. Computer menus have the capability of organizing and displaying items in a hierarchical structure with unlimited depth.
Another aspect that computer menus have over most restaurant menus is the dynamic ability to add or delete items from the menu depending on the current state of the system. For example, Macintosh(TM) pull down menus display grayed out items if they are not currently available. In contrast, the restaurant customer may place an order only to be informed the kitchen has run out of that item. In a sense this is equivalent to generating an error message. The computer menu has the capability of avoiding such errors.
Despite the power and versatility of the computer, restaurant menus still possess two major features not yet shared by computer menu systems. The first is the appeal and complexity of graphic layout. Visual inspection of the restaurant menu in comparison to the computer menu reveals that the typical computer menu is extremely information lean, displaying only alphanumeric lists of items. In contrast, the restaurant menu may display tantalizing pictures, descriptions of entrees, and stylized type. The graphic layout of a menu not only helps to organize items but it also conveys additional information about the items. The intent of the restaurant menu, however, is not only to inform, but also to sell. Eye catching graphics help to do this. Computer menus are only beginning to exploit this aspect of the metaphor. For example, HyperCard(TM) provides a menu system that allows designers full graphics capability.
Perhaps the greatest deficiency of computer menus is not in the menu itself but in the absence of a sub-metaphor; namely, the server. In the restaurant the server facilitates communication between the customer and the kitchen. The counterpart in a computer system might be a natural language parser in conjunction with a menu to provide the user with extensive online help concerning the choices on the menu. The server possesses a great deal of knowledge about the menu and the relationships between items and functions as an intelligent database. Customers may query about items by aspects such as cost and food type.
As computer menus become more and more complex, the user needs such an expert system analogous to the server to assist in navigating and pruning the menu tree. For example, a server could perform complex relational searches by specified conditions and generate a shortened menu or narrow the user's search in a large menu. The server acts as a constantly available context-dependent help facility. The server is called upon for suggestions, definitions of terms, and even directions. Likewise, online help in computer menu selection is needed to provide context-dependent help.
Overall, the restaurant menu generates a powerful metaphor for human/computer interaction. The user's understanding of the interface and proficiency in using it is for the most part enhanced by the metaphor; however, at times it may also be limit thinking. Norman and Draper (1987) caution that metaphors as mental models may fix the way in which a user thinks about the interface. If the concept of a computer menu is limited to restaurant menus, which are essentially single linear menus, it may be difficult to understand how the selection of one item eliminates the availability of other items.
4.3.2 Schemata and Scripts. Metaphors transfer knowledge from one base or media to another. Schemata attempt to capture the structural representation of knowledge. A schema is a diagrammatic outline of something that conveys its essential characteristics. One understands incoming information to the extent that it conforms to our schema or ways of knowing. If it fits a predefined pattern, it can be understood and incorporated into the knowledge base. If it doesn't, it is gibberish. Most information fits somewhere inbetween perfect conformity and total chaos. Consequently, information is filtered and modified by existing schema so that it fits with our understanding of things (Bartlett, 1932). Furthermore, missing information may be inferred as required by the schema. For example, a schema for a menu requires that there are options, a method of selection, and a result. When one encounters different types of menus, an attempt is made to understand them in terms of the overriding menu schema. Pull-down menus become meaningful when the user understands that options are displayed by selecting a pull-down from the menu bar and evoked by moving the cursor down to the desired item.
A special type of schema is the script (also called an event schema). A script is an expected or stereotypical sequence of actions and events. Schank and Abelson (1977) give an example of the restaurant script. They describe a normal pattern of actions as listed in Figure 4.9. The stereotypical script, however, may vary from one instance to another. For example, at a fast food restaurant, the script is changed so that one orders the food and pays before being seated and eating. This sequence allows the customer to leave with the food or right after finishing rather than having to wait for the check.
In the same way, computer users learn scripts for how one interacts with computers, application programs, and particularly menus. Menu selection provides a simple script for cognitive control: Read the options, decide on a selection, input the alternative, and evaluate the result. The script varies somewhat for different menu structures but its simplicity makes for a powerful and compelling user interface.
Scripts also apply to the wider context of a session of interaction to perform a task such as working with a spreadsheet or a word processor. The script may start with how the program opens files and initializes the environment; then how functions are performed; and finally, how files are closed before one exits the program. Scripts not only help users plan their actions, they also help to evaluate the course of action. The user has an expectation of the proper flow of events. When they do not conform to those expectations, the user knows that something is wrong. However, as with the restaurant, certain variations are tolerated. Some programs require specification of a file name prior to entering data rather than upon completion. Some programs also periodically save the contents into this file during the session to prevent accidental loss due to system failure while others require the user to explicitly save the file before exiting the program.
Experienced users acquire a number of scripts through their interaction with different programs. Cognitive control of the interface is facilitated by well worn paths that conform to these scripts. It is suggested that designers take advantage of prior expectations of users. New programs that violate accepted scripts for whatever good reasons may not be understood or accepted by users.
4.3.3 Cognitive Layouts of Menus. Mental models whether metaphors, schemata, or scripts are in the mind of the user. Often they remain there dormant as the user muddles along step by step or frame by frame without engaging any planning or problem solving approach. Norman, Weldon, & Shneiderman (1986) have proposed that performance may be facilitated if such user models take on a visual form that engages the appropriate cognitive layout. Formally, a cognitive layout is defined as a mental representation of the elements and relationships in a system that conform to a cognitive model of operations and is tied to the surface layout of elements on the display. Norman et al. give the example of the three box human memory model as one of many possible cognitive layouts. Users may conceive of the system having a short term sensory store (a buffer), a short term memory (working memory), and a long term memory (file storage). When the surface layout of the system (its graphic representation) matches that of the user's cognitive layout, the user's involvement with and understanding of the system will be maximized. Norman et al. applied this idea to operation of multiple windows and screens. Windows may promote the human memory model by displaying incoming information in one window, working contents or clipboard information in another window, and a directory of files in a third window.
In essence the graphic layout should engage the way in which the user conceptualizes the operation of the program. The problem is that too often menus hide the organization and structure of the tree rather than explicitly using it to the benefit of the interface. A number of cognitive layouts present themselves as possible models for how the user thinks of menu interaction. Each layout has its strengths and weaknesses. The particular surface layout used to drive the interface may be able to emphasize the strengths and make accommodations for the weaknesses.
A number of menu systems have used graphic layouts that have suggested different types of metaphors. Several of these are discussed below.
Road Map. The cognitive layout of a menu selection system may be a map. As such the user views menu traversal as navigation. The road map layout associates menu frames with junctions in the road; alternatives are different locations or roads to those locations. The user is engaged in the process of determining routes between points. Initially, the user may search for possible routes by exploring alternatives branching out from the current location. But one may also work backwards from the destination. In general, search starts from highly familiar points and proceeds from there. Once a route has been found and repeatedly used it becomes habitual and even when shortcuts may be available. The value of the map is to display a graphic representation showing all of the major locations and connectors. When implemented on a computer, there is an additional advantage. The user may be able to select a point on the map and jump to that location in the system. The map is itself merely large menu, but the cognitive model conveyed is much more powerful since the user is aware of both the location and the connections of items. Although the road map is appropriate for a number of systems, only a few actually present a surface layout that conveys that idea. HyperCard(TM) gives one instance of a road map layout in its help system (see Figure 4.10).
Tree. A related cognitive layout is that of a tree with branches or inversely a tree with roots. These layouts confine the user's cognitive layout to a hierarchical menu. The tree layout dictates directional menu traversal from a central node (the root) to increasing levels of specificity. The directional nature of the tree is pervasive and is reinforced by much of the terminology used in menu traversal. In many cases interaction may need to be guided by the hierarchical nature of the database and it does not make sense to go from one location to another without at least conceptually referring to the hierarchical location of a node. In other cases, the hierarchy may be a superficial or arbitrary clustering of items (e.g., a catalog of gift items). When this is the case, the hierarchy may prove to be more of a burden that a strength. The general layout of the hierarchical menu requires that the user back out of a branch and return to the root before traversing back out to another branch.
Smorgasbord. Another layout it that of the Swedish Smorgasbord. All of the options spread out before the user. There is clustering and organization of items, but anything may be sampled. While there may be a linear layout of items, there is no sense of rigid menu traversal, rather one of simultaneous availability. Other layouts make use of the artist's pallet and the workers tool box. Parameter settings (hues, shades, fonts, lines, etc.) are laid out in a meaningful order and are simultaneously available. Functions (text, graphic objects, grabbers, etc.) are also laid out for direct selection. The major strength of the smorgasbord layout is that experienced users learn the locations of items and can make rapid selections despite the very large number of options that may be available.
These layouts are by no means exhaustive of the number of cognitive layouts that may be effectively used to engage the user. The challenge of good design is not so much to invent new interfaces but to borrow existing cognitive layouts from the world of common knowledge and thought.
4.4 Summary
Although the cognitive processing of the user imposes a number of limiting factors on menu selection performance, that same ability to process and control is what drives the interaction. The user must search for information, encode the meaning of alternatives, assess the alternatives, make a choice and effect a response. All of these processes are governed by the laws of human information processing. Good menu design takes into consideration such human factors to increase speed and reduce errors.
At another level, however, the user enters not as a limiting force but as a driving force. The user is a problem solver with goals, strategies, and styles of attack. As such, the computer interface becomes a media for effecting solutions. Theories of human problem solving suggest that the problem solver's understanding and representation of the problem domain aids in solution. To this end good user interface design should convey a sense of meaning and engage schemata that lend themselves to solutions of the tasks being performed.
At a higher level, the model of the user must concern itself with the user's strategies of search and problem solving. If an item is not found on the first path through a database, how does the user redirect the path of search? In command menu system how does the user minimize the number of steps to complete a task by changing the order in which subcomponents of the task are performed? The answers to these questions depend on the user's model of the system and strategies for navigation through that model. Mental models have been represented in several ways in cognitive psychology. Scripts have been used to layout the expected series of events. Metaphors have been used to map the elements and relations from a familiar system to a less familiar one. Production rules have been used capture the knowledge that the user may have about the workings of the system. In each case, the idea of a cognitive layout may be used to describe the way in which users may engage a particular model and cast a visual representation or layout of the model. Such a layout defines the way in which the user thinks about using the system and serves as a vehicle for formulating plans.
It will be seen that a number of models of user behavior can be formulated depending on the level of analysis and the processes of interest. There is no single unified model but rather a collection of modeling techniques that can be applied to particular situations and performance variables. This chapter will cover a number of these models and techniques as they apply to menu selection.
4.1 The Menu Selection Process
The previous chapter dealt with the menu frame as a stimulus. This section will consider the cognitive processing of that stimulus frame. The menu selection process involves a number of cognitive elements. Within a particular menu frame, the user must read the alternatives, choose the desired option, effect the choice, and finally ascertain the consequences. Across menu frames, the user must maintain a sense of direction, evaluate proximity to the goal, and effect a plan of search or problem solving strategy. This section will examine the process within the frame. A theoretical model of these processes will help to evaluate the design of menu frames.
Menu processing is both a time relevant and information relevant task. For the most part, theories have been more concerned with user response time than with information received or information transmitted. While time is an important variable, its overall impact on performance may not be great when it only accounts for a second here and there. However, the time that it takes to respond to a menu frame can be used to test models of how the user processes information received via menu labels and options. Information is transmitted refers to the choices made by the user. Each time the user makes a selection, information is transmitted to the computer. Choice behavior is subject to user preferences, goals, and expectations. An adequate theory must involve both response time and information transmission.
4.1.1 Information Acquisition and Search.
Figures 4.1 and 4.2 show several information processing models. The way in which a user scans a menu frame for information depends on the task and the user's prior knowledge about the frame. Typically the user starts with either an explicitly known target or a partially specified target. If the target is explicitly known (Figure 4.1), the user engages in a visual matching process. For each alternative scanned, the process detects either a match or a mismatch. Since errors can occur, the classic two-way table of possibilities from signal detection theory (Green & Swets, 1966) obtains as shown in the bottom panel of Figure 4.1. It is generally the case that the processing time is faster for a match than for a mismatch (e.g.). Second, to the extent that any transformation on the stimulus is required to process a comparison, response time will be increased (e.g.). Third, to the extent that alternatives are similar and confusable, there will be an increase in the number of errors (e.g., Kinney, Marsetta, & Showman, 1966). Menus which use visually and semantically distinct alternatives will result in faster response times and fewer error. In practice, however, labels are not always distinct and may lead to increased processing time and selection errors.
If the target is partially specified, the user engages an encoding and evaluation process as shown in Figure 4.2. The user must read each alternative, understand its meaning, and generate an assessment. If the selection is construed as having a correct response, the user generates a subjective likelihood that the alternative satisfies the requirements of the partial specification. If the selection is construed as a preference on the part of the user with no correct answer, he or she generates a subjective utility for the alternative as function of its worth relative to prior goals or requirments in the specification. For example, in information retrieval, if the user is looking for the population of India, alternatives such as "History," "Demographics," "Politics," "Religion", and "Facts at a Glance" may be evaluated for their subjective likelihood of supplying the answer. On the other hand, if the user is looking for something interesting about India, the alternatives would be evaluated on the basis of user preferences. In either case, an evaluation is made and the user makes a selection on the basis of its value.
In the case of partially specified goals, users may either evaluate all of the alternatives and select the alternative having the highest evaluation (left panel of Figure 4.2) or they may select the first alternative that exceeds a predetermined criterion value (right panel of Figure 4.2). This strategy is called satisficing (Simon, 1976). When the cost of an error or the negative consequences of selection of a less than optimal alternative is great, users will tend to engage in a careful and complete processing of alternatives. On the other hand, when time is of the essence, users will curtail their processing and select the first alternative that exceeds a preset criterion value (Beach & Mitchell, 1978).
One might initially suppose that novice users would search a menu by reading each item one by one from the top of the list down and stop when the desired item is reached. While this may at times be the case, the evidence is that things are not so simple (Card, 1982). Users often scan menus in an idiosyncratic manner, glancing across the list of alternatives, hoping to light upon the desired alternative.
Three alternative search models are shown in Figure 4.3. Search may be (a) a serial inspection of items, (b) a random inspection without repetition, or (b) a random inspection with replacement. A serial search requires that the user inspect each item one by one without skipping around. Random inspection without repetition allows the user to skip around, but requires the user to keep track of items already inspected. Finally, random search with replacement allows the user to skip around; but because an item may be randomly inspected over again, the search lacks efficiency.
Search strategies are also characterized by their stopping rule. In a self-terminating search, the user stops when the desired item is encountered. An exhaustive search requires the user to inspect all of the items prior to making a choice. Finally, in a redundant search after all the items have been inspected, the user still cannot make a choice and must re-inspect some items. Menus and tasks that promote self-terminating search are expected to be faster than when users must examine all items exhaustively and redundantly. Typically, self-terminating search occurs when the user has an explicitly known target in mind and need only recognize a match between the target and an item. Self-terminating search may also occur if the subject uses the strategy of satisficing. If none of the alternatives meet the criteria before the list is exhausted, no decision has been achieved, and the user must adopt a different strategy. If the user kept track of an evaluation of each alternative, he or she may pick the alternative having the highest score. But more likely than not, the user may have to go back and re-evaluate alternatives in order to weigh the pros and cons associated with items still in the running.
Even after assessing all of the alternatives, it is possible that none of them proves satisfactory. The user has exhausted the list of options and not found any that meet his or her needs. Since menu selection provides only a finite set of alternatives, the user may feel limited and frustrated. In traditional decision making, the decision maker at this point would attempt to generate new alternatives. Within the confines of menu selection, the user may need to move to some other area of the menu tree. But more often than not, the menu simply does not provide the particular alternative needed. And the user must abandon the search and try to solve the problem or find the information in a totally different manner. More will be said about this in a later section on strategies and problem solving.
The amount of time that it takes to process a menu and select an alternative depends on the processing model and the number of alternatives per menu frame. Menu processing time as a function of the number of items has become an important issue in designing efficient hierarchical menus. If broad menus require an inordinate amount of time to search, then designers are advised to limit the number of items per frame and increase the depth of the menu hierarchy. On the other hand, if each decision requires a certain amount of overhead time, then depth will add to the total time, and designers are advised to increase the breadth. Consequently, the type of search process within each frame is extremely important.
Response time for menu scanning is a function of the number of items scanned and the time required to scan each item. Lee and MacGregor (1985) present a model in which search time within a frame is a linear function of the number of alternatives. For any search there will be an expected number of alternatives that will be inspected, E(A). For an exhaustive search E(A) = a, the total number of items in the frame. With a self-terminating search, if the correct alternative is at a random position, then E(A) = (a + 1)/2. Furthermore, E(A) may be greater than a if users need to re-evaluate alternatives in order to make a choice. Lee and MacGregor assume that the total time for each choice is
S = E(A)t + k + c,
where t is the time required to read one alternative, k is the key-press time, and c is the computer response time. The type of processing model operating determines the value of E(A).
Lee and MacGregor's model assume that users scan in a systematic fashion as if they were reading text. However, when alternatives are graphic or the alternatives can be recognized on the basis of graphic characteristics, the locus of search may jump around considerably. Card (1982) has proposed that users sample from a portion of the display randomly with replacement (rightmost panel of Figure 4.2). Each sample is dependent on a saccade of the eyes. The assumption of random replacement means that the user may re-examine items. This model also assumes that search is self-terminating. Card draws upon a model originally developed by Kendall and Wodinsky (1961) for searching for airplanes in the sky or for blips on a radar screen.
If p is the probability of finding the target on a single saccade and k is the number of saccades required to find the target, then the cummulative probability of finding a target in k saccades under the assumption of sampling with replacement conforms to the geometric probability distribution:
P(k) = 1 - (1 - p)k.
Assuming that each saccade takes about the same amount of time t, the average time to detect a target will be S = t/p.
If there is one correct alternative in a list ofn, the probability of finding the target on a particular saccade will be p = 1/n and S = nt.
Consequently, search time is again a linear function of the number of items. And the geometric model predicts the same average time as the Lee & MacGregor model for an exhaustive search. The major difference is that the predicted variability will be much greater in the Card model than in the Lee & MacGregor model. Lee & MacGregor emphasize reading time because they are primarily addressing videotext systems. Card emphasizes saccade and visual search time for command menus. Unfortunately, both models ignore the decision process and assume that choice time (distinct from reading or visual scanning time) does not vary with the number of alternatives. A later section will address this issue.
It is often the case that users have more than one possible target for which they are searching. Several different items may satisfy the requirements of the search. For example, the user may be looking for either "stop" or "quit." An extensive series of studies on visual and memory scanning (Neisser, 1963; Schneider & Shriffin, 1977; Shiffrin & Schneider 1977) show the relationship between the number of possible targets and the total response time. The experimental task is analogous to menu selection. A subject is asked to search for a target in a display of characters. For example, one might be asked to search the array shown in the upper panel of Figure 4.4 and report when the target has been found. The target may be simply defined as "the letter L," or the "letters L, M or Y." In general, the greater the number of possible targets, the longer it takes to detect the one that is actually there. The upper panel of Figure 4.5 shows the idealized results of such experiments using from one to six possible targets. The results indicate that there is a linear increase in search time as the number of possible targets increases. Presumably subjects scanned each item and then compared each of the possible targets in the target set with the item. Each comparison added a constant amount of time.
The intriguing result of these studies occurs when subjects practice the same set of targets over an extended period of practice week after week and month after month. The results of these studies indicate that differences due to both the number of targets and the number items scanned decrease greatly. The lower panel of Figure 4.5 shows the idealized results after practice. Schneider and Shiffrin (1977) found that subjects looking for the same targets eventually could search for four targets about as quickly as they could for one. They could also search through four characters about as quickly as they could through one.
These results indicate that scanning and recognition processes may become automatic with extensive practice. Detection becomes a rapid, effortless, and almost unconscious process. Users of menu selection systems that routinely proceed through the same processes of scanning and selection develop to a point where their response times are no longer affected by the number of items or number of targets. Furthermore, the selection process becomes so engrained that they do not think about it. The user of a word processor or spreadsheet package over many months of use no longer operates at the level of linear scanning for items. Recognition and selection becomes automatic allowing the user to think about the task at hand rather than the control of the human/computer interface.
4.1.2 Choice Process and Time. The choice process may either occur after the user has scanned and evaluated all of the alternatives or it may occur in conjunction with the scanning and recognition process. In exhaustive search, the choice process occurs only after all of the alternatives have been scanned. In self-terminating search, the choice process is engaged following the evaluation of each alternative.
When the choice is separated from the linear scanning of alternatives, it is governed by the same process as in choice reaction time experiments. In these experiments, the subject is presented with a linear array of potential stimuli and a corresponding array of response buttons. When a stimulus is presented, the subject must press the response button corresponding to the stimulus. For example, the stimulus may be a number 1, 2, 3, or 4, and the buttons may be listed as 1, 2, 3, and 4. The results of such an experiment are summarized in Figure 4.6. Response time is a linear function of the uncertainty as to which stimulus will be presented. For equally likely stimuli, uncertainty is given by the number of bits of information or log2n. This relationship is known as the Hick-Hyman law (Hick, 1952; Hyman 1953).
A log model has been proposed for menu selection by Landauer and Nachbar (1985) based on the Hick-Hyman law for choice reaction time and on Fitts' law for movement time (discussed in the next section). According to the Hick-Hyman law, the time that it takes to select one out of n items in a choice reaction time study is
S = a + b log2(n),
where a and b are constants. When applied to menu selection, the equation requires that the probability of selecting any item is equal (i.e., 1/n). This is generally not the case in real world menu systems. Menu items vary greatly in the probability of selection. When the probability is greater than 1/n, the choice time is even faster.
S = a - b log2(pi),
where pi is the probability that alternative i is the desired alternative.
The log law predicts that users can choose among a relatively large number of alternatives rapidly since choice time is a linear function of the amount of information rather than n. However, it must be remembered that this is true only when reading time is negligable. Consquently, the log law pertains more to highly practiced command menus.
For complicated choice processes in which simple matching cannot be used as a basis for choice, response times will be subject to the difficulty of the choice as well as the number of alternatives. Menu selection becomes difficult when alternatives are complex bundles of attributes and no one alternative clearly dominates the rest. The evaluation process can be time consuming and mentally taxing. It has been shown, however, that when the choice difficulty exceeds the maximum cognitive load of an individual response time decreases. The decision maker may resolve the choice on an ad hoc basis and circumvent the evaluation process. Hogarth (1975) presents a model for response time of complex decision processes.
As noted earlier, the selection may be based on either a target match or based on (a) the subjective likelihood of an alternative being correct or (b) the subjective utility of an alternative to the user. Subjective likelihoods and utilities can be scaled on the basis of choice probabilities with Luce's (1959) choice axiom. Suppose that the user is faced with a menu of n alternatives. Let pi be the probability that alternative i is chosen out of the whole set and let pij be the probability that alternative i is preferred over alternative j when only the alternatives i and j are available for selection. The choice axiom has two parts which are summarized below as they might apply to menu selection:
Axiom 1. If all pairwise preferences between alternatives are imperfect (e.g., 0 < pij < 1 for all i and j) then a Constant Ratio Rule holds such that the probabilities of choice from any subset of all the alternatives are naturally induced from pi according to the rules of conditional probability. For example,
p(a1| a1, a2)/p(a2| a1, a2) = p(a1| a1, a2, ..., an)/p(a2| a1, a2, ..., an).
Consequently, in selection among four alternatives if the probabilities are .10 and .40 for Alternatives 1 and 2, then the ratio of probabilities in a binary choice would also be .25 (.10/.40) and the respective probabilities would be .20 and .80.
Axiom 2. If for any alternative ai in the total set of altneratives there is an aj such that pij = 0 (i.e., such that ai is never preferred over aj) then ai may be deleted from the set of possible choices (i.e., ai is never chosen from the total set). In other words, alternatives that are never chosen may be effectively considered as not existing.
These two axioms result in a ratio scale of measurement for the alternatives. A positive real number vi can be assigned to each member ai such that for i = 1 ..., n,
pi = vi / | n Σ j=1 | vj. |
Luce's choice axiom is particularly useful in predicting choice probabilities when the menu set is restricted. This occurs when menu items are not appropriate and are dropped from the list or grayed out. Restricted sets also occur when the user has already selected one or several items and found out that they are not correct or do not lead to the desired goal. The effect of Luce's choice axiom is that the probabilities are essentially normalized to the number of effective alternatives in the set. Table 4.1 shows what happens for example when there are initially 6 menu items and the set is restricted.
Table 4.1
An Example Choice Probabilites in Restricted Menus According to Luce's Choice Axiom
Total Set | Restricted Set A | Restricted Set B | |||
---|---|---|---|---|---|
Option | Probability | Option | Probability | Option | Probability |
1 | .10 | 1 | .18 | 1 | .14 |
2 | .15 | 2 | .27 | 2 | .21 |
3 | .20 | 3 | .37 | 3 | .29 |
4 | .10 | 4 | .18 | 4 | .14 |
5 | .30 | -- | -- | -- | -- |
6 | .15 | -- | -- | 6 | .22 |
4.1.3 Response Process. In order for the user to effect a selection, he or she must produce an overt response that can be detected by the computer. Two basic types of response production are used. The user may be required to enter a code for the alternative (e.g., "press '1' for Account") or to point to the alternative using some sort of pointing device such as cursor keys, a mouse, or a touch screen. Selection by code is complicated by the fact that it requires encoding and production processes. The user must read the instructions from the screen, encode their meaning, plan an intended action, and produce the overt response on the keyboard. Selection by pointing takes advantage of the fact that the pointing response has been highly practiced since infancy. Pointing requires the user to locate the current position of the pointing device (cursor or hand), locate the position of the desired alternative, and plan a targetory from the current position of the pointing device to the desired alternative. To the extent, however, that any transformation or translation is required, time will be increased. The touch screen requires the least translation since it is a direct eye-hand response. The use of a mouse, drawing tablet, joy stick, or trackball requires a degree of tranformation since the pointing device is generally on a horizontal plane whereas the menu items are on a vertical plane. Furthermore, the user must translate the extent of hand movement to movement of the cursor. With practice, these times are reduced and the use of the pointing device seems quite natural.
Pointing by cursor keys requires the greatest amount of response transformation and translation since the targetory must be translated into a discrete sequence of moves. For simple list menus, the cursor may be positioned with only the up and down arrows. For array menus and pull down menus all four arrow keys may be required. To change direction, the user must change keys. This requires additional response time and could produce errors. While cursor key positioning is rather simple with list menus, it becomes excessively difficult with large array menus requiring the user to traverse long distances.
Studies on stimulus-response compatibility strongly suggest that the layout of the alternatives on the screen match the physical layout of the response buttons (Fitts & Seeger, 1953; Fitts & Switzer, 1962). Without such compatibility, the user must engage a translation process to remap the location of items. The worst cases are when the layouts are reversed and when directional indicaters are reversed as in mirror writing (i.e., physical movement to the right moves the cursor to the left on the screen).
Motor response time depends on the distance from the current position of the pointing device to the location of the desired target as well as the difficulty of hitting the target. For analog pointing devices the time depends on the distance to the target and the size of the target. According to Fitt's law (Fitts, 1954), the time that it takes to move to a target is a logarithmic function of the ratio of its distance and width:
R = a log(d/w) +b,
where, d is the distance to the target, w is the width of the target, and a and b are constants. For analog movements motor time is inversely related to the log of the width of the target. Consequently, it would behoove designers of analog input devices to display large menu targets rather than small buttons.
Motor response time for discrete pointing using arrow keys is governed by a different process. One would expect that the time to select an alternative using arrow keys would be linear function of the x + y distance of the cursor from the alternative.
R = a (dx + dy) + b,
where dx and dy are the x and y displacements of the cursor position from the target location and a and b are constants. Although this model is intuitive and simple, it is not entirely correct. In general with such motor movements, there is a large initial startup time or acceleration and a slow down time or decceleration upon approaching the target. Nevertheless, the equation serves as a good first approximation for motor time.
For short list menus, cursor arrow keys may be faster than analog pointing devices. However, when there are a large number of alternatives, the analog device has the distinct advantage.
In both discrete and analog situations, the constant b is the time to plan and move to the response device. If the user's hand is already on the device, the time to move to the device is eliminated. However, a common complaint is that to use a pointing device one has to take one's hands off home position on the keyboard and locate either the arrow keypad or the analog device. If the majority of time is spent navigating through menus, then the home position may in practice be on the pointing device. The problem is critical only when the user must frequently alternate between devices.
4.1.4 Evaluation and Error Detection. Once the user has made a selection, the system generally provides feedback of some type. The feedback may be receipt of some information, the location of a target item, the execution of a function, or the presentation of subsequent menu frame. The feedback may immediately indicate to the user that the selected alternative was correct or incorrect. If it is correct, the user is reinforced and the processes leading to that selection are strengthened. For example, prior to the selection, the user may have assessed only a .5 probability that the alternative would lead to the goal. Following the feedback, that probability can be updated.
On the other hand, feedback may not directly indicate whether the selection was correct. The user may have only partial knowledge about the success of the prior selection. This is particularly true in hierarchical menus. The next frame may give some indication about whether the user is on the right path; but since it is not the target item itself, the user cannot be sure that he or she is on the right path. If the feedback is positive, the user is likely to continue. If it is negative, the user may turn back depending on how unlikely the path now appears. Consequently, feedback engages another decision making process in hierarchical menu search. How this affects the search strategy will be discussed in the next section.
4.2 Problem Solving and Search Strategies
Although much goes on at the frame level of menu processing, the cognitive control of the interface is more properly positioned at a global level. How does the user plan a task that requires a series of menu selections? What is the user's strategy for effecting a search through a complex database? These questions strike at the very essence of thinking and problem solving as they apply to cognitive control of the human/computer interface.
The menu interface provides the user with options that if applied in the right order may achieve the goal state. For example, the goal may be to align the left edges of cubes in a 3-dimensional drawing program. The user must use the menu interface to select the cubes, select their left edges, and finally select the command to align. The exact order of operation is determined by the rules of the system. Each menu selection constitutes a move which may or may not get closer to the desired state. The steps to solving such problems include planning the solution, carrying it out, and finally checking the results. The difficulty and length of each step depends on the complexity of the problem. The drawing problem above is relatively easy and the length of each stage is short. The problem of generating a 3-dimensional image of a space station using a drawing program is much more difficult yet involves the same idea.
Menu interfaces often do more than just provide options. Problems or tasks that are repetitively solved or performed in the same way can be gracefully directed by the menu. For example, in an electronic mail system there is a natural order of steps that may be incorporated in the order of menus: check if there are new messages, read the first message, respond, read the next message, respond, and so on. The menu system can incorporate this order and bypass a number of redundant steps by initially listing new messages. The user may immediately select a message to read and then select options to reply, forward, or delete. One system that explicitly attempts to incorporate the user's plan of work gives the user an "inbox" menu in which the user may either select messages to view or select other program functions. The concept is to position the user at the most likely point of entry rather than at the beginning of a hierarchical command path.
4.2.1 Heuristics. A number of heuristics, strategies, and problem solving styles have been discussed in the literature that are relevant to search in menu selection. Heuristics are plans for attacking problems. They are usually simple sequences of steps that generally work but are not guaranteed to result in a solution in the same way that an algorithm would. The advantage of heuristics is that they require a minimum of time and effort. They are cheap and dirty.
Generate-Test. The generate-test heuristic is one of the simplest heuristic strategies with only two steps (Newell & Simon, 1972): (a) generate a candidate for a solution and (b) test to see if it is actually a solution. If the candidate fails the test, the problem solver keeps generating candidates until the goal is attained. The generate-test heuristic is, however, only as effective as the heuristic is at generating potential responses. The advantage of menu selection is that the user generates responses by selecting options. Newell and Simon (1972) note four difficulties in the generate-test strategy. First, it may be difficult to generate candidates. Menu selection reduces this problem by explicitly listing a set of potential candidates. Second, it may be hard to test to see if the candidate is actually a solution. For explicit targets a simple matching test is all that is required. For partially specified goals, the test may be more complex. And for complex problems requiring a number of steps, it may be extremely difficult to evaluate if the selection is on the right path. For example, it is easy to generate a chess move, but very difficult to know if it is the best of all possible moves. Third, if there are a large number of candidates with a low probability of any one achieving the goal, the generate-test heuristic is unlikely to work. A random trial and error approach is doomed to failure in complex systems and large search spaces. On the other hand in simple systems, the trial and error approach may work well. In fact, a number of systems advocate trial and error as a good way to start learning how to use functions. Fourth, it may be that the correct solution has a low probability of being selected by the problem solver. The user is likely to pick a number of other candidates before selecting the correct one. Menu selection as an interface to problem solving may help to direct the problem solver to the correct solution by the order in which alternatives are listed. Highly likely candidates should be listed first and unlikely candidates are buried at the bottom of the menu.
One of the greatest problems with the generate-test heuristic is that often a candidate cannot be evaluated until it is completely generated. For many problems this is inefficient. The problem solver may be able to evaluate partial solutions. For example, in solving a crossword puzzle, one does not fill in all of the spaces and then check to see if it is the correct solution. Instead one looks for and evaluates partial solutions along the way. Furthermore, there is a great utility to breaking the problem into subgoals. The problem-reduction approach (Nilsson, 1971) reduces the overall size of the search space. Menu systems that organize search into a series of substeps can make effective use of the problem-reduction approach. Rather than searching an index of all newspaper articles, the system may break the search into the substeps: (a) select the year, (b) select the topic, and (c) finally search through the remaining articles. Similarly in a drawing program, problem-reduction may be implemented by allowing the user to construct elementary objects as subgoals. These objects may then be selected for use to achieve more complex goals.
Hill Climbing. A similar strategy takes as its metaphor the idea of hill climbing. One can climb to the top of a simple hill (monotonically increasing in height from any point) blindfolded by merely taking each step such that it results in a higher position than before. Similarly, in the formal strategy of hill climbing, the problem solver selects each move such that an evaluation function results in a higher value than the previous move. Ultimately, one assumes that the goal has been reached when no move can be found that increases the function. In a menu selection system, each menu selection constitutes a move. The resulting frame generally provides informative feedback indicating if the user is getting "hotter" or "colder."
More formally, assume that for each selection i the user has an expected value of the feedback that will be received, e(Fi). When the user evaluates the feedback, it results in a subjective value, s(Fi). These two values are compared. If e(Fi) - s(Fi) is less than a criterion value ci, then the user will proceed. If it is not, the user will terminate or redirect the search path. It is expected that the value of ci will depend on the depth of search and on the ease of redirecting the path in a more profitable direction. Users will probably be more and more reluctant to shift off the path the further they have committed themselves to a particular course. Consequently, the further down the tree and closer to the terminal level, the greater ci. Moreover, if there was another alternative in a previous menu judged to have a high likelihood of leading to the goal, the value of ci will be reduced. Users will shift to another path if it requires little extra in the way of repositioning. The option to move back to the previous menu frame allows repositioning at a local level; whereas, the option to move back to the top of the menu allows repositioning at a distal but fixed level. Very few systems allow for user-set repositioning by way of markers. An innovative technique would be to allow the user to define markers to be placed at various points along the search as one might drip bread crumbs on a path through a maze to find one's way back. Search could be repositioned by selecting one of the markers and restarting from there.
Studies of hill climbing indicate that problem solvers tend to concentrate on only one attribute at a time in selecting their next move rather than selecting moves that change several attributes and achieve the goal in fewer moves (Norman, 1983). In a data base search of a library catalog, this would be analogous to searching first on the basis of author's name to reduce the set and then switching to search on the basis of title. Ultimately, the user may need to switch back to a name search if the title search does not result in a find.
Hill climbing is an effective strategy only if the evaluation function is well behaved and there is only one global maximum. If this is not the case, the problem solver may only find the solution by taking a detour in which the evaluation function goes down for one or several moves before it raises again. Moreover, if local maxima exist, the problem solver may get trapped at what appears to be the solution, but is not truly the optimal selection.
Test-Operate-Test-Exit One of the basic ideas behind the generate-test heuristic is that of feedback. The problem solver monitors the current state and generates responses to change that state to satisfy some criterion. But many tasks require a more complicated strategy. Miller, Galanter, & Pribram (1960) discuss a strategy that not only incorporates the idea of feedback but also the hierarchical structure of interlocking component processes. This plan is called TOTE for Test-Operate-Test-Exit. A simple plan for hammering nails is shown in the left panel of Figure 4.7. The object is to hammer a nail until it flush with the surface. The first stage is to test the nail. If it sticks up, then one goes to the second stage; otherwise stops. The second stage is to test the hammer. If it is down, one lifts it up, otherwise one goes the third stage. The third stage is to strike the nail after which one goes to the first stage.
Many tasks involve just this sort of combination of feedback and hierarchical structuring of components. The right panel of Figure 4.7, shows the same sort of TOTE for a data entry task. The first stage is to check the inbox for data. If there is data, then one goes to the second stage; otherwise stops. The second stage is to test the entry field. If it is not the correct one, the correct field is selected; otherwise one goes to the third stage. The third stage is to enter the data after which one goes to the first stage again.
The value of hierarchical plans for solving problems has been emphasized by Simon (1969). To illustrate the advantage of hierarchical structure, Simon presents a parable about two watchmakers, Tempus and Hora. Both make watches consisting of 1000 parts. Tempus builds his watches in one assembly of 1000 parts. However, if he is interrupted in the middle of the assembly by a customer, the partially assembled watch falls apart into its original pieces. Hora's watches are build in units of 10 pieces. Ten single parts make a unit, 10 units make a larger component, and the 10 components make the entire watch. If Hora is interrupted he loses only a small portion of the unfinished watch. Simon estimated that Tempus will lose an average of 20 times as much work per interruption as Hora. Although, problem solvers may not suffer from the problem of loosing prior work, hierarchical structure may prove beneficial in that the problem solver needs to think about only a limited number of elements at a time.
Menu interfaces are particularly germane to problem solving involving simple hierarchies. Menu systems can incorporate the hierarchical nature of the task into their own structure. The question is whether they do so in an effective and effecient manner that facilitates performance.
Means-Ends Analysis. Another important strategy in problem solving is the means-ends analysis. This heuristic involves a number of components already discussed. In the means-ends analysis the problem solver works on one goal at a time. If that goal cannot be achieved, the problem solver sets a subgoal of removing the obstacle that blocks that goal. The problem solver constantly monitors the difference between the current state and the goal state desired. If a difference exists, an attempt is made to generate a response that will reduce that difference. Simon (1969, p112) summarizes the means-ends analysis as follows: " Given a desired state of affairs and an existing state of affairs, the task of an adaptive organism is to find the difference between these two states and then to find the correlating process that will erase the difference."
A typical problem for the computer user that would be addressed by the means-ends analysis might be the task of viewing File X. But this goal is blocked because the file is not loaded and the user doesn't know the path to that file. Consequently, the first subgoal is to find File X. This may be solved by getting a directory listing. But that can only be accomplished if the program listing the directory can be run. So the sub-subgoal is to run the directory program. Running the directory program may itself entail the solution of a number of sub-subgoals. Once File X is found, the second subgoal is to run a browsing program to view the file. Again, a number of sub-subgoals may have to be achieved to accomplish this.
The means-ends analysis can be mentally taxing when the user must keep track of a number of embedded subgoals. Although the computer may easily store such goals on a push-down-pop-up stack, it is not so easy for the user. However, solutions to problems requiring the means-ends analysis may be facilitated by the use of hierarchical menu selection and event trapping menus. The main advantage of the menu interface is the ability to traverse goal and subgoal states up and down the hierarchy of operations and to access menu options from numerous points during the interaction. The menu interface reduces the memory load on the user by keeping track of goal states (e.g., location in the hierarchical structure) and prompting the user for input appropriate to each subgoal. Furthermore, the hierarchical menu structure may allow the user to solve subgoals in an order more amenable to human thinking rather than in an order dictated by formal problem solving. For example, in order to view File X in the problem above, the user of a menu selection system might first select a browsing program from a menu before knowing the location of File X. The browsing program prompts the user for a file. A means-end analysis reveals that the user should have solved the problem of the location of File X first. The user would then have to back out of the browsing program in order to find the file. However, a menu interface that allows the user may select concurrent processes frees the user from having to perform tasks in a predetermined order. While in the browsing program, the user could select the file directory program, find the file, and pass its location to the browsing program.
4.2.2 User Strategies and Styles. The types of strategies that users employ and their effectiveness depend not only on the tasks but on characteristics of the users. For example, users vary in their repertoire of generating solutions--an attribute important in the generate-test heuristic. Users differ greatly in their ability and willingness to plan ahead--an attribute important in the means-ends analysis. Finally, users differ in the degree with which they will pursue a particular course of action before they give up. This attribute is particularly important in menu selection since problem solving often involves a search through a hierarchical menu structure.
Search behavior by problem solvers have been characterized as either (a) shallow and broad or (b) narrow and deep. A shallow problem solver is likely to survey a wide number of possible solutions but explore them only superficially. This type of problem solver considers a solution and if it is not immediately apparent that it will led to the solution, drops the alternative and turns to another one. This user is likely to look only 1 or 2 levels deep before going back to the top again. On the other hand, the narrow and deep problem solver is likely to limit his or her search to only a few alternatives and explore them in depth. In this style, the problem solver picks one path and follows it out until it either results in the solution or its potential is completely exhausted. Different types of problems are more conducive to solution by different styles of problem solving. The shallow and broad strategy is more appropriate for placing pieces in a jigsaw puzzle; whereas, the deep and narrow strategy is more appropriate for solving the Tower of Hanoi puzzle or playing chess. In a similar way different types of menu system are more conducive to different types of search strategy. The shallow and broad strategy seems appropriate for the broad range of easily scannable alternatives provided by pull down menus. The deep and narrow strategy would be more appropriate for navigation in a complex data base.
A mixture of these strategies is known as progressive deepening. In this strategy each alternative is explored to a certain level. If no solution is found, alternatives are pursued to a further level. Thus, a progressive deepening is conducted in search. This type of strategy is particularly useful when there are a fair number of possible solutions and it is not clear to what extent each alternative must be explored to determine its suitability. Unfortunately, it is hard to see how users of a hierarchical menu system could effectively use this strategy unless they could keep track of previous search depths and jump to those points quickly. However, menu interfaces that allow concurrent searches or that allow the users to place bookmarks for fast return can facilitate the progressive deepening strategy.
Search styles can be much more complex than merely varying on depth. A number and patterns and additional factors have been characterized by Canter, Rivers, and Storrs (1985) for navigation through complex data structures. Canter et al. define six indices that can be used to quantify search patterns. The patterns are based on paths, rings, loops, and spikes in data base traversal (see Figure 4.8).
Pathiness : A path is any route through the data that does visit any node twice. It starts at one point and terminates at another. Menu traversal may be characterized by many short paths (high pathiness) or few long paths (low pathiness).
Ringiness : A ring is a route through the data which returns to the node from which it started. Since a ring has a home base, it may be thought of as an "outing." Such a ring may include other rings. Menu traversal may be characterized by many rings returning to home base (high ringiness) or few rings (low ringiness).
Loopiness : A loop is a ring which contains no other rings. A loop is simple ring and is distinguished by the fact that no node is visited twice except the home base.
Spikiness : A spike is a route through the data which goes out to a node and returns exactly the way it came. Hierarchical data bases are likely to result is high spikiness since one would traverse the hierarchy down and retrace the path back out.
NV/NT : The ratio of the number of nodes visited (NV) to the total number of nodes available in the system (NT) gives the proportion of available nodes utilized by the user. A high NV/NT ratio indicates a more comprehensive coverage of the data base.
NV/NS : The ratio of the number of different nodes visited (NV) to the total number of visits to nodes (NS) gives the proportion of first time visits. A low NV/NS ratio indicates a high degree of repetitive visits to nodes.
Canter at al (1985) use these definitions to further characterize search strategies.
Scanning : When users are scanning, they tend to cover a large area of the menu system, but without going into great detail or depth. Scanning will result in a long spikes and short loops which traverse through the database but do not extend very far into it. It is characterized by a high proportion of nodes visited relative to the total number of nodes available.
Browsing : Users may be happy to go wherever the data leads them. Users will pursue a path as long as it sustains their interest. Browsing behavior may be characterized by many long loops and a few large rings.
Searching : When users are searching for a particular target, the pattern may include ever-increasing spikes with a few loops. It is also characterized by a high redundancy of nodes revisited relative to the total number of different nodes visited.
Exploring : Many different paths of medium or short length suggest that the users are trying to grasp the extent and nature of the database. They may be attempting to gain a global map of the menu system.
Wandering : Users may wander more or less randomly through the database. The unstructured journey will lead to many medium-sized rings.
These strategies give an idea of the different types of search patterns that may occur as a function of motivational factors. Although Canter et al. characterize them in terms of the indices defined earlier, it remains to be seen whether one can delineate the type of search strategy based on the six indices.
4.3 Cognitive Layouts of Mental Models
Problem solving is governed by the way in which the problem space is represented. One may represent a problem mathematically, another visually, and still another metaphorically. Studies of problem solving behavior suggest that the key to problem solving is more often having the proper representation rather than the ideal strategy (Wickelgren, 1974).
When the user plans a task involving a menu selection system, both the menu and the task domain comprise the problem space. The menu representation and the task representation help to define the way in which the user thinks about problem. The term "mental model" has been used loosely to refer to these representations in the sense that the user adopts a conceptual model of computer operations that may relate abstract ideas (e.g., storage registers, I/O drivers, etc.) to concrete things (e.g., mailboxes, TV channels, etc.). The user's mental model of the system has been defined and illustrated numerous ways in the literature (Norman & Draper, 1986). Representations may take the form of metaphors, schemata, scripts , or cognitive layouts. These representations are by no means exhaustive or nonoverlapping. However, they serve to characterize the way in which users think about cognitive control of the human/computer interface.
4.3.1 Menu Selection as a Metaphor. The purpose of a metaphor as a literary device is to transfer the reader's concrete knowledge about a familiar thing to an unfamiliar subject being written about. The author draws upon the wealth of existing knowledge to shed light on a novel topic. It has been suggested that the same process of transference be capitalized upon in human/computer interaction. Carroll and Mack (1985) ask, "Can interfaces be designed to take advantage of the metaphors new users generate spontaneously as they apply their prior knowledge to this novel learning situation?" The extent to which the design suggests and actually conforms to the metaphor determines the amount of transference of knowledge. Part of the knowledge transferred is prior experience in problem solving strategies. For example, if the metaphor for a telecommunication program is a scrolling teletype, then the user would infer that text that scrolled off the top of the screen can be viewed again by scrolling backwards.
In most metaphors for computer operations, the base is more familiar but the novel area is more functionally rich. The typewriter is well understood, but it's functionality is considerably less than the word processor. Similarly, the card catalog is well understood, but it's functionality is considerably less than the online catalog. However, when it comes to metaphors for human/computer interaction the functionality of the base is greater than that of the computer application. Natural language has greater familiarity and functionality than computer command languages. In a similar, but more structured way, restaurant menu selection has a greater familiarity and functionality than computer menu selection.
Computer menu selection is, quintessentially, a metaphor. The original knowledge base is that of ordering items in a restaurant. However, the correspondence of elements between the two domains runs deeper than a superficial application of the metaphor. Webster (1976) gives the following definition for a menu:
menu n. [ Fr., small, detailed, from L. minutus, pp. of minuere, to lessen, from minor, less] 1. a detailed list of the foods served at a meal; bill of fare. 2. the foods served.
The menu presents a finite set of items available at the establishment. The customer then makes a selection and informs the server. The order is then prepared and served to the customer. In a similar vein the computer displays a detailed list of options available using that program. Current applications of computer menu selection bear a strong correspondence to restaurant menus and reinforce the metaphor. However, as with every metaphor, there are certain aspects that may either be deficient or enhanced in the target domain.
Norman and Chin (1989) provide a comparison between restaurant menus and computer menus used in common software packages such as Lotus 123(TM). Their sample of computer programs contained nearly three times as many selectable items as the restaurant menus. However, the ratio of pages in restaurant menus to frames in computer menus did not match that ratio. Restaurant menus contained considerably fewer pages with many more items per page. The organization of items also varied. The number of items per category level (i.e., per frame) was somewhat greater for restaurant menus for first level categories but fewer for bottom level. At the top level, restaurant menus averaged 11 categories (e.g., appetizers, sandwiches, main courses, beverages, etc.); whereas, computer menus averaged only 7.8 (e.g., Print, Rename, Copy, Delete, Run, Exit, etc.). At the bottom choice level, restaurant menus averaged 2.5 items (generally several sizes such as large, medium or small); whereas, computer menus averaged 4.7 ( e.g., a list of font sizes, or baud rates).
Table 4.2
A Comparison of Restaurant Menus (n = 56) and Computer Menus (n = 4)
Attribute | Restaurant | Computer Programs |
---|---|---|
Total Number of Selectable Items | 119.0 (83.1) | 316.4 (158.5) |
Number of Pages (Frames) | 3.8 (3.5) | 70.2 (65.7) |
Number of 1st Level Categories/Items | 11.0 (6.3) | 7.8 (3.6) |
Average Items per 1st Level Category | 8.0 (4.7) | 6.0 (3.2) |
Average Items per 2nd Level Category | 3.4 (1.7) | 5.4 (3.2) |
Average Items per Bottom Level Category | 2.5 (3.0) | 4.7 (3.0) |
Furthermore, Norman and Chin identify a number of common aspects between restaurant and computer menus (see Table 4.3).
Table 4.3
A Comparison of the Aspects of Restaurant Menus and Computer Menus
Attribute | Restaurant | Computer |
---|---|---|
Selection mode | name, number, pointing | letter, number, pointing |
Information about options | name, description, price, picture | definition, explanation, icon |
Organization | by course, type of food | hierarchical clustering, alphabetic, etc. |
Customization | by time, day of week, type of food | by application, experience of user |
Complexity of selection | multiple items, combinations | pick lists, predefined configurations |
Menu bypass | order when seated | jump ahead commands |
Menu specials | chef's choice, seasonals | enhancements, premium, options |
Both restaurant and computer menus often provide not only the names of items but also descriptions, definitions, and pictures or icons to help in making an informed choice. Both typically provide alternate response modes so that an item may be selected by name, number, or pointing. Both organize items along some line to help in the search process. Restaurant menus are customized for breakfast, lunch, and dinner, by day of the week, by type of customer (adult or child), by nationality, and by type of food. Similarly, computer menus are customized from one computer application to another; and one picks the package that will be the most functional for the task at hand. The selection of a restaurant serves to restrict the set of options available to the customer and thereby helps to focus the decision process. Similarly the selection of a particular software package restricts the functionality, but provides a finite set of options that helps to simplify the interaction. The menu, consequently, conveys the speciality of the restaurant or software package.
Both restaurant and computer menus can allow for complex selection. The restaurant allows the customer to order (a) multiple items, (b) several of any one item, and (c) combination platters. In a similar manner, some computer applications provide pick lists, multiple selections, and pre-programmed selections. Both allow for a form of menu bypass. If the customer is familiar with the menu, there is no need to refer to it when ordering. Similarly, many computer menus incorporate this feature with jump ahead or menu bypass commands (Laverson, Norman, & Shneiderman, 1987).
Restaurant menus handle specials of the day by clipping them to the standard menu or display them on a chalk board. Such specials take advantage of seasonal variation, market fluctuations, or the whim of the chef. Some timeshare systems make use of this too. The variety of the menu adds interest to the menu for regular users that are looking for something new. In addition, computer systems may take advantage of prevailing system conditions such as access to the LAN, the printer, or other system resources.
While there are many aspects in common, there are several aspects of the computer menu that do not correspond to the static menus of restaurants. These result primarily from the dynamic nature of the computer and its ability to update its display. Although restaurant menus typically organize and group items by course or food category, they are limited in terms of the number of levels that can be meaningfully displayed. Computer menus have the capability of organizing and displaying items in a hierarchical structure with unlimited depth.
Another aspect that computer menus have over most restaurant menus is the dynamic ability to add or delete items from the menu depending on the current state of the system. For example, Macintosh(TM) pull down menus display grayed out items if they are not currently available. In contrast, the restaurant customer may place an order only to be informed the kitchen has run out of that item. In a sense this is equivalent to generating an error message. The computer menu has the capability of avoiding such errors.
Despite the power and versatility of the computer, restaurant menus still possess two major features not yet shared by computer menu systems. The first is the appeal and complexity of graphic layout. Visual inspection of the restaurant menu in comparison to the computer menu reveals that the typical computer menu is extremely information lean, displaying only alphanumeric lists of items. In contrast, the restaurant menu may display tantalizing pictures, descriptions of entrees, and stylized type. The graphic layout of a menu not only helps to organize items but it also conveys additional information about the items. The intent of the restaurant menu, however, is not only to inform, but also to sell. Eye catching graphics help to do this. Computer menus are only beginning to exploit this aspect of the metaphor. For example, HyperCard(TM) provides a menu system that allows designers full graphics capability.
Perhaps the greatest deficiency of computer menus is not in the menu itself but in the absence of a sub-metaphor; namely, the server. In the restaurant the server facilitates communication between the customer and the kitchen. The counterpart in a computer system might be a natural language parser in conjunction with a menu to provide the user with extensive online help concerning the choices on the menu. The server possesses a great deal of knowledge about the menu and the relationships between items and functions as an intelligent database. Customers may query about items by aspects such as cost and food type.
As computer menus become more and more complex, the user needs such an expert system analogous to the server to assist in navigating and pruning the menu tree. For example, a server could perform complex relational searches by specified conditions and generate a shortened menu or narrow the user's search in a large menu. The server acts as a constantly available context-dependent help facility. The server is called upon for suggestions, definitions of terms, and even directions. Likewise, online help in computer menu selection is needed to provide context-dependent help.
Overall, the restaurant menu generates a powerful metaphor for human/computer interaction. The user's understanding of the interface and proficiency in using it is for the most part enhanced by the metaphor; however, at times it may also be limit thinking. Norman and Draper (1987) caution that metaphors as mental models may fix the way in which a user thinks about the interface. If the concept of a computer menu is limited to restaurant menus, which are essentially single linear menus, it may be difficult to understand how the selection of one item eliminates the availability of other items.
4.3.2 Schemata and Scripts. Metaphors transfer knowledge from one base or media to another. Schemata attempt to capture the structural representation of knowledge. A schema is a diagrammatic outline of something that conveys its essential characteristics. One understands incoming information to the extent that it conforms to our schema or ways of knowing. If it fits a predefined pattern, it can be understood and incorporated into the knowledge base. If it doesn't, it is gibberish. Most information fits somewhere inbetween perfect conformity and total chaos. Consequently, information is filtered and modified by existing schema so that it fits with our understanding of things (Bartlett, 1932). Furthermore, missing information may be inferred as required by the schema. For example, a schema for a menu requires that there are options, a method of selection, and a result. When one encounters different types of menus, an attempt is made to understand them in terms of the overriding menu schema. Pull-down menus become meaningful when the user understands that options are displayed by selecting a pull-down from the menu bar and evoked by moving the cursor down to the desired item.
A special type of schema is the script (also called an event schema). A script is an expected or stereotypical sequence of actions and events. Schank and Abelson (1977) give an example of the restaurant script. They describe a normal pattern of actions as listed in Figure 4.9. The stereotypical script, however, may vary from one instance to another. For example, at a fast food restaurant, the script is changed so that one orders the food and pays before being seated and eating. This sequence allows the customer to leave with the food or right after finishing rather than having to wait for the check.
In the same way, computer users learn scripts for how one interacts with computers, application programs, and particularly menus. Menu selection provides a simple script for cognitive control: Read the options, decide on a selection, input the alternative, and evaluate the result. The script varies somewhat for different menu structures but its simplicity makes for a powerful and compelling user interface.
Scripts also apply to the wider context of a session of interaction to perform a task such as working with a spreadsheet or a word processor. The script may start with how the program opens files and initializes the environment; then how functions are performed; and finally, how files are closed before one exits the program. Scripts not only help users plan their actions, they also help to evaluate the course of action. The user has an expectation of the proper flow of events. When they do not conform to those expectations, the user knows that something is wrong. However, as with the restaurant, certain variations are tolerated. Some programs require specification of a file name prior to entering data rather than upon completion. Some programs also periodically save the contents into this file during the session to prevent accidental loss due to system failure while others require the user to explicitly save the file before exiting the program.
Experienced users acquire a number of scripts through their interaction with different programs. Cognitive control of the interface is facilitated by well worn paths that conform to these scripts. It is suggested that designers take advantage of prior expectations of users. New programs that violate accepted scripts for whatever good reasons may not be understood or accepted by users.
4.3.3 Cognitive Layouts of Menus. Mental models whether metaphors, schemata, or scripts are in the mind of the user. Often they remain there dormant as the user muddles along step by step or frame by frame without engaging any planning or problem solving approach. Norman, Weldon, & Shneiderman (1986) have proposed that performance may be facilitated if such user models take on a visual form that engages the appropriate cognitive layout. Formally, a cognitive layout is defined as a mental representation of the elements and relationships in a system that conform to a cognitive model of operations and is tied to the surface layout of elements on the display. Norman et al. give the example of the three box human memory model as one of many possible cognitive layouts. Users may conceive of the system having a short term sensory store (a buffer), a short term memory (working memory), and a long term memory (file storage). When the surface layout of the system (its graphic representation) matches that of the user's cognitive layout, the user's involvement with and understanding of the system will be maximized. Norman et al. applied this idea to operation of multiple windows and screens. Windows may promote the human memory model by displaying incoming information in one window, working contents or clipboard information in another window, and a directory of files in a third window.
In essence the graphic layout should engage the way in which the user conceptualizes the operation of the program. The problem is that too often menus hide the organization and structure of the tree rather than explicitly using it to the benefit of the interface. A number of cognitive layouts present themselves as possible models for how the user thinks of menu interaction. Each layout has its strengths and weaknesses. The particular surface layout used to drive the interface may be able to emphasize the strengths and make accommodations for the weaknesses.
A number of menu systems have used graphic layouts that have suggested different types of metaphors. Several of these are discussed below.
Road Map. The cognitive layout of a menu selection system may be a map. As such the user views menu traversal as navigation. The road map layout associates menu frames with junctions in the road; alternatives are different locations or roads to those locations. The user is engaged in the process of determining routes between points. Initially, the user may search for possible routes by exploring alternatives branching out from the current location. But one may also work backwards from the destination. In general, search starts from highly familiar points and proceeds from there. Once a route has been found and repeatedly used it becomes habitual and even when shortcuts may be available. The value of the map is to display a graphic representation showing all of the major locations and connectors. When implemented on a computer, there is an additional advantage. The user may be able to select a point on the map and jump to that location in the system. The map is itself merely large menu, but the cognitive model conveyed is much more powerful since the user is aware of both the location and the connections of items. Although the road map is appropriate for a number of systems, only a few actually present a surface layout that conveys that idea. HyperCard(TM) gives one instance of a road map layout in its help system (see Figure 4.10).
Tree. A related cognitive layout is that of a tree with branches or inversely a tree with roots. These layouts confine the user's cognitive layout to a hierarchical menu. The tree layout dictates directional menu traversal from a central node (the root) to increasing levels of specificity. The directional nature of the tree is pervasive and is reinforced by much of the terminology used in menu traversal. In many cases interaction may need to be guided by the hierarchical nature of the database and it does not make sense to go from one location to another without at least conceptually referring to the hierarchical location of a node. In other cases, the hierarchy may be a superficial or arbitrary clustering of items (e.g., a catalog of gift items). When this is the case, the hierarchy may prove to be more of a burden that a strength. The general layout of the hierarchical menu requires that the user back out of a branch and return to the root before traversing back out to another branch.
Smorgasbord. Another layout it that of the Swedish Smorgasbord. All of the options spread out before the user. There is clustering and organization of items, but anything may be sampled. While there may be a linear layout of items, there is no sense of rigid menu traversal, rather one of simultaneous availability. Other layouts make use of the artist's pallet and the workers tool box. Parameter settings (hues, shades, fonts, lines, etc.) are laid out in a meaningful order and are simultaneously available. Functions (text, graphic objects, grabbers, etc.) are also laid out for direct selection. The major strength of the smorgasbord layout is that experienced users learn the locations of items and can make rapid selections despite the very large number of options that may be available.
These layouts are by no means exhaustive of the number of cognitive layouts that may be effectively used to engage the user. The challenge of good design is not so much to invent new interfaces but to borrow existing cognitive layouts from the world of common knowledge and thought.
4.4 Summary
Although the cognitive processing of the user imposes a number of limiting factors on menu selection performance, that same ability to process and control is what drives the interaction. The user must search for information, encode the meaning of alternatives, assess the alternatives, make a choice and effect a response. All of these processes are governed by the laws of human information processing. Good menu design takes into consideration such human factors to increase speed and reduce errors.
At another level, however, the user enters not as a limiting force but as a driving force. The user is a problem solver with goals, strategies, and styles of attack. As such, the computer interface becomes a media for effecting solutions. Theories of human problem solving suggest that the problem solver's understanding and representation of the problem domain aids in solution. To this end good user interface design should convey a sense of meaning and engage schemata that lend themselves to solutions of the tasks being performed.
No comments:
Post a Comment