LEKULE: Depth vs. Breadth of Hierarchical Menu Trees

Menu selection systems have the power of providing a seemingly unlimited number of choices to the user by generating deeper and more complex menu structures. The power of hierarchical menus is inherent in the exponential growth of the possible number of terminal nodes as a function of level. Menus can access thousands of items. Figure 8.1 gives an example of menu frames to access the hierarchy of living animals. The example shown has eight levels proceeding from subkingdoms down to genera. There are approximately 5,000 genera in this classification. If the menu system continued down to species, it would include approximately one million items. Other menu systems use hierarchical menu structures to access medical and personnel files, demographic data, bibliographic references, encyclopedic information, geographical locations, etc. Even systems using indices and keywords may use hierarchical menus to a certain extent.

Unfortunately, the added complexity due to the depth of menu is not without cost. Menu designers must carefully weigh the options and consider a number of tradeoffs between factors such as the breadth and depth of the tree. This chapter will review empirical studies on the effect of depth vs. breadth of the menu structure on performance in highly predictable and learned menus. At each choice point in such menus the user is highly certain as to the "correct" choice when searching for a target. Performance is primarily a function of the speed with which the user can traverse the menu from the top to the target node. The following chapter will explore search behavior in menus that are not perfectly predictable. The user may have to make educated guesses as to the "correct" choice leading to the target. Performance in such menus is primarily a function of the number of selections and backtracking required to find a target. These studies investigate search patterns and user strategies. Finally, it should be mentioned that the studies reviewed in these two chapters are primarily concerned with the effect of menu structure on performance and not on the relationship between menu content and structure. This important issue will be dealt with in Chapter 10. Nevertheless, it will be seen that the formal structure of a menu has a substantial effect on performance. The results of these studies will help to tell designers how to structure the tree.

8.1 Depth vs. Breadth Tradeoff
An extremely important issue in the design of menu systems is the depth vs. breadth trade-off. For a given number of choices, should they all be presented on one screen or should they be divided among several screens? Figure 8.2 gives an example of two ways of setting up an automatic teller machine. In one case, all of the choices are shown on one screen. In the second case, the alternatives are subdivided such that the customer must make a series of choices. Which organization results in fewer errors, reduced time, and greater customer satisfaction and acceptance?

A number of factors come into play when users are faced with systems that vary in their depth and breadth. With broad menus, visual search time becomes an important factor. Users must scan long lists of alternatives to locate the desired item. In Chapter 6, we looked at how the organization of menu items can greatly improve search time. Broad menus also place an added burden on response selection. If numbers are used, two digits will be required. If letters are used, mnemonic abbreviations are harder to construct and often require more letters. If cursor positioning is used, the user must on the average move the cursor a greater distance. Furthermore, if a mouse is used the user may have to hit a smaller target. Consequently, both search time and response time are expected to increase with the length of the menu.
On the other hand, with deep menus a greater number of choices are required. Each choice entails visual search, decision, and response selection. Although each choice may require less time per frame, there are more frames to contend with. Furthermore and perhaps most devastating is the fact that with greater menu depth, there is greater uncertainty as to the location of target items. Vaguely worded alternatives and ambiguous terms obscure the defining set of submenus. Consequently, users tend to get lost.
Early design guidelines suggested that the number of alternatives in a frame should kept to a minimum in adherence to the principle of "frame simplicity" (Robertson, McCracken, & Newell, 1981). In systems such as ZOG and PROMIS (Problem Oriented Medical Information System), each frame presents only a "few sentences of text and no more than half a dozen options" (Robertson et al., 1981, p. 465). Similar guidelines that promoted the "lean, clean, green screen" notion where taken to imply that the number of alternatives per screen should be at a minimum. This meant that designers then had to increase the depth of the tree in order to accommodate more items.
Hardware limitations also served to limit the number of alternatives presented at a time. With slow transmission speeds on time-sharing systems, designers wanted to avoid long lists of alternatives. Moreover, small CRT screens could only display a limited amount of text at one time. Small screens and slow transmission led designers of early systems to opt for menus with no more than 8 alternatives at each level. Consequently, designers found it necessary to reduce the breadth of the tree and increase its depth.
On the other hand, there were those who argued that menu depth should be limited. Sensitive to the difficulty of remembering the path and of the increase in time required to make a series of choices, Calhoun (1978) suggested that "no function or piece of data should be more than four switch hits removed from the first menu or display."
The decision to emphasize depth or breadth in menu design has been largely a function of intuitions about the users rather than based on empirical performance results. Shneiderman (1980) points out the need for experimental tests on the parameters of menu selection for questions such as: "How many choices are appropriate for a single menu?" and "Does a deep menu lead to loss of orientation?" (p. 241) Clearly, it became necessary to do empirical research to determine what the optimal combination of depth and breath should be.
The trade-off of depth versus breadth can be reduced to the trade-off between increased decision time due to breadth versus a decreased number of choices. The total time that it takes for the user to respond to a series of menu choices is given by the following equation:
Total Response Time = Σ {u(n_i) + s(n_i)}, 8.1
where u(n_i) is the user response time to select from among n items at Level i and s(n_i) is the computer response time at Level i. In general, the breadth of trees in existing menu systems varies throughout the structure so that the trade-off between depth and breadth cannot be expressed in a simple way. However, for constant symmetric trees with N terminal nodes, the relationship between depth d and breadth n is given by the equations:
N = n^d, 8.2
log N = d log n, and 8.3
d = log N / log n. 8.4
If u(n_i) could be specified, the optimum the trade-off function could be determined. Two general functions relating response time to menu breadth have been proposed in the literature as a result of theories about the cognitive processes of menu search, choice, and selection. A linear function is the result of a model proposed by Lee and MacGregor (1985) and extended by Paap and Roske-Hofstrand (1986) and by MacGregor, Lee, and Nam (1986). Alternatively, a logarithmic function is the result of models proposed by Landauer and Nachbar (1985) and by Card (1982). It will be seen that these two models result in completely different solutions to the optimal breadth of a menu tree. Although the cognitive processing of menu alternatives is expected to vary with type of menu, task, and experience, empirical evidence to date supports the logarithmic model. Nevertheless, designers should be especially careful when generalizing theoretical and empirical results to specific applications in the depth versus breadth issue due to the influence of unexplored factors.

8.2 The Linear Model
Response time u(n_i) may simply be a linear function of the number of alternatives n_i. This model may result from a number of simplifying assumptions about the cognitive processes of search, selection time, and computer response time. Lee and MacGregor (1985) discuss a decision model for menu search in videotext databases. They assume that the time that it takes for a user to make a selection will be an additive function of the number of alternatives that the user reads multiplied by the time that it takes to read an alternative plus the key-press time and computer response time:
t(n_i) = E(n) t + k + c, 8.5
where E(n) is the expected number of alternatives read, t is the reading time per alternative, k is the key-press time, and c is the computer response time. The value of E(n) depends on whether the user adopts an exhaustive or a self-terminating search strategy. For an exhaustive search, the user reads all of the alternatives in a menu frame before making a choice; consequently, E(n) = n. For a self-terminating search, the user stops as soon as an appropriate alternative is encountered. The expected value of the number of alternatives read would then be E(n) = (n + 1)/2.
The total response time is then given by multiplying depth and response time per menu frame as given from Equations 8.4 and 8.5:
t(N) = {(log N) / (log n)} (E(n) t + k + c). 8.6
Although a direct solution for the optimum values of depth and breadth does not exist, Lee et al. used numerical methods to find the solution. Figure 8.3 illustrates the relationship between breadth and total response time. As the number of items is increased from 2 the time decreases but then increases again. The optimal number of alternatives is about 4 for slow readers (t = 1 sec) and about 7 for fast readers (t = .25 sec) when k = 1 sec and c = .5 sec.

Using numerical methods, the optimum number of menu alternatives per frame was determined for various combinations of the parameters for reading time, key-press time and computer response time. These are shown in Table 8.1. What is rather astounding about these tables is that within reasonable values for the parameters the optimal number of alternatives per frame is between 4 and 7. However, Lee and MacGregor's (1985) assumptions may be overly restrictive, particularly for organized or highly familiar menus.
Paap and Hofstrand (1986) have extended the range of search strategies beyond exhaustive and self-terminating menus to include any proportion of items that need to be examined. They suggest that when searching well practiced menus or when the alternatives in a menu are organized into categories, the scope of the search may be substantially reduced. For a well practiced list, users may learn approximately where to look for an item. Consequently, they need to scan only a small portion of the total number of items. To account for this Paap et al. introduce the parameter, 1/f, to indicate the proportion of items that need to be read before the user terminates the search. The expected number of items read is then:
E(n_i) = (n + 1)/f. 8.7
When f = 2, the scope is the whole list, when f = 3, the scope is reduced to two-thirds of the list, and when f = 4, it is reduced to one half. Adding scope to Equation 8.6 results in the following:
t(N) = {(log N) / (log n)} ({(n + 1)/f} t + k + c). 8.8
The optimal breadth is quite sensitive to 1/f. For reasonable values of reading time, key-press time, and computer response time, the optimal breadth may increase substantially beyond 8 alternatives per frame.

Table 8.1
Optimum Number of Alternatives per Menu Frame for Exhaustive and Self-Terminating Search Strategies(From Lee & MacGregor, 1985)

Key-Press Time (k)	Reading Time (t)	0.50	0.60	0.90	1.35
		Computer Response Time (c)
Exhaustive Search
0.50	0.25	6	6	6/7	7
	0.50	4	4	5	5/6
	1.00	4	4	4	4
	2.00	3	3	3	4

1.00	0.25	7	7	7/8	8
	0.50	5	5	5	6
	1.00	4	4	4	5
	2.00	3	3	4	4
Self-Terminating Search
0.50	0.25	8	8	9	11
	0.50	6	6	6/7	7
	1.00	4	4	5	5/6
	2.00	4	4	4	4

1.00	0.25	10	10	11	12/13
	0.50	7	7	7/8	8
	1.00	5	5	5	6
	2.00	4	4	4	5

Grouping of items in a menu would also be expected to reduce the scope of search. When items are placed into groups, the visual search process may be in two stages. First the user searches group labels until the desired group is located. Then he or she searches within that group until the desired item is located. The total number of items that need to be read given a two-stage self-terminating search would be:
E(I_i) = (g + 1)/2 + {(b/g) + 1}/2, 8.9
where g is the number of groups. Paap et al. introduce a simplifying assumption that the number of groups should be approximately the square root of the number of items. Then the total time to search for an item is given by:
t(N) = {(log N) / (log n)} ([(g+1)/2 + {(b/g)+1}/2 ] t + k + c). 8.10
When items are organized into groups such as this, the optimal breadth of the menu increases dramatically as shown in Table 8.2.
MacGregor, Lee, and Lam (1986) acknowledge that for command menus which may access fewer than 100 items and which are well practiced by users, the optimum number of items per frame probably exceeds 8. However, they note that for videotext menus accessing databases in excess of 10,000 documents, it is unlikely that users will learn the location of menu items and restrict the scope of their search. They contend that in such cases, the decision process will be complicated and that users may not only engage in an exhaustive search but that they may need to read items over again to compare one with other. Such a redundant search may result when the user has read all of the alternatives and several appear to be plausible choices. MacGregor et al. suggest that such redundant search processes may be frequent with broader menus that have more plausible options. Consequently, for videotext menus at least, one should restrict the number of items to about 5.

Table 8.2
Optimum Number of Alternatives per Frame for Grouped Menus Given a Self-Terminating Search Strategy (From Paap & Roske-Hofstrand, 1986)

Key-Press Time (k)	Reading Time (t)	0.50	0.60	0.90	1.35
		Computer Response Time (c)
0.50	0.25	38	41	49	63
	0.50	25	26	30	36
	1.00	19	20	22	24
	2.00	16	17	17	18

1.00	0.25	52	55	64	78
	0.50	32	33	37	43
	1.00	22	23	25	27
	2.00	18	18	19	20

However, it should be remembered that the linear model may not be at all appropriate. The time that it takes to read an item may not be constant either within menu frames or across menu levels. As the user begins to read items in a frame, he or she may start out slowly and speed up as the context of the choice becomes clearer. Furthermore, the familiarity of alternatives probably varies with the depth of the tree. Top level menus are seen more frequently and hence will be familiar to the user. Lower menus may only be seen for the first time. Finally, in writing menu items the length of alternatives may vary with the depth of the tree. Superordinate categories typically require longer descriptors. Specific items may require only brief phrases.
The linear model ignores decision time despite the fact that decision time is known to be a function of the number of alternatives. If decision time is a linear function of n_i, then the model still holds. However, the results of choice reaction time studies suggest that decision and key-press time is best described by a log model.

8.3 The Log Model
Under limited conditions u(n_i) can be estimated by the Hick-Hyman law for choice reaction time and Fitts' law for movement time. If selection is made by a choice among responses rather than a serial scan-and-match process, the Hick-Hyman law (Hyman, 1953; Welford, 1980) states that
dt = c + klog(n_i), 8.11
where c and k are constants and n_i is the number of equally likely alternatives at Level i of the tree. Fitts' law (Fitts, 1954) specifies the movement time to hit a target of width w from a distance d:
mt = c + klog(d/w). 8.12
These functions were tested in a study by Landauer and Nachbar (1985) who varied the number of alternatives per screen to select either a number or word out of 4096 possible. Responses were made on a touch screen. The physical width of the alternative was proportional to 1/n; and it was assumed that the distance was the same for all alternatives; consequently:
mt = c + klog(n). 8.13
Moreover, it was assumed that the decision time and movement time are additive components in response selection and that u(n_i) does not depend on depth, consequently:
t(n) = dt + mt = c + klog(n), 8.14
Landauer and Nachbar (1985) varied the breadth of a menu for locating the 4096 sequential numbers or alphabetized words across 2, 4, 8, and 16 alternatives per level. The exact experimental conditions of the study are important for understanding the results. On each trial, a goal number or word was presented in the middle of a main screen as well as on a second screen to the left. Participants initiated the search by touching anywhere on the main screen. The screen went blank for 1 second. After the delay, an auditory signal occurred and 5 to 33 alternating blue and red horizontal stripes from .5 to 3.5 inches wide appeared on the screen. For integers, ranges were shown by their extreme high and low values on the blue stripes. Participants were to choose the range that contained the goal by touching the stripe between the two values. For words, the procedure was the same except that words replaced numbers in the ranges according to their alphabetical order. Errors were not permitted. Participants had to select the correct bar, at which point a second auditory signal has sounded and the stripe flashed white. Responses were timed from the onset signal to success signal. After the range was chosen, the main screen went blank for 1 second and another set of ranges were presented and so on until the screen on which sequential integers or words themselves were shown rather than ranges. Goal numbers and words were chosen with the restriction that they never appeared as high or low range values. Eight participants served in all conditions in a counterbalanced order. Two sessions in each condition were given on separate days to observe changes with practice.
Figure 8.4 shows the mean response time per selection as a function of the number of alternatives. The lines show the predicted functions based on the Hick-Hyman and Fitt's laws, and the points indicated the observed values. It is clear that the log function fits the data quite well.

Figure 8.4, however, shows only the response time per choice. Total time to locate a target is given by the number of choices multiplied by the time per choice. For a symmetric tree, the number of choices necessary to locate a target is lognN, where N is the total number of terminal items (4096). The response time per choice is given by Equation 8.5. Multiplying these two results in
u(N) = (lognN)(c + klog(n)), or
u(N) = k(logN) + c(lognN). 8.15
The left hand term does not depend on the tree structure, only on the number of terminal items. The right hand term gets smaller as n gets larger. The result is that increased breadth of the tree reduces the overall response time. The constant c, which indicates the time added per choice, determines the magnitude of the effect. When c is large, increased depth becomes more detrimental.
The observed times in the Landauer et al. confirmed these predictions. Figure 8.5 shows the cumulative times for numbers and words for each degree of branching. The larger the degree of branching, the faster the total time to locate the target. Two features of this graph are also interesting. First, response times increased slightly with depth. Landauer et al. hypothesize that as the ranges become narrower the decisions become more difficult. Second, response time for the last decision is much faster than for the steps leading up to it. The authors note that this is probably due to the speed at which an identity match occurs with the goal at the last level over the speed of an order comparison required on earlier screens.

The results of Landaur and Nachbar are extremely important. They suggest that for certain menus, breadth should be increased to a practical maximum. However, their results may be confined to the laboratory conditions particular to their studies. Most hierarchical menu systems are not sequential numbers or alphabetized words. Furthermore, most menu alternatives are ill-defined categorical names rather than ordered ranges. It is quite possible that the logarithmic functions hypothesized for response times do not generalize to other sets of items. Studies using menu selection in hierarchical data bases shed light on the generality on the depth versus breadth issue.
Support for a log model also comes from studies on visual search. It is not necessarily the case that users scan the list of alternatives systematically from top to bottom. Instead they may randomly sample items. Card (1982) discusses a visual search model originally developed by Kendall and Wodinsky (1960). If the user is searching for a target item, there is a probability of finding the target on each of a number of saccades. If each saccade requires a fixed amount of time, the total time that it takes to locate the target is a logarithmic function of the number of alternatives.

8.4 Total User Response Time in Hierarchical Data Bases
Early evidence that increased depth of menus was detrimental came from a review of information retrieval studies. Tombaugh and McEwen (1982) concluded that when searching for information, users were very likely to choose menu items that did not lead to the desired information and that they tended to give up without locating the information on a high proportion of searches.
Observations of users working with a large network menu selection system indicated that often users became confused about where they were in the system. Robertson, McCracken, and Newell (1981) described the performance of operators using a menu selection system named ZOG as follows: "Users readily get lost in using ZOG. The user does not know where he is, how to get where he wants to go, or what to do; he feels lost and may take excessively long to respond. This happens in all sorts of nets, especially complex nets or nets without regular structure" (p. 483).
Finally, informal studies on menu driven teleterminals by Hagelbarger and Thompson (1983) indicated that users took progressively longer to respond as they progressed further down the tree. When users selected a wrong alternative at any point, they became confused and rather than backing up the tree one level to correct the error, they would return to the main menu.
Allen (1983) investigated the effect of menu depth on response times and error rates at each level of tree. He found that response times at each level became longer for searches deeper in the tree. Subjects also made more errors when searches deeper into the tree.
In order to understand what was happening in hierarchical menus, a series of controlled laboratory studies investigated the effect of varying depth and breadth while holding the number of terminal nodes constant. Miller (1981) used a constant number of 64 items arranged in symmetric hierarchical menus of 2⁶, 4³, 8², and 64¹. The items were nouns and proper nouns in a semantic hierarchy formed by superset-subset associations. Figure 8.6 shows the words arranged in the 2⁶condition and Figure 8.7 shows the 4³ condition. Selections were made on a response panel consisting of push buttons adjacent to rectangular viewing holes mounted directly over a CRT display. In the case of the 64¹ menu, buttons were above or below groups of eight words and subjects only indicated which group the word was in.

Subjects studied word hierarchy diagrams prior to being tested so that memory and choice uncertainty would be minimized. Each trial proceeded as follows: (a) a goal word was presented in the middle of the screen for 2 seconds; (b) the screen blanked for 1.5 seconds and then the first set of choices were presented; (c) when the subject responded, the screen was blanked for .5 seconds; and (d) a 4 second rest period occurred between trials. Total response time was recorded as the sum of the response times for each selection. System response time was not counted. Subjects were tested on four blocks of the 64 words. If a subject selected a wrong choice, the word "error" appeared and the trial was repeated at a later point. Subjects were encouraged to work "as quickly as possible without making errors."
The results are shown in Figure 8.8. Total response time was fastest for menus 4³ and 8² and slowest for menus 2⁶ and 64¹ as shown by the U-shaped curve in the figure. Subjects using menus 2⁶ or 64¹ took approximately twice as much total time to get to the target. On the other hand, as shown in the bottom line, response time per menu choice increased as a function of breadth. Subjects using menus 2⁶ and 4³ took about 1 second to respond; subjects using menu 8² took 1.3 seconds; and subjects using menu 64¹ took 5.3 seconds. Consequently, menu 4³ was faster than 2⁶since response time per choice was about the same, but menu 2⁶ required twice as many choices. Broader menus required longer search times. Consequently menu 64¹was worse than 8²since the response time per choice for 64¹was much more than twice as long as for menu 8².

Percent errors also indicated that the 8²menu resulted in superior performance showing less than one percent errors. The more errors occurred with menus 2⁶and 4³, showing 7.6 and 6.6 percent errors, respectively. The 64¹ menu resulted in 2.9 percent errors.
Miller concludes that "a menu hierarchy of two levels was the fastest, produced the fewest errors, showed the least variability, and was the easiest to learn. If for some reason two levels and eight choices per level cannot satisfy system requirements, expansion in breadth is recommended over expansion in depth."
Unfortunately, several shortcomings in the design of Miller's experiment serve to invalidate the results for the 64¹ menu and undermine the generality of the conclusion. It turns out that in all of menus the semantic categories remained intact throughout the hierarchy except for menu 64¹. The 64 items were presented in eight columns of eight items. The eight items were not from the same category but were drawn from four different categories. Consequently, performance was probably impaired due to the lack of categorical organization. Subjects had to search an essentially random ordering of words.
In order to test for this possibility, Snowberry, Parkinson, and Sisson (1983) replicated Miller's experiment and included a comparison between a categorical organization versus random display of the 64 items in the broadest menu. The same items were used. In addition the response requirement was changed so as to be the same across all conditions. Subjects had to enter a two digit response code to select an item. Response times were measured from the onset of the display to the entry of the first digit so that deep menus would not be penalized by the longer response times due to multiple key entry.
Snowberry et al.'s results for user response times are shown in Figure 8.9. The results replicate Millers findings with the exception that the categorized display of 64 items is slightly superior to the 8² menu. Clearly, a random ordering of words results in search times that degrade performance well below the 8² menu. On the other hand, a categorical ordering of the 64 words results in performance slightly superior to the 8² menu. Overall, the results indicate that search time is a decreasing linear function of log2 number of items up to at least 64. Results for accuracy of choice also support the superiority of broad menus. Snowberry notes that differences in response time and accuracy were not eliminated with practice. Consequently, broad menus can be expected to be superior for experienced users as well as novice users.

A potential problem with both Miller's and Snowberry et al.'s studies is that the difficulty of finding targets may not be equal across different hierarchical structures. Some categorical structures may be more natural and meaningful than others. To recategorize them reduces the semantic associations and increases the uncertainty of choice at the upper levels. Furthermore, certain category names may obscure their members in some menus. The higher error rates for the 2⁶ and 4³ conditions may actually indicate that the menus were not well matched. Aware of this problem, Snowberry et al. tested an additional group of subjects with specific instructions to avoid errors common in the original test. This group achieved substantially reduced errors with only a slightly slower response time. Half of this group continued for 128 trials and further reduced errors and increased speed. Nevertheless, both speed and accuracy were still below those achieved using broader menus. Although the different menu structures probably differed in terms of the "goodness" of their categorization of words, it would appear that such differences do not undermine the conclusions.
Kiger (1984) extended the research on depth vs. breadth by investigating a simulated database of information sources and commercial services on a videotext type system. Sixty-four services were clustered into five different tree structures: 2⁶, 4³, 8², 4x16, and 16x4. Figure 8.10 gives an example of the 4³ menu.

Twenty-two subjects searched for 16 targets in each of the 5 structures. The menu structure was randomly changed every two trials so that subjects would experience all of the structures across different trials.
Both rankings of preference and ratings of ease of use indicated that menus 8² and 4³ were most preferred and menus 16x4 and 2⁶were least preferred. Total user response time was approximately the same for menus 4x16, 16x4 and 8²and significantly faster than for menus 4³ and 2⁶. Menu 2⁶was significantly slower than all other structures. Finally, the menu 2⁶resulted in the most errors. Kiger concludes that the 2⁶ menu is slowest, least accurate, hardest to use and least preferred. On the other hand, menu 8² appears superior.
Schultz and Curran (1986) have confirmed the advantage of menu breadth over depth. They compared performance on a one-level, full menu versus a three-level, paged menu of user-familiar functions on a prototype system. Items within menus were either alphabetically arranged or randomly ordered. Even when system response time was subtracted out, search times were 30% faster with the full menu (16.4 sec) than the paged menu (24.0 sec). Randomly ordered paged menus were particularly slow on the first block of trials. With practice, performance on the randomly ordered menus was at least as good as for the alphabetized menus. Schultz et al. note that menu structure and ordering are interactively related in early use of the system. The practical conclusion is that the number of pages in a menu system should be minimized and that items should be arranged in an orderly fashion on the page.
The evidence to this point indicates that where possible, designers should avoid depth and increase the breadth of choice. Certainly a deep binary tree is to be avoided. Menus with 8 alternatives are definitely preferable. Furthermore, early thinking that broad menus were to be avoided at all costs has been proven false. Broad menus of 64 items may indeed be superior to two levels of 8 alternatives. However, a number of other factors begin to emerge. Some menu structures may convey a better semantic categorization of the items.

8.5 Selection Time as a Function of Menu Depth
The time required to select among a set of alternatives depends not only on the number of alternatives, but also on how far the user has traversed down the menu hierarchy. It was noted by Landauer and Nachbar (1985) that selection times increase slightly with depth. In their particular task this finding was probably due to the increased difficulty in deciding whether or not an item was included in narrower and narrower ranges. Selection time for the last decision at the terminal node, on the other hand, was much faster due to an identity match of the target.
For videotext systems Kiger reports quite different results. Given a constant number of alternatives, selection time decreased with depth. Figure 8.11 shows selection time per frame as a function of depth for 5 menu structures. The 2⁶, 4³, and 8² structures show a steady decrease from Level 1 to the lowest level of the tree. The 16x4 menu also shows a decrease but this is due in part to the smaller number of alternatives at Level 2. The only reversal occurred for the 4x16 menu where a longer selection time at Level 2 was probably due to the larger number of alternatives at that level. Kiger notes that the longer selection time at Level 1 may be due to time required for (a) initial orientation to the menu, (b) think time planning a path, and (c) selection among general categories. Fast response times at the terminal menu frames may have been due to target matches.

8.6 Factors of System Speed and User Response Time
Most of the research on the trade-off between depth and breadth has investigated time as the primary measure of performance. As noted in Chapter 7, overall performance is a function of system performance and user performance. When the system is very slow, it ultimately determines the overall time. For example, when the transmission speed of the system is 30 cps, a typical menu of eight items on the Source(TM) takes approximately 8 seconds. Consider a 8³ menu with a total of 512 terminal items. If the user traverses the tree to a depth of three levels and takes only 3 seconds selection time per level, it would require a total of 33 seconds. On the other hand without a hierarhcial menu, if all 512 items were transmitted in one broad menu at 30 cps, it would require 8.5 minutes to display all of the items. Depth leads to a considerable savings!
But what happens when the transmission and display rates are markedly improved? If transmission and display is essentially instantaneous, then it is user performance that determine speed. The total time that it takes to traverse the menu to the desired item is given by the following equation:
Total Time = Σ{ s(n_i) + u(n_i)}, 8.16
where s(n_i) is the system time for the transmission and display of n items at Level i and u(n_i) is the user time to select a response out of n items at Level i. Total time is summed across the number of levels. Total time may also be broken into the components due to the system and the user separately:
System Response Time = Σs(n_i), 8.17
User Response Time= Σu(n_i). 8.18
Sisson, Parkinson, & Snowberry (1986) calculated the communication times for 4 menu configurations with a constant number of items, but varying in depth for transmission speeds of 10 to 1920 cps. In addition, Sisson et al. added user times for search and response as empirically determined by Snowberry et al. (1983). The broad menu of all 64 items was superior for transmission speeds of 960 and faster. The 8² menu had the best time for speeds from 60 to 480 cps, and the 4³ menu had the best times for the slowest speeds of 10 to 30 cps. The deepest menu 2⁶ was never optimal.
With faster transmission and display rates, it would seem that broader menus become more efficient, that is, if user search and choice times are not inordinately longer with broad menus.
8.6 Summary
Given a set number of terminal nodes in a hierarchical menu, the designer faces the important trade-off issue of depth vs. breadth. Within the constraints of system response and screen display time, size of screen, and meaningful grouping of items, there is a certain amount of leeway as to whether to present long lists of alternatives or subdivide them into shorter groups. Although the literature is somewhat mixed as to the "optimal" number of alternatives per frame, several principles are clear.
For lists of linearly organized arrays such as numbers, alphabetized lists, letters of the alphabet, and months of the year, one should increase breadth to the maximum practical level. Visual search is optimized in that the intact organization of the list facilitates user response time to the extent that it approximates a logarithmic function.
When there is no inherent linear ordering of alternatives, users may scan items sequentially. When this is the case, user response time may approximate a linear function and the optimal number of alternatives will fall between 3 and 12 depending on various user characteristics and system parameters. However, if multiple levels of the hierarchical menu can be displayed in an organized manner in one frame, response time can be reduced with much broader menus. The overriding principle is to provide organization, whether linear or hierarchical, to the user as a vehicle for visually locating target items.
It may very well be that the depth vs. breadth trade-off issue is really misplaced and that the transcending issue is that of effectively revealing menu organization to users, while reducing the number of frames and responses required to locate target items. This issue will be developed further in the next chapter on search behavior in hierarchical menu trees.

12 Nov 2015

Depth vs. Breadth of Hierarchical Menu Trees

No comments:

Post a Comment