An Analysis of Two (or Three) Models of Visual Attention Allocation
Michael D. Fleetwood (firstname.lastname@example.org) Thesis Advisor: Michael D. Byrne (email@example.com)
Department of Psychology,
Rice University 6100 Main St., Houston, TX 77005 USA influence when and where a person will look next. Moray (1986) identified a number of factors, which under the right conditions, will influence the monitoring strategy of observers. The research proposed here will explicitly investigate four of those factors: the rate at which exogenous uncertainty is generated by the monitored process; the probability that while viewing one source another may show a critical value; the payoff matrix associated with missing or detecting critical values; and the cost of making an observation. Under the scenario that we will be using, we would like to determine the extent to which the above factors contribute to the attention allocation patterns of observers and incorporate the factors into the models accordingly. Because of the number and complexity of the computational models in this domain, only a subset deemed to be representative are examined here. From the visual sampling domain, the seminal model of visual sampling, developed by Senders (1964, 1983) is examined because of its influence in the domain and because of its power in predicting human performance. The primary parameter of Senders model is bandwidth, or the information generation rate, of a signal. A more recent model, the SEEV model developed by Wickens and colleagues (2001) is a relatively simple model that has been shown to be an accurate predictor of visual attention by airline pilots. Each letter in the models acronym represents a parameter of the model (Salience of the information source; Expectancy, or information rate of the source; Effort to make an observation; Value of the source relative to others). Regarding general models of attention, the Information Foraging model (Pirolli & Card, 1999) has been evaluated as a model of general attention in a wide variety of contexts, and its potential as a model of visual sampling will be evaluated here. Its primary parameters are the amount of information gained from a source, the effort required to examine a source, and the effort required to switch attention to another information source.
This work is concerned with examining in a formal quantitative manner what human observers look at and what the objects of their gaze tell them. Researchers have developed a number of analytical models designed to describe and predict the allocation of human attention. The proposed research aims to compare and evaluate the predictions made by three such models and further refine the models where possible. Many models have been designed to predict the allocation of human attention in the visual domain. One class of such models has focused on visual sampling or monitoring behavior in supervisory control tasks. These models use sampling or scanning as a dependent variable (e.g., Moray, 1986; Senders, 1964, 1983; Sheridan, 1970). In these models, the observer is not looking for a static target, but is rather supervising a series of dynamic processes, such as temperature gauges or aircraft movements, and the key dependent variable is the proportion of visual attention distributed to various “areas of interest” (AOIs) as a function of the quantitative properties of those AOIs. Psychologists have also developed of a number of models designed to describe and predict the general allocation of human attention. These general attention models do not deal with vision per se, but rather consider attention to be a resource and are concerned with how people allocate this resource. This is usually measured in terms of the amount of time they devote to a task or source of information. Each model brings to the table its own set of advantages and disadvantages as a predictor of human performance. In the research proposed herein, we will examine the ability of a model of general attention to predict attention allocation in the visual sampling domain.
The Current Research
We propose to examine how the attention allocation predictions made by different types of models fare in a common task. There are two primary comparisons that will be made. One, how do the model predictions differ from each other? More specifically, we will examine how models designed to predict the allocation of attention on a coarser grain of analysis fare on a perceptual level. The second objective of the current research is to make some refinements to the models where possible. One of the overarching goals of the line of research dedicated to modeling visual attention is to determine the factors that
The basic paradigm to be employed in our evaluation of the three models is similar to that developed by Senders and colleagues (Senders, 1964, 1983). Observers will be placed in front of a computer screen displaying four ammeters located in the four corners of the screen. The task of the observer will be to monitor the array of ammeters and to press a button whenever the pointer of any ammeter
enters a “danger zone.” The eye movements of participants will be the principal dependent measure recorded. In Sender’s study, the meters were driven by quasirandom forcing functions such that their movements appeared random to observers. Senders was primarily concerned with the bandwidth of the instrument as a determinant of sampling behavior, and thus only manipulated the bandwidth of the four instruments. As a result, the bandwidth of the instrument and the alarm frequency (when the pointer enters the “danger zone’) were inextricably linked, i.e., the higher the bandwidth the greater the alarm frequency. The system we will be using, developed by Alex Kirlik and Alex Kosorukoff at the University of Illinois, dissociates bandwidth and alarm frequency. Thus we will be able to determine the extent to which each variable drives sampling behavior. Other than dissociating bandwidth and alarm frequency, two additional manipulations will be performed. The second manipulation we will undertake will be to associate a value with the detection of an alarm on each of the different dials. A different number of points will be associated with each dial, and the observer will earn points by correctly detecting an alarm on each dial. The task of the observer will be to earn a maximum number of points. The purpose of the manipulation is to determine the influence that the relative importance of an instrument (as determined by the number of points received for detecting an alarm) plays in the observer’s attention allocation strategies. Although not explicitly addressed in Senders’s model, the SEEV model and the Information Foraging model both predict that the relative value of the information source will influence sampling behavior. A final manipulation will involve manipulating the effort required by observers to make an observation. This will simply involve altering the distance between the ammeters, such that more or less effort will be required of observers to shift their attention from one dial to another. Again, this is not explicitly addressed in Senders’s model, but the SEEV model and the Information Foraging model both predict that the effort required to observe an information source will influence sampling behavior.
one of these constraints, bandwidth. However, Senders’s model does not account for the additional experimental manipulations of the relative value of the instruments and the effort of observation. We expect the Information Foraging model to show consistent and possibly even improved performance in predicting human behavior as the additional experimental manipulations are performed. Each of these considerations is accounted for in the model, and the power of the model to account for complex data may be limited in the earlier manipulations. Regarding the SEEV model, we expect it to maintain a consistent level of behavior prediction. Each of the manipulations is accounted for in the model; however, the model has traditionally be evaluated on a relatively coarse grain of analysis (the proportion of time spent observing different AOI), and its predictions may suffer as the predictions are extrapolated to make more specific predictions, such as the average dwell time. Regarding possible refinements to the models, there is a clear opportunity to refine the original Senders model by incorporating the parameters of alarm frequency, relative value, and effort of observation. Indeed, Senders anticipated these concerns with his original model and has proposed several additional models that may account for them (Senders, 1983). To our knowledge, these newer Senders models have not been evaluated experimentally, and the proposed research will provide an opportunity to do so. The SEEV model incorporates each of the proposed manipulations as parameters, and the proposed research may involve supplementing the basic algorithm in order to account for finer-grained inputs. Regarding the Information Foraging model, it has not been evaluated as a model of perception, and the proposed research will evaluate its adequacy in this domain.
Moray, N. (1986). Monitoring behavior and supervisory control. In K. R. Boff, L. Kaufman, & J. P. Thomas (Eds.), Handbook of perception and performance, Vol II (pp 40-1-40-51). New York: Wiley & Sons. Senders, J. (1964). The human operator as a monitor and controller of multidegree of freedom systems. I E E Transactions on Human Factors in Electronics, HFE-5, 2-6. Senders, J. W. (1983). Visual scanning processes. Netherlands: University of Tilburg Press. Sheridan, T. B. (1970), On how often the supervisor should sample. IEE Transactions on Systems Science and Cybernetics, SSC-6(2), 140-145. Wickens, C. D., Helleberg, J., Goh, J., Xu, X., & Horrey, W. J. (2001) Pilot task management: Testing an attentional expected value model of visual scanning. (ARL-01-14/NASA-01-7). Savoy,IL: University of Illinois, Aviation Research Lab. Pirolli, P., Card, S. (1999). Information foraging. Psychological Review, 106, 643-675.
The aggregate aspects of the eye tracking data that will be evaluated for comparison with the models are: the proportion of time spent observing each instrument, the average dwell time per instrument, the frequency with which each instrument is observed, and the transition matrix of glances from instrument to instrument. With respect to the first stated goal of the research, comparing the predictions made by the three models, we expect the Senders model to make relatively less accurate predictions as the task becomes more complex. In first proposed experiment, there are only two factors determining the accuracy of the model to predict human behavior, bandwidth and alarm frequency. The Senders model has been shown to predict sampling behavior well given at least