9

Summary, Further Discussions and Conclusions

Visual attention is of importance in both biological and engineering areas. The goal of most research on visual attention related to this book is to construct a model that can simulate the visual attention mechanism in human or animal brains and to derive an algorithm for engineering applications. Different kinds of visual attention models or algorithms have been proposed during the past few decades in both biological and engineering areas. However, the focal points of the studies in the two areas are different.

Biologists and psychologists are more interested in understanding the human perceptual capability. In the studies of anatomy and physiology, the human visual system has clearly been revealed, especially in terms of the information processing from the retina to the primary visual cortex V1 area; however, there are still many controversial and unclear issues such as how searching for required objects in the visual field happens in human brains, how visual attention is performed in the brain and so on. Some hypotheses and theories have been suggested by physiologists and psychologists to predict visual attention in the brain, and these have been validated in many biological experiments by measuring the cell activity in animal brains and in some psychological experiments with human observers by displaying man-made visual paradigms. Feature integration theory (FIT) [1], guided search (GS) [2, 3] and synchronized oscillation theory [4] mentioned earlier in this book are compliant with biological visual attention.

The scientists and engineers in both computer vision and signal processing areas direct more of their efforts towards the efficiency and effectiveness of the information processing of the human visual system. The motivation of their studies is more to build computational models that make use of the biological and/or psychological findings or the associated theory in their applications. Thus, these computational models have explicit input, computational equations and algorithms, and can be implemented by software or hardware. It is worth noting that visual attention models with pure computer vision algorithms in engineering areas are not considered in this book in great details. Some pure computer vision (engineering) methods of visual attention have been mentioned due to their applications in computer vision, for example large object detection (Sections 7.2.2 and 7.2.3) and image retrieval (Section 7.4.2), but they are not based on psychological experiments and do not follow biological rules. In this book, we have emphasized the computational models that lie in the intersection area of biological and engineering categories.

There is a wide variety of computational models based upon in both biological and engineering aspects in the literature of the past decades. Recently, Tsotsos in his book [5] organizes these computational models in the aforementioned intersection area into four classes: selective routing models, temporal tagging models, emergent attention models and saliency map models. The selective routing model, later named selective tuning (ST) model in his book, composes of multilayer neurons with feed-forward and feedback routing, which aims at determining which neurons and routings best represent the input while given a task to be performed. His book mainly presents the selective tuning models. The temporal tagging model considers the relationship among neural oscillations based on synchronizing firing activity, and in emergent attention modelling the knowledge representation is in the cell's population form. These three models are based at the neuronal level so it is more biologically plausible with the biological structure and easier to add top-down bias on one or a few neurons; however, their computational complexity is high and they are not easy to be combined with the existing theory in statistical signal processing because statistical formulae do not work at the neuron level.

Saliency map models based on blocks have the merits of simplicity and convenient participation with other methods in statistical signal processing and computer vision. In this book, we concentrate on saliency map models and their applications. Temporal tagging and an emergent attention model as the different top-down types are introduced in Sections 5.1 and 5.6 (see Section 9.1 below for a summary).

Typical saliency map models have been presented in this book because they are important to research students, scientists and engineers working in the relevant areas. In addition, this book is slanted towards beginners who are just entering related research fields and it can also serve as a communicating bridge between engineers and biologists.

In this final chapter, we first summarize the content of the whole book, highlight the connection between chapters and sections; and then we discuss several critical issues of visual attention for both biological and engineering perspectives; we finally present some conclusions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset