Preface

Humans perceive the outside world with the information obtained from five sensing organs (ears, eyes, nose, tongue and skin), and human behaviour results from the information processed in the brain. The human brain is the product of evolution in the long process of natural selection and survival, in the course of which the brain, through interaction with the external world and with other species, has evolved into a comprehensive information processing system. The human brain is the most complex and ingenious system that we know of, and there are no artificial systems that can compare with it in terms of information processing. So the study for the human brain is extremely challenging for human beings. Of all the information processing subsystems in the brain, the visual processing system plays the most important role because more than 70% of outside information comes from the visual sense. Thus, the human visual system (HVS) has been more researched biologically than any other information processing system in the brain, and this has resulted in an independent branch of research. However, before the middle of the twentieth century, most of the research on the HVS was based on qualitative observations and experiments, rather than theoretical or quantitative studies.

On the other hand, researchers in physics, mathematics and information science have long hoped to build a machine to simulate the functions of complex visual processing that the brain has. They are interested in the achievements of brain study on the biological side of the HVS. They have tried to understand how information processing works in the HVS, in order to create a brain-like system for engineering applications. The development of artificial intelligence and artificial neural networks for visual pattern recognition is a typical example of simulating brain functions. Researchers with physics and mathematics background promote the qualitative biological studies into quantitative or theoretical ones. Computational neuroscience, biophysics and biomathematics have been developed to simulate the brain function at the neuronal level and to describe the brain's function by using mathematical equations; this aims at building computational models to fit the recorded data of brain cells. One influential work on visual computational theory was the book, Vision, published by Marr in the 1980s, using mathematics and physics for visual processing. Although some of the contents and points of view in that book seem not to be correct now, its influence in both biological and engineering areas continues to this day. Since then, a good number of models of quantitative visual computing have been suggested and developed.

Selective visual attention is a common human or animal behaviour while objects are being observed in the visual field, and this has long attracted much research in physiology, psychology, neuroscience, biophysics, biomathematics, information science and computer science. Biologists have explored the mechanism of visual attention by observing and analysing the data of experiments, such as which part of the brain works for visual attention, how to connect the different visual areas when visual attention happens, and so on. Computational neuroscientists have built some computational models to simulate the structure and processing of the HVS that can fit the experimental data of psychology and physiology. These computational models can validate the mechanism of visual attention qualitatively. Also, engineers and information scientist have explored the computational ability to simulate human vision and tackled the tough issues of computer vision and image processing. These experts have also contributed by building computational models that incorporate engineering theories and methodologies. In other words, some mechanisms that are unclear to those studying the brain have been replaced by information processing methods through the use of engineering models. These applications may in turn inspire and help biologists to explore and understand the functions of the brain further.

As mentioned above, visual attention of the HVS is related to multiple disciplines, so relying on a single discipline is difficult in related research. What is more, research on visual attention covers large spans from pure biology based on observations and qualitative experiments, through building theoretical models and formulation of quantitative methods, to the practical models combining other methods for more immediate engineering applications. Thus, visual attention modelling is an interdisciplinary field that needs cooperation from experts working in different areas. Obviously, this is not easy since there are large knowledge gaps among the different disciplines. For example, a biologist cannot express a problem in the nomenclature used by an expert in information science, and vice versa. Furthermore, the investigation strategy and background of different disciplines are different, which makes it difficult to interpret the findings of different disciplines and for the disciplines to be complementary. More importantly, knowledge in some disciplines (such as biology, physiology and psychology) often works with a single stimulus (or a few stimuli) while a practical model usually needs to deal simultaneously with a huge number of stimuli.

This book is mainly targeted at researchers, engineers and students of physics, mathematics, information science and computer science who are interested in visual attention and the HVS. It is not easy for colleagues in these disciplines to learn the biological nomenclature, research strategies and implications of findings when reading the books and papers scattered in the literature for the relevant parts of biology and psychology. The purpose of this book therefore is to provide a communication bridge.

Organization of this Book

The development of visual attention studies has had three phases: biological studies, computational models and then their applications. This book therefore follows these three phases as its three major parts, followed by a summary chapter.

Part I includes two chapters that give the fundamental concepts, experimental facts and theories of biology and psychology, as well as some principles of information science.

To be more specific, the first two chapters of this book introduce the related background including the biological concepts of visual attention, the experimental results of physiology and psychology, the anatomical structure of the HVS and then some important theories in visual attention. In addition, the relevant theories of statistical signal processing are briefly presented.

In Part II, some typical visual attention computational models, related to the concepts and theories presented in Part I, are introduced in Chapters 03, 04, 05. There have been a large number of computational models built in the past few decades. There are two extreme categories of models: (1) purely biological models, which simulate the structure of the anatomy and fit the recorded data on cells at the neuronal level; (2) pure computer vision models, which are not based on psychological experiments and do not follow biological rules. Biological models are too complex to be used in applications and, more crucially, they do not capture higher level perception well (obviously perception is not only about cells at the neuronal level), so they cannot tackle practical problems effectively. On the other hand, there is a lack biological or psychological grounding in pure computer vision models – though this is not our main emphasis here – as we have already realised that visual attention is closely related to biology and psychology. Therefore, the two extreme categories of models will not be considered as the core in this book; instead, we mainly concern ourselves with the computational models that have a related biological basis and are effective for applications. Chapters 03 and 04 present bottom-up computational models in the spatial and frequency domains, respectively, and Chapter 5 introduces top-down computational models. Chapter 6 presents databases and methods for benchmark testing different computational models. The performance estimation and benchmarking of computational models discussed will provide the means for testing new models, comparing different models or selecting appropriate models in practice.

In this book several typical saliency-map computational models on both bottom-up and top-down processing are presented. Each model has its biological basis, or the computational results are coincident (at least partly) with biological facts. Bottom-up models in the frequency domain are presented in more detail as a separate chapter (Chapter 4) since they usually have higher computing speed and more easily meet the real-time requirements of engineering applications.

Chapters 07 and 08, in Part III, demonstrate several application examples of two important aspects: computer vision and image processing. Overall, this book provides many case studies on how to solve various problems based on both scientific principles and practical requirements.

The summary in Chapter 9 provides the connection between chapters and sections, several controversial issues in visual attention, suggestions for possible future work and some final overall conclusions.

Readers who are interested in visual attention, the HVS and the building of new computational models should read the Parts I and II, in order to learn how to build the relevant computational models corresponding to biological/psychological facts and how to test and compare one model with others. We suggest that readers who want to use computational visual attention models in their applications should read Parts II and III, since several different types of computational models – with some computer code as references – can be found in Part II, while the way to apply visual attention models in different projects is explained in Part III. Readers who hope to do further research on visual attention and its modelling might also read Chapter 9, where some controversial issues in both biology and information science are discussed for further exploration. Of course, readers can select particular chapters or sections for more careful reading according to their requirements, and we suggest that they read the summary in Chapter 9 and especially look at Figures 9.1 and 9.2 for an overview of all the techniques presented in this book.

Acknowledgements

Finally, we wish to express our gratitude to many people who, in one way or another, have helped with the process of writing this book. Firstly, we are grateful to the many visual attention researchers in different disciplines, because their original contributions form the foundation of visual attention and its modelling, and therefore make this book possible. We are grateful to John K. Tsotsos, Minhoo Lee, Delian Wang, Nevrez Imamoglu and Manish Narwaria who provided suggestions or checked some sections or chapters of the book. We appreciate the help of Anmin Liu and Yuming Fang for inclusion of their research work and for proofreading of the related chapters. Anmin Liu also assisted us by obtaining permission to use some figures in the book from the original authors or publishers. We would like to thank the students and staff members of the research groups in the school of information science and technology, Fudan University, China, and the School of Computer Engineering, Nanyang Technological University, Singapore, for their research work in modelling and applications of visual attention and drawing some figures in this book. The related research has been supported by National Science Foundation of China (Grant 61071134) and Singapore Ministry of Education Academic Research Fund (AcRF) Tier 2 (Grant T208B1218).

We are particularly grateful to the Editors, James Murphy who helped to initiate this project, and Clarissa Lim and Shelley Chow who looked through our manuscript and provided useful comments and feedback; Shelley Chow was always patient and supportive, notwithstanding our underestimates of the time and effort required to complete this book.

Liming Zhang
Weisi Lin

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset