Preface

System identification is a common method for building the mathematical model of a physical plant, which is widely utilized in practical engineering situations. In general, the system identification consists of three key elements, i.e., the data, the model, and the criterion. The goal of identification is then to choose one from a set of candidate models to fit the data best according to a certain criterion. The criterion function is a key factor in system identification, which evaluates the consistency of the model to the actual plant and is, in general, an objective function for developing the identification algorithms. The identification performances, such as the convergence speed, steady-state accuracy, robustness, and the computational complexity, are directly related to the criterion function.

Well-known identification criteria mainly include the least squares (LS) criterion, minimum mean square error (MMSE) criterion, and the maximum likelihood (ML) criterion. These criteria provide successful engineering solutions to most practical problems, and are still prevalent today in system identification. However, they have some shortcomings that limit their general use. For example, the LS and MMSE only consider the second-order moment of the error, and the identification performance would become worse when data are non-Gaussian distributed (e.g., with multimodal, heavy-tail, or finite range). The ML criterion requires the knowledge of the conditional probability density function of the observed samples, which is not available in many practical situations. In addition, the computational complexity of the ML estimation is usually high. Thus, selecting a new criterion beyond second-order statistics and likelihood function is attractive in problems of system identification.

In recent years, criteria based on information theoretic descriptors of entropy and dissimilarity (divergence, mutual information) have attracted lots of attentions and become an emerging area of study in signal processing and machine learning domains. Information theoretic criteria (or briefly, information criteria) can capture higher order statistics and information content of signals rather than simply their energy. Many studies suggest that information criteria do not suffer from the limitation of Gaussian assumption and can improve performance in many realistic scenarios. Combined with nonparametric estimators of entropy and divergence, many adaptive identification algorithms have been developed, including the practical gradient-based batch or recursive algorithms, fixed-point algorithms (no step-size), or other advanced search algorithms. Although many elegant results and techniques have been developed over the past few years, till now there is no book devoted to a systematic study of system identification under information theoretic criteria. The primary focus of this book is to provide an overview of these developments, with emphasis on the nonparametric estimators of information criteria and gradient-based identification algorithms. Most of the contents of this book originally appeared in the recent papers of the authors.

The book is divided into six chapters: the first chapter is the introduction to the information theoretic criteria and the state-of-the-art techniques; the second chapter presents the definitions and properties of several important information measures; the third chapter gives an overview of information theoretic approaches to parameter estimation; the fourth chapter discusses system identification under minimum error entropy criterion; the fifth chapter focuses on the minimum information divergence criteria; and the sixth chapter changes the focus to the mutual information-based criteria.

It is worth noting that the information criteria can be used not only for system parameter identification but also for system structure identification (e.g., model selection). The Akaike’s information criterion (AIC) and the minimum description length (MDL) are two famous information criteria for model selection. There have been several books on AIC and MDL, and in this book we don’t discuss them in detail. Although most of the methods in this book are developed particularly for system parameter identification, the basic principles behind them are universal. Some of the methods with little modification can be applied to blind source separation, independent component analysis, time series prediction, classification and pattern recognition.

This book will be of interest to graduates, professionals, and researchers who are interested in improving the performance of traditional identification algorithms and in exploring new approaches to system identification, and also to those who are interested in adaptive filtering, neural networks, kernel methods, and online machine learning.

The authors are grateful to the National Natural Science Foundation of China and the National Basic Research Program of China (973 Program), which have funded this book. We are also grateful to the Elsevier for their patience with us over the past year we worked on this book. We also acknowledge the support and encouragement from our colleagues and friends.

Xi’an

P.R. China

March 2013

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset