References

1. Zadeh LA. From circuit theory to system theory. Proc IRE. 1962;50(5):856–865.

2. Eykhoff P. System Identification—Parameter and State Estimation London: John Wiley & Sons, Inc.; 1974.

3. Ljung L. Convergence analysis of parametric identification methods. IEEE Trans Automat Control. 1978;23:770–783.

4. Ljung L. System Identification: Theory for the User second ed. Upper Saddle River, New Jersey: Prentice Hall PTR; 1999.

5. Zhang P. Model selection via multifold cross validation. Ann Stat. 1993;299–313.

6. H. Akaike, Information theory and an extension of the maximum likelihood principle, Proceedings of the Second International Symposium on Information Theory, 1973, pp. 267–281.

7. Akaike H. A new look at the statistical model identification. IEEE Trans Automat Control. 1974;19(6):716–723.

8. Schwarz G. Estimating the dimension of a model. Ann Stat. 1978;6:461–464.

9. Rissanen J. Modeling by shortest data description. Automatica. 1978;14:465–471.

10. Barron A, Rissanen J, Yu B. The minimum description length principle in coding and modeling. IEEE Trans Inf Theory. 1998;44(6):2743–2760.

11. Rojas CR, Welsh JS, Goodwin GC, Feuer A. Robust optimal experiment design for system identification. Automatica. 2007;43(6):993–1008.

12. Liu W, Principe J, Haykin S. Kernel Adaptive Filtering: A Comprehensive Introduction Wiley 2010.

13. Wolberg JR, Wolberg J. Data Analysis Using the Method of Least Squares: Extracting the Most Information from Experiments. 1 Berlin, Germany: Springer; 2006.

14. Kailath T, Sayed AH, Hassibi B. Linear Estimation. 1 New Jersey: Prentice Hall; 2000.

15. Aldrich J. R A Fisher and the making of maximum likelihood 1912–1922. Stat Sci. 1997;12(3):162–176.

16. Hald A. On the history of maximum likelihood in relation to inverse probability and least squares. Stat Sci. 1999;14(2):214–222.

17. Tikhonov AN, Arsenin VY. Solution of Ill-posed Problems Washington: Winston & Sons; 1977.

18. Widrow B, Sterns S. Adaptive Signal Processing Englewood Cliffs, NJ: Prentice Hall; 1985.

19. Haykin S. Adaptive Filter Theory Englewood Cliffs, NJ: Prentice Hall; 2002.

20. Haykin SS, Widrow B, eds. Least-Mean-Square Adaptive Filters. New York: Wiley; 2003.

21. Sherman S. Non-mean-square error criteria. IRE Trans Inf Theory. 1958;4:125–126.

22. Brown JL. Asymmetric non-mean-square error criteria. IRE Trans Automat Control. 1962;7:64–66.

23. Zakai M. General error criteria. IEEE Trans Inf Theory. 1964;10(1):94–95.

24. Hall EB, Wise GL. On optimal estimation with respect to a large family of cost function. IEEE Trans Inf Theory. 1991;37(3):691–693.

25. Ljung L, Soderstrom T. Theory and Practice of Recursive Identification Cambridge, MA: MIT Press; 1983.

26. Walach E, Widrow B. The least mean fourth (LMF) adaptive algorithm and its family. IEEE Trans Inf Theory. 1984;30(2):275–283.

27. Walach E. On high-order error criteria for system identification. IEEE Trans Acoust. 1985;33(6):1634–1635.

28. Douglas SC, Meng THY. Stochastic gradient adaptation under general error criteria. IEEE Trans Signal Process. 1994;42:1335–1351.

29. Al-Naffouri TY, Sayed AH. Adaptive filters with error nonlinearities: mean-square analysis and optimum design. EURASIP J Appl Signal Process. 2001;4:192–205.

30. Pei SC, Tseng CC. Least mean p-power error criterion for adaptive FIR filter. IEEE J Sel Areas Commun. 1994;12(9):1540–1547.

31. Shao M, Nikias CL. Signal processing with fractional lower order moments: stable processes and their applications. Proc IEEE. 1993;81(7):986–1009.

32. Nikias CL, Shao M. Signal Processing with Alpha-Stable Distributions and Applications New York: Wiley; 1995.

33. Rousseeuw PJ, Leroy AM. Robust Regression and Outlier Detection New York: John Wiley & Sons, Inc.; 1987.

34. Chambers JA, Tanrikulu O, Constantinides AG. Least mean mixed-norm adaptive filtering. Electron Lett. 1994;30(19):1574–1575.

35. Tanrikulu O, Chambers JA. Convergence and steady-state properties of the least-mean mixed-norm (LMMN) adaptive algorithm. IEE Proc Vis Image Signal Process. 1996;143(3):137–142.

36. Chambers J, Avlonitis A. A roust mixed-norm adaptive filter algorithm. IEEE Signal Process Lett. 1997;4(2):46–48.

37. Boel RK, James MR, Petersen IR. Robustness and risk-sensitive filtering. IEEE Trans Automat Control. 2002;47(3):451–461.

38. Lo JT, Wanner T. Existence and uniqueness of risk-sensitive estimates. IEEE Trans Automat Control. 2002;47(11):1945–1948.

39. Delopoulos AN, Giannakis GB. Strongly consistent identification algorithms and noise insensitive MSE criteria. IEEE Trans Signal Process. 1992;40(8):1955–1970.

40. C.Y. Chi, W.T. Chen, Linear prediction based on higher order statistics by a new criterion, Proceedings of Sixth IEEE SP Workshop Stat. Array Processing, 1992.

41. Chi CY, Chang WJ, Feng CC. A new algorithm for the design of linear prediction error filters using cumulant-based MSE criteria. IEEE Trans Signal Process. 1994;42(10):2876–2880.

42. Feng CC, Chi CY. Design of Wiener filters using a cumulant based MSE criterion. Signal Process. 1996;54:23–48.

43. Cover TM, Thomas JA. Elements of Information Theory Chichester: John Wiley &Sons, Inc.; 1991.

44. Shannon CE. A mathematical theory of communication. J Bell Syst Technol. 1948;27(379–423):623–656.

45. J.P. Burg, Maximum entropy spectral analysis, Proceedings of the Thirty-Seventh Annual International Social Exploration Geophysics Meeting, Oklahoma City, OK, 1967.

46. Lagunas MA, Santamaria ME, Figueiras AR. ARMA model maximum entropy power spectral estimation. IEEE Trans Acoust. 1984;32:984–990.

47. Ihara S. Maximum entropy spectral analysis and ARMA processes. IEEE Trans Inf Theory. 1984;30:377–380.

48. Kay SM. Modern Spectral Estimation: Theory and Application Englewood Cliffs, NJ: Prentice Hall; 1988.

49. Linsker R. Self-organization in perceptual networks. Computer. 1988;21:105–117.

50. Linsker R. How to generate ordered maps by maximizing the mutual information between input and output signals. Neural Comput. 1989;1:402–411.

51. R. Linsker, Deriving receptive fields using an optimal encoding criterion, in: S.J. Hansor (Ed.), Proceedings of Advances in Neural Information Processing Systems, 1993, pp. 953–960.

52. Deco G, Obradovic D. An Information—Theoretic Approach to Neural Computing New York: Springer-Verlag; 1996.

53. Haykin S. Neural Networks: A Comprehensive Foundation Englewood Cliffs, NJ: Prentice Hall, Inc.; 1999.

54. Comon P. Independent component analysis, a new concept? Signal Process. 1994;36(3):287–314.

55. Lee TW, Girolami M, Sejnowski T. Independent component analysis using an extended infomax algorithm for mixed sub-Gaussian and super-Gaussian sources. Neural Comput. 1999;11(2):409–433.

56. Lee TW, Girolami M, Bell AJ. A unifying information-theoretic framework for independent component analysis. Comput Math Appl. 2000;39(11):1–21.

57. Erdogmus D, Hild II KE, Rao YN, Principe JC. Minimax mutual information approach for independent component analysis. Neural Comput. 2004;16:1235–1252.

58. Cardoso JF. Infomax and maximum likelihood for blind source separation. IEEE Signal Process Lett. 1997;4:109–111.

59. Yang HH, Amari SI. Adaptive online learning algorithms for blind separation: maximum entropy minimum mutual information. Neural Comput. 1997;9:1457–1482.

60. Pham DT. Mutual information approach to blind separation of stationary source. IEEE Trans Inf Theory. 2002;48(7):1–12.

61. Zadeh MB, Jutten C. A general approach for mutual information minimization and its application to blind source separation. Signal Process. 2005;85:975–995.

62. Principe JC, Xu D, Fisher JW. Information theoretic learning. In: Haykin S, ed. Unsupervised Adaptive Filtering. New York: Wiley; 2000.

63. Principe JC, Xu D, Zhao Q, et al. Learning from examples with information theoretic criteria. J VLSI Signal Process Syst. 2000;26:61–77.

64. Principe JC. Information Theoretic Learning: Renyi’s Entropy and Kernel Perspectives New York: Springer; 2010.

65. Fisher JW. Nonlinear Extensions to the Minimum Average Correlation Energy Filter USA: University of Florida; 1997.

66. Xu D. Energy, Entropy and Information Potential for Neural Computation USA: University of Florida; 1999.

67. Erdogmus D. Information Theoretic Learning: Renyi’s Entropy and Its Applications to Adaptive System Training USA: University of Florida; 2002.

68. Erdogmus D, Principe JC. From linear adaptive filtering to nonlinear information processing. IEEE Signal Process Mag. 2006;23(6):15–33.

69. Zaborszky J. An information theory viewpoint for the general identification problem. IEEE Trans Automat Control. 1966;11(1):130–131.

70. Van Trees HL. Detection, Estimation, and Modulation Theory, Part I New York: John Wiley & Sons; 1968.

71. Snyder DL, Rhodes IB. Filtering and control performance bounds with implications on asymptotic separation. Automatica. 1972;8:747–753.

72. Galdos JIA. Cramer–Rao bound for multidimensional discrete-time dynamical systems. IEEE Trans Automat Control. 1980;25:117–119.

73. Friedlander B, Francos J. On the accuracy of estimating the parameters of a regular stationary process. IEEE Trans Inf Theory. 1996;42(4):1202–1211.

74. Stoica P, Marzetta TL. Parameter estimation problems with singular information matrices. IEEE Trans Signal Process. 2001;49(1):87–90.

75. Seidman LP. Performance limitations and error calculations for parameter estimation. Proc IEEE. 1970;58:644–652.

76. Zakai M, Ziv J. Lower and upper bounds on the optimal filtering error of certain diffusion process. IEEE Trans Inf Theory. 1972;18(3):325–331.

77. Galdos JI. A rate distortion theory lower bound on desired function filtering error. IEEE Trans Inf Theory. 1981;27:366–368.

78. Washburn RB, Teneketzis D. Rate distortion lower bound for a special class of nonlinear estimation problems. Syst Control Lett. 1989;12:281–286.

79. Weidemann HL, Stear EB. Entropy analysis of estimating systems. IEEE Trans Inf Theory. 1970;16(3):264–270.

80. Duncan TE. On the calculation of mutual information. SIAM J Appl Math. 1970;19:215–220.

81. Guo D, Shamai S, Verdu S. Mutual information and minimum mean-square error in Gaussian Channels. IEEE Trans Inf Theory. 2005;51(4):1261–1282.

82. Zakai M. On mutual information, likelihood-ratios and estimation error for the additive Gaussian channel. IEEE Trans Inf Theory. 2005;51(9):3017–3024.

83. Binia J. Divergence and minimum mean-square error in continuous-time additive white Gaussian noise channels. IEEE Trans Inf Theory. 2006;52(3):1160–1163.

84. T.E. Duncan, B. Pasik-Duncan, Estimation and mutual information, Proceedings of the Forty-Sixth IEEE Conference on Decision and Control, New Orleans, LA, USA, 2007, pp. 324–327.

85. Verdu S. Mismatched estimation and relative entropy. IEEE Trans Inf Theory. 2010;56(8):3712–3720.

86. Weidemann HL, Stear EB. Entropy analysis of parameter estimation. Inf Control. 1969;14:493–506.

87. Tomita Y, Ohmatsu S, Soeda T. An application of the information theory to estimation problems. Inf Control. 1976;32:101–111.

88. Kalata P, Priemer R. Linear prediction, filtering, and smoothing: an information theoretic approach. Inf Sci (Ny). 1979;17:1–14.

89. Minamide N. An extension of the entropy theorem for parameter estimation. Inf Control. 1982;53(1):81–90.

90. Janzura M, Koski T, Otahal A. Minimum entropy of error principle in estimation. Inf Sci (Ny). 1994;79:123–144.

91. Chen TL, Geman S. On the minimum entropy of a mixture of unimodal and symmetric distributions. IEEE Trans Inf Theory. 2008;54(7):3166–3174.

92. Chen B, Zhu Y, Hu J, Zhang M. On optimal estimations with minimum error entropy criterion. J Franklin Inst. 2010;347(2):545–558.

93. Chen B, Zhu Y, Hu J, Zhang M. A new interpretation on the MMSE as a robust MEE criterion. Signal Process. 2010;90(12):3313–3316.

94. Chen B, Principe JC. Some further results on the minimum error entropy estimation. Entropy. 2012;14(5):966–977.

95. Chen B, Principe JC. On the smoothed minimum error entropy criterion. Entropy. 2012;14(11):2311–2323.

96. Parzen E. On estimation of a probability density function and mode. Time Series Analysis Papers San Diego, CA: Holden-Day, Inc.; 1967.

97. Silverman BW. Density Estimation for Statistic and Data Analysis NY: Chapman & Hall; 1986.

98. Devroye L, Lugosi G. Combinatorial Methods in Density Estimation New York: Springer-Verlag; 2000.

99. Santamaria I, Erdogmus D, Principe JC. Entropy minimization for supervised digital communications channel equalization. IEEE Trans. Signal Process. 2002;50(5):1184–1192.

100. Erdogmus D, Principe JC. An error-entropy minimization algorithm for supervised training of nonlinear adaptive systems. IEEE Trans. Signal Process. 2002;50(7):1780–1786.

101. Erdogmus D, Principe JC. Generalized information potential criterion for adaptive system training. IEEE Trans Neural Netw. 2002;13:1035–1044.

102. Erdogmus D, Principe JC. Convergence properties and data efficiency of the minimum error entropy criterion in Adaline training. IEEE Trans. Signal Process. 2003;51:1966–1978.

103. Erdogmus D, Hild II KE, Principe JC. Online entropy manipulation: stochastic information gradient. IEEE Signal Process Lett. 2003;10:242–245.

104. Morejon RA, Principe JC. Advanced search algorithms for information-theoretic learning with kernel-based estimators. IEEE Trans Neural Netw. 2004;15(4):874–884.

105. Han S, Rao S, Erdogmus D, Jeong KH, Principe JC. A minimum-error entropy criterion with self-adjusting step-size (MEE-SAS). Signal Process. 2007;87:2733–2745.

106. Chen B, Zhu Y, Hu J. Mean-square convergence analysis of ADALINE training with minimum error entropy criterion. IEEE Trans Neural Netw. 2010;21(7):1168–1179.

107. Guo LZ, Billings SA, Zhu DQ. An extended orthogonal forward regression algorithm for system identification using entropy. Int J Control. 2008;81(4):690–699.

108. Kullback S. Information Theory and Statistics New York: John Wiley & Sons; 1959.

109. Kulhavý RA. Kullback–Leibler distance approach to system identification. Annu Rev Control. 1996;20:119–130.

110. Matsuoka T, Ulrych TJ. Information theory measures with application to model identification. IEEE Trans Acoust. 1986;34(3):511–517.

111. Cavanaugh JE. A large-sample model selection criterion based on Kullback’s symmetric divergence. Stat Probab Lett. 1999;42:333–343.

112. Seghouane AK, Bekara M. A small sample model selection criterion based on the Kullback symmetric divergence. IEEE Trans. Signal Process. 2004;52(12):3314–3323.

113. Seghouane AK, Amari SI. The AIC criterion and symmetrizing the Kullback–Leibler divergence. IEEE Trans Neural Netw. 2007;18(1):97–106.

114. Seghouane AK. Asymptotic bootstrap corrections of AIC for linear regression models. Signal Process. 2010;90(1):217–224.

115. Baram Y, Sandell NR. An information theoretic approach to dynamic systems modeling and identification. IEEE Trans Automat Control. 1978;23(1):61–66.

116. Baram Y, Beeri Y. Stochastic model simplification. IEEE Trans Automat Control. 1981;26(2):379–390.

117. Tugnait JK. Continuous-time stochastic model simplification. IEEE Trans Automat Control. 1982;27(4):993–996.

118. Leland R. Reduced-order models and controllers for continuous-time stochastic systems: an information theory approach. IEEE Trans Automat Control. 1999;44(9):1714–1719.

119. Leland R. An approximate-predictor approach to reduced-order models and controllers for distributed-parameter systems. IEEE Trans Automat Control. 1999;44(3):623–627.

120. Chen B, Hu J, Zhu Y, Sun Z. Parameter identifiability with Kullback–Leibler information divergence criterion. Int. J. Adapt. Control Signal Process. 2009;23(10):940–960.

121. Weinstein E, Feder M, Oppenheim AV. Sequential algorithms for parameter estimation based on the Kullback–Leibler information measure. IEEE Trans Acoust. 1990;38(9):1652–1654.

122. Krishnamurthy V. Online estimation of dynamic shock-error models based on the Kullback–Leibler information measure. IEEE Trans Automat Control. 1994;39(5):1129–1135.

123. Stoorvogel AA, Van Schuppen JH. Approximation problems with the divergence criterion for Gaussian variables and Gaussian process. Syst Control Lett. 1998;35:207–218.

124. Stoorvogel AA, Van Schuppen JH. System identification with information theoretic criteria. In: Bittanti S, Picc G, eds. Identification, Adaptation, Learning. Berlin: Springer; 1996.

125. Pu L, Hu J, Chen B. Information theoretical approach to identification of hybrid systems. Hybrid Systems: Computation and Control Berlin Heidelberg: Springer; 2008; pp. 650–653.

126. Chen B, Zhu Y, Hu J, Sun Z. Adaptive filtering under minimum information divergence criterion. Int J Control Autom Syst. 2009;7(2):157–164.

127. Chandra SA, Taniguchi M. Minimum image-divergence estimation for ARCH models. J Time Series Anal. 2006;27(1):19–39.

128. Pardo MC. Estimation of parameters for a mixture of normal distributions on the basis of the Cressie and Read divergence. Commun Stat-Simul C. 1999;28(1):115–130.

129. Cressie N, Pardo L. Minimum image-divergence estimator and hierarchical testing in loglinear models. Stat Sin. 2000;10:867–884.

130. Pardo L. Statistical Inference Based on Divergence Measures Boca Raton, FL: Chapman & Hall/CRC; 2006.

131. Feng X, Loparo KA, Fang Y. Optimal state estimation for stochastic systems: an information theoretic approach. IEEE Trans Automat Control. 1997;42(6):771–785.

132. Mustafa D, Glover K. Minimum entropy image control. Lecture Notes in Control and Information Sciences. 146 Berlin: Springer-Verlag; 1990.

133. Yang J-M, Sakai H. A robust ICA-based adaptive filter algorithm for system identification. IEEE Trans Circuits Syst Express Briefs. 2008;55(12):1259–1263.

134. Durgaryan IS, Pashchenko FF. Identification of objects by the maximal information criterion. Autom Remote Control. 2001;62(7):1104–1114.

135. Chen B, Hu J, Li H, Sun Z. Adaptive filtering under maximum mutual information criterion. Neurocomputing. 2008;71(16):3680–3684.

136. Chen B, Zhu Y, Hu J, Príncipe JC. Stochastic gradient identification of Wiener system with maximum mutual information criterion. Signal Process, IET. 2011;5(6):589–597.

137. Liu W, Pokharel PP, Principe JC. Correntropy: properties and applications in non-Gaussian signal processing. IEEE Trans. Signal Process. 2007;55(11):5286–5298.

138. A. Singh, J.C. Principe, Using correntropy as a cost function in linear adaptive filters, in: International Joint Conference on Neural Networks (IJCNN’09), IEEE, 2009, pp. 2950–2955.

139. S. Zhao, B. Chen, J.C. Principe, Kernel adaptive filtering with maximum correntropy criterion, in: The 2011 International Joint Conference on Neural Networks (IJCNN), IEEE, 2011, pp. 2012–2017.

140. Bessa RJ, Miranda V, Gama J. Entropy and correntropy against minimum square error in offline and online three-day ahead wind power forecasting. IEEE Trans Power Systems. 2009;24(4):1657–1666.

141. Xu J.W., Erdogmus D., Principe J.C. Minimizing Fisher information of the error in supervised adaptive filter training, in: Proc. ICASSP, 2004, pp.513–516.

142. Sakamoto Y, Ishiguro M, Kitagawa G. Akaike Information Criterion Statistics Dordretcht, Netherlands: Reidel Publishing Company; 1986.

143. Burnham KP, Anderson DR. Model Selection and Multimodel Inference: A Practical Information Theoretic Approach second ed. New York: Springer-Verlag; 2002.

144. Grunwald PD. The Minimum Description Length Principle Cambridge, MA: MIT Press; 2007.

145. N. Tishby, F.C. Pereira, W. Bialek, The information bottleneck method. arXiv preprint physics/0004057, 2000.

146. Kolmogorov AN. Three approaches to the quantitative definition of information. Probl Inform Transm. 1965;1:4–7.

147. Johnson O, Johnson OT. Information Theory and the Central Limit Theorem London: Imperial College Press; 2004.

148. Jaynes ET. Information theory and statistical mechanics. Phys Rev. 1957;106:620–630.

149. Kapur JN, Kesavan HK. Entropy Optimization Principles with Applications Academic Press, Inc. 1992.

150. Ormoneit D, White H. An efficient algorithm to compute maximum entropy densities. Econom Rev. 1999;18(2):127–140.

151. Wu X. Calculation of maximum entropy densities with application to income distribution. J Econom. 2003;115(2):347–354.

152. A. Renyi, On measures of entropy and information. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, 1961, pp. 547–561.

153. Havrda J, Charvat F. Concept of structural image-entropy. Kybernetika. 1967;3:30–35.

154. Varma RS. Generalizations of Renyi’s entropy of order image. J Math Sci. 1966;1:34–48.

155. Arimoto S. Information-theoretic considerations on estimation problems. Inf Control. 1971;19:181–194.

156. Salicru M, Menendez ML, Morales D, Pardo L. Asymptotic distribution of image-entropies. Commun Stat Theory Methods. 1993;22:2015–2031.

157. Rao M, Chen Y, Vemuri BC, Wang F. Cumulative residual entropy: a new measure of information. IEEE Trans Inf Theory. 2004;50(6):1220–1228.

158. Zografos K, Nadarajah S. Survival exponential entropies. IEEE Trans Inf Theory. 2005;51(3):1239–1246.

159. Chen B, Zhu P, Príncipe JC. Survival information potential: a new criterion for adaptive system training. IEEE Trans. Signal Process. 2012;60(3):1184–1194.

160. Mercer J. Functions of positive and negative type, and their connection with the theory of integral equations. Philos Trans R Soc London. 1909;209:415–446.

161. Vapnik V. The Nature of Statistical Learning Theory New York: Springer; 1995.

162. Scholkopf B, Smola AJ. Learning with Kernels, Support Vector Machines, Regularization, Optimization and Beyond Cambridge, MA, USA: MIT Press; 2002.

163. Whittle P. The analysis of multiple stationary time series. J R Stat Soc B. 1953;15(1):125–139.

164. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B. 1977;39(1):1–38.

165. McLachlan GJ, Krishnan T. The EM Algorithm and Extensions NJ: Wiley-Interscience; 2008.

166. Aiazzi B, Alparone L, Baronti S. Estimation based on entropy matching for generalized Gaussian PDF modeling. IEEE Signal Process Lett. 1999;6(6):138–140.

167. Pham DT. Entropy of a variable slightly contaminated with another. IEEE Signal Process Lett. 2005;12:536–539.

168. Chen B, Hu J, Zhu Y, Sun Z. Information theoretic interpretation of error criteria. Acta Automatica Sin. 2009;35(10):1302–1309.

169. Chen B, Príncipe JC. Maximum correntropy estimation is a smoothed MAP estimation. IEEE Signal Process Lett. 2012;19(8):491–494.

170. Rubinstein RY. Simulation and the Monte Carlo Method New York: Wiley; 1981.

171. Styblinsli MA, Tang TS. Experiments in nonconvex optimization: stochastic approximation with function smoothing and simulated annealing. Neural Netw. 1990;3:467–483.

172. Edmonson W, Srinivasan K, Wang C, Principe J. A global least mean square algorithm for adaptive IIR filtering. IEEE Trans Circuits Syst. 1998;45:379–384.

173. Goodwin GC, Payne RL. Dynamic System Identification: Experiment Design and Data Analysis New York: Academic Press; 1977.

174. Moore E. On properly positive Hermitian matrices. Bull Amer Math Soc. 1916;23(59):66–67.

175. Aronszajn N. The theory of reproducing kernels and their applications. Cambridge Philos Soc Proc. 1943;39:133–153.

176. Engel Y, Mannor S, Meir R. The kernel recursive least-squares algorithm. IEEE Trans. Signal Process. 2004;52:2275–2285.

177. Liu W, Pokharel P, Principe J. The kernel least mean square algorithm. IEEE Trans. Signal Process. 2008;56:543–554.

178. Liu W, Principe J. Kernel affine projection algorithm. EURASIP J Adv Signal Process 2008; doi 10.1155/2008/784292 Article ID 784292, 12 pages.

179. Platt J. A resource-allocating network for function interpolation. Neural Comput. 1991;3:213–225.

180. Richard C, Bermudez JCM, Honeine P. Online prediction of time series data with kernels. IEEE Trans. Signal Process. 2009;57:1058–1066.

181. Liu W, Park Il, Principe JC. An information theoretic approach of designing sparse kernel adaptive filters. IEEE Trans Neural Netw. 2009;20:1950–1961.

182. Chen B, Zhao S, Zhu P, Principe JC. Quantized kernel least mean square algorithm. IEEE Trans Neural Netw Learn Syst. 2012;23(1):22–32.

183. Berilant J, Dudewicz EJ, Gyorfi L, van der Meulen EC. Nonparametric entropy estimation: an overview. Int J Math Statist Sci. 1997;6(1):17–39.

184. Vasicek O. A test for normality based on sample entropy. J Roy Statist Soc B. 1976;38(1):54–59.

185. Singh A, Príncipe JC. Information theoretic learning with adaptive kernels. Signal Process. 2011;91(2):203–213.

186. D. Erdogmus, J.C. Principe, S.-P. Kim, J.C. Sanchez, A recursive Renyi’s entropy estimator, in: Proceedings of the Twelfth IEEE Workshop on Neural Networks for Signal Processing, 2002, pp. 209–217.

187. Wu X, Stengos T. Partially adaptive estimation via the maximum entropy densities. J Econom. 2005;8:352–366.

188. Chen B, Zhu Y, Hu J, Zhang M. Stochastic information gradient algorithm based on maximum entropy density estimation. ICIC Exp Lett. 2010;4(3):1141–1145.

189. Zhu Y, Chen B, Hu J. Adaptive filtering with adaptive p-power error criterion. Int J Innov Comput Inf Control. 2011;7(4):1725–1738.

190. Chen B, Principe JC, Hu J, Zhu Y. Stochastic information gradient algorithm with generalized Gaussian distribution model. J Circuit Syst Comput. 2012;21.

191. Varanasi MK, Aazhang B. Parametric generalized Gaussian density estimation. J Acoust Soc Amer. 1989;86(4):1404–1415.

192. Kokkinakis K, Nandi AK. Exponent parameter estimation for generalized Gaussian probability density functions with application to speech modeling. Signal Process. 2005;85:1852–1858.

193. S. Han, J.C. Principe, A fixed-point minimum error entropy algorithm, in: Proceedings of the Sixteenth IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing, 2006, pp.167–172.

194. S. Han, A family of minimum Renyi’s error entropy algorithm for information processing, Doctoral dissertation, University of Florida, 2007.

195. Chen S, Billings SA, Grant PM. Recursive hybrid algorithm for non-linear system identification using radial basis function networks. Int J Control. 1992;55:1051–1070.

196. Sayed A. Fundamentals of Adaptive Filtering New York: Wiley; 2003.

197. Clark GA, Mitra SK, Parker SR. Block implementation of adaptive digital filters. IEEE Trans Acoust Speech Signal Process. 1981;ASSP-29(3):744–752.

198. Bershad NJ, Bonnet M. Saturation effects in LMS adaptive echo cancellation for binary data. IEEE Trans Acoust Speech Signal Process. 1990;38(10):1687–1696.

199. T.Y. Al-Naffouri, A. Zerguine, M. Bettayeb, Convergence analysis of the LMS algorithm with a general error nonlinearity and an iid input, in: Proceedings of the Asilomar Conference on Signals, Systems, and Computers, vol. 1, 1998, pp. 556–559.

200. Chen B, Hu J, Pu L, Sun Z. Stochastic gradient algorithm under (h,φ)-entropy criterion. Circuits Syst. Signal Process. 2007;26(6):941–960.

201. Gibson JD, Gray SD. MVSE adaptive filtering subject to a constraint on MSE. IEEE Trans Circuits Syst. 1988;35(5):603–608.

202. Rao BLSP. Asymptotic Theory of Statistical Inference New York: Wiley; 1987.

203. Kaplan D, Glass L. Understanding Nonlinear Dynamics New York: Springer-Verlag; 1995.

204. J.M. Kuo, Nonlinear dynamic modeling with artificial neural networks, Ph.D. dissertation, University of Florida, Gainesville, 1993.

205. Luenberger DG. Linear and Nonlinear Programming Reading, MA: Addison-Wesley; 1973.

206. Wang LY, Zhang JF, Yin GG. System identification using binary sensors. IEEE Trans Automat Control. 2003;48(11):1892–1907.

207. Harvey AC, Fernandez C. Time series for count data or qualitative observations. J Bus Econ Stat. 1989;7:407–417.

208. Al-Osh M, Alzaid A. First order integer-valued autoregressive INAR(1) process. J Time Series Anal. 1987;8(3):261–275.

209. Brannas K, Hall A. Estimation in integer-valued moving average models. Appl Stoch Model Bus Ind. 2001;17(3):277–291.

210. Weis CH. Thinning operations for modeling time series of counts—a survey. AStA Adv Stat Anal. 2008;92(3):319–341.

211. Chen B, Zhu Y, Hu J, Principe JC. Δ-Entropy: definition, properties and applications in system identification with quantized data. Inf Sci. 2011;181(7):1384–1402.

212. Janzura M, Koski T, Otahal A. Minimum entropy of error estimation for discrete random variables. IEEE Trans Inf Theory. 1996;42(4):1193–1201.

213. Silva LM, Felgueiras CS, Alexandre LA, Marques J. Error entropy in classification problems: a univariate data analysis. Neural Comput. 2006;18:2036–2061.

214. Ozertem U, Uysal I, Erdogmus D. Continuously differentiable sample-spacing entropy estimation. IEEE Trans Neural Netw. 2008;19:1978–1984.

215. Larranaga P, Lozano JA. Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation Boston: Kluwer Academic Publishers; 2002.

216. Rothenberg TJ. Identification in parametric models. Econometrica. 1971;39:577–591.

217. Grewal MS, Glover K. Identifiability of linear and nonlinear dynamic systems. IEEE Trans Automat Control. 1976;21(6):833–837.

218. Tse E, Anton JJ. On the identifiability of parameters. IEEE Trans Automat Control. 1972;17(5):637–646.

219. Glover K, Willems JC. Parameterizations of linear dynamical systems: canonical forms and identifiability. IEEE Trans Automat Control. 1974;19(6):640–646.

220. van der Vaart AW. Asymptotic Statistics, Cambridge Series in Statistical and Probabilistic Mathematics New York: Cambridge University Press; 1998.

221. Doob JL. Stochastic Processes New York: John Wiley; 1953.

222. Sara A, van de Geer. Empirical Processes in M-estimation, Cambridge Series in Statistical and Probabilistic Mathematics Cambridge: Cambridge University Press; 2000.

223. Barron AR, Gyorfi L, van der Meulen EC. Distribution estimation consistent in total variation and in two types of information divergence. IEEE Trans Inf Theory. 1992;38(5):1437–1454.

224. Papoulis A, Pillai SU. Probability, Random Variables, and Stochastic Processes fourth ed. New York: McGraw-Hill Companies, Inc.; 2002.

225. Karny M. Towards fully probabilistic control design. Automatica. 1996;32(12):1719–1722.

226. Karny M, Guy TV. Fully probabilistic control design. Syst Control Lett. 2006;55:259–265.

227. Wang H. Robust control of the output probability density functions for multivariable stochastic systems with guaranteed stability. IEEE Trans Automat Control. 1999;44(11):2103–2107.

228. Wang H. Bounded Dynamic Stochastic Systems: Modeling and Control New York: Springer-Verlag; 2000.

229. Wang H, Yue H. A rational spline model approximation and control of output probability density functions for dynamic stochastic systems. Trans Inst Meas Control. 2003;25(2):93–105.

230. Sala-Alvarez J, Vázquez-Grau G. Statistical reference criteria for adaptive signal processing in digital communications. IEEE Trans. Signal Process. 1997;45(1):14–31.

231. Meyer ME, Gokhale DV. Kullback–Leibler information measure for studying convergence rates of densities and distributions. IEEE Trans Inf Theory. 1993;39(4):1401–1404.

232. R. Vidal, B. Anderson, Recursive identification of switched ARX hybrid models: exponential convergence and persistence of excitation, Proceedings of the Forty-Third IEEE Conference on Decision and Control (CDC), 2004.

233. C.-A. Lai, Global optimization algorithms for adaptive infinite impulse response filters, Ph.D. Dissertation, University of Florida, 2002.

234. B. Chen, J. Hu, H. Li, Z. Sun, Adaptive FIR filtering under minimum error/input information criterion. The Seventeenth IFAC Word Conference, Seoul, Korea, July 2008, pp. 3539–3543.

235. Kailath T, Hassibi B. Linear Estimation NJ: Prentice Hall; 2000.

236. Hyvarinen A, Karhunen J, Oja E. Independent Component Analysis New York: Wiley; 2001.

237. Cardoso JF, Laheld BH. Equivariant adaptive source separation. IEEE Trans. Signal Process. 1996;44(12):3017–3030.

238. Schetzen M. The Volterra and Wiener Theories of Nonlinear Systems New York: Wiley; 1980.

239. Bershad NJ, Celka P, Vesin JM. Stochastic analysis of gradient adaptive identification of nonlinear systems with memory for Gaussian data and noisy input and output measurements. IEEE Trans. Signal Process. 1999;47:675–689.

240. Celka P, Bershad NJ, Vesin JM. Stochastic gradient identification of polynomial Wiener systems: analysis and application. IEEE Trans. Signal Process. 2001;49:301–313.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset