Adaptive Convex Combination Filter under Minimum Error Entropy Criterion Siyuan Peng1, Zongze Wu2, Yajing Zhou2, Badong Chen3 2 1 Electronic and Information Engineering, South China University of Technology, Guangzhou, China Institute of Automation and Radio Engineering, Guangzhou University of Technology, Guangzhou, China 3 Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, China E-mail: [email protected], [email protected] Abstract—Minimum error entropy (MEE) is a robust adaption criterion and has been successfully applied to adaptive filtering, which can outperform the well-known minimum mean square error (MSE) criterion especially in the present of non-Gaussian noise. However, the adaptive algorithms under MEE are still subject to a compromise between convergence speed and steadystate mean square deviation (MSD). To address this issue, we propose in this paper an adaptive convex combination filter under MEE (CMEE), which is derived by using a convex combination of two MEE-based adaptive algorithms of different step-sizes. Monte Carlo simulation results confirm that the new algorithm can achieve fast convergence speed while keeping a desirable performance. Keywords-MEE; CMEE; non-Gaussian noise I. INTRODUCTION Due to its mathematical tractability and simplicity, both in terms of easiness of implementation and computational load, the least mean square algorithm (LMS), which is based on the minimum mean square error (MSE) criterion, has been widely used in many potential applications, such as signal processing, system identification, acoustic echo cancellation, blind equalization, and so on [1]. As a simple stochastic-gradientdescent algorithm, the LMS algorithm seriously suffers from the low convergence rate especially when the input signal is correlated. In order to alleviate this problem, some combination schemes have been successfully applied to improve convergence speed while keeping an excellent performance. In previous studies, most existing combination adaptive filtering algorithms are proposed based on the well-known MSE criterion. For instance, Jeronimo et al. developed a combination of one fast and one slow LMS filter that was effective at combining fast convergence and low misadjustment [2], and the detailed mean square performance analysis of this algorithm for both stationary and nonstationary situations based on energy conservation arguments has been developed in [3]. Bijit et al. introduced an alternative method to deal with variable sparsity by using an adaptive convex combination of the LMS and the zero-attractor LMS algorithms [4], and Jeronimo et al. also proposed an adaptive combination of proportionate filters for sparse echo cancellation [5]. Moreover, several adaptive combination algorithms based on the recursive least squares algorithm (RLS) were also developed in [6-8]. As is well-known that most of these above linear adaptive filters under MSE criterion achieve their optimality only when the underlying system is linear and Gaussian. In most practical applications, the Gaussian assumption does not hold however. The minimum error entropy (MEE), on the other hand, has provided a robust alternative criterion for addressing nonGaussian signal processing [9-10]. The MEE, with a nonparametric estimator based on Parzen window to calculate the error entropy directly from the error samples, can be used for adaptive system training. Compared with the MSE criterion, the MEE can achieve better performance in adaptive filtering, mainly when the system is contaminated by non-Gaussian noises [11-14]. A key problem of the MEE-based adaptive filter algorithms is that the selection of the step-size requires a compromise between convergence speed and accuracy. Combination approaches provide an interesting way to deal with this issue however. In this paper, we propose a novel adaptive convex combination filter under MEE (CMEE), which consists of two independently-run filters under MEE with different step-sizes. The proposed algorithm aims to obtain both fast convergence speed (from the faster filter) and low misadjustment (from the slower filter). We also make a comparison with the convex combination filter under maximum correntropy criterion (CMEE) [15] in the simulation, and simulation results are presented to confirm the superior performance of the new algorithm. The rest of the paper is organized as follows. In section II, we develop the CMEE algorithm before briefly introducing the MEE criterion. Simulation results are shown in section III. Finally, section IV gives the conclusion. II. CONVEX COMBINATION FILTER UNDER MINIMUM ERROR ENTROPY CRITERION A. Minimum Error Entropy (MEE) Criterion Consider a linear system, in which the input vector T X (n) xnM 1 ,, xn1 , xn ) is sent over a finite-impulse-response (FIR) channel with parameter (weight) vector W * [ w1* , w2* ,, w*M ]T (M is the size of the channel memory). Then the desired signal is d (n) W *T X (n) v(n) (1) Figure 1. Performance Surface of Quadratic Information Potential. where stands for the transpose operator, and v( n) denotes the measurement noise at instant n . Let W (n) [w1 (n), w2 (n),, wM (n)]T be the weight vector of an adaptive filter. The instantaneous error can then be computed as e(n) d (n) y(n) d (n) W T (n) X (n) (2) where y (n) denotes the output signal of the adaptive filter. Consider the error e(n) as a random variable, with probability density function (PDF) pe (.) . The quadratic Renyi's entropy of the error will be [9-14] H R 2 (e) log pe2 ( )d = log V (e) (3) where V (e) pe2 ( ) d is the quadratic information potential (QIP). Clearly, minimizing the quadratic Renyi's entropy is equal to maximizing the QIP. So the optimal weight vector under MEE can be obtained as Wopt arg max V (e) (4) 1 N N V (e) Vˆ (e) 2 2 (e(i ) e( j )) N i 1 j 1 (5) W According to [9-14], a nonparametric estimator of V (e) is where N is the samples number, and () is a Gaussian kernel function with bandwidth . Based on (5), a weight vector update equation under MEE is as follows W(n 1) W(n) Vˆ(e) W(n) W(n) 2L2 2 = n n 2 (e(i, j)) i nL1 j nL1(e(i, j)) X (i) X ( j) (6) where is the step-size parameter, e(i, j ) e(i) e( j ) , and L is the sliding data length. Fig. 1 illustrates a performance surface of the QIP for a two-dimensional system. As one can see, the QIP performance surface does not display a constant curvature (as in the MSE performance surface), being flat in a region away from the optimum. Actually, QIP affects the terms in the sum differently, which in general provides different solutions Figure 2. Adaptive Convex Combination of two Adaptive Filters. that can be better than the MSE solution particularly in the presence of impulsive noises [9]. B. CMEE Algorithm The CMEE algorithm is derived by combining two independently-run MEE-based adaptive algorithms with different step-sizes (see Fig. 2). According to [15-18], the overall output and the corresponding overall weight vector can be calculated by y ( n) ( n) y1 ( n) (1 ( n)) y2 ( n) W ( n) (n)W1 (n) (1 (n))W2 (n) (7) (8) where ( n) stands for a mixing coefficient, y1 (n) , y2 (n) , W1 (n) and W2 ( n) denote, respectively, the outputs and the weight vectors of the fast filter and the slow filter. The goal is to make the mixing coefficient (n) as close to 1 as possible when the algorithm starts, and make it as close to 0 as possible when the algorithm begins to converge to steady-state. Since ( n) is restricted within the interval (0, 1), one can define it via a sigmoidal function, that is (n) sgm[ ( n)] 1 1 e ( n ) (9) Instead of using the MSE criterion, we use the MEE cost to update the parameter (n) according to the following gradient based algorithm: ( n 1) (n) = ( n) Vˆ (e) ( n) n n ( e(i, j ))( e(i, j )) 2 2 2 L i n L 1 j n L 1 2 where is a step-size parameter, and y1 (i ) y2 (i ) (i )(1 (i )) y ( j ) y ( j ) ( j )(1 ( j )) 2 1 (10) (11) In order to prevent the proposed algorithm from stopping, one can restrict the value of (n) to a certain interval [- , ], such and the adaptive filter has the same structure. The measurement noise is assumed to be mix-Gaussian, defined as [19-20] TABLE I CMEE ALGORITHM Parameters: 1 N 1,12 N 2 , 22 1 , 2 , , , , and L where is a mixture coefficient, and N i , i2 i 1, 2 denote the Initialization : (0) 0 Gaussian distributions with mean values i and variances i2 . In this simulation, one can observe that Gaussian distribution (with much larger variance) can model stronger impulsive noise. The mean square deviation (MSD) is adopted as a performance measure, given by (0) 0 W (0) 0 Update : y1 (n) W1T (n) X (n) y2 (n) W2T (n) X (n) 2 MSD=E W * W ( n) y (n) (n) y1 (n) (1 (n)) y2 ( n) e1 (n) d (n) y1 (n) e(n) d (n) y (n) 1 n n 2 (e1(i, j)) 2L2 2 inL1 jnL1 (e1(i, j)) X (i) X ( j) W2 (n 1)= 1 a W1 ( n) W2 (n) 2 2 L2 2 if (n) 2 (e2 (i , j ))(e2 (i, j )) X (i ) X ( j ) n n i n L 1 j n L 1 First, we investigate the convergence curves of CMEE and MEE with different step-sizes in mix-Gaussian noise situations. Simulation results are shown in Fig. 3, and accordingly, Fig. 4 illustrates the evolution of the mixing coefficient (n) . In the simulation the kernel width is 1.0, and the parameters 1 , 2 ,1 , 2 , in the mix-Gaussian distribution are set to (0, 0, 0.01, 10, 0.2). From Fig. 3 and Fig. 4, we can see: 1) Compared with the MEE algorithm, the CMEE algorithm can obtain much better performance in terms of convergence rate or accuracy; 2) CMEE algorithm can gradually change from the fast filter to the slow filter. ( n) else if (n) (n) end (n 1) ( n) (n 1) n n 2 (e(i, j )) 2 2 2 L in L1 j n L1 (e(i, j )) 1 1 e ( n1) W (n 1) (n 1)W1(n 1) (1 (n 1))W2 (n 1) that the value of (n)(1 (n)) is not too close to 0 (more details on this truncation procedure can be found in [15-16]). Following [15-16], the performance of the convex combination scheme can be further improved by taking advantage of the faster filter to speed up the convergence of the slower one. The modified update rule for W2 ( n) can then be expressed as n n (e(i, j ))(e(i, j )) W2 (n) 2 2 2 2 L i n L 1 j n L 1 X (i ) X ( j ) W2 (n 1)= 1 a W (n) 1 (12) where denotes a smooth factor. The pseudocodes of the proposed CMEE are presented in Table I. III. SIMULATION RESULTS (14) In the following simulations, unless stated explicitly, the input is a white Gaussian random sequence with zero mean and unit variance. The sliding data length is L 20 , and the smooth factor is 0.998. The step-size and the parameter are set at 4. Simulation results are averaged over 50 independent Monte Carlo runs, and in each simulation, 8000/4000 iterations are run to ensure the algorithm to reach the steady state. e2 (n) d (n) y2 (n) W1(n 1) =W1(n) (13) In this study, we intend to compare the convergence performance between the proposed CMEE algorithm and the MEE algorithm in non-Gaussian noise. Consider the case in which the unknown system is randomly generated with 20 taps, Second, we illustrate the convergence curves of CMEE and MEE with different kernel widths. The noise vector is same as the above simulation. Simulation results are demonstrated in Fig. 5. Again, CMEE achieves fast convergence speed while keeping a good accuracy. Third, we make a comparison with CMCC algorithm. Simulation results are shown in Fig. 6. The noise is the binary noise which is either -1 or 1 (each with probability 0.5). The parameters of the CMEE algorithm and the CMCC algorithm are set to 1 0.2, 2 0.035, 1.0 and 1 0.05, 2 0.01, 1.5 , respectively. As one can see that, compared with the CMCC algorithm, the proposed algorithm can achieve a litter lower MSD while the computation complexity is higher. IV. CONCLUSION A novel adaptive convex combination filter under MEE criterion, called CMEE, has been developed. Compared with the traditional MEE algorithm, the proposed algorithm can achieve both fast convergence speed and low misadjustment in the present of non-Gaussian noise, and its superior performance has been confirmed by simulation results. Figure 6. Convergence curves of CMEE and CMCC Figure 3. Convergence curves of the CMEE and MEE with different stepsizes. ACKNOWLEDGMENT This work was supported in part by 973 Program under grant no. 2015CB351703 and NSF of China under grants no. 61271210 and no. 61372152. REFERENCES [1] Figure 4. Evolution of the mixing coefficient (n) in CMEE. Figure 5. Convergence curves of the CMEE and MEE with different kernel widths. A. H. Sayed, Fundamentals of Adaptive Filtering, Wiley, Hoboken, NJ, USA, 2003. [2] M. Martinez-Ramon, J. Arenas-García, A. Navia-Vazquez, and A. R. Figueiras-Vidal, “An adaptive combination of adaptive filters for plantidentification,” in Proc. 14th Int. Conf. Digital Signal Processing, Santorini, Greece, 2002, pp. 1195–1198. [3] J. Arenas- García, A. R. Figueiras-Vidal, and A. H. Sayed, “Meansquare performance of a convex combination of two adaptivefilters,” IEEE Trans. Signal Process., vol. 54, no. 3, pp. 1078–1090, Mar. 2006. [4] B. K. Das and M. Chakraborty, “Sparse Adaptive Filtering by an Adaptive Convex Combination of the LMS and the ZA-LMS Algorithms”, IEEE Trans. Circuits and Systems I: Regular papers, vol. 61, no. 5, pp. 1780-1786, Jan. 2002. [5] J. Arenas- García, A. R. Figueiras-Vidal, “Adaptive combination of proportionate filters for sparse echo cancellation,” IEEE Trans. Audio,Speech Language Process., vol. 17, no. 6, pp. 1087–1098, Aug. 2009. [6] J. Arenas- García, M. Martinez-Ramon, A. Navia-Vazquez, and A. R. Figueiras-Vidal, “Plant identification via adaptive combination of transversal filters,” Signal Process., vol. 86, pp. 2430–2438, Sep. 2006. [7] M. Niedzwiecki, “Identification of nonstationary stochastic systems using parallel estimation schemes,” IEEE Trans. Autom. Control, vol. 35, no. 3, pp. 329–334, Mar. 1990. [8] M. Niedzwiecki, “Multiple-model approach to finite memory adaptive filtering,” IEEE Trans. Signal Process., vol. 40, no. 2, pp. 470–473, Feb. 1992. [9] J. C. Principe, Information Theoretic Learning: Renyi’s Entropy and Kernel Perspectives, Springer, New York, NY, USA, 2010. [10] B. Chen, Y. Zhu, J. Hu, and J. C. Principe, System Parameter Identification: Information Criteria and Algorithms, Elsevier, Amsterdam, Netherlands, 2013. [11] D. Erdogmus, and J. C. Principe, “An Error-Entropy Minimization Algorithm for Supervised Training of Nonlinear Adaptive Systems,” IEEE Trans. Signal Process., vol. 50, pp. 1780-1786, 2002. [12] B. Chen, Y. Zhu, and J. Hu, “Mean-square convergence analysis of ADALINE training with minimum error entropy criterion,” IEEE Trans. Neural Netw., vol. 21, pp. 1168–1179, 2010. [13] B. Chen, J. Hu, L. Pu, and Z. Sun, “Stochastic gradient algorithm under (h, phi)-entropy criterion,” Circuit Syst. Signal Process., vol. 26, pp. 941–960, 2007. [14] B. Chen, and J. C. Principe, “On the Smoothed Minimum Error Entropy Criterion,” Entropy, vol. 14, no. 711, pp. 2311-2323, Nov. 2012. [15] L. Shi, and Y. Lin, “Convex Combination of Adaptive Filters under the Maximum Correntropy Criterion in Impulsive Interference,” IEEE Signal Process. Lett., vol. 21, pp. 1070–1388, 2014. [16] J. Arenas-García, V. Gómez-Verdejo, and A. R. Figueiras-Vidal, “New algorithms for improved adaptive convex combination of LMS transversal filters,” IEEE Trans. Instrum. Meas., vol. 54, pp. 2239–2249, 2005. [17] M. T. M. Silva, V. H. Nascimento, “Improving the tracking capability of adaptive filters via convex combination,” IEEE Trans. Signal Process. Vol. 56, no. 7, part 2, pp. 3137–3149, Jan. 2008. [18] J. Arenas-García, and A. R. Figueiras-Vidal, “Adaptive combination of normalized filters for robust system identification,” Electronics Letters, vol. 41, no. 15, pp. 874–875, Jul. 2005. [19] W. Ma, H. Qu, G. Gui, L. Xu, J. Zhao, and B. Chen, “Maximum correntropy criterion based sparse adaptive filtering algorithms for robust channel estimation under non-Gaussian environments,” Journal of the Franklin Institute, vol 352, pp. 2708-2727, Apr. 2015. [20] S. Zhao, B. Chen, and J. C. Principe, “Kernel adaptive filtering with maximum correntropy criterion,” in Proc. Inter. Joint Conf. Neural Netw., pp. 2012–2017, Aug. 2011.
© Copyright 2024 Paperzz