lanxicy.com

第一范文网 文档专家

第一范文网 文档专家

http://www.paper.edu.cn

A Fast Neural Networks Back Propagation Algorithm Based on Dynamics Analogy1

Wang Teng?[1,3] Yan Guirong[1] Han Yuhang[2]

Key Laboratory of Mechanical Structural Strength And Vibration, Xi’an Jiaotong University, 710049,China; 2 China Academy of Engineering Physics, Mianyang ,62900,China 3 Taiyuan Branch of the China Coal Ming Research Institute ，Taiyuan, 030006，China

1State

Abstract. A new fast neural network Back Propagation Algorithm, MultiDegree-of-Freedom Backpropagation (MDFBP) was proposed to improve the convergence rate of Back propagation Momentum Algorithm. The underlying idea is to consider the process for weight coefficients of neural networks as a particle movement on the MSE supersurface in the gravitational field; by using the Tailor’s approximation, the differential equations can be simplified as a linear Multi-Degree-of-Freedom system at each time; the stability analysis of the MDFBP algorithm is presented and the parameters are determined according the linear system theory based on the stability condition. The simulation results show that the initial convergence rate of MDFBP is much faster than that of the Back propagation Momentum algorithm. The significance of the paper is that it point out that a multi-layer neural network is equivalent to a time-varying linear quadratic function..

Keywords: Backpropagation; learning algorithm; Multi-Degree-of-Freedom Back Propagation algorithm; Multi-layer neural networks

1 Introduction

Multilayer feed forward neural network is the most popular connectionist model that has been playing a central role in applications of neural network. The standard

The Natural Science Fund of China, Grant No. 10276032, supports this research. Wang Teng was with the State key laboratory of mechanical structural Strength And Vibration, Xi’an Jiao Tong University Xi’an between 2002-2006, and now is with the Taiyuan Branch of the China Coal Ming Research Institute , China , 030006 (phone: +86 0351 7685007; fax: +86 0351 7685564; e-mail: wangteng40@ gmail.com). Yan Guirong, Prof, is with the State key laboratory of mechanical structural Strength And Vibration, Xi’an Jiaotong University Xi’an, China ? Corresponding author E-mail address: wangteng40@gmail.com (Wang Teng)

1

http://www.paper.edu.cn

back propagation algorithm (SBP) suffers from a number of problems [1-7]. The main problems are: it is extremely slow, training performance is sensitive to the initial conditions, it may be trapped in local minima before converging to a solution, oscillations may occur during learning (this usually happens when the learning rate is high), and, if the error function is shallow, the gradient is very small leading to small weight changes. For these reasons in the past few years, a number of improvements to SBP have been proposed in the literature. The inclusion of a momentum term has been found to increase the rate of convergence dramatically [8]. That is, the modification of the weight vector at the current time step depends on both the current gradient and the weight change of the previous step. Intuitively, the rationale for the use of the momentum term is that the steepest descent is particularly slow when there is a long and narrow valley in the error function surface. In this situation, the direction of the gradient is almost perpendicular to the long axis of the valley. The system thus oscillates back and forth in the direction of the short axis, and only moves very slowly along the long axis of the valley. The momentum term helps average out the oscillation along the short axis while at the same time adds up contributions along the long axis [8]. This BPM (Back propagation Momentum) algorithm is popular and used for many applications. Unfortunately, its convergence rate is relatively slow, especially for networks with more than one hidden layer. The reason for this is the saturation behavior of the activation function used for the network layers. Since the output of a unit exists in the saturation area, the corresponding descent gradient takes a very small value, even if the output error is large, leading to very little progress in the weight adjustment [9] The problem of improving the efficiency and convergence rate of the BPM algorithm has been investigated by a number of researchers. In general, there are two underlying idea. The first is to vary the LR (learning rate) and MF (momentum factor) by a fixed step based on the observation of the error signals. For these algorithms, a large number of trial runs are required before arriving at the right learning parameter values. Jacobs [10] proposed dynamically varying the LR and MF by a fixed step based on the observation of the error signals. This approach has been proven to work well for many cases, but the fixed step may lead the system to jump to undesirable areas of weight space, resulting in dramatic divergence. The individual LR for each component of the weights vector need to be determined separately, and like the SBP algorithm, convergence rates are very slow. In addition, a large number of trial runs are required before arriving at the right learning parameter values. The second is to use the optimization theory. Charalambous proposed conjugate gradient (CG) algorithm [11], the CG method has been shown to be superior to the steepest decent in most application [12]. However, the conjugate gradient method requires more storage of intermediate results than the momentum method, and is nonlocal in the sense that the information needed to update a weight is not all contained in the pre- and post-synaptic units of the weight. This makes the algorithm less biologically plausible and harder to implement on hardware. In addition the conjugate gradient method is less robust than the momentum method when the error surface is

2

http://www.paper.edu.cn

relatively flat, and when it is very different from a quadratic form in most parts of parameter space [14], perhaps for these reasons, the momentum method appears to be dominant in the connectionist learning literature. Ooyen and Neinhuis [13] proposed a different error cost function, and the use of a second-order Newton’s method to optimize the terms. This approach uses the Pseudo inverse of the Hessian matrix, however the storage requirements increase quadratically with the number of weights, making this method impractical for large problems. Another disadvantage of most acceleration techniques is that often they must be tuned to fit a particular application. About the rigorous studies of BPM mechanisms, [14] point out that in the limit of continuous time, the momentum parameter is analogous to the mass of Newtonian particles that move through a viscous medium in a conservative force field on the surface. The behavior of the system near a local minimum is equivalent to a set of coupled and damped harmonic oscillators. The momentum term improves the speed of convergence by bringing some eigen components of the system closer to critical damping. By utilizing a discrete approximation to this continuous system, Qian also derives the conditions for stability of the algorithm. [14] analyzes the effect of momentum on steepest descent training for quadratic performance functions and demonstrates that there always exists a momentum coefficient that will stabilize the steepest descent algorithm, regardless of the value of the learning rate and how the value of the momentum coefficient changes the convergence properties of the algorithm. The paper [15] pointed out that the so called momentum method, much used in the neural network literature as an acceleration of the backpropagation method, is a stationary version of the conjugate gradient method. Connections with the continuous optimization method known as heavy ball with friction are also made. In fact, early papers on ‘continuous iterative methods’ and ‘analog computing’ [1618] proposed the use of dynamical systems to compute the solution of various optimization problems. Specifically, [16] investigates the idea of using a dynamical system that represents a Heavy Ball with Friction (HBF), moving under Newtonian dynamics in a conservative force field. Later, a more detailed analysis of this system was carried out by Attouch et al. [19-21]. Even though papers [13-15] related the BPM algorithm to HBF, the continuous version of the BPM algorithm actually is a first order system (SBP algorithm) combining with a damping force, not a true second order system (HBF). Even that, their analyses showed the possibility to develop a second order BP algorithm by the dynamical theory. Hence the main aim of this paper is to develop a new algorithm by the dynamical theory to reduce the training time by increasing the convergence rate and decreasing learning stalls whilst maintaining the simplicity and efficiency of the SBP algorithm. The underlying idea is to consider the process for the weight coefficients as a particle movement on the MSE (Mean Square Error) supersurface in gravitation field; by using the Tailor’s approximation, the particle movement differential equations can be simplified as a linear Multi-Degree-of-Freedom system at each time; and the parameters are determined according the linear vibration system theory. A new fast neural network Back Propagation Algorithm, MDFBP (Multi-Degree-of-Freedom Back Propagation) is proposed, the parameters are determined according the linear system

3

http://www.paper.edu.cn

theory. The simulation results show that the initial convergence rate of MDFBP is faster than the Back propagation Momentum Algorithm. The paper is organized as follows. In next section, the MDFBP algorithm is proposed by the particle dynamics analogy. In section 3, the stability analysis of the MDFBP algorithm is presented. The parameters are determined by vibration theory and the stability condition. The simulation results are presented in section 4. Conclusions are drawn in section 5.

2 MDFBP algorithm derivation

2.1 A novel idea A Multilayer feed forward neural network can be described as a nonlinear function

y = f ( w, x ) ,

where y ∈ R , w ∈ R N , x ∈ R N , N is the total number of weight coefficients, N is the length of the input vector. The corresponding desired output (target) and network output of the kth input vector x k is d k and yk , respectively. The least-square objective function in the weight space of the networks is

MSE = E ( ek2 ) = E ( d k ? yk ) .

2

The MSE function is a supersurface, the BP algorithm for multilayer networks is a gradient descent procedure used to minimize a least-squares objective function. We can imagine that, the supersurface is located in the gravitational field, and a particle on the surface is moving under the influence of gravity. Apparently with the particle mechanical energy exhausted, the particle will move to and stay on the lowest point of the supersurface. Therefore, the movement of the particle is a track from the initial point to the optimal point. A new algorithm can be developed from the particle differential equation. 2.2 Derivation of the particle differential equation The iteration process for optimal vector can be regarded as a time series. The function f can be linearized on w j ?1 at time j-1 by Tailor formula. The corresponding network output of the kth input vector x k at time j is

' yk，j = f ( w j , x k ) ≈ f ( w j ?1 , x k ) + f w ( w j ?1 , x k )

T

(w

j

? w j ?1 ) + o w j ? w j ?1 .

(1)

Where

4

http://www.paper.edu.cn

' f w ( w j ?1 , x k ) =

?f ?w

,

w = w j ?1

and “ o w j ? w j ?1 ”is the remainder term of Taylor's formula. A vector w*j can always been found from the equation (1):

' d k ? f ( w j ?1 , x k ) ? o w j ? w j ?1 = f w ( w j ?1 , x k ) T

(w

* j

? w j ?1 ) .

Then, we have

' d k ? yk , j = d k ? f ( w j ?1 , x k ) ? f w ( w j ?1 , x k ) T

(w

j

? w j ?1 ) ? o w j ? w j ?1

T

' = d k ? f ( w j ?1 , x k ) ? o w j ? w j ?1 ? f w ( w j ?1 , x k ) ' = f w ( w j ?1 , x k ) T

(

)

(w

j

? w j ?1 )

( w* j ? w j ?1 ) ? f w' ( w j ?1 , xk ) (w

2

T

( w j ? w j ?1 )

.

' = f w ( w j ?1 , x k )

T

* j

?wj)

Then, in the neighborhood of w j , the function MSE can be described as:

' ' y j = MSE = E ( ek2，j ) = E ( d k ? yk，j ) = ( w *j ? w j ) E f w ( w j ?1 , x k ) f w ( w j ?1 , x k ) T

(

T

)(w ? w ) ,

* j j

' ' let v = w j ? w *j ， R j = E f w ( w j ?1 , x k ) f w ( w j ?1 , x k )

(

T

) , we have

( 2)

y j = vT R j v

Equation (2) shows that the network at time j can be regarded as a quadratic function. Apparently, w *j is a transient optimum point of the quadratic function at time j (see Fig.1).

5

http://www.paper.edu.cn

Fig.1 The motion of a particle on the hypersurface in gravitational field

Now we imagine that the supersurface is located in the gravitational field (see Fig.1). A particle with the mass of m on the surface is moving under the influence of gravity. At time j, the particle’s kinetic energy T, potential energy U and Lagrange function L [23] are

T = m ( y 2 + vT v ) 2 ， U = mgy j and L = T ? U , j respectively, where g is a universal gravitational constant. Substitute these expressions for T, U and L in Lagrange equation [23], we have dy j dy j dy j = 2R j v , y j = 2 vT R j v , = 2R j v , = 2R j v , dv dv dv

dy j ?L = mv + my j , = mv + 2my j R j v ?v dv dy j dy j ?L = my j ? mg , = 2my j R j v ? 2mgR j v ?v dv dv v + (g + yj ) dy j dv =0

Because of (2), equation (3) can be rewritten as

v + 2( g + yj ) R j v = 0

(3)

6

http://www.paper.edu.cn

2.3 Derivation of new algorithm During time j and j+1, when w *j and R j being frozen, equation (3) can be considered as a linear system. After introducing a viscous force into (3) and let μ j = g + y j , we have

v + 2ξ j v + 2 μ j R j v = 0

(4)

where ξ j is a damping parameter. Because w j , w*j and R j be frozen, we have:

v = w j +1 ? 2w j + w j ?1 ， dy j dv

' = 2R j v = ?2 E ek，j f w ( w j , x k ) .

(

)

Then equation (4) can be changed as 1? ξ j 2 2 ' w j +1 = wj ? w j ?1 + μ j E ek , j f w ( w j , x k ) , 1+ ξ j 1+ ξ j 1+ ξ j

(

)

(5)

which is called as MDFBP algorithm. Because of lim w j +1 = lim w j = lim w j ?1 , thus, we have

j →∞ j →∞ j →∞

' lim E ek , j f w ( w j , x k ) = 0 j →∞

(

)

Which shows that equation (5) converges to the local minima of the network.

3 Selection of parameters

3.1 The stability condition of MDFBP algorithm The parameters ξ j and μ j must satisfy certain stability condition in order to achieve a stable convergence for the mean weight vector towards its optimal value. During a sampling period, equation (4) can be considered as a linear system. According to the theory of vibration [23], equation (4) can be rewritten as

l + 2ξ j I + 2μ j Λ j l = 0 Q j T R j Q j = Λ j = diag ( λ1, j , λ2, j , , λN , j ) , and l = [l1 , l2 ,

(6)

by substituting v = Q j l for v and v , where Q j is the matrix of eigenvectors of R j ,

, λi , j ,

, lN ] .

T

Let λmin, j = λ1, j < λ2, j <

< λN , j = λmax, j .

The differential equation for the i-th mode of vibration [23] is

7

http://www.paper.edu.cn

li + 2ξ j li + 2 μ j λi , j li = 0

(7)

where li is the i-th components of the vector l and λi , j is the i-th eigenvalue of the autocorrelation matrix R j . We can consider (7) in the discrete case. According to the central difference method, (7) can be rewritten as

l?i , j +1 = 1? ξ j ? 2μ j 2 ? li , j ? li , j ?1 ? λi , j l?i , j 1+ ξ j 1+ ξ j 1+ ξ j

Supplying the dummy equation l?i , j = l?i , j , we can rewrite the equation in matrix form

? l?i , j ? ? l? ? ? l? ? ? ? = A ? i , j ?1 ? = Ak ? i ,0 ? ? l? ? ? l? ? ? l? ? ? i ,1 ? ? i , j +1 ? ? i, j ?

(8)

where l?i , j stands for discrete value of li at the time j, and the matrix A is given by:

0 ? A=? ?1 ? ? (1 ? ξ j )(1 + ξ j ) ? 2 (1 + ξ j ) ? μ j λi , j (1 + ξ j )

?1

(

1

?1

)

? ? ? ?

The convergence of l?i , j is determined by the eigenvalues of matrix A:

λ

?1 ? i,? ? ?2?

? 1 μ j λi , j =? ? ? 1+ ξ 1+ ξ j j ?

? ?± ? ?

? 1 μ j λi , j ? ? ?1+ ξ 1+ ξ j j ?

? 1?ξ j ? ? ? 1+ ξ j ?

2

(9)

To ensure convergence, we require λi ,1 < 1 and λi ,2 < 1 . Then we obtain the condition of stability of MDFBP (for the limitation of the paper's length, the details are shown in appendix A):

2 2 ? ?0 < μ j λmax, j < 2, ξ j > 0 ? ? ? 1 ? μ λ ? j max, j ) + ξ j < 1 ? ?? ? ∪ ? ?( ? ? 2 ? ?(1 ? μ j λmax, j ) + ξ j 2 ≥ 1? ? ?0 < ξ < 1 ? j ?? ? ?? ?

(10)

3.2 Selection of parameters ξ j and μ j The vibration theory [23] points out that for a damped linear vibration system, only when its relative damp factor equals critical damping (one), its vibration trades off fastest. In general, the eignvalues of a Multi-Degree-of-Freedom system are different each other, it is impossible to make all vibrations work under the critical damping

8

http://www.paper.edu.cn

condition with a single damp factor. We can only make all vibrations working as possible as close to critical damping. The damping factor of ith vibration in (7)

ξi , j =

There is the relation:

ξj

2μ j λi , j

;

ξj

2μ j λmin, j

> ξi , j =

ξj

2μ j λi , j

>

ξj

2μ j λmax, j

.

If there is the relation:

ξ j ≥ 2μ j λmax, j ，

all of modals will converge under the condition of over damped. In general, we choose the equation

ξ j = ρ 2 μ j λmax, j ， ? ρ ≥ 1

convergence rate. Note that ρ > 1 , from equation (10), we have

(11)

for ξ j to make all modals working about the critical damping and obtain a faster

0 < μj <

2

λmax, j

In general, the calculation of the largest eigenvalue of the correlation matrix is quite expensive. From equation (2), we know the correlation matrix R is positive definite. So, we can use the trace of R as the approximation of λmax, j . To prevent from the overflow in iteration, let

g + yj g + yj

max

μ j = rj

In general, MDFBP algorithm can be concluded as

9

http://www.paper.edu.cn

1? ξ j ? 2 2 ' μ j E ek , j f w ( w j , x k ) wj ? w j ?1 + ?w j +1 = 1+ ξ j 1+ ξ j 1+ ξ j ? ? ? μ = r g + y j , e = d ? y , y = E e2 ? 2e 2 + e 2 ( k , j k , j ?1 k , j ? 2 ) j k, j k k, j j ? j g + yj ? max ? T ' ' ?ξ j = ρ 2 μ j λmax, j , λmax, j = E f w ( w j ?1 , x k ) f w ( w j ?1 , x k ) ，? ρ ≥ 1 ? 2 2 ? 2 ? y j = E ( ek , j ) = E ( d k ? yk , j ) , yk , j = f ( w j , x k ) , 0 < μ j < λ max, j ?

(

)

(12)

(

)

' ' where f w ( w j ?1 , x k ) and f w ( w j , x k ) can be calculated with chain rule, which is used

in SBP algorithm. From equation (12) and (4), we can find that the MDFBP algorithm is very different from the BPM algorithm, because it is a true second order dynamical system in continuous version. The computational complexity of the new algorithm is almost the same as BPM algorithm.

4 simulations

Fig. 2. The neural network to be identified

10

http://www.paper.edu.cn

(a)

(b)

Fig. 3. The weight coefficients (a) and MSE (b) curves of MDFBP and BPM algorithms

11

http://www.paper.edu.cn

The network in Fig.2 is the plant to be identified; it has the weights and offset val1 1 2 2 ues: w1,1 = 10 ， w1 = 10 ， b11 = ?5 ， b2 = 5 ， w1,1 = 1 ， w1,2 = 1 ， b 2 = 1 .We 2,1

1 2 use a model that has the same structure as Figure 1 to identify the plant. w1,1 ， w1,1

are the weight coefficients to be identified. The input sample is a sequence of 40 values evenly distributed between –2 and 2. Figure 3 is the simulation results of MDFBP and Momentum BP algorithm. The iteration parameters of MDFBP algorithm are rj = 0.89 λmax, j , g = 0.01 λmax, j . The iteration parameters of BPM algorithm are MF = 0.8 , LR = 479 . Figure 3 shows that convergence rate of MDFBP algorithm is much faster than that of BPM algorithm. The simulation results show that the new algorithm has a great potential to improve the convergence rate of the Backpropagation Momentum algorithm. From Fig.3, we can find that the initial convergence rate of the MDFBP algorithm is much higher than the BPM algorithm, after the initial stage ( j > 600 ), the convergence rate of the former gets slower than the later. The fact shows that the MDFBP algorithm is slow when the particle is in a long and narrow valley in the error function surface. The problem is expected to be overcome by the particle initial velocity.

5 Conclusions

In this paper, a new algorithm is presented to improve the convergence rate of Backpropagation momentum Algorithm. The underlying idea is to consider the process for weight coefficients of neural networks as a particle movement on the MSE supersurface in the gravitation field; by using the Tailor’s approximation, the particle e differential equations can be simplified as a linear Multi-Degree-of-Freedom system at each time; and the parameters are determined according the linear system theory based on the fact. The simulation results show that the initial convergence rate of MDFBP algorithm is much faster than that of BPM algorithm; the later convergence rate of MDFBP algorithm is slower than the later. It means that the new algorithm has a great potential to improve the convergence rate of the Backpropagation momentum algorithm. The next work we must do is to use the initial velocity to improve the later stage convergence rate of the MDFBP algorithm.

References

1. 2.

Martin T. Hagan, Howard B, Demuth, Neural Network Design, PWS Publishing Company, 1996

S. E. Fahlman, an empirical study of learning speed in back propagation networks, tech.rep., CMU-CS-88-162,Caenegie Mellon University,Pittsburgh.,1988.

3.

M. Riedmiller, Advanced supervised learning in multi-layer perceptrons from backpropagation to adaptive learning algorithms, Computer Standards and Interfaces Special Issue on Neural Networks,vol. 16, no. 3 (1994), pp. 265-275,.

12

http://www.paper.edu.cn

4.

5. 6. 7.

8. 9. 10 11

12 13 14 15 16 17 18

M. Riedmiller and H. Braun, A direct method for faster backpropagation learning: The RPROP Algorithm, IEEE International Conference on Neural Networks 1993 (ICNNS3), pp. 586-591. D. Sarkar, Methods to speed up error back propagation learning algorithm, ACM Computzng Surveys, vol. 27, no. 4 (1995),pp. 519-542,. F. F. Silva and L. B. Almeida, Speeding Backpropagation, pp. 151-158. Elsevier Science Publishers B.V. (North-Holland): Advanced Neural Computers, 1990. Amr Radi and Riccardo Poli, Discovery of Backpropagation Learning Rules Using Genetic Programming, IEEE International Conference on Evolutionary Computation, 1998,pp. 371-375. McClelland, J.L., & Rumelhart, D.E. Parallel distributed processing (Vol. 2,1986). Cambridge, MA: MIT Press. Y.H. Zweiri, J.F. Whidborne, L.D. Seneviratne, A three-term backpropagation algorithm, Neurocomputing 50 (2003) 305 – 318 A. Jacobs, Increasing rate of convergence through learning rate adaptation, Neural Networks 1 (4)(1988) 295-307. C. Charalambous, Conjugate gradient algorithm for efficient training of artificial neural networks, IEEE Proceedings, vol.139, no.3 (1992) 301-310.

Press, W.H., Teukolsky, S.A., Vetterling, W.T., & Flannery, B.P. Numerical recipes in C. Cambridge, UK: Cambridge University Press, 1992. A.O. Ooyen, B. Neinhuis, Improving the convergence of the backpropagation algorithm, Neural Networks 5 (1992) 465–471. Ning Qian, On the momentum term in gradient descent learning algorithms, Neural Networks 12 (1999) 145–151 Manabu Torii and Martin T. Hagan, Stability of Steepest Descent With Momentum for Quadratic Functions, IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002 Amit Bhaya, Eugenius Kaszkurewicz, Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method. Neural Networks 17 (2004) 65–71 Polyak, B. T. Some methods of speeding up the convergence of iterative methods. USSR Computational Mathematics and Mathematical Physics, 4(5), (1964) 1–17. Rybashov, M. V.. Method of differential equations in the problem of finding the extremum of a function using analog computers. Automation and Remote Control, 30(5), (1969) 181–194. Tsypkin, Y. Z.. Adaptation and learning in automatic systems (Vol. 73). Mathematics in science and engineering, New York: Academic Press, first published in Russian under the title Adaptatsia i obuchenie v avtomaticheskikh sistemakh, 1968, Moscow: Nauka. Attouch, H., Goudou, X., & Redont, P. The heavy ball with frictionmethod, I. The continuous dynamical system: global exploration of the global minima of a real-valued function by asymptotic analysis of a dissipative dynamical system. Communications in Contemporary Mathematics, 2(1), (2000) 1 A. Cabot, Inertial Gradient-Like Dynamical System Controlled by a Stabilizing Term, JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 120, No. 2, (2004) pp. 275–303 . Hedy Attouch and Marc-Olivie rCzarnecki, Asymptotic Control and Stabilization of Nonlinear Oscillators with Non-isolated Equilibria, Journal of Differential Equations 179, (2002) 278–310. W. T. Thomson, Theory of Vibration with Aplications, Prentice-Hall, 1996.

19

20

21

22

23

13

http://www.paper.edu.cn

APPENDIX A The Stability Analysis of The MDFBP Algorithm We can consider each of (6) in the manuscript separately in the discrete case. Because μ j is a constant during a sampling period, (6) in the discrete case can be considered as a linear system. According to the central difference method, the ith of (6) can be rewritten as

l?i , j +1 =

1? ξ j ? 2μ j 2 ? λi , j l?i , j li , j ? li , j ?1 ? 1+ ξ j 1+ ξ j 1+ ξ j

Supplying the dummy equation l?i , j = l?i , j , we can rewrite the equation in matrix form

? l?i , j ? ? l? ? ? l? ? ? ? = A ? i , j ?1 ? = A k ? i ,0 ? ? l? ? ? l? ? ? l? ? ? i ,1 ? ? i , j +1 ? ? i, j ? where l?i , j stands for discrete value of li at the time j, and the matrix A is given by:

? 0 ? A = ? 1?ξ j ? ?1+ ξ j ? ? ? ? 1 ?? μj λ 2? ? ? 1 + ξ 1 + ξ i, j ? ? ? j j ? ?? 1

This system can be looked as a linear dynamic system that will be stable if the eigenvalues of A are less than one in magnitude during each sampling period. The convergence of l?i , j is determined by the time varying eigenvalues of matrix A:

λ

?1 ? i,? ? ?2?

? 1 μ j λi , j =? ? ? 1+ ξ 1+ ξ j j ?

? ?± ? ?

? 1 μ j λi , j ? ? ?1+ ξ 1+ ξ j j ?

? 1?ξ j ? ? ? 1+ ξ j ?

2

(A.1)

Next, we’ll determine the eigenvalues of matrix A. Let

? 1 μ j λi , j A=? ? ?1+ ξ 1+ ξ j j ? ? 1?ξ j ?, B = ? 1+ ξ j ?

(A.2)

and we’ll do this in stages.

Case Ⅰ: A2 ? B ≥ 0

The eigenvalues of A should satisfy

14

http://www.paper.edu.cn

? A2 ? B ≥ 0, ? ? 2 ? A± A ? B <1 ?

(a ) (b)

(A.3)

Substituting (A.2) into（a）of (A.3), we obtain

(1 ? μ λ )

j i, j

2

+ ξ j2 ≥ 1

(A.4)

Substituting (A.2) into（b）of (A.3), we obtain

?1 < A ± A 2 ? B < 1

(A.5)

Equation (A.5) can be rewritten as:

?( A ? 1) ± A2 ? B < 0 ? ? ?( A + 1) ± A2 ? B > 0 ?

(a) (b)

(A.6)

In (a) of (A.6) ， if A ? 1 > 0 ， ( A ? 1) + A2 ? B < 0 can't be satisfied, we must have

? A ?1 < 0 ? ? 2 ? A ?1 + A ? B < 0 ?

(A.7)

In (b) of (A.6)，if A + 1 < 0 ， A + 1 ? A2 ? B > 0 can't be satisfied, we must have

?A +1 > 0 ? ? 2 ? A ?1 ? A ? B > 0 ?

(A.8)

Using (A.7) and (A.8) we can rewrite this case stability condition as：

??1 < A < 1 ? ? 2 ? A +1? A ? B > 0 ? 2 ? A ?1 + A ? B < 0 ?

(a) (b) (c)

(A.9)

From (b) of (A.9), we obtain

B > ? ( 2 A + 1)

(A.10)

Substituting (A.2) into (A.10), we obtain

μ j λi , j < 2

From (c) of (A.9), we obtain

(A.11)

15

http://www.paper.edu.cn

B > 2 A ?1

(A.12) (A.13)

Substituting (A.2) into (A.12), we obtain

μ j λi , j > 0

From (a) of (A.9), note that ξ j > 0 , we obtain

ξj > 0

(A.14)

from (A.11) , (A.13) and (A.14), note (A.4), we can obtain the caseⅠ stability condition as：

? ?0 < μ j λi , j < 2 ? ?ξ j > 0 ? 2 ?(1 ? μ j λi , j ) + ξ j 2 > 1 ?

(A.15)

Case Ⅱ: A2 ? B < 0

The eigenvalues of A should satisfy

? A2 ? B < 0, ? ? 2 ? A± i B ? A <1 ?

(a ) (b)

(A.16)

where i 2 = ?1 . Substituting (A.2) into（b）of (A.16), we obtain 0 < ξj <1 Substituting (A.2) into（a）of (A.16), we obtain (A.17)

(1 ? μ λ )

j i, j

2

+ ξ j2 < 1

(A.18)

From (A.17) and (A.18) , we can obtain the caseⅡ stability condition as：

?(1 ? μ λ )2 + ξ 2 < 1 ? j i, j j ? ?0 < ξ j < 1 ?

(A.19)

Using (A.15) and (A.18), we obtain the stability condition is

16

http://www.paper.edu.cn

?? ? 2 ? ?0 < μ j λi , j < 2 ? ?? 2 ? ? ?ξ > 0 ? ∪ ? ?(1 ? μ j λi , j ) + ξ j < 1? ? j ? ? ? ? 0 < ξ <1 ? ? j 2 ? ? ?? ? ?(1 ? μ j λi , j ) + ξ j 2 > 1? ? ? ?? ?

(A.20)

Fig.A1 give the figure of (A.20). In Fig.A1, the yellow area stands for caseⅠ, the green area stands for case Ⅱ.

Fig. A1. The ith mode stability condition of MDFBP algorithm

Stability condition of the MDFBP

The equation (A.20) should be suitable for all of eigenvalues of matrix A, so the stability condition of the MDFBP algorithm is

?? ? 2 ? ?0 < μ j λmax, j < 2 ? ?? 2 ? ? ?ξ > 0 ? ∪ ? ?(1 ? μ j λmax, j ) + ξ j < 1? ?? j ? ? ?0 < ξ < 1 ? j 2 ? ? ?? ? ?? 2 ? ?(1 ? μ j λmax, j ) + ξ j > 1? ?? ?

(A.21)

Equation (A.21) is the common area in Fig.A2.

17

http://www.paper.edu.cn

Fig. A2. The stability condition of the MDFBP algorithm

18

相关文章:

- BP神经网络算法及应用_图文
- 课程作业——
*BP 神经网络算法*及应用 学班姓学 院:...*BP神经网络*本质上是*一个*非线性优化问题,它可以 在...对那些*快速*训练*算 法*,缺省参数值基本上都是最有效...

- 基于BP神经网络的预测模型_图文
- 神经网络工具箱进行
*模拟*训练,建立了*基于**BP 神经网络*...1981--2007 年我国铁矿石消费量*计算*结果见 表*1*...训练函数:为了确定最*快捷*准确的训练函数,本文采用...

- 基于MATLAB的BP神经网络应用
- 网络系统,它反映了人脑功能的许多基本特征,是
*一个*高度复杂的非线性*动力学*系 统...*神经网络*工具箱提供了很多经典的*学习算法*,使用它能够*快速*实现对实际问 题的建模...

- 基于BP神经网络的PID控制系统设计
*BP神经网络*的PID控制系统设计 摘要 本文主要研究*一个*...从而形成非线性*动力学*系统, 以表示某些被控对象的...4、采用信息的分布式并行处理,可以进行*快速*大量*计算*。...

- 基于BP神经网络的函数拟合算法研究
*基于BP 神经网络*的函数拟合*算法*研究 [摘要] 人工...卷积特性等基本性质决定了它 是*一种*很好的函数拟合...因而能 够有较好的耐故障能力和较快的总体处理能力...

- 基于DSP的BP神经网络的实时实现
- 4.
*基于*DSP*的 BP 神经网络*训练的 MATLAB*计算*实时实现*基于*DSP 芯片是*一种*具有特殊结构的适合进行数字信号处理运算的微处理器。为 了实时*快速*地实现各种数字...

- BP神经网络的优缺点
- 是在
*模拟*人脑神经组织的基础上发展起来*的计算*系统, ...最近由于人工神经网络*的快速*发展, 它已经成为模式识别...*BP 神经网络*结构的选择至今尚无*一种*统一而完整的...

- BP神经网络的特点
- 是在
*模拟*人脑神经组织的基础上发展起来*的计算*系统,...最近由于人工*神经网络的快速*发 展,它已经成为模式...多层前向*BP*网络是目前应用最多的*一种神经网络*...

- BP神经网络及深度学习研究 - 综述
- 重点研 究
*BP 神经网络*原理、*BP 神经网络算法*分析...*一种*抽象和*模拟*, 是人们模仿人的大脑神经系统信息...*快速*自适应*学习算法*等,这些方案均提高 6 BP 神经...

- 基于BP神经网络的故障诊断方法
- 它具有大规模并行
*模拟*处理、连续时 间*动力学*和网络...由多个传感器采 集信号,分别经过*快速*傅里叶变换后...2.1*基于**BP**算法的神经网络*模型本文采用的是 3 ...

更多相关标签: