当前位置:首页 >> 机械/仪表 >>

一种基于动力学模拟的BP神经网络快速算法


http://www.paper.edu.cn

A Fast Neural Networks Back Propagation Algorithm Based on Dynamics Analogy1
Wang Teng?[1,3] Yan Guirong[1] Han Yuhang[2]
Key Laboratory of Mechan

ical Structural Strength And Vibration, Xi’an Jiaotong University, 710049,China; 2 China Academy of Engineering Physics, Mianyang ,62900,China 3 Taiyuan Branch of the China Coal Ming Research Institute ,Taiyuan, 030006,China
1State

Abstract. A new fast neural network Back Propagation Algorithm, MultiDegree-of-Freedom Backpropagation (MDFBP) was proposed to improve the convergence rate of Back propagation Momentum Algorithm. The underlying idea is to consider the process for weight coefficients of neural networks as a particle movement on the MSE supersurface in the gravitational field; by using the Tailor’s approximation, the differential equations can be simplified as a linear Multi-Degree-of-Freedom system at each time; the stability analysis of the MDFBP algorithm is presented and the parameters are determined according the linear system theory based on the stability condition. The simulation results show that the initial convergence rate of MDFBP is much faster than that of the Back propagation Momentum algorithm. The significance of the paper is that it point out that a multi-layer neural network is equivalent to a time-varying linear quadratic function..

Keywords: Backpropagation; learning algorithm; Multi-Degree-of-Freedom Back Propagation algorithm; Multi-layer neural networks

1 Introduction
Multilayer feed forward neural network is the most popular connectionist model that has been playing a central role in applications of neural network. The standard
The Natural Science Fund of China, Grant No. 10276032, supports this research. Wang Teng was with the State key laboratory of mechanical structural Strength And Vibration, Xi’an Jiao Tong University Xi’an between 2002-2006, and now is with the Taiyuan Branch of the China Coal Ming Research Institute , China , 030006 (phone: +86 0351 7685007; fax: +86 0351 7685564; e-mail: wangteng40@ gmail.com). Yan Guirong, Prof, is with the State key laboratory of mechanical structural Strength And Vibration, Xi’an Jiaotong University Xi’an, China ? Corresponding author E-mail address: wangteng40@gmail.com (Wang Teng)

1

http://www.paper.edu.cn

back propagation algorithm (SBP) suffers from a number of problems [1-7]. The main problems are: it is extremely slow, training performance is sensitive to the initial conditions, it may be trapped in local minima before converging to a solution, oscillations may occur during learning (this usually happens when the learning rate is high), and, if the error function is shallow, the gradient is very small leading to small weight changes. For these reasons in the past few years, a number of improvements to SBP have been proposed in the literature. The inclusion of a momentum term has been found to increase the rate of convergence dramatically [8]. That is, the modification of the weight vector at the current time step depends on both the current gradient and the weight change of the previous step. Intuitively, the rationale for the use of the momentum term is that the steepest descent is particularly slow when there is a long and narrow valley in the error function surface. In this situation, the direction of the gradient is almost perpendicular to the long axis of the valley. The system thus oscillates back and forth in the direction of the short axis, and only moves very slowly along the long axis of the valley. The momentum term helps average out the oscillation along the short axis while at the same time adds up contributions along the long axis [8]. This BPM (Back propagation Momentum) algorithm is popular and used for many applications. Unfortunately, its convergence rate is relatively slow, especially for networks with more than one hidden layer. The reason for this is the saturation behavior of the activation function used for the network layers. Since the output of a unit exists in the saturation area, the corresponding descent gradient takes a very small value, even if the output error is large, leading to very little progress in the weight adjustment [9] The problem of improving the efficiency and convergence rate of the BPM algorithm has been investigated by a number of researchers. In general, there are two underlying idea. The first is to vary the LR (learning rate) and MF (momentum factor) by a fixed step based on the observation of the error signals. For these algorithms, a large number of trial runs are required before arriving at the right learning parameter values. Jacobs [10] proposed dynamically varying the LR and MF by a fixed step based on the observation of the error signals. This approach has been proven to work well for many cases, but the fixed step may lead the system to jump to undesirable areas of weight space, resulting in dramatic divergence. The individual LR for each component of the weights vector need to be determined separately, and like the SBP algorithm, convergence rates are very slow. In addition, a large number of trial runs are required before arriving at the right learning parameter values. The second is to use the optimization theory. Charalambous proposed conjugate gradient (CG) algorithm [11], the CG method has been shown to be superior to the steepest decent in most application [12]. However, the conjugate gradient method requires more storage of intermediate results than the momentum method, and is nonlocal in the sense that the information needed to update a weight is not all contained in the pre- and post-synaptic units of the weight. This makes the algorithm less biologically plausible and harder to implement on hardware. In addition the conjugate gradient method is less robust than the momentum method when the error surface is

2

http://www.paper.edu.cn

relatively flat, and when it is very different from a quadratic form in most parts of parameter space [14], perhaps for these reasons, the momentum method appears to be dominant in the connectionist learning literature. Ooyen and Neinhuis [13] proposed a different error cost function, and the use of a second-order Newton’s method to optimize the terms. This approach uses the Pseudo inverse of the Hessian matrix, however the storage requirements increase quadratically with the number of weights, making this method impractical for large problems. Another disadvantage of most acceleration techniques is that often they must be tuned to fit a particular application. About the rigorous studies of BPM mechanisms, [14] point out that in the limit of continuous time, the momentum parameter is analogous to the mass of Newtonian particles that move through a viscous medium in a conservative force field on the surface. The behavior of the system near a local minimum is equivalent to a set of coupled and damped harmonic oscillators. The momentum term improves the speed of convergence by bringing some eigen components of the system closer to critical damping. By utilizing a discrete approximation to this continuous system, Qian also derives the conditions for stability of the algorithm. [14] analyzes the effect of momentum on steepest descent training for quadratic performance functions and demonstrates that there always exists a momentum coefficient that will stabilize the steepest descent algorithm, regardless of the value of the learning rate and how the value of the momentum coefficient changes the convergence properties of the algorithm. The paper [15] pointed out that the so called momentum method, much used in the neural network literature as an acceleration of the backpropagation method, is a stationary version of the conjugate gradient method. Connections with the continuous optimization method known as heavy ball with friction are also made. In fact, early papers on ‘continuous iterative methods’ and ‘analog computing’ [1618] proposed the use of dynamical systems to compute the solution of various optimization problems. Specifically, [16] investigates the idea of using a dynamical system that represents a Heavy Ball with Friction (HBF), moving under Newtonian dynamics in a conservative force field. Later, a more detailed analysis of this system was carried out by Attouch et al. [19-21]. Even though papers [13-15] related the BPM algorithm to HBF, the continuous version of the BPM algorithm actually is a first order system (SBP algorithm) combining with a damping force, not a true second order system (HBF). Even that, their analyses showed the possibility to develop a second order BP algorithm by the dynamical theory. Hence the main aim of this paper is to develop a new algorithm by the dynamical theory to reduce the training time by increasing the convergence rate and decreasing learning stalls whilst maintaining the simplicity and efficiency of the SBP algorithm. The underlying idea is to consider the process for the weight coefficients as a particle movement on the MSE (Mean Square Error) supersurface in gravitation field; by using the Tailor’s approximation, the particle movement differential equations can be simplified as a linear Multi-Degree-of-Freedom system at each time; and the parameters are determined according the linear vibration system theory. A new fast neural network Back Propagation Algorithm, MDFBP (Multi-Degree-of-Freedom Back Propagation) is proposed, the parameters are determined according the linear system

3

http://www.paper.edu.cn

theory. The simulation results show that the initial convergence rate of MDFBP is faster than the Back propagation Momentum Algorithm. The paper is organized as follows. In next section, the MDFBP algorithm is proposed by the particle dynamics analogy. In section 3, the stability analysis of the MDFBP algorithm is presented. The parameters are determined by vibration theory and the stability condition. The simulation results are presented in section 4. Conclusions are drawn in section 5.

2 MDFBP algorithm derivation

2.1 A novel idea A Multilayer feed forward neural network can be described as a nonlinear function
y = f ( w, x ) ,

where y ∈ R , w ∈ R N , x ∈ R N , N is the total number of weight coefficients, N is the length of the input vector. The corresponding desired output (target) and network output of the kth input vector x k is d k and yk , respectively. The least-square objective function in the weight space of the networks is
MSE = E ( ek2 ) = E ( d k ? yk ) .
2

The MSE function is a supersurface, the BP algorithm for multilayer networks is a gradient descent procedure used to minimize a least-squares objective function. We can imagine that, the supersurface is located in the gravitational field, and a particle on the surface is moving under the influence of gravity. Apparently with the particle mechanical energy exhausted, the particle will move to and stay on the lowest point of the supersurface. Therefore, the movement of the particle is a track from the initial point to the optimal point. A new algorithm can be developed from the particle differential equation. 2.2 Derivation of the particle differential equation The iteration process for optimal vector can be regarded as a time series. The function f can be linearized on w j ?1 at time j-1 by Tailor formula. The corresponding network output of the kth input vector x k at time j is
' yk,j = f ( w j , x k ) ≈ f ( w j ?1 , x k ) + f w ( w j ?1 , x k )

T

(w

j

? w j ?1 ) + o w j ? w j ?1 .

(1)

Where

4

http://www.paper.edu.cn

' f w ( w j ?1 , x k ) =

?f ?w

,
w = w j ?1

and “ o w j ? w j ?1 ”is the remainder term of Taylor's formula. A vector w*j can always been found from the equation (1):
' d k ? f ( w j ?1 , x k ) ? o w j ? w j ?1 = f w ( w j ?1 , x k ) T

(w

* j

? w j ?1 ) .

Then, we have
' d k ? yk , j = d k ? f ( w j ?1 , x k ) ? f w ( w j ?1 , x k ) T

(w

j

? w j ?1 ) ? o w j ? w j ?1
T

' = d k ? f ( w j ?1 , x k ) ? o w j ? w j ?1 ? f w ( w j ?1 , x k ) ' = f w ( w j ?1 , x k ) T

(

)

(w

j

? w j ?1 )

( w* j ? w j ?1 ) ? f w' ( w j ?1 , xk ) (w
2

T

( w j ? w j ?1 )

.

' = f w ( w j ?1 , x k )

T

* j

?wj)

Then, in the neighborhood of w j , the function MSE can be described as:
' ' y j = MSE = E ( ek2,j ) = E ( d k ? yk,j ) = ( w *j ? w j ) E f w ( w j ?1 , x k ) f w ( w j ?1 , x k ) T

(

T

)(w ? w ) ,
* j j

' ' let v = w j ? w *j , R j = E f w ( w j ?1 , x k ) f w ( w j ?1 , x k )

(

T

) , we have
( 2)

y j = vT R j v

Equation (2) shows that the network at time j can be regarded as a quadratic function. Apparently, w *j is a transient optimum point of the quadratic function at time j (see Fig.1).

5

http://www.paper.edu.cn

Fig.1 The motion of a particle on the hypersurface in gravitational field

Now we imagine that the supersurface is located in the gravitational field (see Fig.1). A particle with the mass of m on the surface is moving under the influence of gravity. At time j, the particle’s kinetic energy T, potential energy U and Lagrange function L [23] are

T = m ( y 2 + vT v ) 2 , U = mgy j and L = T ? U , j respectively, where g is a universal gravitational constant. Substitute these expressions for T, U and L in Lagrange equation [23], we have dy j dy j dy j = 2R j v , y j = 2 vT R j v , = 2R j v , = 2R j v , dv dv dv
dy j ?L = mv + my j , = mv + 2my j R j v ?v dv dy j dy j ?L = my j ? mg , = 2my j R j v ? 2mgR j v ?v dv dv v + (g + yj ) dy j dv =0

Because of (2), equation (3) can be rewritten as
v + 2( g + yj ) R j v = 0

(3)

6

http://www.paper.edu.cn

2.3 Derivation of new algorithm During time j and j+1, when w *j and R j being frozen, equation (3) can be considered as a linear system. After introducing a viscous force into (3) and let μ j = g + y j , we have
v + 2ξ j v + 2 μ j R j v = 0

(4)

where ξ j is a damping parameter. Because w j , w*j and R j be frozen, we have:
v = w j +1 ? 2w j + w j ?1 , dy j dv
' = 2R j v = ?2 E ek,j f w ( w j , x k ) .

(

)

Then equation (4) can be changed as 1? ξ j 2 2 ' w j +1 = wj ? w j ?1 + μ j E ek , j f w ( w j , x k ) , 1+ ξ j 1+ ξ j 1+ ξ j

(

)

(5)

which is called as MDFBP algorithm. Because of lim w j +1 = lim w j = lim w j ?1 , thus, we have
j →∞ j →∞ j →∞
' lim E ek , j f w ( w j , x k ) = 0 j →∞

(

)

Which shows that equation (5) converges to the local minima of the network.

3 Selection of parameters

3.1 The stability condition of MDFBP algorithm The parameters ξ j and μ j must satisfy certain stability condition in order to achieve a stable convergence for the mean weight vector towards its optimal value. During a sampling period, equation (4) can be considered as a linear system. According to the theory of vibration [23], equation (4) can be rewritten as
l + 2ξ j I + 2μ j Λ j l = 0 Q j T R j Q j = Λ j = diag ( λ1, j , λ2, j , , λN , j ) , and l = [l1 , l2 ,

(6)

by substituting v = Q j l for v and v , where Q j is the matrix of eigenvectors of R j ,
, λi , j ,

, lN ] .
T

Let λmin, j = λ1, j < λ2, j <

< λN , j = λmax, j .

The differential equation for the i-th mode of vibration [23] is

7

http://www.paper.edu.cn

li + 2ξ j li + 2 μ j λi , j li = 0

(7)

where li is the i-th components of the vector l and λi , j is the i-th eigenvalue of the autocorrelation matrix R j . We can consider (7) in the discrete case. According to the central difference method, (7) can be rewritten as
l?i , j +1 = 1? ξ j ? 2μ j 2 ? li , j ? li , j ?1 ? λi , j l?i , j 1+ ξ j 1+ ξ j 1+ ξ j

Supplying the dummy equation l?i , j = l?i , j , we can rewrite the equation in matrix form
? l?i , j ? ? l? ? ? l? ? ? ? = A ? i , j ?1 ? = Ak ? i ,0 ? ? l? ? ? l? ? ? l? ? ? i ,1 ? ? i , j +1 ? ? i, j ?

(8)

where l?i , j stands for discrete value of li at the time j, and the matrix A is given by:
0 ? A=? ?1 ? ? (1 ? ξ j )(1 + ξ j ) ? 2 (1 + ξ j ) ? μ j λi , j (1 + ξ j )
?1

(

1
?1

)

? ? ? ?

The convergence of l?i , j is determined by the eigenvalues of matrix A:

λ

?1 ? i,? ? ?2?

? 1 μ j λi , j =? ? ? 1+ ξ 1+ ξ j j ?

? ?± ? ?

? 1 μ j λi , j ? ? ?1+ ξ 1+ ξ j j ?

? 1?ξ j ? ? ? 1+ ξ j ?

2

(9)

To ensure convergence, we require λi ,1 < 1 and λi ,2 < 1 . Then we obtain the condition of stability of MDFBP (for the limitation of the paper's length, the details are shown in appendix A):
2 2 ? ?0 < μ j λmax, j < 2, ξ j > 0 ? ? ? 1 ? μ λ ? j max, j ) + ξ j < 1 ? ?? ? ∪ ? ?( ? ? 2 ? ?(1 ? μ j λmax, j ) + ξ j 2 ≥ 1? ? ?0 < ξ < 1 ? j ?? ? ?? ?

(10)

3.2 Selection of parameters ξ j and μ j The vibration theory [23] points out that for a damped linear vibration system, only when its relative damp factor equals critical damping (one), its vibration trades off fastest. In general, the eignvalues of a Multi-Degree-of-Freedom system are different each other, it is impossible to make all vibrations work under the critical damping

8

http://www.paper.edu.cn

condition with a single damp factor. We can only make all vibrations working as possible as close to critical damping. The damping factor of ith vibration in (7)

ξi , j =
There is the relation:

ξj
2μ j λi , j

;

ξj
2μ j λmin, j

> ξi , j =

ξj
2μ j λi , j

>

ξj
2μ j λmax, j

.

If there is the relation:

ξ j ≥ 2μ j λmax, j ,
all of modals will converge under the condition of over damped. In general, we choose the equation

ξ j = ρ 2 μ j λmax, j , ? ρ ≥ 1
convergence rate. Note that ρ > 1 , from equation (10), we have

(11)

for ξ j to make all modals working about the critical damping and obtain a faster

0 < μj <

2

λmax, j

In general, the calculation of the largest eigenvalue of the correlation matrix is quite expensive. From equation (2), we know the correlation matrix R is positive definite. So, we can use the trace of R as the approximation of λmax, j . To prevent from the overflow in iteration, let
g + yj g + yj
max

μ j = rj

In general, MDFBP algorithm can be concluded as

9

http://www.paper.edu.cn

1? ξ j ? 2 2 ' μ j E ek , j f w ( w j , x k ) wj ? w j ?1 + ?w j +1 = 1+ ξ j 1+ ξ j 1+ ξ j ? ? ? μ = r g + y j , e = d ? y , y = E e2 ? 2e 2 + e 2 ( k , j k , j ?1 k , j ? 2 ) j k, j k k, j j ? j g + yj ? max ? T ' ' ?ξ j = ρ 2 μ j λmax, j , λmax, j = E f w ( w j ?1 , x k ) f w ( w j ?1 , x k ) ,? ρ ≥ 1 ? 2 2 ? 2 ? y j = E ( ek , j ) = E ( d k ? yk , j ) , yk , j = f ( w j , x k ) , 0 < μ j < λ max, j ?

(

)

(12)

(

)

' ' where f w ( w j ?1 , x k ) and f w ( w j , x k ) can be calculated with chain rule, which is used

in SBP algorithm. From equation (12) and (4), we can find that the MDFBP algorithm is very different from the BPM algorithm, because it is a true second order dynamical system in continuous version. The computational complexity of the new algorithm is almost the same as BPM algorithm.

4 simulations

Fig. 2. The neural network to be identified

10

http://www.paper.edu.cn

(a)

(b)

Fig. 3. The weight coefficients (a) and MSE (b) curves of MDFBP and BPM algorithms

11

http://www.paper.edu.cn

The network in Fig.2 is the plant to be identified; it has the weights and offset val1 1 2 2 ues: w1,1 = 10 , w1 = 10 , b11 = ?5 , b2 = 5 , w1,1 = 1 , w1,2 = 1 , b 2 = 1 .We 2,1
1 2 use a model that has the same structure as Figure 1 to identify the plant. w1,1 , w1,1

are the weight coefficients to be identified. The input sample is a sequence of 40 values evenly distributed between –2 and 2. Figure 3 is the simulation results of MDFBP and Momentum BP algorithm. The iteration parameters of MDFBP algorithm are rj = 0.89 λmax, j , g = 0.01 λmax, j . The iteration parameters of BPM algorithm are MF = 0.8 , LR = 479 . Figure 3 shows that convergence rate of MDFBP algorithm is much faster than that of BPM algorithm. The simulation results show that the new algorithm has a great potential to improve the convergence rate of the Backpropagation Momentum algorithm. From Fig.3, we can find that the initial convergence rate of the MDFBP algorithm is much higher than the BPM algorithm, after the initial stage ( j > 600 ), the convergence rate of the former gets slower than the later. The fact shows that the MDFBP algorithm is slow when the particle is in a long and narrow valley in the error function surface. The problem is expected to be overcome by the particle initial velocity.

5 Conclusions
In this paper, a new algorithm is presented to improve the convergence rate of Backpropagation momentum Algorithm. The underlying idea is to consider the process for weight coefficients of neural networks as a particle movement on the MSE supersurface in the gravitation field; by using the Tailor’s approximation, the particle e differential equations can be simplified as a linear Multi-Degree-of-Freedom system at each time; and the parameters are determined according the linear system theory based on the fact. The simulation results show that the initial convergence rate of MDFBP algorithm is much faster than that of BPM algorithm; the later convergence rate of MDFBP algorithm is slower than the later. It means that the new algorithm has a great potential to improve the convergence rate of the Backpropagation momentum algorithm. The next work we must do is to use the initial velocity to improve the later stage convergence rate of the MDFBP algorithm.

References
1. 2.

Martin T. Hagan, Howard B, Demuth, Neural Network Design, PWS Publishing Company, 1996
S. E. Fahlman, an empirical study of learning speed in back propagation networks, tech.rep., CMU-CS-88-162,Caenegie Mellon University,Pittsburgh.,1988.

3.

M. Riedmiller, Advanced supervised learning in multi-layer perceptrons from backpropagation to adaptive learning algorithms, Computer Standards and Interfaces Special Issue on Neural Networks,vol. 16, no. 3 (1994), pp. 265-275,.

12

http://www.paper.edu.cn

4.

5. 6. 7.

8. 9. 10 11
12 13 14 15 16 17 18

M. Riedmiller and H. Braun, A direct method for faster backpropagation learning: The RPROP Algorithm, IEEE International Conference on Neural Networks 1993 (ICNNS3), pp. 586-591. D. Sarkar, Methods to speed up error back propagation learning algorithm, ACM Computzng Surveys, vol. 27, no. 4 (1995),pp. 519-542,. F. F. Silva and L. B. Almeida, Speeding Backpropagation, pp. 151-158. Elsevier Science Publishers B.V. (North-Holland): Advanced Neural Computers, 1990. Amr Radi and Riccardo Poli, Discovery of Backpropagation Learning Rules Using Genetic Programming, IEEE International Conference on Evolutionary Computation, 1998,pp. 371-375. McClelland, J.L., & Rumelhart, D.E. Parallel distributed processing (Vol. 2,1986). Cambridge, MA: MIT Press. Y.H. Zweiri, J.F. Whidborne, L.D. Seneviratne, A three-term backpropagation algorithm, Neurocomputing 50 (2003) 305 – 318 A. Jacobs, Increasing rate of convergence through learning rate adaptation, Neural Networks 1 (4)(1988) 295-307. C. Charalambous, Conjugate gradient algorithm for efficient training of artificial neural networks, IEEE Proceedings, vol.139, no.3 (1992) 301-310.
Press, W.H., Teukolsky, S.A., Vetterling, W.T., & Flannery, B.P. Numerical recipes in C. Cambridge, UK: Cambridge University Press, 1992. A.O. Ooyen, B. Neinhuis, Improving the convergence of the backpropagation algorithm, Neural Networks 5 (1992) 465–471. Ning Qian, On the momentum term in gradient descent learning algorithms, Neural Networks 12 (1999) 145–151 Manabu Torii and Martin T. Hagan, Stability of Steepest Descent With Momentum for Quadratic Functions, IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002 Amit Bhaya, Eugenius Kaszkurewicz, Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method. Neural Networks 17 (2004) 65–71 Polyak, B. T. Some methods of speeding up the convergence of iterative methods. USSR Computational Mathematics and Mathematical Physics, 4(5), (1964) 1–17. Rybashov, M. V.. Method of differential equations in the problem of finding the extremum of a function using analog computers. Automation and Remote Control, 30(5), (1969) 181–194. Tsypkin, Y. Z.. Adaptation and learning in automatic systems (Vol. 73). Mathematics in science and engineering, New York: Academic Press, first published in Russian under the title Adaptatsia i obuchenie v avtomaticheskikh sistemakh, 1968, Moscow: Nauka. Attouch, H., Goudou, X., & Redont, P. The heavy ball with frictionmethod, I. The continuous dynamical system: global exploration of the global minima of a real-valued function by asymptotic analysis of a dissipative dynamical system. Communications in Contemporary Mathematics, 2(1), (2000) 1 A. Cabot, Inertial Gradient-Like Dynamical System Controlled by a Stabilizing Term, JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 120, No. 2, (2004) pp. 275–303 . Hedy Attouch and Marc-Olivie rCzarnecki, Asymptotic Control and Stabilization of Nonlinear Oscillators with Non-isolated Equilibria, Journal of Differential Equations 179, (2002) 278–310. W. T. Thomson, Theory of Vibration with Aplications, Prentice-Hall, 1996.

19

20

21

22

23

13

http://www.paper.edu.cn

APPENDIX A The Stability Analysis of The MDFBP Algorithm We can consider each of (6) in the manuscript separately in the discrete case. Because μ j is a constant during a sampling period, (6) in the discrete case can be considered as a linear system. According to the central difference method, the ith of (6) can be rewritten as
l?i , j +1 =
1? ξ j ? 2μ j 2 ? λi , j l?i , j li , j ? li , j ?1 ? 1+ ξ j 1+ ξ j 1+ ξ j

Supplying the dummy equation l?i , j = l?i , j , we can rewrite the equation in matrix form
? l?i , j ? ? l? ? ? l? ? ? ? = A ? i , j ?1 ? = A k ? i ,0 ? ? l? ? ? l? ? ? l? ? ? i ,1 ? ? i , j +1 ? ? i, j ? where l?i , j stands for discrete value of li at the time j, and the matrix A is given by:
? 0 ? A = ? 1?ξ j ? ?1+ ξ j ? ? ? ? 1 ?? μj λ 2? ? ? 1 + ξ 1 + ξ i, j ? ? ? j j ? ?? 1

This system can be looked as a linear dynamic system that will be stable if the eigenvalues of A are less than one in magnitude during each sampling period. The convergence of l?i , j is determined by the time varying eigenvalues of matrix A:

λ

?1 ? i,? ? ?2?

? 1 μ j λi , j =? ? ? 1+ ξ 1+ ξ j j ?

? ?± ? ?

? 1 μ j λi , j ? ? ?1+ ξ 1+ ξ j j ?

? 1?ξ j ? ? ? 1+ ξ j ?

2

(A.1)

Next, we’ll determine the eigenvalues of matrix A. Let
? 1 μ j λi , j A=? ? ?1+ ξ 1+ ξ j j ? ? 1?ξ j ?, B = ? 1+ ξ j ?

(A.2)

and we’ll do this in stages.
Case Ⅰ: A2 ? B ≥ 0

The eigenvalues of A should satisfy

14

http://www.paper.edu.cn

? A2 ? B ≥ 0, ? ? 2 ? A± A ? B <1 ?

(a ) (b)

(A.3)

Substituting (A.2) into(a)of (A.3), we obtain

(1 ? μ λ )
j i, j

2

+ ξ j2 ≥ 1

(A.4)

Substituting (A.2) into(b)of (A.3), we obtain
?1 < A ± A 2 ? B < 1

(A.5)

Equation (A.5) can be rewritten as:
?( A ? 1) ± A2 ? B < 0 ? ? ?( A + 1) ± A2 ? B > 0 ?

(a) (b)

(A.6)

In (a) of (A.6) , if A ? 1 > 0 , ( A ? 1) + A2 ? B < 0 can't be satisfied, we must have
? A ?1 < 0 ? ? 2 ? A ?1 + A ? B < 0 ?

(A.7)

In (b) of (A.6),if A + 1 < 0 , A + 1 ? A2 ? B > 0 can't be satisfied, we must have
?A +1 > 0 ? ? 2 ? A ?1 ? A ? B > 0 ?

(A.8)

Using (A.7) and (A.8) we can rewrite this case stability condition as:
??1 < A < 1 ? ? 2 ? A +1? A ? B > 0 ? 2 ? A ?1 + A ? B < 0 ?

(a) (b) (c)

(A.9)

From (b) of (A.9), we obtain
B > ? ( 2 A + 1)

(A.10)

Substituting (A.2) into (A.10), we obtain

μ j λi , j < 2
From (c) of (A.9), we obtain

(A.11)

15

http://www.paper.edu.cn

B > 2 A ?1

(A.12) (A.13)

Substituting (A.2) into (A.12), we obtain

μ j λi , j > 0
From (a) of (A.9), note that ξ j > 0 , we obtain

ξj > 0

(A.14)

from (A.11) , (A.13) and (A.14), note (A.4), we can obtain the caseⅠ stability condition as:
? ?0 < μ j λi , j < 2 ? ?ξ j > 0 ? 2 ?(1 ? μ j λi , j ) + ξ j 2 > 1 ?

(A.15)

Case Ⅱ: A2 ? B < 0

The eigenvalues of A should satisfy
? A2 ? B < 0, ? ? 2 ? A± i B ? A <1 ?

(a ) (b)

(A.16)

where i 2 = ?1 . Substituting (A.2) into(b)of (A.16), we obtain 0 < ξj <1 Substituting (A.2) into(a)of (A.16), we obtain (A.17)

(1 ? μ λ )
j i, j

2

+ ξ j2 < 1

(A.18)

From (A.17) and (A.18) , we can obtain the caseⅡ stability condition as:
?(1 ? μ λ )2 + ξ 2 < 1 ? j i, j j ? ?0 < ξ j < 1 ?

(A.19)

Using (A.15) and (A.18), we obtain the stability condition is

16

http://www.paper.edu.cn

?? ? 2 ? ?0 < μ j λi , j < 2 ? ?? 2 ? ? ?ξ > 0 ? ∪ ? ?(1 ? μ j λi , j ) + ξ j < 1? ? j ? ? ? ? 0 < ξ <1 ? ? j 2 ? ? ?? ? ?(1 ? μ j λi , j ) + ξ j 2 > 1? ? ? ?? ?

(A.20)

Fig.A1 give the figure of (A.20). In Fig.A1, the yellow area stands for caseⅠ, the green area stands for case Ⅱ.

Fig. A1. The ith mode stability condition of MDFBP algorithm

Stability condition of the MDFBP

The equation (A.20) should be suitable for all of eigenvalues of matrix A, so the stability condition of the MDFBP algorithm is
?? ? 2 ? ?0 < μ j λmax, j < 2 ? ?? 2 ? ? ?ξ > 0 ? ∪ ? ?(1 ? μ j λmax, j ) + ξ j < 1? ?? j ? ? ?0 < ξ < 1 ? j 2 ? ? ?? ? ?? 2 ? ?(1 ? μ j λmax, j ) + ξ j > 1? ?? ?

(A.21)

Equation (A.21) is the common area in Fig.A2.

17

http://www.paper.edu.cn

Fig. A2. The stability condition of the MDFBP algorithm

18


相关文章:
基于BP算法的神经网络技术 完整毕业论文
基于BP算法的神经网络技术 完整毕业论文_计算机软件及应用_IT/计算机_专业资料。...人工神经网络就是模拟人思维的第二种方式。这是一个非线性动力学系统, 其特色...
西南大学1085 《智能控制》作业答案
答:对 23、 RBF 神经网络的学习过程和 BP 神经网络的学习过程没有任何 区别...答: Hebb 学习规则是一种联想式学习算法。 生物学家 D.O.Hebbian 基于对生物...
神经网络模拟
神经网络模拟试题一、名词解释(共 5 题,每题 5 ...典型的过学习是多层前向网络的 BP 算法 4、Hebb ...方程来描述,反馈型神 经网络一个非线性动力学...
神经网络算法及模型
神经网络算法及模型思维学普遍认为,人类大脑的思维分为抽象(逻辑)思维、形象(...人工神经网络就是模拟人思维的第二种方式。这是一个非线性动力学系统,其特 色...
BP神经网络算法
通过模拟大脑神 经网络处理,记忆信息的方式设计一种...算法等几种流行的 BP 神经网络学习算法,详细的介绍...互连组成的大规模并行分布式信息处理和非 线性动力学...
基于GA-BP神经网络的人脸智能检测算法
基于GA-BP 神经网络的人脸智能检测算法摘 要:传统...一个3层的BP网络可以完成任意的n 维到m维 的非...(即人工神经元)组成的一种 非线性自适应动力学系统...
基于BP神经网络的PID控制器设计
本文还介绍了基于 BP 神经网络 PID 控制器的设计步骤、结构框图、控制算法。 ...神经网络的结构和功能的一种技术系统, 它是一种大规 模并行的非线性动力学...
BP神经网络算法
BP神经网络算法_数学_自然科学_专业资料。神经网络...性动力学系统, 它是一种应用类似于大脑神经突触联接...较之 基于传统数学计算机的离散控制方式, 神经网络更...
神经网络期终论文
姓专学名: 业: 号: 神经网络设计 彭洪 基于遗传...1 摘要 论文提出一种基于遗传-BP 神经网络的手写...遗传算法模拟达尔文的遗传选择和自然淘汰的生物进化...
神经网络优化学习算法综述
本文对目前几种常见的神经网络优化学习算法, 感知器...BP,RBF, Widrow-Hoff 一 引言神经网络的研究至今已有...网络的动力学性质,并用 电子线路设计出相应的网络,...
更多相关标签:
bp神经网络算法 | bp神经网络算法原理 | bp神经网络算法实例 | bp神经网络算法代码 | bp神经网络算法步骤 | bp神经网络训练算法 | bp神经网络算法推导 | bp神经网络算法流程图 |