レーベンバーグ・マーカート法の大切さ

最近LMAの由来を考えています。in the case of minimizing the sum of squared errors of an approximation to a function over a training set, a step delta is made for the parameter vector, where delta is a vector. Apparently, the direction of delta steps is the direction of the gradient of the error function with respect to delta, but the sign of the direction seems to depend on (J'J+lambda*(diag(J'J)))^(-1). In addition, for large lambda, the LMA is said to be roughly equivalent to gradient descent, but this means that the learning rate is (J'J+lambda*(diag(J'J)))^(-1).
アルゴリズムの一般的の説明はこのブログにあります。
http://kivantium.hateblo.jp/entry/20140408/p1