CurrentModule = Manopt
quasi_Newton
quasi_Newton!
The aim is to minimize a real-valued function on a Riemannian manifold, that is
Riemannian quasi-Newtonian methods are as generalizations of their Euclidean counterparts Riemannian line search methods. These methods determine a search direction η_k ∈ T_{p_k} \mathcal{M} at the current iterate p_k and a suitable stepsize α_k along \gamma(α) = R_{p_k}(α η_k), where R: T \mathcal{M} →\mathcal{M} is a retraction. The next iterate is obtained by
In quasi-Newton methods, the search direction is given by
where \mathcal{H}_k : T_{p_k} \mathcal{M} →T_{p_k} \mathcal{M} is a positive definite self-adjoint operator, which approximates the action of the Hessian \operatorname{Hess} f (p_k)[⋅] and \mathcal{B}_k = {\mathcal{H}_k}^{-1}. The idea of quasi-Newton methods is instead of creating a complete new approximation of the Hessian operator \operatorname{Hess} f(p_{k+1}) or its inverse at every iteration, the previous operator \mathcal{H}_k or \mathcal{B}_k is updated by a convenient formula using the obtained information about the curvature of the objective function during the iteration. The resulting operator \mathcal{H}_{k+1} or \mathcal{B}_{k+1} acts on the tangent space T_{p_{k+1}} \mathcal{M} of the freshly computed iterate p_{k+1}.
In order to get a well-defined method, the following requirements are placed on the new operator \mathcal{H}_{k+1} or \mathcal{B}_{k+1} that is created by an update.
Since the Hessian \operatorname{Hess} f(p_{k+1}) is a self-adjoint operator on the tangent space T_{p_{k+1}} \mathcal{M}, and \mathcal{H}_{k+1} approximates it, one requirement is, that \mathcal{H}_{k+1} or \mathcal{B}_{k+1} is also self-adjoint on T_{p_{k+1}} \mathcal{M}.
In order to achieve a steady descent, the next requirement is that η_k is a descent direction in each iteration.
Hence a further requirement is that \mathcal{H}_{k+1} or \mathcal{B}_{k+1} is a positive definite operator on T_{p_{k+1}} \mathcal{M}.
In order to get information about the curvature of the objective function into the new operator \mathcal{H}_{k+1} or \mathcal{B}_{k+1}, the last requirement is a form of a Riemannian quasi-Newton equation:
or
where T_{p_k \rightarrow p_{k+1}} : T_{p_k} \mathcal{M} →T_{p_{k+1}} \mathcal{M} and
the chosen retraction R is the associated retraction of T.
Note that, of course, not all updates in all situations meet these conditions in every iteration.
For specific quasi-Newton updates, the fulfilment of the Riemannian curvature condition, which requires that
holds, is a requirement for the inheritance of the self-adjointness and positive definiteness of the \mathcal{H}_k or \mathcal{B}_k to the operator \mathcal{H}_{k+1} or \mathcal{B}_{k+1}. Unfortunately, the fulfilment of the Riemannian curvature condition is not given by a step size \alpha_k > 0 that satisfies the generalized Wolfe conditions. However, to create a positive definite operator \mathcal{H}_{k+1} or \mathcal{B}_{k+1} in each iteration, the so-called locking condition was introduced in HuangGallivanAbsil:2015, which requires that the isometric vector transport T^S, which is used in the update formula, and its associate retraction R fulfil
where T^R is the vector transport by differentiated retraction. With the requirement that the isometric vector transport T^S and its associated retraction R satisfies the locking condition and using the tangent vector
where
in the update, it can be shown that choosing a stepsize α_k > 0 that satisfies the Riemannian Wolfe conditions leads to the fulfilment of the Riemannian curvature condition, which in turn implies that the operator generated by the updates is positive definite.
In the following the specific operators are denoted in matrix notation and hence use H_k and B_k, respectively.
In general there are different ways to compute a fixed AbstractQuasiNewtonUpdateRule.
In general these are represented by
AbstractQuasiNewtonDirectionUpdate
QuasiNewtonMatrixDirectionUpdate
QuasiNewtonLimitedMemoryDirectionUpdate
QuasiNewtonCautiousDirectionUpdate
Manopt.initialize_update!
QuasiNewtonPreconditioner
QuasiNewtonLimitedMemoryBoxDirectionUpdate
Using
update_hessian!
the following update formulae for either H_{k+1} or B_{k+1} are available.
AbstractQuasiNewtonUpdateRule
BFGS
DFP
Broyden
SR1
InverseBFGS
InverseDFP
InverseBroyden
InverseSR1
The quasi Newton algorithm is based on a DefaultManoptProblem.
QuasiNewtonState
The quasi_Newton solver requires the following functions of a manifold to be available
- A [
retract!](@extref ManifoldsBase :doc:retractions)(M, q, p, X); it is recommended to set the [default_retraction_method](@extrefManifoldsBase.default_retraction_method-Tuple{AbstractManifold}) to a favourite retraction. If this default is set, aretraction_method=does not have to be specified. - A [
vector_transport_to!](@extref ManifoldsBase :doc:vector_transports)M, Y, p, X, q); it is recommended to set the [default_vector_transport_method](@extrefManifoldsBase.default_vector_transport_method-Tuple{AbstractManifold}) to a favourite retraction. If this default is set, avector_transport_method=orvector_transport_method_dual=(for\mathcal N) does not have to be specified. - By default quasi Newton uses
ArmijoLinesearchwhich requiresmax_stepsize(M)to be set and an implementation of [inner](@extrefManifoldsBase.inner-Tuple{AbstractManifold, Any, Any, Any})(M, p, X). - the [
norm](@extrefLinearAlgebra.norm-Tuple{AbstractManifold, Any, Any}) as well, to stop when the norm of the gradient is small, but if you implementedinner, the norm is provided already. - A [
copyto!](@extrefBase.copyto!-Tuple{AbstractManifold, Any, Any})(M, q, p)and [copy](@extrefBase.copy-Tuple{AbstractManifold, Any})(M,p)for points and similarlycopy(M, p, X)for tangent vectors. - By default the tangent vector storing the gradient is initialized calling [
zero_vector](@extrefManifoldsBase.zero_vector-Tuple{AbstractManifold, Any})(M,p).
Most Hessian approximations further require [get_coordinates](@extref ManifoldsBase.get_coordinates)(M, p, X, b) with respect to the [AbstractBasis](@extref ManifoldsBase.AbstractBasis) b provided, which is [DefaultOrthonormalBasis](@extref ManifoldsBase.DefaultOrthonormalBasis) by default from the basis= keyword.
Pages = ["quasi_Newton.md"]
Canonical=false