@@ -4,6 +4,25 @@ This document records the experiments and findings from investigating iterative
44(matrix-free) alternatives to direct factorization for solving KKT systems in
55QTQP's primal-dual interior point method.
66
7+ ## TL;DR
8+
9+ ** CG on the normal equations is the only iterative method that works.** All
10+ attempts to solve the full (n+m) x (n+m) KKT system iteratively failed --
11+ including MINRES with diagonal scaling (4 variants), PETSc GMRES with block
12+ fieldsplit (6 configurations), full ILU (2 configs), and Schur complement
13+ fieldsplit (5 configs). Every method breaks down at small mu (barrier parameter)
14+ because no tested preconditioner adequately captures the off-diagonal A/A^T
15+ coupling that dominates the KKT system as mu -> 0.
16+
17+ CG on normal equations works by reducing to an n x n SPD system where a cheap
18+ diagonal preconditioner (` diag(P) + mu + colsum(A^2/d) ` ) captures both
19+ diagonal and off-diagonal contributions. This is implemented as
20+ ` LinearSolver.CG ` and passes all 218 tests.
21+
22+ ** If you are revisiting this work** : the barrier is preconditioning, not the
23+ Krylov method. Any future attempt on the full KKT needs a preconditioner that
24+ accounts for A/A^T coupling -- see "Future Directions" at the end.
25+
726## Background
827
928QTQP solves QPs via a primal-dual IPM. Each iteration solves multiple KKT systems
@@ -33,7 +52,21 @@ QTQP adds `mu*I` to both diagonal blocks (not just the (1,1) block). This means:
3352The condition number of the KKT matrix is O(||A||^2 / mu^2), which grows as
3453mu -> 0. This is the central challenge for iterative methods.
3554
36- ## Approaches Tested
55+ ## Summary of Results
56+
57+ | Approach | Passes tests? | Failure mode |
58+ | ----------| :---:| ---|
59+ | CG on normal equations | YES | -- |
60+ | MINRES + diagonal preconditioner (scipy) | no | scipy checks preconditioned residual, not true residual |
61+ | MINRES unpreconditioned | no | kappa = O(1/mu^2), cannot converge at small mu |
62+ | MINRES + explicit diagonal scaling | no | unscaling amplifies residual by O(1/sqrt(mu)) |
63+ | MINRES + scaling + iterative refinement | no | each refinement step has same amplification |
64+ | PETSc GMRES + ICC/Jacobi fieldsplit | no | block PC ignores A/A^T coupling, breaks down at mu ~ 1e-4 |
65+ | PETSc GMRES + full ILU | no | ILU quality degrades at small mu |
66+ | PETSc Schur complement (5 variants) | no | inner solve needs normal-equations-quality preconditioner |
67+ | Direct LU (baseline) | YES | -- (but O(nnz^1.5) fill-in) |
68+
69+ ## Detailed Experiments
3770
3871### 1. CG on Normal Equations (WORKS -- shipped as ` LinearSolver.CG ` )
3972
@@ -241,48 +274,65 @@ Schur complement only uses -(D+mu*I) (the (2,2) block), missing the
241274for the Schur complement would need ` diag(A(P+mu*I)^{-1}A^T) ` -- which is
242275exactly what CgNormalEqSolver already computes.
243276
244- ## Why CG on Normal Equations is the Right Approach
245-
246- The experiments above show a consistent pattern: all iterative methods on the
247- full (n+m) x (n+m) KKT system fail at small mu because:
248-
249- 1 . ** Block preconditioners** ignore the A/A^T coupling that dominates at small mu
250- 2 . ** Full ILU** quality degrades with the condition number
251- 3 . ** Diagonal scaling** reduces kappa but introduces unscaling amplification
252- 4 . ** Schur complement** needs a good inner preconditioner, which is the normal
253- equations diagonal -- exactly what CG on normal equations already uses
254-
255- CG on normal equations succeeds because:
256-
257- 1 . ** SPD structure** : The reduction eliminates the indefinite structure, giving
258- an n x n SPD system amenable to CG
259- 2 . ** Effective preconditioner** : ` diag(N) = diag(P) + mu + colsum(A^2/d) ` captures
260- both the P and A^T D^{-1} A contributions to N's diagonal
261- 3 . ** Matrix-free** : N is never formed explicitly; each CG iteration costs
262- O(nnz(A)) via ` N v = (P+mu*I)v + A^T(D^{-1}(Av)) `
277+ ## Why Every Full-KKT Iterative Method Failed
278+
279+ All tested methods on the full (n+m) x (n+m) KKT system share the same root
280+ cause: ** no tested preconditioner captures the off-diagonal A/A^T coupling** .
281+
282+ As mu -> 0, the KKT diagonal blocks shrink to O(mu) while the off-diagonal
283+ blocks A, A^T remain O(||A||). This means the off-diagonal coupling dominates
284+ the system, and any preconditioner that ignores it (block-diagonal, diagonal
285+ scaling, ILU with limited fill) becomes ineffective.
286+
287+ Specifically:
288+ - ** Block preconditioners** treat A/A^T as zero -- completely wrong at small mu
289+ - ** Diagonal scaling** equalizes the diagonal but cannot touch the off-diagonal
290+ structure, and unscaling amplifies the residual by O(1/sqrt(mu))
291+ - ** Full ILU** tries to capture coupling but its quality degrades as
292+ kappa = O(1/mu^2) grows -- the incomplete factorization becomes a poor
293+ approximation of a very ill-conditioned matrix
294+ - ** Schur complement** properly accounts for coupling in principle, but the
295+ inner solve on the Schur complement S = D + A H^{-1} A^T needs a
296+ preconditioner that captures ` diag(A H^{-1} A^T) ` -- which is exactly
297+ the normal equations preconditioner
298+
299+ ## Why CG on Normal Equations Succeeds
300+
301+ CG on normal equations avoids the preconditioning barrier by ** eliminating the
302+ indefinite structure entirely** :
303+
304+ 1 . ** SPD reduction** : Substituting y out gives N = H + A^T D^{-1} A, an n x n
305+ SPD system. No indefiniteness, no sign-dependent amplification.
306+ 2 . ** The preconditioner captures off-diagonal coupling** : ` diag(N) = diag(P) + mu + colsum(A^2/d) ` includes the ` A^T D^{-1} A ` contribution. This is
307+ the key difference -- every failed method lacked this.
308+ 3 . ** Matrix-free** : N is never formed; each CG iteration costs O(nnz(A)) via
309+ ` N v = (P+mu*I)v + A^T(D^{-1}(Av)) `
2633104 . ** No fill-in** : Unlike direct methods, there is no sparse factorization
264311
265- The Schur complement approach via PETSc is mathematically equivalent but with
266- more complexity and no advantage over the scipy CG implementation.
312+ The PETSc Schur complement approach is mathematically equivalent (the Schur
313+ complement IS the normal equations) but with more complexity and no advantage
314+ over scipy CG with the diagonal preconditioner.
267315
268316## Conditioning Analysis
269317
270- Both the full KKT and normal equations N have condition number O(||A||^2/mu^2):
318+ ** Key fact** : Both the full KKT and normal equations N have condition number
319+ ** O(||A||^2/mu^2)** -- they are equally ill-conditioned.
271320
272321- ** KKT** : eigenvalues in ` [-max(d+mu), -mu] U [mu, max(diag(P)+mu) + ||A||^2/mu] ` ,
273322 giving ` kappa(KKT) = O(||A||^2/mu^2) `
274323- ** Normal equations** : ` N = P + mu*I + A^T D^{-1} A ` has
275324 ` lambda_min >= mu ` and ` lambda_max >= ||A_eq||^2/mu ` ,
276325 giving ` kappa(N) >= ||A_eq||^2/mu^2 `
277326
278- The diagonal preconditioner for N reduces the effective condition number
279- significantly in practice, making CG iteration counts manageable across all
280- mu values encountered in the IPM.
327+ CG on normal equations does NOT succeed because of better conditioning. It
328+ succeeds because its diagonal preconditioner is effective (capturing both
329+ diagonal and off-diagonal contributions to N), while no comparably cheap
330+ preconditioner exists for the full indefinite KKT.
281331
282- An initial claim that QTQP's mu* I regularization gives O(1/mu) conditioning
283- ( vs O(1/mu^2) for standard IPMs) was incorrect . The mu * I terms bound the
284- smallest eigenvalue at O(mu), but the largest eigenvalue is dominated by the
285- off-diagonal coupling ||A||^2/mu, giving O(1/mu^2) regardless .
332+ ** Corrected misconception ** : An initial claim that QTQP's mu* I regularization
333+ gives O(1/mu) conditioning ( vs O(1/mu^2) for standard IPMs) was wrong . The
334+ mu * I terms bound the smallest eigenvalue at O(mu), but the largest eigenvalue
335+ is dominated by the off-diagonal coupling ||A||^2/mu, giving O(1/mu^2).
286336
287337## Code in This Branch
288338
0 commit comments