dc.description.abstract | First-order methods are optimization algorithms that can be described and analyzed using the values and gradients of the functions to be minimized. These methods have become the main workhorses for modern large-scale optimization and machine learning due to their low iteration costs, minimal memory requirements, and dimension-independent convergence guarantees. As the data revolution continues to unfold, the pressing demand for discovering faster first-order methods and rigorous convergence analyses of existing first-order methods have become the key problem in today’s big data era. To that goal, in this thesis, we make advancements in computer-assisted design and analysis of first-order methods and related problems.
The core contribution of this thesis is developing computer-assisted methodologies for analyzing and designing first-order methods using nonconvex quadratically constrained quadratic optimization problems (QCQPs). In this approach, the key idea involves posing the analysis or design of first-order methods as nonconvex but practically tractable QCQPs and then solving them to global optimality using custom spatial branch-and-bound algorithms.
In Chapter 2 of this thesis, we present Branch-and-Bound Performance Estimation Programming (BnB-PEP), a unified methodology for constructing optimal first-order methods for convex and nonconvex optimization. BnB-PEP poses the problem of finding the optimal first-order method as a nonconvex but practically tractable QCQP and solves it to certifiable global optimality using a customized branch-and-bound algorithm. Our customized branch-and-bound algorithm, through exploiting specific problem structures, outperforms the latest off-the-shelf implementations by orders of magnitude, accelerating the solution time from hours to seconds and weeks to minutes. We apply BnB-PEP to several practically relevant convex and nonconvex setups and obtain first-order methods with bounds that improve upon prior state-of-the-art results. Furthermore, we use the BnB-PEP methodology to find proofs with potential function structures, thereby systematically generating analytical convergence proofs.
We next propose a QCQP-based computer-assisted approach to the analysis of the worst-case convergence of nonlinear conjugate gradient methods (NCGMs) in Chapter 3. Those methods are known for their generally good empirical performances for large-scale optimization while having relatively incomplete analyses. Using our computer-assisted approach, we establish novel complexity bounds for the Polak-Ribière-Polyak (PRP) and the Fletcher-Reeves (FR) NCGMs for smooth strongly convex minimization. In particular, we construct mathematical proofs that establish the first non-asymptotic convergence bound for FR (which is historically the first developed NCGM), and a much improved non-asymptotic convergence bound for PRP. Additionally, we provide simple adversarial examples on which these methods do not perform better than gradient descent with exact line search, leaving very little room for improvements on the same class of problems.
In Chapter 4 of this thesis, we develop the nonconvex exterior-point optimization solver (NExOS)---a first-order algorithm tailored to sparse and low-rank optimization problems. We consider the problem of minimizing a convex function over a nonconvex constraint set, where the set can be decomposed as the intersection of a compact convex set and a nonconvex set involving sparse or low-rank constraints. Unlike the convex relaxation approaches, NExOS finds a locally optimal point of the original problem by solving a sequence of penalized problems with strictly decreasing penalty parameters by exploiting the nonconvex geometry. NExOS solves each penalized problem by applying a first-order algorithm, which converges linearly to a local minimum of the corresponding penalized formulation under regularity conditions. Furthermore, the local minima of the penalized problems converge to a local minimum of the original problem as the penalty parameter goes to zero. We then implement and test NExOS on many instances from a wide variety of sparse and low-rank optimization problems, empirically demonstrating that our algorithm outperforms specialized methods. | |