New Theory and Algorithms for Convex Optimization with Non-Standard Structures
Author(s)
Zhao, Renbo
DownloadThesis PDF (1.283Mb)
Advisor
Freund, Robert M.
Terms of use
Metadata
Show full item recordAbstract
Optimization models and algorithms have long played central and indispensable roles in the advancement of science and engineering. In recent years, first-order methods have played important roles in tackling applications arising in machine learning and data science, due to their simplicity, reasonably fast convergence rate, and low periteration computational cost. However, there exist many important applications that violate the fundamental assumptions on which existing first-order methods are based — specifically, the objective function, despite being convex, is neither Lipschitz nor has Lipschitz-gradient on the feasible region. The purpose of this thesis is to propose new optimization models for these “non-standard” problems, develop new first-order methods to solve these models, and analyze the convergence rate of these methods.
In the first chapter, we present and analyze a new generalized Frank-Wolfe method for the composite convex optimization problem min𝑥∈R𝑛𝑓(A𝑥) + ℎ(𝑥), where 𝑓 is a 𝜃-logarithmically-homogeneous self-concordant barrier, A is a linear operator and the function ℎ has a bounded domain but is possibly non-smooth. We show that our generalized Frank-Wolfe method requires 𝑂((𝛿0 + 𝜃 + 𝑅ℎ) ln(𝛿0) + (𝜃 + 𝑅ℎ)2/𝜀) iterations to produce an 𝜀-approximate solution, where 𝛿0 denotes the initial optimality gap and 𝑅ℎ is the variation of ℎ on its domain. This result establishes certain intrinsic connections between 𝜃-logarithmically homogeneous barriers and the Frank-Wolfe method. When specialized to the 𝐷-optimal design problem, we essentially recover the complexity obtained by Khachiyan (1996) using the Frank-Wolfe method with exact line-search.
In the second chapter, we present and analyze a new away-step Frank-Wolfe method for the convex optimization problem min𝑥∈𝒳 𝑓(A𝑥) + ⟨𝑐, 𝑥⟩, where 𝑓 is a 𝜃-logarithmically-homogeneous self-concordant barrier, A is a linear operator, ⟨𝑐, ·⟩ is a linear function and 𝒳 is a nonempty polytope. We establish the global linear convergence rate of our Frank-Wolfe method in terms of both the objective gap and the Frank-Wolfe gap. This, in particular, settles the question raised in Ahipasaoglu, 3 Sun and Todd (2008) on the global linear convergence of the away-step Frank-Wolfe method specialized to the D-optimal design problem.
In the third chapter, we propose a generalized multiplicative gradient (MG) method for a class of convex optimization problems, which, roughly speaking, involves minimizing a 1-logarithmically-homogeneous function over a “slice” of symmetric cone. This problem class includes several important applications, including positron emission tomography, D-optimal design, quantum state tomography, and Nesterov’s relaxation of boolean quadratic optimization. We show, via the machinery of Euclidean Jordan algebra, that this generalized MG method converges with rate 𝑂(ln(𝑛)/𝑘), where 𝑛 denotes the rank of the symmetric cone.
Date issued
2023-06Department
Massachusetts Institute of Technology. Operations Research CenterPublisher
Massachusetts Institute of Technology