Learning efficiently with approximate inference via dual losses
Author(s)Meshi, Ofer; Sontag, David Alexander; Jaakkola, Tommi S.; Globerson, Amir
MetadataShow full item record
Many structured prediction tasks involve complex models where inference is computationally intractable, but where it can be well approximated using a linear programming relaxation. Previous approaches for learning for structured prediction (e.g., cutting- plane, subgradient methods, perceptron) repeatedly make predictions for some of the data points. These approaches are computationally demanding because each prediction involves solving a linear program to optimality. We present a scalable algorithm for learning for structured prediction. The main idea is to instead solve the dual of the structured prediction loss. We formulate the learning task as a convex minimization over both the weights and the dual variables corresponding to each data point. As a result, we can begin to optimize the weights even before completely solving any of the individual prediction problems. We show how the dual variables can be efficiently optimized using coordinate descent. Our algorithm is competitive with state-of-the-art methods such as stochastic subgradient and cutting-plane.
DepartmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
International Conference on Machine Learning (27th, 2010) proceedings
International Machine Learning Society
Meshi, Ofer et al. "Learning Efficiently with Approximate Inference via Dual Losses" Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel, 2010.
Final published version