Efficient coordinate descent for ranking with domination loss

Stevens, Mark A., M. Eng. Massachusetts Institute of Technology

Author(s)

Stevens, Mark A., M. Eng. Massachusetts Institute of Technology

DownloadFull printable version (3.471Mb)

Other Contributors

Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.

Advisor

Yoram Singer and Michael Collins.

Terms of use

M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

We define a new batch coordinate-descent ranking algorithm based on a domination loss, which is designed to rank a small number of positive examples above all negatives, with a large penalty on false positives. Its objective is to learn a linear ranking function for a query with labeled training examples in order to rank documents. The derived single-coordinate updates scale linearly with respect to the number of examples. We investigate a number of modifications to the basic algorithm, including regularization, layers of examples, and feature induction. The algorithm is tested on multiple datasets and problem settings, including Microsoft's LETOR dataset, the Corel image dataset, a Google image dataset, and Reuters RCV1. Specific results vary by problem and dataset, but the algorithm generally performed similarly to existing algorithms when rated by average precision and precision at top k. It does not train as quickly as online algorithms, but offers extensions to multiple layers, and perhaps most importantly, can be used to produce extremely sparse weight vectors. When trained with feature induction, it achieves similarly competitive performance but with much more compact models.

Description

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.

Cataloged from PDF version of thesis.

Includes bibliographical references (p. 37-38).

Date issued

2010

URI

http://hdl.handle.net/1721.1/61592

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Graduate Theses