Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item Catalogs

Mezentsev, Gleb; Gusak, Danil; Oseledets, Ivan; Frolov, Evgeny

dc.contributor.author	Mezentsev, Gleb
dc.contributor.author	Gusak, Danil
dc.contributor.author	Oseledets, Ivan
dc.contributor.author	Frolov, Evgeny
dc.date.accessioned	2025-06-12T20:52:14Z
dc.date.available	2025-06-12T20:52:14Z
dc.date.issued	2024-10-08
dc.identifier.isbn	979-8-4007-0505-2
dc.identifier.uri	https://hdl.handle.net/1721.1/159401
dc.description	RecSys ’24, October 14–18, 2024, Bari, Italy	en_US
dc.description.abstract	Scalability issue plays a crucial role in productionizing modern recommender systems. Even lightweight architectures may suffer from high computational overload due to intermediate calculations, limiting their practicality in real-world applications. Specifically, applying full Cross-Entropy (CE) loss often yields state-of-the-art performance in terms of recommendations quality. Still, it suffers from excessive GPU memory utilization when dealing with large item catalogs. This paper introduces a novel Scalable Cross-Entropy (SCE) loss function in the sequential learning setup. It approximates the CE loss for datasets with large-size catalogs, enhancing both time efficiency and memory usage without compromising recommendations quality. Unlike traditional negative sampling methods, our approach utilizes a selective GPU-efficient computation strategy, focusing on the most informative elements of the catalog, particularly those most likely to be false positives. This is achieved by approximating the softmax distribution over a subset of the model outputs through the maximum inner product search. Experimental results on multiple datasets demonstrate the effectiveness of SCE in reducing peak memory usage by a factor of up to 100 compared to the alternatives, retaining or even exceeding their metrics values. The proposed approach also opens new perspectives for large-scale developments in different domains, such as large language models.	en_US
dc.publisher	ACM\|18th ACM Conference on Recommender Systems	en_US
dc.relation.isversionof	https://doi.org/10.1145/3640457.3688140	en_US
dc.rights	Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.	en_US
dc.source	Association for Computing Machinery	en_US
dc.title	Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item Catalogs	en_US
dc.type	Article	en_US
dc.identifier.citation	Mezentsev, Gleb, Gusak, Danil, Oseledets, Ivan and Frolov, Evgeny. 2024. "Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item Catalogs."
dc.identifier.mitlicense	PUBLISHER_POLICY
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2025-06-01T07:47:49Z
dc.language.rfc3066	en
dc.rights.holder	The author(s)
dspace.date.submission	2025-06-01T07:47:49Z
mit.license	PUBLISHER_POLICY
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: 3640457.3688140.pdf
Size:: 1.342Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record