The Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricing

Bergemann, Dirk; Bonatti, Alessandro; Smolin, Alex

dc.contributor.author	Bergemann, Dirk
dc.contributor.author	Bonatti, Alessandro
dc.contributor.author	Smolin, Alex
dc.date.accessioned	2026-02-10T18:01:01Z
dc.date.available	2026-02-10T18:01:01Z
dc.date.issued	2025-07-02
dc.identifier.isbn	979-8-4007-1943-1
dc.identifier.uri	https://hdl.handle.net/1721.1/164780
dc.description	EC ’25, July 7–10, 2025, Stanford, CA, USA	en_US
dc.description.abstract	We develop an economic framework to analyze the optimal pricing and product design of Large Language Models (LLM). Our framework captures several key features of LLMs: variable operational costs of processing input and output tokens; the ability to customize models through fine-tuning; and high-dimensional user heterogeneity in terms of task requirements and error sensitivity. In our model, a monopolistic seller offers multiple versions of LLMs through a menu of products. The optimal pricing structure depends on whether token allocation across tasks is contractible and whether users face scale constraints. When it is possible to contract on the entire assignment of tokens to tasks, the seller's problem ("Token Allocations") is an infinite-dimensional screening problem, which is well-known to be difficult. We are nonetheless able to make progress in two important classes of environments: binary environment and two dimensional value-scale heterogeneity, in which case users with similar aggregate value-scale characteristics choose similar levels of fine-tuning and token consumption. When only the total number of tokens is contractible ("Token Packages"), we leverage the tractability of a constant elasticity of substitution framework to drastically simplify the problem: the buyer's type-a function mapping each task to a value of precision- is an index. This index for the value of precision allows the seller to solve a one-dimensional screening problem. The optimal mechanism can be implemented through menus of two-part tariffs, with higher markups for more intensive users. Our results rationalize observed industry practices such as tiered pricing based on model customization and usage levels.	en_US
dc.publisher	ACM\|The 26th ACM Conference on Economics and Computation	en_US
dc.relation.isversionof	https://doi.org/10.1145/3736252.3742625	en_US
dc.rights	Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.	en_US
dc.source	Association for Computing Machinery	en_US
dc.title	The Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricing	en_US
dc.type	Article	en_US
dc.identifier.citation	Dirk Bergemann, Alessandro Bonatti, and Alex Smolin. 2025. The Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricing. In Proceedings of the 26th ACM Conference on Economics and Computation (EC '25). Association for Computing Machinery, New York, NY, USA, 786.	en_US
dc.identifier.mitlicense	PUBLISHER_POLICY
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2025-08-01T09:02:48Z
dc.language.rfc3066	en
dc.rights.holder	The author(s)
dspace.date.submission	2025-08-01T09:02:48Z
mit.license	PUBLISHER_POLICY
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: 3736252.3742625.pdf
Size:: 501.8Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record