Show simple item record

dc.contributor.authorBergemann, Dirk
dc.contributor.authorBonatti, Alessandro
dc.contributor.authorSmolin, Alex
dc.date.accessioned2026-02-10T18:01:01Z
dc.date.available2026-02-10T18:01:01Z
dc.date.issued2025-07-02
dc.identifier.isbn979-8-4007-1943-1
dc.identifier.urihttps://hdl.handle.net/1721.1/164780
dc.descriptionEC ’25, July 7–10, 2025, Stanford, CA, USAen_US
dc.description.abstractWe develop an economic framework to analyze the optimal pricing and product design of Large Language Models (LLM). Our framework captures several key features of LLMs: variable operational costs of processing input and output tokens; the ability to customize models through fine-tuning; and high-dimensional user heterogeneity in terms of task requirements and error sensitivity. In our model, a monopolistic seller offers multiple versions of LLMs through a menu of products. The optimal pricing structure depends on whether token allocation across tasks is contractible and whether users face scale constraints. When it is possible to contract on the entire assignment of tokens to tasks, the seller's problem ("Token Allocations") is an infinite-dimensional screening problem, which is well-known to be difficult. We are nonetheless able to make progress in two important classes of environments: binary environment and two dimensional value-scale heterogeneity, in which case users with similar aggregate value-scale characteristics choose similar levels of fine-tuning and token consumption. When only the total number of tokens is contractible ("Token Packages"), we leverage the tractability of a constant elasticity of substitution framework to drastically simplify the problem: the buyer's type-a function mapping each task to a value of precision- is an index. This index for the value of precision allows the seller to solve a one-dimensional screening problem. The optimal mechanism can be implemented through menus of two-part tariffs, with higher markups for more intensive users. Our results rationalize observed industry practices such as tiered pricing based on model customization and usage levels.en_US
dc.publisherACM|The 26th ACM Conference on Economics and Computationen_US
dc.relation.isversionofhttps://doi.org/10.1145/3736252.3742625en_US
dc.rightsArticle is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleThe Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricingen_US
dc.typeArticleen_US
dc.identifier.citationDirk Bergemann, Alessandro Bonatti, and Alex Smolin. 2025. The Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricing. In Proceedings of the 26th ACM Conference on Economics and Computation (EC '25). Association for Computing Machinery, New York, NY, USA, 786.en_US
dc.identifier.mitlicensePUBLISHER_POLICY
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2025-08-01T09:02:48Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2025-08-01T09:02:48Z
mit.licensePUBLISHER_POLICY
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record