| dc.contributor.author | Bergemann, Dirk | |
| dc.contributor.author | Bonatti, Alessandro | |
| dc.contributor.author | Smolin, Alex | |
| dc.date.accessioned | 2026-02-10T18:01:01Z | |
| dc.date.available | 2026-02-10T18:01:01Z | |
| dc.date.issued | 2025-07-02 | |
| dc.identifier.isbn | 979-8-4007-1943-1 | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/164780 | |
| dc.description | EC ’25, July 7–10, 2025, Stanford, CA, USA | en_US |
| dc.description.abstract | We develop an economic framework to analyze the optimal pricing and product design of Large Language Models (LLM). Our framework captures several key features of LLMs: variable operational costs of processing input and output tokens; the ability to customize models through fine-tuning; and high-dimensional user heterogeneity in terms of task requirements and error sensitivity. In our model, a monopolistic seller offers multiple versions of LLMs through a menu of products. The optimal pricing structure depends on whether token allocation across tasks is contractible and whether users face scale constraints.
When it is possible to contract on the entire assignment of tokens to tasks, the seller's problem ("Token Allocations") is an infinite-dimensional screening problem, which is well-known to be difficult. We are nonetheless able to make progress in two important classes of environments: binary environment and two dimensional value-scale heterogeneity, in which case users with similar aggregate value-scale characteristics choose similar levels of fine-tuning and token consumption. When only the total number of tokens is contractible ("Token Packages"), we leverage the tractability of a constant elasticity of substitution framework to drastically simplify the problem: the buyer's type-a function mapping each task to a value of precision- is an index. This index for the value of precision allows the seller to solve a one-dimensional screening problem. The optimal mechanism can be implemented through menus of two-part tariffs, with higher markups for more intensive users. Our results rationalize observed industry practices such as tiered pricing based on model customization and usage levels. | en_US |
| dc.publisher | ACM|The 26th ACM Conference on Economics and Computation | en_US |
| dc.relation.isversionof | https://doi.org/10.1145/3736252.3742625 | en_US |
| dc.rights | Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. | en_US |
| dc.source | Association for Computing Machinery | en_US |
| dc.title | The Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricing | en_US |
| dc.type | Article | en_US |
| dc.identifier.citation | Dirk Bergemann, Alessandro Bonatti, and Alex Smolin. 2025. The Economics of Large Language Models: Token Allocation, Fine-Tuning, and Optimal Pricing. In Proceedings of the 26th ACM Conference on Economics and Computation (EC '25). Association for Computing Machinery, New York, NY, USA, 786. | en_US |
| dc.identifier.mitlicense | PUBLISHER_POLICY | |
| dc.eprint.version | Final published version | en_US |
| dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
| eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
| dc.date.updated | 2025-08-01T09:02:48Z | |
| dc.language.rfc3066 | en | |
| dc.rights.holder | The author(s) | |
| dspace.date.submission | 2025-08-01T09:02:48Z | |
| mit.license | PUBLISHER_POLICY | |
| mit.metadata.status | Authority Work and Publication Information Needed | en_US |