On the Power of Decision Trees in Auto-Regressive Language Modeling

Gan, Yulu; Galanti, Tomer; Poggio, Tomaso; Malach, Eran

dc.contributor.author	Gan, Yulu
dc.contributor.author	Galanti, Tomer
dc.contributor.author	Poggio, Tomaso
dc.contributor.author	Malach, Eran
dc.date.accessioned	2024-09-30T17:39:24Z
dc.date.available	2024-09-30T17:39:24Z
dc.date.issued	2024-09-27
dc.identifier.uri	https://hdl.handle.net/1721.1/157074
dc.description.abstract	Originally proposed for handling time series data, Auto-regressive Decision Trees (ARDTs) have not yet been explored for language modeling. This paper delves into both the theoretical and practical applications of ARDTs in this new context. We theoretically demonstrate that ARDTs can compute complex functions, such as simulating automata, Turing machines, and sparse circuits, by leveraging "chain-of-thought" computations. Our analysis provides bounds on the size, depth, and computational efficiency of ARDTs, highlighting their surprising computational power. Empirically, we train ARDTs on simple language generation tasks, showing that they can learn to generate coherent and grammatically correct text on par with a smaller Transformer model. Additionally, we show that ARDTs can be used on top of transformer representations to solve complex reasoning tasks. This research reveals the unique computational abilities of ARDTs, aiming to broaden the architectural diversity in language model development.	en_US
dc.description.sponsorship	This material is based upon work supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216.	en_US
dc.publisher	Center for Brains, Minds and Machines (CBMM)	en_US
dc.relation.ispartofseries	CBMM Memo;149
dc.title	On the Power of Decision Trees in Auto-Regressive Language Modeling	en_US
dc.type	Article	en_US
dc.type	Technical Report	en_US
dc.type	Working Paper	en_US

Files in this item

Name:: CBMM-Memo-149.pdf
Size:: 2.105Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

CBMM Memo Series

Show simple item record