SeerCuts: Explainable Attribute Discretization

Lai, Eugenie; Croitoru, Inbal; Bitton, Noam; Shalem, Ariel; Youngmann, Brit; Galhotra, Sainyam; Rezig, El Kindi; Cafarella, Michael

dc.contributor.author	Lai, Eugenie
dc.contributor.author	Croitoru, Inbal
dc.contributor.author	Bitton, Noam
dc.contributor.author	Shalem, Ariel
dc.contributor.author	Youngmann, Brit
dc.contributor.author	Galhotra, Sainyam
dc.contributor.author	Rezig, El Kindi
dc.contributor.author	Cafarella, Michael
dc.date.accessioned	2026-02-09T22:16:56Z
dc.date.available	2026-02-09T22:16:56Z
dc.date.issued	2025-06-22
dc.identifier.isbn	979-8-4007-1564-8
dc.identifier.uri	https://hdl.handle.net/1721.1/164768
dc.description	SIGMOD-Companion ’25, Berlin, Germany	en_US
dc.description.abstract	This demonstration showcases SeerCuts - a tool that suggests useful and semantically meaningful discretization strategies (partitions) for numerical attributes. SeerCuts is a generic, interactive framework where users specify attributes to discretize and their utility measure for a downstream task of choice. It uses GPT-4o to assess the semantic meaningfulness of candidate partitions and employs an efficient search strategy to explore the vast space of discretization options. With hierarchical clustering to group related partitions and a multi-armed bandit policy to identify useful partitions with only a few samples, SeerCuts quickly finds meaningful and useful partitions. In the demo, we will provide an overview of SeerCuts and allow the audience to explore various datasets and tasks, including data visualization and comprehensive modeling. The users will be able to evaluate how SeerCuts identifies meaningful discretization strategies and compare the tradeoff between different discretization options.	en_US
dc.publisher	ACM\|Companion of the 2025 International Conference on Management of Data	en_US
dc.relation.isversionof	https://doi.org/10.1145/3722212.3725132	en_US
dc.rights	Creative Commons Attribution	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.source	Association for Computing Machinery	en_US
dc.title	SeerCuts: Explainable Attribute Discretization	en_US
dc.type	Article	en_US
dc.identifier.citation	Eugenie Lai, Inbal Croitoru, Noam Bitton, Ariel Shalem, Brit Youngmann, Sainyam Galhotra, El Kindi Rezig, and Michael Cafarella. 2025. SeerCuts: Explainable Attribute Discretization. In Companion of the 2025 International Conference on Management of Data (SIGMOD/PODS '25). Association for Computing Machinery, New York, NY, USA, 143–146.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.identifier.mitlicense	PUBLISHER_POLICY
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2025-08-01T08:54:35Z
dc.language.rfc3066	en
dc.rights.holder	The author(s)
dspace.date.submission	2025-08-01T08:54:35Z
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: 3722212.3725132.pdf
Size:: 1.415Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record