Show simple item record

dc.contributor.authorMarkakis, Markos
dc.contributor.authorChen, An Bo
dc.contributor.authorYoungmann, Brit
dc.contributor.authorGao, Trinity
dc.contributor.authorZhang, Ziyu
dc.contributor.authorShahout, Rana
dc.contributor.authorChen, Peter Baile
dc.contributor.authorLiu, Chunwei
dc.contributor.authorSabek, Ibrahim
dc.contributor.authorCafarella, Michael
dc.date.accessioned2024-07-23T20:18:35Z
dc.date.available2024-07-23T20:18:35Z
dc.date.issued2024-06-09
dc.identifier.isbn979-8-4007-0422-2
dc.identifier.urihttps://hdl.handle.net/1721.1/155775
dc.descriptionSIGMOD-Companion ’24, June 9–15, 2024, Santiago, AA, Chileen_US
dc.description.abstractCausal analysis is an essential lens for understanding complex system dynamics in domains as varied as medicine, economics and law. Computer systems are often similarly complex, but much of the information about them is only available in long, messy, semi-structured log files. This demo presents Sawmill, an open-source system that makes it possible to extract causal conclusions from log files. Sawmill employs methods drawn from the areas of data transformation, cleaning, and extraction in order to transform logs into a representation amenable to causal analysis. It gives log-derived variables human-understandable names and distills the information present in a log file around a user's chosen causal units (e.g. users or machines), generating appropriate aggregated variables for each causal unit. It then leverages original algorithms to efficiently use this representation for the novel process of Exploration-based Causal Discovery - the task of constructing a sufficient causal model of the system from available data. Users can engage with this process via an interactive interface, ultimately making causal inference possible using off-the-shelf tools. SIGMOD'24 participants will be able to use Sawmill to efficiently answer causal questions about logs. We will guide attendees through the process of quantifying the impact of parameter tuning on query latency using real-world PostgreSQL server logs, before letting them test Sawmill on additional logs with known causal effects but varying difficulty. A companion video for this submission is available online.en_US
dc.publisherACM|Companion of the 2024 International Conference on Management of Dataen_US
dc.relation.isversionof10.1145/3626246.3654731en_US
dc.rightsCreative Commons Attributionen_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en_US
dc.sourceAssociation for Computing Machineryen_US
dc.titleSawmill: From Logs to Causal Diagnosis of Large Systemsen_US
dc.typeArticleen_US
dc.identifier.citationMarkakis, Markos, Chen, An Bo, Youngmann, Brit, Gao, Trinity, Zhang, Ziyu et al. 2024. "Sawmill: From Logs to Causal Diagnosis of Large Systems."
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.identifier.mitlicensePUBLISHER_CC
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
eprint.statushttp://purl.org/eprint/status/NonPeerRevieweden_US
dc.date.updated2024-07-01T07:54:45Z
dc.language.rfc3066en
dc.rights.holderThe author(s)
dspace.date.submission2024-07-01T07:54:46Z
mit.licensePUBLISHER_CC
mit.metadata.statusAuthority Work and Publication Information Neededen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record