ZStream : a cost-based query processor for composite event detection
Author(s)
Mei, Yuan, Ph. D. Massachusetts Institute of Technology
DownloadFull printable version (26.50Mb)
Alternative title
Searching for optimal plans in sequential composite event detection
Other Contributors
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Advisor
Samuel Madden.
Terms of use
Metadata
Show full item recordAbstract
Composite (or Complex) event processing (CEP) systems search sequences of incoming primitive events for occurrences of user-specified event patterns. Recently, they are gaining more and more attention in a variety of areas due to their powerful and expressive query language and performance potential. Sequentiality (temporal ordering) is the primary way in which CEP relates events to each other. Examples include tracing a car's movement in a predefined area (where a car moves through a series of places), detecting anomalies in stock prices (where the rise and fall of the price of some stocks is monitored), detecting intrusion in network monitoring (where a specific sequence of malicious activities is detected) or catching break points in debugging systems (where a sequence of function calls are made). But even searching for a simple sequence pattern involving only equality constraints between its components is an NP-complete problem. Furthermore, simple sequentiality is not enough to express many real world patterns, which also involve conjunction (e.g., concurrent events), disjunction (e.g., a choice between two options) and negation, making the matching problem even more complex. In this thesis, we present a CEP system called ZStream to efficiently process such sequential patterns. Besides simple sequentiality, ZStream is also able to support other relations such as conjunction, disjunction, negation and Kleene Closure. ZStream uses a tree-based plan for both the logical and physical representation of query patterns. Using this tree-based infrastructure, ZStream is able to unify the evaluation of sequence, conjunction, disjunction, negation, and Kleene Closure as variants of the join operator. A single pattern may have several equivalent physical tree plans, with different evaluation costs. Hence a cost model is proposed to estimate the computation cost of a plan. (cont.) Experiments show that our cost model can capture the real evaluation cost of a query plan accurately. Based on this cost model and using a simple set of statistics about operator selectivity and data rates, ZStream is able to adjust the order in which it detects patterns. In addition, we design a dynamic programming algorithm and propose equivalent transition rules to automatically search for an optimal query plan for a given pattern.
Description
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008. Includes bibliographical references (p. 103-104).
Date issued
2008Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.