Distributed Stream Filtering For Database Applications

William M. Shapiro, Kenneth J. Goldman

Abstract

Distributed stream filtering is a mechanism for implementing a new class of real-time applications with distributed processing requirements. These applications require scalable architectures to support the efficient processing and multiplexing of large volumes of continuously generated data.

This paper provides an overview of a stream-oriented model for database query processing and presents a supporting implementation. To facilitate distributed stream filtering, we introduce several new query processing operations, including pipelined filtering that efficiently joins and eliminates duplicates from database streams and a new join method, the progressive join, that joins streams of tuples. Finally, recognizing that the stream-oriented model results in performance tradeoffs that differ significantly from those in traditional databases, we present a new query optimization strategy specifically designed for stream-oriented databases.

Technical report version is available as either postscript(.ps) or compressed postscript(.ps.Z).


Washington University Department of Computer Science Technical Report WUCS-96-27, October 1996.
Prepared by Bill Shapiro (wms1@cs.wustl.edu)
Washington University Department of Computer Science