摘要: 分布式处理是数据流管理系统发展的必然趋势。文章研究了分布式数据流的连接查询,提出DM3Join算法,它由2部分组成:一是通过分解并发的连接请求,合并相同的连接谓词,形成分布式查询操作算子;二是数据流在各分布式代理(Agent)中流转实现部分连接,并在查询引擎处组合成最终结果。DM3Join算法采用了一种类似路由表的结构执行窗口连接,由于可以共享中间结果,算法只需扫描数据1遍。分析和实验证明,该连接算法是高效的。
关键词:
数据流,
窗口连接,
连续查询,
分布式系统
Abstract: Distributed processing is a very promising route towards a more effective and adaptive data stream processing model. This paper studies window join over data streams, which is an important class of continuous operators for distributed processing. A novel distributed join approach named DM3Join is proposed. DM3Join consists of two parts. One is to decompose concurrent join query, merge the same join predicate and form distributed join operator. The other is to implement part join based on moving of data stream through distributed agent, and form final results in query engine. Different from most of other algorithms, the algorithm executes window joins performs like a router and needs only one scan over the data streams since different join queries share the intermediate results. The experimental results show that the algorithm is effective.
Key words:
Data streams,
Window join,
Continuous queries,
Distributed system
中图分类号:
刘学军;钱江波. 分布式数据流连接查询算法[J]. 计算机工程, 2006, 32(21): 41-43.
LIU Xuejun; QIAN Jiangbo. Algorithms for Sliding Window Join over Distributed Data Stream[J]. Computer Engineering, 2006, 32(21): 41-43.