作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2013, Vol. 39 ›› Issue (6): 60-65. doi: 10.3969/j.issn.1000-3428.2013.06.012

• 先进计算与数据处理 • 上一篇    下一篇

列数据库的SQL查询语句编译与优化

甄 真,陈 虎,张林亚   

  1. (华南理工大学软件学院,广州 510006)
  • 收稿日期:2012-06-18 出版日期:2013-06-15 发布日期:2013-06-14
  • 作者简介:甄 真(1987-),男,硕士研究生,主研方向:高性能计算,并行算法;陈 虎,副教授;张林亚,硕士研究生
  • 基金资助:
    广东省科技计划基金资助项目(2011A010801008, 2011A090200122, 2011A090200027)

Compilation and Optimization of SQL Query Statements on Column-oriented Database

ZHEN Zhen, CHEN Hu, ZHANG Lin-ya   

  1. (School of Software Engineering, South China University of Technology, Guangzhou 510006, China)
  • Received:2012-06-18 Online:2013-06-15 Published:2013-06-14

摘要: 基于多核CPU和GPU异构平台的列数据库可用于海量数据和复杂查询,但其优化主要集中在底层,并且后端的执行序列只能通过手工硬编码生成,不能适应多样的SQL查询语句。针对该问题,设计并实现一个将SQL查询语句自动转化成执行序列的编译器,研究多个复杂表达式中的公共子表达式消除和原语依赖图合并方法。与手工编码的比较结果表明,该编译器能够提高算术表达式的计算速度,缩短执行SQL查询语句的时间。

关键词: 列数据库, 原语, 编译器, 依赖图, 公共子表达式消除, 查询优化

Abstract: A column-oriented database based on a heterogeneous platform of multi-core CPU and GPU can be used for mass data and complex queries. However, the optimization in this database is mainly on physical level and execution sequence for its back-end can only be generated manually, resulting in hard adaptation to varieties of SQL query statements. To solve this problem, this paper designs and implements a compiler that translates SQL query statements into execution sequence. It studies Common Subexpression Elimination(CSE) method in multiple complex expressions and merging method of multiple primitive dependency graphs. Comparing the results with situations where no compiler is used in GSQL, it can be found that this compiler can improve speed of computing multiple complex expressions efficiently and reduce time of processing multiple SQL query statements.

Key words: column-oriented database, primitive, compiler, dependency graph, Common Subexpression Elimination(CSE), query optimization

中图分类号: