作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 先进计算与数据处理 • 上一篇    下一篇

基于日志分析的增量数据捕获方法研究

彭远浩,潘久辉   

  1. (暨南大学信息科学技术学院,广州510632)
  • 收稿日期:2014-07-23 出版日期:2015-06-15 发布日期:2015-06-15
  • 作者简介:彭远浩(1989 - ),男,硕士研究生,主研方向:高性能数据库技术;潘久辉(通讯作者),教授。
  • 基金资助:

    武汉大学软件工程国家重点实验室开放基金资助项目(SKLSE2012-09-37);公安部技术研究计划基金资助项目(2014 JSYJB048)。

Study on Incremental Data Capturing Method Based on Log Analysis

PENG Yuanhao,PAN Jiuhui   

  1. (College of Information Science and Technology,Jinan University,Guangzhou 510632,China)
  • Received:2014-07-23 Online:2015-06-15 Published:2015-06-15

摘要:

通过扫描数据库日志文件可以捕获变化数据,但已有研究仅适应某种特定类型的数据库管理系统并且缺乏对冗余信息的消除。针对上述不足,提出一个基于日志分析的增量数据检测及其净效应处理通用模型,描述增量检测过程中的通用处理步骤,即日志抽取、日志分析和净效应处理3 个模块。通过实验对净效应处理速度、冗余数据压缩率、网络传输速度等因素进行分析,结果表明,净效应处理可以有效地减少数据的网络传输时间和更新时间,提高运行效率。

关键词: 信息集成, 增量检测, 冗余信息, 日志分析, 日志抽取, 净效应

Abstract:

Change data can be captured by scanning log files in the database. However,methods proposed in most existing studies have many limitations and can only be applied to certain types of DBMS. Moreover,they have not provided effective ways of eliminating redundant information. This paper,while analyzing these limitations,proposes a universal model based on log analysis for incremental data detection and its net effect processing,and describes the typical procedures for incremental detection,which are log extraction,log analysis and net effect handling. Finally,experiments are also conducted to analyze the net effect handling speed,redundant data compression ratio,network transmission speed and other factors,through which show that the net effect handling helps to decrease the time of network transmission and data update,and improves operating efficiency.

Key words: information integration, incremental detection, redundant information, log analysis, log extraction, net effect

中图分类号: