作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 体系结构与软件技术 • 上一篇    下一篇

一种面向图数据的预装载缓存策略

左遥1,2,梁英1,许洪波1,黄硕1,2   

  1. (1.中国科学院计算技术研究所网络数据科学与技术重点实验室,北京 100190; 2.中国科学院大学,北京 100190)
  • 收稿日期:2015-02-12 出版日期:2016-05-15 发布日期:2016-05-13
  • 作者简介:左遥(1991-),男,硕士研究生,主研方向为大数据;梁英,高级工程师;许洪波,副研究员;黄硕,硕士。
  • 基金资助:
    国家“973”计划基金资助项目(2012CB316303,2013CB329602);国家自然科学基金资助重点项目(61232010);国家自然科学基金资助面上项目(61173064);国家科技支撑计划基金资助项目(2012BAH39B04)。

A Preloaded Caching Strategy for Graph Data

ZUO Yao 1,2,LIANG Ying 1,XU Hongbo 1,HUANG Shuo 1,2   

  1. (1.Key Laboratory of Network Data Science and Technology,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China; 2.University of Chinese Academy of Sciences,Beijing 100190,China)
  • Received:2015-02-12 Online:2016-05-15 Published:2016-05-13

摘要: 真实世界中存在很多数据规模大且关联性强的图数据,而图缓存技术可有效提高对图数据的访问效率和查询效率。为此,提出一种面向大规模图数据的预装载缓存策略。采用基于节点访问日志和大度数优先2种装载方法,利用图数据访问的局部性特点缓存频繁访问的数据。在图存储系统GolaxyGDB中设计一个分布式图数据缓存框架,并描述其中图缓存策略的实现过程。实验结果表明,该策略能有效提高图数据复杂查询的命中率,降低响应时间,满足实际应用中的在线访问需求。

关键词: 预装载缓存策略, 图数据, 大度数节点优先, 访问日志, Apache Hbase数据库, 分布式缓存

Abstract: Many large scale and highly connected graphs exist in real world,and caching is an efficient way to increase visiting and querying efficiency for graph data.This paper proposes a preloaded caching strategy for large scale graph data.It includes two methods named ‘log-based’ and ‘big degree node first’,which takes advantage of the graph access locality to cache frequently accessed data.This paper designs a distributed cache framework for graph data in GolaxyGDB graph storage system,and describes the implementation process of caching strategy in it.Experimental results demonstrate that the proposed strategy can effectively improve hit ratio of complex graph querying and reduce response time.It can meet the demand of online access in practical application.

Key words: preloaded caching strategy, graph data, big degree node first, visit log, Apache Hbase database, distributed cache

中图分类号: