计算机工程

• 移动互联与通信技术 • 上一篇    下一篇

基于树扩展朴素贝叶斯分类器的Web代理服务器缓存优化

赵中全,刘丹   

  1. (电子科技大学 电子科学技术研究院,成都 611731)
  • 收稿日期:2015-12-31 出版日期:2017-01-15 发布日期:2017-01-13
  • 作者简介:赵中全(1991—),男,硕士研究生,主研方向为网络通信、数据挖掘;刘丹,副教授、博士。

Web Proxy Server Cache Optimization Based on Tree Augmented Naive Bayes Classifier

ZHAO Zhongquan,LIU Dan   

  1. (Research Institute of Electronic Science and Technology,University of Electronic Science and Technology of China,Chengdu 611731,China)
  • Received:2015-12-31 Online:2017-01-15 Published:2017-01-13

摘要:

Web代理服务器缓存能在一定程度上减少网络拥塞现象和用户的访问延迟,减轻服务器负载。然而Web代理缓存的缓存命中率和字节命中率较低,并不能很好地起到加速网络请求响应的效果。为此,研究监督学习方法,使用树扩展朴素贝叶斯分类器对Web日志数据进行分类,进而预测可能会再次访问到的Web对象,并结合最近最少使用(LRU)算法,提出一种新的缓存策略。实验结果表明,树扩展的贝叶斯分类器在精度和召回率指标上优于朴素贝叶斯和BP神经网络等分类器,通过树扩展的贝叶斯分类器优化后的缓存策略与普通LRU算法相比,不仅可以提高缓存的效率,而且可有效提高Web代理缓存的请求命中率和字节命中率。

关键词: Web代理缓存, 贝叶斯分类器, 贝叶斯网络, 循环滑动窗口, 数据集

Abstract:

Web proxy server cache can reduce network congestion in a certain extent,and it can also reduce server load and user’s access delay.However,the Web proxy cache is just passable in the cache hit rate and byte hit rate,cannot play very well to accelerate network request response effect.Combining supervised learning method,this paper tries to classify the Web log data using Tree Augmented Naive Bayes(TANB) classifier,predicts the Web object,and proposes a new cache strategy with the regularly used Least Recently Used(LRU) algorithm.Experimental results show that TANB classifier is superior to the naive Bayes and BP neural network classifier in the precision and recall index.And compared with LRU algorithm,optimized cache strategy cannot only improve the cache efficiency,but also effectively improve the request hit rate and byte hit rate of Web proxy cache.

Key words: Web proxy cache, Bayes classifier, Bayes network, circular sliding window, data set

中图分类号: