摘要: 为从海量深网中获得有价值的信息,提出一种深网数据源质量估计模型,综合考虑接口查询能力、接口页面质量和服务质量3方面因素,采用SVM和Ranking SVM机器学习方法得到质量估计函数。实验结果表明,该估计函数得到的数据源质量排序序列和人工排序序列的Kendall’s 距离超过0.5,且获得较高的精度。
关键词:
深网,
查询能力,
查询接口,
服务质量
Abstract: In order to get valuable information from the mass Deep Web, this paper proposes a quality estimation model of Deep Web data sources, considering the query capability of interface, the quality of interface pages and the Quality of Services(QoS), using the SVM and Ranking SVM machine learning approach to obtain the quality estimation function. Experimental results show the Kendall’s distance between data sources quality sort sequences made by this quality estimation function and the artificial one is more than 0.5, and achieves higher accuaracy.
Key words:
Deep Web,
query capability,
query interface,
Quality of Services(QoS)
中图分类号:
胡鹏昱;赵朋朋;方 巍;崔志明;. 深网数据源质量估计模型[J]. 计算机工程, 2009, 35(9): 204-207.
HU Peng-yu; ZHAO Peng-peng; FANG Wei; CUI Zhi-ming;. Quality Estimation Model of Deep Web Data Source[J]. Computer Engineering, 2009, 35(9): 204-207.