基于LSA的Web信息采集和统计服务

doi:10.3969/j.issn.1000-3428.2008.15.029

计算机工程 ›› 2008, Vol. 34 ›› Issue (15): 83-84,8.

基于LSA的Web信息采集和统计服务

李晓婷1，张磊2，沈建京2

(1. 西安通信学院通信装备管理系，西安 710106；2. 解放军信息工程大学电子信息工程系，郑州 450001)

收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-08-05 发布日期:2008-08-05

Web Information Collection and Statistics Services Based on LSA

LI Xiao-ting1, ZHANG Lei2, SHEN Jian-jing2

(1. Department of Communication Equipment Management, Xi’an Communication College, Xi’an 710106; 2. Department of Electrical Information Engineering, PLA Information Engineering University, Zhengzhou 450001)

Received:1900-01-01 Revised:1900-01-01 Online:2008-08-05 Published:2008-08-05

摘要/Abstract

摘要： 在网络信息时代，传统的统计预测方法已经不完全适用，而对特定领域的信息采集和统计的需求日趋明显，使有效定向采集和统计特定领域信息并得到其相应的预测结果成为一个日益重要的研究方向。该文通过运用汉语分词、潜在语义分析和语义匹配等技术，构造了用户兴趣模型，并同时使用了面向服务的体系结构来设计该Web信息采集统计服务，通过具体的实验验证了对Web信息结构分析和未知信息相关性预测来控制信息采集统计的效果。

关键词: 信息采集, 潜在语义分析, 面向服务的架构, Web服务

Abstract: In network information age, the traditional statistics and prediction methods have not been applicable to Web information collection and statistics anymore and owing to the requirements of information collection and statistics in special area are clearer than before, it makes the effectively directional information collection and statistics in special area and getting the corresponding predictive results become a more important research direction. This paper applies the technologies of Chinese word segmenting, Latent Semantic Analysis(LSA), semantic matching, and constructs a user interest model. In the mean time, it uses Service-Oriented Architecture(SOA) to design the Web information collection and statistics service, and validates the effect of the analysis of Web page gathering structure and unknown information, forecast for the relevance of Web page to control the information collection and statistics by concrete experiments.

Key words: information collection, Latent Semantic Analysis(LSA), Service-Oriented Architecture(SOA), Web service

中图分类号:

N945

李晓婷;张磊;沈建京. 基于LSA的Web信息采集和统计服务[J]. 计算机工程, 2008, 34(15): 83-84,8.

LI Xiao-ting; ZHANG Lei; SHEN Jian-jing. Web Information Collection and Statistics Services Based on LSA[J]. Computer Engineering, 2008, 34(15): 83-84,8.

https://www.ecice06.com/CN/Y2008/V34/I15/83

[1]	马超, 宋琛. 计及电力数据安全的智能合约上链方法及防篡改技术研究[J]. 计算机工程, 2024, 50(10): 240-254.
[2]	谭文安, 吴嘉凯. 基于改进花朵授粉算法的Web服务组合优化[J]. 计算机工程, 2020, 46(12): 67-72.
[3]	李卫超, 张铮, 王立群, 刘镇武, 刘浩. 一种拟态构造的Web威胁态势分析方法[J]. 计算机工程, 2019, 45(8): 1-6.
[4]	陆贝妮,杜育根. 基于社区发现的Web服务QoS预测[J]. 计算机工程, 2019, 45(3): 117-124.
[5]	杜胜浩,钱晓捷. 基于刻面与本体标识的语义Web服务发现方法[J]. 计算机工程, 2018, 44(8): 224-229,236.
[6]	陈莹,孙晓波,邢建春,杨启亮. 任务关键系统的时间约束验证与最优路径分析[J]. 计算机工程, 2018, 44(5): 60-65,77.
[7]	王瑞,李青,赵倩. 基于SOA与Web Service的飞机保障信息系统集成[J]. 计算机工程, 2018, 44(1): 91-97.
[8]	欧阳超,陈志泊,孙国栋. Web服务组合QoS优化中的改进遗传算法[J]. 计算机工程, 2017, 43(8): 231-235,242.
[9]	卢凤,李海荣,韩艳. 基于时空相似度感知的Web服务QoS协同过滤推荐[J]. 计算机工程, 2017, 43(4): 28-33,38.
[10]	周雄,王莉莉. 基于多准则决策和相似度评价的Web服务推荐SOA系统[J]. 计算机工程, 2017, 43(3): 187-192,199.
[11]	周波,曾一,陈恒鑫,刘慧君,杨燕宁. Web服务组合匹配框架研究[J]. 计算机工程, 2017, 43(1): 98-104.
[12]	贾静兰,董才林,喻莹,王静,张丽芬. 基于回溯树的语义Web服务自动组合优化方法[J]. 计算机工程, 2016, 42(4): 215-220.
[13]	张龙昌,杨艳红,王晓明. 一种均衡风险与偏好的并行服务选择算法[J]. 计算机工程, 2016, 42(10): 57-63.
[14]	李靖,乔蕊,刘志中. 结合对策论与多目标MILP的Web服务组合调度问题求解[J]. 计算机工程, 2016, 42(1): 11-17.
[15]	马亮,钱雪忠. 基于QoS的Web服务调用最短路径确定方法[J]. 计算机工程, 2015, 41(9): 103-107.

选择文件类型/文献管理软件名称

选择包含的内容

基于LSA的Web信息采集和统计服务

Web Information Collection and Statistics Services Based on LSA

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

基于LSA的Web信息采集和统计服务

Web Information Collection and Statistics Services Based on LSA

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价