摘要: 大多数的网站体积庞大、结构复杂,因此要考察与网站相关的问题比较有效的方法是进行网站信息可视化。而可视化的一个关键问题就是如何对网站拓扑结构等一些基础数据进行提取和表示。该文提出了一种网站拓扑结构及基本信息的提取方法。其中包括提取过程中一些复杂问题的解决方案、关键技术以及数据的表示和存储结构等。介绍了基于这种方法所开发的一个网站拓扑结构自动提取工具,以及利用该工具所进行的应用试验。
关键词:
网站;拓扑结构;数据提取;十字链表
Abstract: Most websites are voluminous and have the complex structures. So the effective way to review the questions associate with it is to make the information of website visualized. The key factor of visualization is how to extract and denote the information of the Web topology structure. This paper presents an approach to extract the Web topology and related information, which includes the solution of some complicated problem, the application of some key technologies, the denotation and storage of data. Based on the approach, the paper develops a tool to support Web topology extraction and carries out an experiment of Web topology extraction using this tool.
Key words:
Website; Topology structure; Data extraction; Cross-linked list
何玉宝,刘正捷,田晓杰. 网站拓扑结构提取技术的研究与应用[J]. 计算机工程, 2006, 32(1): 157-159,179.
HE Yubao, LIU Zhengjie, TIAN Xiaojie. Research and Application of Web Topology Structure Extraction Technology[J]. Computer Engineering, 2006, 32(1): 157-159,179.