作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2006, Vol. 32 ›› Issue (20): 59-61. doi: 10.3969/j.issn.1000-3428.2006.20.022

• 软件技术与数据库 • 上一篇    下一篇

基于链式结构XML文档的生成方法

陈再良,徐德智,陈学工,沈海澜   

  1. (中南大学信息科学与工程学院,长沙 410083)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2006-10-20 发布日期:2006-10-20

Generation Method of XML Document Based on Chain-link Structure

CHEN Zailiang, XU Dezhi, CHEN Xuegong, SHEN Hailan   

  1. (College of Information Science and Engineering, Central South University, Changsha 410083)
  • Received:1900-01-01 Revised:1900-01-01 Online:2006-10-20 Published:2006-10-20

摘要: 提出了一种基于链式结构的XML文档生成方法,设计了一个利用Java中的stream tokenizer类实现HTML文档解析的算法,将解析得到的元素内容及文本内容生成的结点插入到相应的位置上,同步生成DOM解析树,对DOM解析树进行遍历,将遍历得到的信息以二叉链表的形式存储,采用改进的先根遍历算法对该二叉链表遍历,提取相应的信息构建DTD,完成整个转换生成的过程。

关键词: HTML, XML, DOM, 解析

Abstract: This paper puts forward the method of XML document based on linked-structure. It uses stream tokenizer to design an algorithm of HTML document parse. The element and text contents are inserted into the correct position to create DOM-parsing-tree as parsing. Lastly. After visiting the tree and storing the information into a binary-linked-list, it uses a modified preorder algorithm visiting the linked-list, then extracts corresponding information to build DTD and finish the whole generation process.

Key words: HTML, XML, Document object model(DOM), Parse