摘要:
为最准最全地对页面抽取的数据进行语义标注,提出一种基于包装器自动语义标注的方法。该方法利用多个标注源进行组合标注,有效解决单标注源标注率不高问题,同时针对标注不完全问题,给出利用多个数据源的互补关系来标注,生成高效率的标注包装器对抽取结果自动标注。实验结果证明,该方法具有较高的准确性和效率。
关键词:
深层网络,
语义标注,
同步标注,
包装器
Abstract:
To annotate extracted data accurately and comprehensively in Deep Web, this paper proposes an automatic semantic annotation method based on wrapper. Several annotators are combined to improve accuracy. To settle annotation incomplete problem, complementary relationship of data sources is used. A wrapper which can annotate extracted data in high efficiency is generated. Experimental result shows that the method achieves higher accuracy and efficiency.
Key words:
Deep Web,
semantic annotation,
synchronous annotation,
wrapper
中图分类号:
杨晓琴, 鞠时光, 曹庆皇, 王秀红. 基于包装器的Deep Web自动语义标注[J]. 计算机工程, 2010, 36(12): 52-54.
YANG Xiao-Qin, JU Shi-Guang, CAO Qiang-Huang, WANG Xiu-Gong. Deep Web Automatic Semantic Annotation Based on Wrapper[J]. Computer Engineering, 2010, 36(12): 52-54.