作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2009, Vol. 35 ›› Issue (24): 265-267. doi: 10.3969/j.issn.1000-3428.2009.24.089

• 开发研究与设计技术 • 上一篇    下一篇

动态网页信息提取技术在求职搜索中的应用

方 宏1,吕太之2   

  1. (1. 江苏海事职业技术学院信息工程系,南京 211170;2. 南京理工大学计算机科学与技术学院,南京 210094)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2009-12-20 发布日期:2009-12-20

Application of Dynamic Web Page Information Extraction Technology in Seeking-job Search

FANG Hong1, LV Tai-zhi2   

  1. (1. Department of Information Engineering, Jiangsu Maritime Institute, Nanjing 211170;2. College of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing 210094)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-12-20 Published:2009-12-20

摘要: 针对传统搜索引擎难以提取客户端脚本生成信息的问题,结合求职搜索引擎的研发,运用HtmlUnit解析JavaScript动态网页,使用Selenium IDE提取动态元素的XPath,解决传统搜索引擎难以提取客户端动态生成信息的问题。实验结果证明,该技术是行之有效的。

关键词: 动态网页, 信息提取, 求职, 搜索

Abstract: Aiming at the problem that using the script of Web page widely, the traditional search engine is difficult to extract the information, this paper uses HtmlUnit to interpret JavaScript dynamic Web page, and uses Selenium IDE to extract XPath of dynamic element, the seeking-job search engine extracts successfully the information of Web page produced dynamically. Experimental results show that this technology is useful.

Key words: dynamic Web page, information extraction, seeking-job, search

中图分类号: