Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2009, Vol. 35 ›› Issue (24): 265-267. doi: 10.3969/j.issn.1000-3428.2009.24.089

• Developmental Research • Previous Articles     Next Articles

Application of Dynamic Web Page Information Extraction Technology in Seeking-job Search

FANG Hong1, LV Tai-zhi2   

  1. (1. Department of Information Engineering, Jiangsu Maritime Institute, Nanjing 211170;2. College of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing 210094)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-12-20 Published:2009-12-20

动态网页信息提取技术在求职搜索中的应用

方 宏1,吕太之2   

  1. (1. 江苏海事职业技术学院信息工程系,南京 211170;2. 南京理工大学计算机科学与技术学院,南京 210094)

Abstract: Aiming at the problem that using the script of Web page widely, the traditional search engine is difficult to extract the information, this paper uses HtmlUnit to interpret JavaScript dynamic Web page, and uses Selenium IDE to extract XPath of dynamic element, the seeking-job search engine extracts successfully the information of Web page produced dynamically. Experimental results show that this technology is useful.

Key words: dynamic Web page, information extraction, seeking-job, search

摘要: 针对传统搜索引擎难以提取客户端脚本生成信息的问题,结合求职搜索引擎的研发,运用HtmlUnit解析JavaScript动态网页,使用Selenium IDE提取动态元素的XPath,解决传统搜索引擎难以提取客户端动态生成信息的问题。实验结果证明,该技术是行之有效的。

关键词: 动态网页, 信息提取, 求职, 搜索

CLC Number: