[1]蒲梅,周枫,周晶晶,等.基于加权TextRank的新闻关键事件主题句提取[J].计算机工程,2017,34(8):219-224.
[2]ALLAN J,PAPKA R,LAVRENKO V.On-line new event detection and tracking[C]//Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York,USA:ACM Press,1998:37-45.
[3]吴共庆,胡骏,李莉,等.基于标签路径特征融合的在线Web新闻内容抽取[J].软件学报,2016,27(3):714-735.
[4]REIS D C,GOLGHER P B,SILVA A S,et al.Automatic Web news extraction using tree edit distance[C]//Proceedings of the 13th International Conference on World Wide Web.New York,USA:ACM Press,2004:502-511.
[5]FANG Y,XIE X,ZHANG X,et al.STEM:a suffix tree-based method for Web data records extraction[J].Knowledge and Information Systems,2017,55(2):305-331.
[6]GULHANE P,MADAAN A,MEHTA R,et al.Web-scale information extraction with vertex[C]//Proceedings of the 27th International Conference on Data Engineering.Washington D.C.,USA:IEEE Press,2011:1209-1220.
[7]BING L,WONG T L,LAM W.Unsupervised extraction of popular product attributes from E-commerce Web sites by considering customer reviews[J].ACM Transactions on Internet Technology,2016,16(2):12-15.
[8]CHARRON B,HIRATE Y,PURCELL D,et al.Extracting semantic information for E-commerce[C]//Proceedings of International Semantic Web Conference.Berlin,Germany:Springer,2016:273-290.
[9]GALI N,MARIESCU-ISTODOR R,FRNTI P.Using linguistic features to automatically extract Web page title[J].Expert Systems with Applications,2017,79:296-312.
[10]ADELBERG B.NoDoSE——a tool for semi-automatically extracting structured and semistructured data from text documents[J].ACM SIGMOD Record,1998,27(2):283-294.
[11]HAMMER J,GARCIA-MOLINA H,NESTOROV S,et al.Template-based wrappers in the TSIMMIS system[J].ACM SIGMOD Record,1997,26(2):532-535.
[12]李效东,顾毓清.基于DOM的Web信息提取[J].计算机学报,2002,25(5):526-533.
[13]KUSHMERICK N,WELD D S,DOORENBOS R B.Wrapper induction for information extraction[C]//Proceedings of International Joint Conference on Artificial Intelligence.New York,USA:ACM Press,1997:729-737.
[14]CAI D,YU S,WEN J R,et al.VIPS:a vision-based page segmentation algorithm[EB/OL].[2017-12-11].https://link.springer.com/content/pdf/10.1007/978-3-319-04244-2_22.pdf.
[15]SONG R,LIU H,WEN J R,et al.Learning block importance models for Web pages[C]//Proceedings of the 13th International Conference on World Wide Web.New York,USA:ACM Press,2004:203-211.
[16]WENINGER T,HSU W H,HAN J.CETR:content extraction via tag ratios[C]//Proceedings of the 19th International Conference on World Wide Web.New York,USA:ACM Press,2010:971-980.
[17]WU G,LI L,HU X,et al.Web news extraction via path ratios[C]//Proceedings of the 22nd ACM International Conference on Information and Knowledge Management.New York,USA:ACM Press,2013:2059-2068. |