[1] ESKENAZI S,GOMEZ-KRÄMER P,OGIER J M.A comprehensive survey of mostly textual document segmentation algorithms since 2008[J].Pattern Recognition,2016,64:1-14. [2] TAO Xin,TANG Zhi,XU Canhui.Contextual modeling for logical labeling of PDF documents[J].Computers and Electrical Engineering,2014,40(4):1363-1375. [3] TAO Xin,TANG Zhi,XU Canhui,et al.Logical labeling of fixed layout PDF documents using multiple contexts[C]//Proceedings of IAPR International Workshop on Document Analysis Systems.Washington D.C.,USA:IEEE Press,2014:360-364. [4] DONG Yongquan,LI Qingzhong,DING Yanhui,et al.Constrained conditional random fields for semantic annotation of Web data[J].Journal of Computer Research and Development,2012,49(2):361-371.(in Chinese) 董永权,李庆忠,丁艳辉,等.基于约束条件随机场的Web数据语义标注[J].计算机研究与发展,2012,49(2):361-371. [5] RAHMAN M M,FININ T.Understanding the logical and semantic structure of large documents[EB/OL].[2019-01-01].https://arxiv.org/pdf/1709.00770.pdf. [6] OYEDOTUN O K,KHASHMAN A.Document segmentation using textural features summarization and feedforward neural network[M].[S.l.]:Kluwer Academic Publishers,2016. [7] QIN Jiangmin,LIN Ping,WANG Rong,et al.Application of software typesetting in Founder FX 2011[J].Chinese Journal of Science and Technology Research,2012(4):109-111.(in Chinese) 秦江敏,林平,王荣,等.利用方正飞翔2011软件排版的实践[J].中国科技期刊研究,2012(4):109-111. [8] FENG Shaorong,PAN Weiwei,LIN Ziyu.XML documents clustering based on improved k-medoids algorithm[J].Computer Engineering,2015,41(9):56-62.(in Chinese) 冯少荣,潘炜炜,林子雨.基于改进k-medoids算法的XML文档聚类[J].计算机工程,2015,41(9):56-62. [9] CHEN Luyao,ZENG Guosun,WANG Wei.Extraction and logic description for structure trust pattern of information documents[J].Application Research of Computers,2010,27(12):4624-4629.(in Chinese) 陈路瑶,曾国荪,王伟.信息文档结构信任模式的提取及逻辑描述[J].计算机应用研究,2010,27(12):4624-4629. [10] LI Juan.Research on document typesetting format checking method based on template[D].Beijing:Beijing Information Science and Technology University,2012.(in Chinese) 李娟.基于模板的文档排版格式检查方法研究[D].北京:北京信息科技大学,2012. [11] SONG Haosu,LI Ning,ZHANG Wei.Application of VSM model to document structure recognition[J].Journal of Beijing Information Science and Technology University(Natural Science),2011,26(6):66-69,75.(in Chinese) 宋昊苏,李宁,张伟.VSM模型在文档结构识别中的应用[J].北京信息科技大学学报(自然科学版),2011,26(6):66-69,75. [12] PENG Xin.Research on document typesetting format inspection method based on format index and graph[D].Beijing:Beijing Information Science and Technology University,2015.(in Chinese) 彭欣.基于格式索引和图的文档排版格式检查方法研究[D].北京:北京信息科技大学,2015. [13] IORIO A D,PERONI S,POGGI F,et al.Recognising document components in XML-based academic articles[C]//Proceedings of ACM Symposium on Document Engineering.New York,USA:ACM Press,2013:181-184. [14] KIM T,KIM S,CHOI S,et al.A machine-learning based approach for extracting logical structure of a styled document[J].KSII Transactions on Internet and Information Systems,2017(11):1043-1056. [15] LEI Yang,TIAN Ying'ai,LI Ning,et al.Document structure identification method based on conditional random field[C]//Proceedings of International Conference on Mechatronics,Control and Materials.[S.l.]:Atlantis Press,2016:1-6. [16] SUNDERMEYER M,NEY H.From feed forward to recurrent LSTM neural networks for language modeling[J].IEEE/ACM Transactions on Audio Speech and Language Processing,2015,23(3):517-529. [17] SHAO Y,HARDMEIER C,TIEDEMANN J,et al.Character-based joint segmentation and POS tagging for Chinese using bidirectional RNN-CRF[EB/OL].[2018-12-23].https://arxiv.org/pdf/1704.01314.pdf. [18] CHEN Bin,ZHOU Yong,LIU Bing.Event trigger word extraction based on convolutional bidirectional long short term memory network[J].Computer Engineering,2019,45(1):153-158.(in Chinese)陈斌,周勇,刘兵.基于卷积双向长短期记忆网络的事件触发词抽取[J].计算机工程,2019,45(1):153-158. [19] SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[EB/OL].[2018-12-23].https://arxiv.org/pdf/1409.3215.pdf. [20] LE Q V,MIKOLOV T.Distributed representations of sentences and documents[EB/OL].[2018-12-23].https://arxiv.org/pdf/1405.4053.pdf. |