[1] Research group of statistics and analysis on Chinese scientific papers.A brief report of statistics and analysis on Chinese scientific papers in 2016[J].Chinese Journal of Scientific and Technical Periodicals,2018,29(1):59-68.(in Chinese)中国科技论文统计与分析课题组.2016年中国科技论文统计与分析简报[J].中国科技期刊研究,2018,29(1):59-68. [2] HOFMANN T.Probabilistic latent semantic analysis[C]//Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence.San Mateo,USA:Morgan Kaufmann Publishers Inc.,1999:289-296. [3] BLEI D M,NG A Y,JORDAN M I.Latent Dirichlet allocation[J].Journal of Machine Learning Research,2003,3:993-1022. [4] LU Wanhui,TAN Zongying.Measuring novelty of scholarly articles[J].Data Analysis and Knowledge Discovery,2018,2(3):22-29.(in Chinese)逯万辉,谭宗颖.学术成果主题新颖性测度方法研究——基于Doc2Vec和HMM算法[J].数据分析与知识发现,2018,2(3):22-29. [5] WANG Tingting,HAN Man,WANG Yu.Optimizing LDA model with various topic numbers:case study of scientific literature[J].Data Analysis and Knowledge Discovery,2018,2(1):29-39.(in Chinese)王婷婷,韩满,王宇.LDA模型的优化及其主题数量选择研究——以科技文献为例[J].数据分析与知识发现,2018,2(1):29-39. [6] LI Ran,LIN Hong.Academic paper recommendation based community detection citation-collaboration networks[J].Application Research of Computers,2019,36(9):2675-2678.(in Chinese)李冉,林泓.基于频繁主题集偏好的学术论文推荐算法[J].计算机应用研究,2019,36(9):2675-2678. [7] ZHANG Chenyi,SUN Jianling,DING Yiqun.Topic mining for microblog based on MB-LDA model[J].Journal of Computer Research and Development,2011,48(10):1795-1802.(in Chinese)张晨逸,孙建伶,丁轶群.基于MB-LDA模型的微博主题挖掘[J].计算机研究与发展,2011,48(10):1795-1802. [8] WANG Shuyi,LIAO Huatao,WU Chake.Mining news on competitors with sentiment classification[J].Data Analysis and Knowledge Discovery,2018,2(3):70-78.(in Chinese)王树义,廖桦涛,吴查科.基于情感分类的竞争企业新闻文本主题挖掘[J].数据分析与知识发现,2018,2(3):70-78. [9] MAVRIDIS I,KARATZA E.Performance evaluation of cloud-based log file analysis with Apache Hadoop and Apache Spark[J].Journal of Systems and Software,2016,125:131-151. [10] MENG X,BRADLEY J,YAVUZ B,et al.MLlib:machine learning in Apache Spark[J].Journal of Machine Learning Research,2015,17(1):1235-1241. [11] ZHOU Lina,ZHANG Dongsong.NLPIR:a theoretical framework for applying natural language processing to information retrieval[J].Journal of the American Society for Information Science and Technology,2003,54(2):115-123. [12] TAGHVA K,BECKLEY R,SADEH M.A list of farsi stopwords[EB/OL].[2018-09-11].https://www.researchgate.net/publication/228427943_A_list_of_farsi_stopwords. [13] ERK K.A structured vector space model for word meaning in context[C]//Proceedings of Conference on Empirical Methods in Natural Language Processing.Philadelphia,USA:Association for Computational Linguistics,2008:897-906. [14] TERENIN A,SIMPSON D,DRAPER D.Asynchronous Gibbs sampling[J].Statistics,2015,3:760-762. [15] ZHANG W,YOSHIDA T,TANG X.TFIDF,LSI and multi-word in information retrieval and text categorization[C]//Proceedings of IEEE International Conference on Systems,Man and Cybernetics.Washington D.C.,USA:IEEE Press,2009:108-113. [16] ZHANG C.Research on enhancing the effectiveness of the Chinese text automatic categorization based on ICTCLAS segmentation method[C]//Proceedings of IEEE International Conference on Software Engineering and Service Science.Washington D.C.,USA:IEEE Press,2013:267-270. |