1 |
任洁. 自然语言与自然语言理解及其应用. 科教文汇, 2006,(3): 69- 70.
URL
|
|
REN J. Natural language and natural language understanding and its application. Journal of Science and Education, 2006,(3): 69- 70.
URL
|
2 |
BELKIN N J, CROFT W B. Information filtering and information retrieval. Communications of the ACM, 1992, 35(12): 29- 38.
doi: 10.1145/138859.138861
|
3 |
王春柳, 杨永辉, 邓霏, 等. 文本相似度计算方法研究综述. 情报科学, 2019, 37(3): 158- 168.
URL
|
|
WANG C L, YANG Y H, DENG F, et al. A review of text similarity approaches. Information Science, 2019, 37(3): 158- 168.
URL
|
4 |
金博, 史彦军, 滕弘飞. 基于语义理解的文本相似度算法. 大连理工大学学报, 2005, 45(2): 291- 297.
doi: 10.3321/j.issn:1000-8608.2005.02.028
|
|
JIN B, SHI Y J, TENG H F. Text similarity algorithm based on semantic understanding. Journal of Dalian University of Technology, 2005, 45(2): 291- 297.
doi: 10.3321/j.issn:1000-8608.2005.02.028
|
5 |
DING P, LIU D, ZHANG Z, et al. A novel discrimination structure for assessing text semantic similarity. Journal of Internet Technology, 2022, 23(4): 709- 717.
doi: 10.53106/160792642022072304006
|
6 |
LEVENSHTEIN V. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 1965, 10, 707- 710.
|
7 |
SAMPATH A, SHANMUGAVEL V. Hybrid Tamil spell checker with combined character splitting. Concurrency and Computation: Practice and Experience, 2023, 35(1): e7440.
doi: 10.1002/cpe.7440
|
8 |
BENT T, HOLT R F, VAN ENGEN K J, et al. How pronunciation distance impacts word recognition in children and adults. The Journal of the Acoustical Society of America, 2021, 150(6): 4103- 4117.
doi: 10.1121/10.0008930
|
9 |
LIKHITHA C P, NINITHA P, KANCHANA V, et al. DNA Bar-coding: a novel approach for identifying an individual using extended Levenshtein distance algorithm and STR analysis. International Journal of Electrical and Computer Engineering, 2016, 6(3): 1133.
doi: 10.11591/ijece.v6i3.10086
|
10 |
ARNOLD M, OHLEBUSCH E. Linear time algorithms for generalizations of the longest common substring problem. Algorithmica, 2011, 60(4): 806- 818.
doi: 10.1007/s00453-009-9369-1
|
11 |
张毅超, 车玫, 马骏. 求最长公共子串问题的算法分析. 计算机仿真, 2007, 24(12): 97-100, 116.
doi: 10.3969/j.issn.1006-9348.2007.12.026
|
|
ZHANG Y C, CHE M, MA J. Analysis of the longest common substring algorithm. Computer Simulation, 2007, 24(12): 97-100, 116.
doi: 10.3969/j.issn.1006-9348.2007.12.026
|
12 |
周荫清. 信息理论基础. 北京: 北京航空航天大学出版社, 1993.
|
|
ZHOU Y Q. Fundamentals of information theory. Beijing: Beijing University of Aeronautics & Astronautics Press, 1993.
|
13 |
JACCARD P. The distribution of the flora in the alpine zone. New Phytologist, 1912, 11(2): 37- 50.
doi: 10.1111/j.1469-8137.1912.tb05611.x
|
14 |
林颖. 文本相似度计算方法的研究及改进[D]. 乌鲁木齐: 新疆大学, 2021.
|
|
LIN Y. Research and improvement of text similarity calculation method[D]. Urumqi: Xinjiang University, 2021. (in Chinese)
|
15 |
田星, 郑瑾, 张祖平. 基于词向量的Jaccard相似度算法. 计算机科学, 2018, 45(7): 186- 189.
URL
|
|
TIAN X, ZHENG J, ZHANG Z P. Jaccard text similarity algorithm based on word embedding. Computer Science, 2018, 45(7): 186- 189.
URL
|
16 |
MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL]. [2023-06-26]. http://arxiv.org/abs/1301.3781.
|
17 |
SUEN C Y. N-gram statistics for natural language understanding and text processing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1979, 1(2): 164- 172.
doi: 10.1109/TPAMI.1979.4766902
|
18 |
TURNEY P D, PANTEL P. From frequency to meaning: vector space models of semantics. Journal of Artificial Intelligence Research, 2010, 37, 141- 188.
doi: 10.1613/jair.2934
|
19 |
ROBERTSON S E, WALKER S. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval[M]. London, UK: Springer, 1994: 232-241.
|
20 |
|
21 |
李伊仝, 王红斌, 程良. 融入新闻标题信息的新闻文本与评论的语义相似度计算方法. 吉林大学学报(理学版), 2022, 60(6): 1399- 1406.
URL
|
|
LI Y T, WANG H B, CHENG L. Semantic similarity calculation method of news text and comment integrated with news title information. Journal of Jilin University(Science Edition), 2022, 60(6): 1399- 1406.
URL
|
22 |
BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3, 993- 1022.
URL
|
23 |
WANG J, XU W, YAN W, et al. Text similarity calculation method based on hybrid model of LDA and TF-IDF[P]. Computer Science and Artificial Intelligence, 2019.
|
24 |
DEERWESTER S, DUMAIS S T, FURNAS G W, et al. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990, 41(6): 391- 407.
doi: 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
|
25 |
KONTOSTATHIS A, POTTENGER W M. A framework for understanding Latent Semantic Indexing (LSI) performance. Information Processing & Management, 2006, 42(1): 56- 73.
doi: 10.1016/j.ipm.2004.11.007
|
26 |
SCHWARZ C. Lsemantica: a command for text similarity based on latent semantic analysis. The Stata Journal, 2019, 19(1): 129- 142.
doi: 10.1177/1536867X19830910
|
27 |
LANDAUER T K, DUMAIS S T. A solution to Plato's problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 1997, 104(2): 211- 240.
doi: 10.1037/0033-295X.104.2.211
|
28 |
LANDAUER T K, FOLTZ P W, LAHAM D. An introduction to latent semantic analysis. Discourse Processes, 1998, 25(2/3): 259- 284.
|
29 |
GROSSMAN D A, FRIEDER O. Information retrieval: algorithms and heuristics[M]. Berlin, Germany: Springer, 2012.
|
30 |
|
31 |
WITSCHARD D, JUSUFI I, MARTINS R M, et al. Interactive optimization of embedding-based text similarity calculations. Information Visualization, 2022, 21(4): 335- 353.
doi: 10.1177/14738716221114372
|
32 |
李舟军, 范宇, 吴贤杰. 面向自然语言处理的预训练技术研究综述. 计算机科学, 2020, 47(3): 162- 173.
URL
|
|
LI Z J, FAN Y, WU X J. Survey of natural language processing pre-training techniques. Computer Science, 2020, 47(3): 162- 173.
URL
|
33 |
LI M T, SHEN X F, SUN Y Y, et al. Using semantic text similarity calculation for question matching in a rheumatoid arthritis question-answering system. Quantitative Imaging in Medicine and Surgery, 2023, 13(4): 2183- 2196.
doi: 10.21037/qims-22-749
|
34 |
VIJI D, REVATHY S. A hybrid approach of weighted fine-tuned BERT extraction with deep Siamese Bi-LSTM model for semantic text similarity identification. Multimedia Tools and Applications, 2022, 81(5): 6131- 6157.
doi: 10.1007/s11042-021-11771-6
|
35 |
QIU S J, NIU Y, LI J, et al. Research on semantic similarity of short text based on BERT and time warping distance. Journal of Web Engineering, 2021, 20(8): 2521- 2544.
URL
|
36 |
NGUYEN H T, DUONG P H, CAMBRIA E. Learning short-text semantic similarity with word embeddings and external knowledge sources. Knowledge-Based Systems, 2019, 182, 104842.
doi: 10.1016/j.knosys.2019.07.013
|
37 |
HOCHREITER S, SCHMIDHUBER J. Long short-term memory. Neural Computation, 1997, 9(8): 1735- 1780.
doi: 10.1162/neco.1997.9.8.1735
|
38 |
GRAVES A, JAITLY N, MOHAMED A R. Hybrid speech recognition with deep bidirectional LSTM[C]//Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding. Washington D. C., USA: IEEE Press, 2013: 273-278.
|
39 |
杨飞. 基于LSTM的文本相似度识别方法研究[D]. 长春: 吉林大学, 2018.
|
|
YANG F. Research on text similarity recognition based on LSTM[D]. Changchun: Jilin University, 2018. (in Chinese)
|
40 |
ZHAO W D, LIU X T, JING J, et al. Re-LSTM: a long short-term memory network text similarity algorithm based on weighted word embedding. Connection Science, 2022, 34(1): 2652- 2670.
doi: 10.1080/09540091.2022.2140122
|
41 |
伍树书. 基于BiLSTM和注意力机制的短文本相似度算法研究[D]. 武汉: 武汉科技大学, 2021.
|
|
WU S S. Research on short text similarity algorithm based on BiLSTM and attention mechanism[D]. Wuhan: Wuhan University of Science and Technology, 2021. (in Chinese)
|
42 |
|
43 |
WU X, GAO C C, ZANG L J, et al. ESimCSE: enhanced sample building method for contrastive learning of unsupervised sentence embedding[EB/OL]. [2023-06-26]. http://arxiv.org/abs/2109.04380.
|
44 |
HUANG P, HE X, GAO J, et al. Learning deep structured semantic models for Web search using clickthrough data[C]// Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. New York, USA: ACM Press, 2013: 2333-2338.
|
45 |
SHEN Y, HE X, GAO J, et al. A latent semantic model with convolutional-pooling structure for information retrieval[C]// Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. New York, USA: ACM Press, 2014: 101-110.
|
46 |
PALANGI H, DENG L, SHEN Y, et al. Semantic modelling with long-short-term memory for information retrieval[EB/OL]. [2023-06-26]. https://arxiv.org/abs/1412.6629.
|
47 |
孟金旭, 单鸿涛, 万俊杰, 等. BSLA: 改进Siamese-LSTM的文本相似模型. 计算机工程与应用, 2022, 58(23): 178- 185.
doi: 10.3778/j.issn.1002-8331.2105-0220
|
|
MENG J X, SHAN H T, WAN J J, et al. BSLA: improved text similarity model for Siamese-LSTM. Computer Engineering and Applications, 2022, 58(23): 178- 185.
doi: 10.3778/j.issn.1002-8331.2105-0220
|
48 |
|
49 |
KIM S, KANG I, KWAK N. Semantic sentence matching with densely-connected recurrent and co-attentive information[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence and 31st Innovative Applications of Artificial Intelligence Conference and 9th AAAI Symposium on Educational Advances in Artificial Intelligence. Palo Alto, USA: AAAI Press, 2019: 6586-6593.
|
50 |
李凡. 基于神经网络的文本相似度匹配算法研究[D]. 太原: 太原科技大学, 2020.
|
|
LI F. Research on text similarity matching algorithm based on neural network[D]. Taiyuan: Taiyuan University of Science and Technology, 2020. (in Chinese)
|
51 |
JING H, KEYU M. SRU-based multi-angle enhanced network for semantic text similarity calculation of big data language model. International Journal of Information Technologies and Systems Approach, 2023, 16(2): 1- 20.
URL
|
52 |
WANG Z, HAMZA W, FLORIAN R. Bilateral multi-perspective matching for natural language sentences[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence. New York, USA: ACM Press, 2017: 4144-4150.
|
53 |
LUO J G, XIONG W P, DU J Q, et al. Traditional Chinese medicine text similarity calculation model based on the bidirectional temporal Siamese network. Evidence-Based Complementary and Alternative Medicine, 2021, 28, 2337924.
doi: 10.1155/2021/2337924
|
54 |
WANG Z G, ZHANG B. Chinese text similarity calculation model based on multi-attention Siamese Bi-LSTM[C]//Proceedings of the 4th International Conference on Computer Science and Software Engineering. New York, USA: ACM Press, 2021: 93-98.
|
55 |
YAO L, PAN Z Y, NING H S. Unlabeled short text similarity with LSTM encoder. IEEE Access, 2018, 7, 3430- 3437.
|
56 |
梅家驹, 竺一鸣, 高蕴琦, 等. 同义词词林. 上海: 上海辞书出版社, 1983.
|
|
MEI J J, ZHU Y M, GAO Y Q, et al. Thesaurus of synonyms. Shanghai: Shanghai Lexicographical Publishing House, 1983.
|
57 |
DONG Z D, DONG Q. HowNet and the computation of meaning[M]. Hackensack, USA: World Scientific, 2006.
|
58 |
陈丹华, 王艳娜, 周子力, 等. 基于Word2Vec的WordNet词语相似度计算研究. 计算机工程与应用, 2022, 58(3): 222- 229.
doi: 10.3778/j.issn.1002-8331.2009-0090
|
|
CHEN D H, WANG Y N, ZHOU Z L, et al. Research on WordNet word similarity calculation based on Word2Vec. Computer Engineering and Applications, 2022, 58(3): 222- 229.
doi: 10.3778/j.issn.1002-8331.2009-0090
|
59 |
CHEN X J, JIA S B, XIANG Y. A review: knowledge reasoning over knowledge graph. Expert Systems with Applications, 2020, 141, 112948.
doi: 10.1016/j.eswa.2019.112948
|
60 |
NIU X Y, ZHENG W G, XIAO Y Y, et al. Short text similarity computation method based on feature expansion and Siamese network[C]//Proceedings of the 4th International Conference on Data Science and Information Technology. New York, USA: ACM Press, 2021: 279-283.
|
61 |
HUANG P S, CHIU P S, CHANG J W, et al. A study of using syntactic cues in short-text similarity measure. Journal of Internet Technology, 2019, 20, 839- 850.
doi: 10.3966/160792642019052003017
|
62 |
HAN M T, ZHANG X, YUAN X, et al. A survey on the techniques, applications, and performance of short text semantic similarity. Concurrency and Computation: Practice and Experience, 2021, 33(5): e5971.
doi: 10.1002/cpe.5971
|
63 |
郭振鹏. 基于中文分词与文本相似度的主观题评分系统研究与实现[D]. 太原: 太原理工大学, 2021.
|
|
GUO Z P. Research and implementation of subjective question scoring system based on Chinese word segmentation and text similarity[D]. Taiyuan: Taiyuan University of Technology, 2021. (in Chinese)
|
64 |
NGUYEN M H, TRAN D Q. Estimation in semantic similarity of texts. Journal of Information Science and Engineering, 2021, 37, 617- 633.
|
65 |
谷重阳, 徐浩煜, 周晗, 等. 基于词汇语义信息的文本相似度计算. 计算机应用研究, 2018, 35(2): 391- 395.
URL
|
|
GU C Y, XU H Y, ZHOU H, et al. Text similarity computing based on lexical semantic information. Application Research of Computers, 2018, 35(2): 391- 395.
URL
|
66 |
INAN E. SimiT: a text similarity method using lexicon and dependency representations. New Generation Computing, 2020, 38(3): 509- 530.
|
67 |
FAROUK M. Measuring text similarity based on structure and word embedding. Cognitive Systems Research, 2020, 63, 1- 10.
|
68 |
LI M Y, BI X H, WANG L M, et al. Text similarity measurement method and application of online medical community based on density peak clustering. Journal of Organizational and End User Computing, 2022, 34(2): 1- 25.
|
69 |
LI J Y, ZHANG X J, ZHOU X B. ALBERT-based self-ensemble model with semisupervised learning and data augmentation for clinical semantic textual similarity calculation: algorithm validation study. JMIR Medical Informatics, 2021, 9(1): e23086.
|
70 |
WANG Y. Similarity detection of English text and teaching evaluation based on improved TCUSS clustering algorithm. Journal of Intelligent & Fuzzy Systems, 2021, 40(4): 7555- 7565.
|
71 |
CHATTERJEE N, YADAV N. Fuzzy rough set-based sentence similarity measure and its application to text summarization. IETE Technical Review, 2019, 36(5): 517- 525.
|
72 |
CER D, DIAB M, AGIRRE E, et al. SemEval-2017 task 1: semantic textual similarity-multilingual and cross-lingual focused evaluation[EB/OL]. [2023-06-26]. https://arxiv.org/abs/1708.00055.
|
73 |
DOLAN B, QUIRK C, BROCKETT C. Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources[C]//Proceedings of the 20th International Conference on Computational Linguistics. Philadelphia, USA: Association for Computational Linguistics, 2004: 350-362.
|
74 |
ZHANG B W, SUN W W, WAN X J, et al. PKU paraphrase bank: a sentence-level paraphrase corpus for Chinese[C]// Proceedings of CCF International Conference on Natural Language Processing and Chinese Computing. Berlin, Germany: Springer, 2019: 814-826.
|
75 |
LIU X, CHEN Q, DENG C, et al. LCQMC: a large-scale Chinese question matching corpus[C]//Proceedings of International Conference on Computational Linguistics. Philadelphia, USA: Association for Computational Linguistics, 2018: 1952-1962.
|
76 |
REIMERS N, BEYER P, GUREVYCH I. Task-oriented intrinsic evaluation of semantic textual similarity[C]//Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. Philadelphia, USA: Association for Computational Linguistics, 2016: 87-96.
|
77 |
WEI W, XIA X, WOZNIAK M, et al. Multi-sink distributed power control algorithm for cyber-physical-systems in coal mine tunnels. Computer Networks, 2019, 161, 210- 219.
|
78 |
WEI W, SONG H B, LI W, et al. Gradient-driven parking navigation using a continuous information potential field based on wireless sensor network. Information Sciences, 2017, 408, 100- 114.
|
79 |
FAN X, SONG H B, FAN X F, et al. Imperfect information dynamic Stackelberg game based resource allocation using hidden Markov for cloud computing. IEEE Transactions on Services Computing, 2018, 11(1): 78- 89.
|