YE Junmin, XU Song, LUO Daxiong, WANG Zhifeng, CHEN Shu
The Chinese real-word error in the online learning community will make it difficult to understand the semantics of Chinese texts,which affects the learning and analyzing effects based on online learning community texts.To this end,this paper proposes a real-word error detection and repairing method for short texts in online learning communities.Firstly,the confusion word set and the fixed collocation knowledge base corresponding to the confusion word are automatically constructed.Then,n-gram scores,context scores and fixed match scores are calculated for each confusion word based on the n-gram probability statistical model,context model,and fixed collocation knowledge base respectively.Finally,the weighted summation is used as the basis for judging whether the original text is wrong,and the confusing word with the highest score is used as the repair opinion.Experimental results show that this method can effectively detect and repair Chinese real-word error in the learning community,whose Recall,Precision,and Correction are 85.6%,86.3%,92.9% respectively.