摘要: 针对中文短文本分类问题,从集成学习的角度提出一种基于多元概率推理模型的书写纹识别方法。将初始样本集划分为等粒度、可交叉的样本子集,构造具有差异性的子空间,在各子空间上采用基于概率推理模型的基分类器训练样本,通过概率求和法融合所有基分类器的输出得到训练样本的最终识别结果。实验结果表明,该方法对于网络书写纹具有较好的识别效果,查全率、查准率和F1度量值分别高达81.6%、85.9%和83.69%。
关键词:
网络书写纹,
集成学习,
概率推理模型,
样本空间,
随机采样,
隶属度
Abstract: To solve Chinese written grain identification problem, this paper proposes a written grain identification method based on Multiple Probabilistic Reasoning Model(MPRM), from the point of view of ensemble learning. In this method, diverse subspaces are constructed by dividing the initial sample space into equal granularity, cross-allowed subsets. And then sample is trained by a base classifier based on Probabilistic Reasoning Model(PRM) in each subspace. A probability summation method is used to fuse the output of base classifier to get the final recognition result of training samples. Experimental result shows that this method is effective for online written grain identification. The recall rate, precision rate and F1-measure are 81.6%, 85.9% and 83.69%.
Key words:
network written grain,
ensemble learning,
Probabilistic Reasoning Model(PRM),
sample space,
random sampling,
membership degree
中图分类号:
刘三女牙,铁璐,刘智,孙建文. 基于多元概率推理模型的中文书写纹识别[J]. 计算机工程.
LIU San-ya, TIE Lu, LIU Zhi, SUN Jian-wen. Chinese Written Grain Identification Based on Multiple Probabilistic Reasoning Model[J]. Computer Engineering.