摘要: 针对中文网络客户评论中的产品特征挖掘问题,提出一种基于Apriori算法的非监督挖掘方法。利用Apriori算法挖掘候选特征集合,设计邻近规则剪枝算法和最小独立支持度剪枝算法,并通过实验确定邻近规则距离值和最小独立支持度。实验结果表明,这2种剪枝算法均能有效提高产品特征挖掘的查准率和查全率。
关键词:
评论挖掘,
关联规则,
产品特征,
剪枝,
非结构化信息,
非监督学习
Abstract: This paper focuses on product features mining from reviews of Chinese network customers and proposes a method based on Apriori algorithm which is an unsupervised mining method. It extracts the candidate features collection by Apriori algorithm, and takes redundancy pruning and compactness pruning algorithms. According to the experimental research results, it establishes adjacent words value and p-support value. Results show that the precision and recall of mining method are effective improved by two proposed pruning algorithms.
Key words:
review mining,
association rule,
product feature,
pruning,
unstructured information,
unsupervised learning
中图分类号:
李实, 李秋实. 中文评论中产品特征挖掘的剪枝算法研究[J]. 计算机工程, 2011, 37(23): 43-45.
LI Shi, LI Qiu-Shi. Research on Pruning Algorithm of Product Feature Mining in Chinese Review[J]. Computer Engineering, 2011, 37(23): 43-45.