摘要: 提出一种结合词频-逆向文件频率(TF-IDF)规则与多标记分类的歌曲情感分析方法。对歌曲中基于声学特征的音乐内容,用带向量夹角的多标记k近邻算法进行分类,将TF-IDF规则用于歌词内容,以计算歌词情感分数,并将其作为情感特征。采用该方法对歌词内容分类错误的类别标记进行修正。选用396首英文歌曲对该算法进行测试,结果表明,与其他方法相比,该方法能使分类精确度从69%提高到74%。
关键词:
多标记分类,
歌曲情感分类,
多标记k近邻算法,
词频-逆向文件频率
Abstract: This paper proposes a new method about combining music content and lyrics of songs. Multi-label k-Nearest Neighbor(kNN) algorithm by the angle of two vectors is applied to the emotional classification of music content based on acoustic features. Term Frequency-Inverse Document Frequency(TF-IDF) rules are used in the lyrics, and the lyrics emotion scores are calculated as its emotional features. The combing method uses the lyrics right labels to correct the content of music wrong labels. Experiment uses 396 English songs, after the new method, the accuracy of the original test from 69% to 74%.
Key words:
multi-label classification,
song emotion classification,
multi-label k-Nearest Neighbor(kNN) algorithm,
Term Frequency-Inverse Document Frequency(TF-IDF)
中图分类号:
孙向琨, 邓伟. 结合TF-IDF的歌曲情感多标记分类[J]. 计算机工程, 2011, 37(19): 189-190,197.
SUN Xiang-Kun, DENG Wei. Multi-label Classification for Song Emotion Combined with TF-IDF[J]. Computer Engineering, 2011, 37(19): 189-190,197.