摘要: 聊天室中的聊天数据充斥着大量噪声,极大地降低了话题检测的监控效率。但聊天数据只有对话发出时间这一线索可供直接利用,因此噪声过滤是聊天室监控的一个难题。该文提出一种基于社会网络的聊天数据噪声过滤方法,通过分析聊天数据的时序关系,推断出聊天用户间的社会网络关系,根据社会网络蕴含的用户交流特点判断并过滤出噪声。实验证实了该方法能较准确地过滤出噪声,提高话题识别的准确率。
关键词:
社会网络,
聊天数据,
启发性规则,
噪声
Abstract: Noise brings great troubles in inspecting chat room because it reduces the monitoring efficiency of chat room. But there are no other clues to distinguish noise and normal utterances directly except time clue, it is difficult to filter noise. With the temporal relations of dialogs, this paper proposes a new method to handle the problem. It deduces the social network by the temporal relations of dialogs, and then judges the noise and filteres in terms of the features of chatters’ relations expressed by social networks. Experimental result shows that the method can judge noise well and improve the accuracy of topic detection.
Key words:
social network,
chat data,
heuristic rules,
noise
中图分类号:
高 鹏;曹先彬. 基于社会网络的聊天数据噪声过滤[J]. 计算机工程, 2008, 34(5): 166-168.
GAO Peng; CAO Xian-bin. Noise Filtering in Chat Data Based on Social Network[J]. Computer Engineering, 2008, 34(5): 166-168.