作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (23): 277-280. doi: 10.3969/j.issn.1000-3428.2012.23.069

• 开发研究与设计技术 • 上一篇    下一篇

基于概率统计模型的快递地址自动分类方法

邵 妍 1,2,3,刘燕兵 2,3,谭建龙 2,3,郭 莉 2,3   

  1. (1. 北京邮电大学计算机学院,北京 100876;2. 中国科学院信息工程研究所,北京 100093;3. 信息内容安全技术国家工程实验室,北京 100093)
  • 收稿日期:2012-03-12 修回日期:2012-05-18 出版日期:2012-12-05 发布日期:2012-12-03
  • 作者简介:邵 妍(1987-),女,硕士研究生,主研方向:信息安全,正则表达式匹配;刘燕兵,助理研究员;谭建龙,研究员;郭 莉,正研级高级工程师
  • 基金资助:

    国家自然科学基金资助面上项目(61070026);国家“863”计划基金资助项目(2011AA010705);国家“242”信息安全计划基金资助项目((242)2010A029)

Automatic Classification Approach of Express Address Based on Probability Statistical Model

SHAO Yan 1,2,3, LIU Yan-bing 2,3, TAN Jian-long 2,3, GUO Li 2,3   

  1. (1. School of Computer, Beijing University of Posts and Telecommunications, Beijing 100876, China; 2. Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China; 3. National Engineering Laboratory for Information Security Technologies, Beijing 100093, China)
  • Received:2012-03-12 Revised:2012-05-18 Online:2012-12-05 Published:2012-12-03

摘要: 快递货物在中转点向取送点分拣时需要人工判断收货地址所属取送点,为提高分拣的自动化程度和分拣速度,提出一种基于概率统计分类模型的快递地址自动分类方法。该方法以基于概率统计的地址分类模型为核心,通过统计出的最小地址要素与取送点的对应概率分布,对快递地址所属的取送点做出判断。在某快递公司提供的快递地址分类数据上的实验结果表明,该方法的自动分类准确率可达99%以上,每个地址的分类用时为0.43 ms。

关键词: 快递地址, 自动分类, 快递分拣, 概率统计, 中文地址分词, 停用字符过滤

Abstract: In general, the delivery terminal that an express address belongs to is determined manually when sorting the goods at the express distribution center. In order to improve automation and speed, an automatic classification approach of express address based on the probability statistical model is proposed. The probability statistical model counts the probability distributions of the minimum address element, and determines the delivery terminal that the goods should be sent to. Experimental results based on the real data show that the classification accuracy of the approach reaches 99%, and classification speed is 0.43 ms per address.

Key words: express address, automatic classification, express sorting, probability statistic, Chinese address segmentation, stop-character filtering

中图分类号: