作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (6): 1-19. doi: 10.19678/j.issn.1000-3428.0069005

• 热点与综述 • 上一篇    下一篇

声景识音:数字化时代声学场景分类的探索与前沿

庞鑫1, 葛凤培2,*(), 李艳玲1,3   

  1. 1. 内蒙古师范大学计算机科学技术学院, 内蒙古 呼和浩特 010022
    2. 北京邮电大学图书馆, 北京 100876
    3. 内蒙古师范大学无穷维哈密顿系统及其算法应用教育部重点实验室, 内蒙古 呼和浩特 010022
  • 收稿日期:2023-12-12 出版日期:2025-06-15 发布日期:2024-05-22
  • 通讯作者: 葛凤培
  • 基金资助:
    国家自然科学基金(12204062); 国家自然科学基金(62266033); 国家自然科学基金(61806103); 国家自然科学基金(61562068); 无穷维哈密顿系统及其算法应用教育部重点实验室开放课题(2023KFZD03); 内蒙古自治区自然科学基金(2022LHMS06001); 内蒙古师范大学基本科研业务费专项资金(2022JBQN106); 内蒙古师范大学基本科研业务费专项资金(2022JBQN111); 内蒙古师范大学基本科研业务费专项资金(2022JBTD016); 内蒙古师范大学研究生创新基金(CXJJS23066)

Soundscape Recognition: Explorations and Frontiers of Acoustic Scene Classification in the Digital Era

PANG Xin1, GE Fengpei2,*(), LI Yanling1,3   

  1. 1. School of Computer Science and Technology, Inner Mongolia Normal University, Hohhot 010022, Inner Mongolia, China
    2. Library, Beijing University of Posts and Telecommunications, Beijing 100876, China
    3. Key Laboratory of Infinite-dimensional Hamiltonian System and Its Algorithm Application, Ministry of Education, Inner Mongolia Normal University, Hohhot 010022, Inner Monglia, China
  • Received:2023-12-12 Online:2025-06-15 Published:2024-05-22
  • Contact: GE Fengpei

摘要:

声学场景分类(ASC)旨在让计算机模拟人类听觉识别不同的声学环境,是计算机听觉领域中具有挑战性的任务之一。随着智能音频处理技术以及神经网络学习算法的快速进步,近年来ASC任务也涌现出一系列新算法和新技术。为了全面展示该领域的技术发展脉络和演进过程,梳理了该领域的早期工作和近期发展,全面介绍了ASC任务。首先描述了ASC的应用场景和面临的挑战;其次详细介绍了ASC的主流框架,重点阐述了应用于此领域的深度学习算法;然后系统性地总结了ASC的前沿探索与延伸任务以及公开数据集;最后对ASC的发展趋势进行探讨与展望。

关键词: 声学场景分类, 深度学习, 音频分类, 语音识别, 数据增强

Abstract:

Acoustic Scene Classification (ASC) aims to enable computers to simulate the human auditory system in the task of recognizing various acoustic environments, which is a challenging task in the field of computer audition. With rapid advancements in intelligent audio processing technologies and neural network learning algorithms, a series of new algorithms and technologies for ASC have emerged in recent years. To comprehensively present the technological development trajectory and evolution in this field, this review systematically examines both early work and recent developments in ASC, providing a thorough overview of the field. This review first describes application scenarios and the challenges encountered in ASC and then details the mainstream frameworks in ASC, with a focus on the application of deep learning algorithms in this domain. Subsequently, it systematically summarizes frontier explorations, extension tasks, and publicly available datasets in ASC and finally discusses the prospects for future development trends in ASC.

Key words: Acoustic Scene Classification(ASC), deep learning, audio classification, speech recognition, Data Augmentation(DA)