1 |
VIRTANEN T, PLUMBLEY M D, ELLIS D. Computational analysis of sound scenes and events. Berlin, Germany: Springer, 2018.
|
2 |
VALIN J M, MICHAUD F, HADJOU B, et al. Localization of simultaneous moving sound sources for mobile robot using a frequency-domain steered beamformer approach[C]//Proceedings of the IEEE International Conference on Robotics and Automation. Washington D.C., USA: IEEE Press, 2004: 1033-1038.
|
3 |
ALEXANDRE E, CUADRA L, ROSA M, et al. Feature selection for sound classification in hearing aids through restricted search driven by genetic algorithms. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(8): 2249- 2256.
doi: 10.1109/TASL.2007.905139
|
4 |
VIVEK V S, VIDHYA S, MADHANMOHAN P. Acoustic scene classification in hearing aid using deep learning[C]//Proceedings of the International Conference on Communication and Signal Processing (ICCSP). Washington D.C., USA: IEEE Press, 2020: 695-699.
|
5 |
STOWELL D, CLAYTON D. Acoustic event detection for multiple overlapping similar sources[C]//Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). Washington D.C., USA: IEEE Press, 2015: 1-5.
|
6 |
PHAM L, PHAN H, NGUYEN T, et al. Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework. Digital Signal Processing, 2021, 110, 102943.
doi: 10.1016/j.dsp.2020.102943
|
7 |
BARCHIESI D, GIANNOULIS D, STOWELL D, et al. Acoustic scene classification: classifying environments from the sounds they produce. IEEE Signal Processing Magazine, 2015, 32(3): 16- 34.
doi: 10.1109/MSP.2014.2326181
|
8 |
SINGH V K, SHARMA K, SUR S N. A survey on preprocessing and classification techniques for acoustic scene. Expert Systems with Applications, 2023, 229, 120520.
doi: 10.1016/j.eswa.2023.120520
|
9 |
CHANDRAKALA S, JAYALAKSHMI S L. Environmental audio scene and sound event recognition for autonomous surveillance: a survey and comparative studies. ACM Computing Surveys, 2019, 52(3): 1- 34.
|
10 |
WANG D L, BROWN G J. Computational auditory scene analysis: principles, algorithms, and applications. [S. l.]: Wiley, 2006.
|
11 |
CLARKSON B, SAWHNEY N, PENTLAND A. Auditory context awareness via wearable computing. Energy, 1998, 400, 20.
URL
|
12 |
YE J X, KOBAYASHI T, MURAKAWA M, et al. Acoustic scene classification based on sound textures and events[C]//Proceedings of the 23rd ACM International Conference on Multimedia. New York, USA: ACM Press, 2015: 1291-1294.
|
13 |
SALAMON J, JACOBY C, BELLO J P, et al. A dataset and taxonomy for urban sound research[C]//Proceedings of the 22nd ACM International Conference on Multimedia. New York, USA: ACM Press, 2014: 1041-1044.
|
14 |
ERONEN A J, PELTONEN V T, TUOMI J T, et al. Audio-based context recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2005, 14(1): 321- 329.
|
15 |
AUCOUTURIER J J, DEFREVILLE B, PACHET F. The bag-of-frames approach to audio pattern recognition: a sufficient model for urban soundscapes but not for polyphonic music. The Journal of the Acoustical Society of America, 2007, 122(2): 881- 891.
doi: 10.1121/1.2750160
|
16 |
PASEDDULA C, GANGASHETTY S V. Late fusion framework for acoustic scene classification using LPCC, SCMC, and log-mel band energies with deep neural networks. Applied Acoustics, 2021, 172, 107568.
doi: 10.1016/j.apacoust.2020.107568
|
17 |
GREEN M C, ADAVANNE S, MURPHY D. Acoustic scene classification using higher-order ambisonic features[C]//Proceedings of 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). Washington D.C., USA: IEEE Press, 2019: 42-45.
|
18 |
康丽霞, 马建芬, 张朝霞. 基于多特征后期融合的声学场景分类. 计算机工程与设计, 2023, 44(1): 141- 147.
|
|
KANG L X, MA J F, ZHANG Z X. Acoustic scene classification based on multi-feature post-fusion. Computer Engineering and Design, 2023, 44(1): 141- 147.
|
19 |
WALDEKAR S, SAHA G. Wavelet transform based mel-scaled features for acoustic scene classification[C]//Proceedings of the INTERSPEECH'18. Washington D.C., USA: IEEE Press, 2018: 3323-3327.
|
20 |
|
21 |
ERONEN A, TUOMI J, KLAPURI A, et al. Audio-based context awareness-acoustic modeling and perceptual evaluation[C]//Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. Washington D.C., USA: IEEE Press, 2003: 520-529.
|
22 |
DORFER M, LEHNER B, EGHBAL-ZADEH H, et al. Acoustic scene classification with fully convolutional neural networks and I-Vectors[EB/OL]. [2023-11-07]. https://arxiv.org/abs/1607.02383.
|
23 |
OO M M. Comparative study of MFCC feature with different machine learning techniques in acoustic scene classification. International Journal of Research and Engineering, 2018, 5(7): 439- 444.
URL
|
24 |
SUN J Y, LIU X B, MEI X H, et al. Deep neural decision forest for acoustic scene classification[C]//Proceedings of the 30th European Signal Processing Conference (EUSIPCO). Washington D.C., USA: IEEE Press, 2022: 772-776.
|
25 |
PICZAK K J. Environmental sound classification with convolutional neural networks[C]//Proceedings of the IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP). Washington D.C., USA: IEEE Press, 2015: 1-6.
|
26 |
YANG L P, TAO L J, CHEN X X, et al. Multi-scale semantic feature fusion and data augmentation for acoustic scene classification. Applied Acoustics, 2020, 163, 107238.
doi: 10.1016/j.apacoust.2020.107238
|
27 |
沈昕昊, 陈嘉烨, 宋晓宁. 基于配对特征融合的声学场景分类方法. 计算机应用研究, 2023, 40(6): 1771- 1776.
|
|
SHEN X H, CHEN J Y, SONG X N. Acoustic scene classification method based on paired feature fusion. Application Research of Computers, 2023, 40(6): 1771- 1776.
|
28 |
|
29 |
常月, 侯元波, 谭奕舟, 等. 基于自注意力机制的多模态场景分类. 复旦学报(自然科学版), 2023, 62(1): 46- 52.
|
|
CHANG Y, HOU Y B, TAN Y Z, et al. Multimodal scene classification based on self-attention mechanism. Journal of Fudan University (Natural Science), 2023, 62(1): 46- 52.
|
30 |
ABEER J. A review of deep learning based methods for acoustic scene classification. Applied Sciences, 2020, 10(6): 2020.
doi: 10.3390/app10062020
|
31 |
ZIELIŃSKI S K. Feature extraction of surround sound recordings for acoustic scene classification. Berlin, Germany: Springer International Publishing, 2018.
|
32 |
KAWAMURA T, KINOSHITA Y, ONO N, et al. Effectiveness of inter- and intra-subarray spatial features for acoustic scene classification[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D.C., USA: IEEE Press, 2023: 1-5.
|
33 |
SALAMON J, BELLO J P. Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Processing Letters, 2017, 24(3): 279- 283.
URL
|
34 |
AYTAR Y, VONDRICK C, TORRALBA A. SoundNet: learning sound representations from unlabeled video[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2016: 892-900.
|
35 |
|
36 |
|
37 |
YUN S, HAN D, CHUN S, et al. CutMix: regularization strategy to train strong classifiers with localizable features[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2019: 6022-6031.
|
38 |
KIM G, HAN D K, KO H. SpecMix: a mixed sample data augmentation method for training with time-frequency domain features[EB/OL]. [2023-11-07]. https://arxiv.org/abs/2108.03020v1.
|
39 |
PARK D S, CHAN W, ZHANG Y, et al. SpecAugment: a simple data augmentation method for automatic speech recognition[EB/OL]. [2023-11-07]. https://arxiv.org/abs/1904.08779v3.
|
40 |
WANG H L, ZOU Y X, WANG W W. SpecAugment++: a hidden space data augmentation method for acoustic scene classification[EB/OL]. [2023-11-07]. https://arxiv.org/abs/2103.16858v3.
|
41 |
|
42 |
KIM B, YANG S, KIM J, et al. Domain generalization with relaxed instance frequency-wise normalization for multi-device acoustic scene classification[EB/OL]. [2023-11-07]. https://arxiv.org/abs/2206.12513v1.
|
43 |
MOROCUTTI T, SCHMID F, KOUTINI K, et al. Device-robust acoustic scene classification via impulse response augmentation[EB/OL]. [2023-11-07]. https://arxiv.org/abs/2305.07499v2.
|
44 |
XIE W, HE Q H, YAN H K, et al. Acoustic scene classification using deep CNNs with time-frequency representations[C]//Proceedings of the IEEE 21st International Conference on Communication Technology (ICCT). Washington D.C., USA: IEEE Press, 2021: 1325-1329.
|
45 |
VALENTI M, SQUARTINI S, DIMENT A, et al. A convolutional neural network approach for acoustic scene classification[C]//Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN). Washington D.C., USA: IEEE Press, 2017: 1547-1554.
|
46 |
LIM M, LEE D, PARK H, et al. Convolutional neural network based audio event classification. KSII Transactions on Internet and Information Systems, 2018, 12(6): 2748- 2760.
URL
|
47 |
PICZAK K J. The details that matter: frequency resolution of spectrograms in acoustic scene classification[C]//Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events. Washington D.C., USA: IEEE Press, 2017: 103-107.
|
48 |
HAN Y, PARK J, LEE K. Convolutional neural networks with binaural representations and background subtraction for acoustic scene classification[C]//Proceedings of Workshop on Detection and Classification of Acoustic Scenes and Events. Washington D.C., USA: IEEE Press, 2018: 46-50.
|
49 |
REN Z, KONG Q Q, HAN J, et al. Attention-based atrous convolutional neural networks: visualisation and understanding perspectives of acoustic scenes[C]//Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D.C., USA: IEEE Press, 2019: 56-60.
|
50 |
KOUTINI K, EGHBAL-ZADEH H, DORFER M, et al. The receptive field as a regularizer in deep convolutional neural networks for acoustic scene classification[C]//Proceedings of the 27th European Signal Processing Conference (EUSIPCO). Washington D.C., USA: IEEE Press, 2019: 1-5.
|
51 |
|
52 |
|
53 |
BASBUG A M, SERT M. Acoustic scene classification using spatial pyramid pooling with convolutional neural networks[C]//Proceedings of the IEEE 13th International Conference on Semantic Computing (ICSC). Washington D.C., USA: IEEE Press, 2019: 128-131.
|
54 |
PHAYE S S R, BENETOS E, WANG Y. SubSpectralNet-using sub-spectrogram based convolutional neural networks for acoustic scene classification[C]//Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D.C., USA: IEEE Press, 2019: 825-829.
|
55 |
HU H, YANG C H, XIA X J, et al. Device-robust acoustic scene classification based on two-stage categorization and data augmentation[EB/OL]. [2023-11-07]. https://arxiv.org/abs/2007.08389v2.
|
56 |
KEK X Y, CHIN C S, LI Y. Multi-timescale wavelet scattering with genetic algorithm feature selection for acoustic scene classification. IEEE Access, 2022, 10, 25987- 26001.
|
57 |
曹毅, 费鸿博, 李平, 等. 基于多流卷积和数据增强的声场景分类方法. 华中科技大学学报(自然科学版), 2022, 50(4): 40- 46.
|
|
CAO Y, FEI H B, LI P, et al. Acoustic scene classification method based on multi-stream convolution and data augmentation. Journal of Huazhong University of Science and Technology (Natural Science Edition), 2022, 50(4): 40- 46.
|
58 |
HASAN N W, SAUDI A S, KHALIL M I, et al. A genetic algorithm approach to automate architecture design for acoustic scene classification. IEEE Transactions on Evolutionary Computation, 2023, 27(2): 222- 236.
URL
|
59 |
|
60 |
|
61 |
ZHANG Z C, XU S G, ZHANG S Q, et al. Attention based convolutional recurrent neural network for environmental sound classification. Neurocomputing, 2021, 453, 896- 903.
URL
|
62 |
WANG C Y, SANTOSO A, WANG J C. Acoustic scene classification using self-determination convolutional neural network[C]//Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). Washington D.C., USA: IEEE Press, 2017: 19-22.
|
63 |
|
64 |
TRIPATHI A, PAUL K. Temporal self attention-based residual network for environmental sound classification[C]//Proceedings of INTERSPEECH'22. Washington D.C., USA: IEEE Press, 2022: 128.
|
65 |
ZHANG Z C, XU S G, ZHANG S Q, et al. Learning attentive representations for environmental sound classification. IEEE Access, 2019, 7, 130327- 130339.
URL
|
66 |
WANG H L, ZOU Y X, CHONG D D, et al. Environmental sound classification with parallel temporal-spectral attention[EB/OL]. [2023-11-07]. https://arxiv.org/abs/1912.06808v3.
|
67 |
WANG Y, FENG C Y, ANDERSON D V. A multi-channel temporal attention convolutional neural network model for environmental sound classification[C]//Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D.C., USA: IEEE Press, 2021: 930-934.
|
68 |
LI Z T, HOU Y B, XIE X, et al. Multi-level attention model with deep scattering spectrum for acoustic scene classification[C]//Proceedings of the IEEE International Conference on Multimedia & Expo Workshops (ICMEW). Washington D.C., USA: IEEE Press, 2019: 396-401.
|
69 |
|
70 |
SINGH A, RAJAN P, BHAVSAR A. Deep multi-view features from raw audio for acoustic Scene Classification[C]//Proceedings of DCASE'19. New York, USA: ACM Press, 2019: 229-233.
|
71 |
|
72 |
HUANG J, LU H, MEYER P L, et al. Acoustic scene classification using deep learning-based ensemble averaging[C]//Proceedings of DCASE'19. New York, USA: ACM Press, 2019: 94-98.
|
73 |
KUMAR A, KHADKEVICH M, FVGEN C. Knowledge transfer from weakly labeled audio using convolutional neural network for sound events and scenes[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D.C., USA: IEEE Press, 2018: 326-330.
|
74 |
|
75 |
NWE T L, DAT T H, MA B. Convolutional neural network with multi-task learning scheme for acoustic scene classification[C]//Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. Washington D.C., USA: IEEE Press, 2017: 1347-1350.
|
76 |
|
77 |
|
78 |
|
79 |
|
80 |
|
81 |
|
82 |
|
83 |
|
84 |
|
85 |
|
86 |
|
87 |
|
88 |
ZHANG D, LI S M, ZHANG X, et al. SpeechGPT: empowering large language models with intrinsic cross-modal conversational abilities[EB/OL]. [2023-11-07]. https://arxiv.org/abs/2305.11000v2.
|
89 |
|
90 |
|
91 |
CAI R, LU L, HANJALIC A, et al. A flexible framework for key audio effects detection and auditory context inference. IEEE Transactions on Audio, Speech, and Language Processing, 2006, 14(3): 1026- 1039.
URL
|
92 |
GIANNOULIS D, STOWELL D, BENETOS E, et al. A database and challenge for acoustic scene classification and event detection[C]//Proceedings of the 21st European Signal Processing Conference. Washington D.C., USA: IEEE Press, 2013: 1-5.
|
93 |
CHAUDHURI S, RAJ B. Unsupervised hierarchical structure induction for deeper semantic analysis of audio[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. Washington D.C., USA: IEEE Press, 2013: 833-837.
|
94 |
GIANNOULIS D, BENETOS E, STOWELL D, et al. Detection and classification of acoustic scenes and events: an IEEE AASP challenge[C]//Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. Washington D.C., USA: IEEE Press, 2013: 1-4.
|
95 |
MESAROS A, HEITTOLA T, VIRTANEN T. TUT database for acoustic scene classification and sound event detection[C]//Proceedings of the 24th European Signal Processing Conference. Washington D.C., USA: IEEE Press, 2016: 1128-1132.
|
96 |
MESAROS A, HEITTOLA T, BENETOS E, et al. Detection and classification of acoustic scenes and events: outcome of the DCASE 2016 challenge. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2018, 26(2): 379- 393.
URL
|
97 |
|
98 |
|
99 |
|
100 |
HEITTOLA T, MESAROS A, VIRTANEN T. Acoustic scene classification in DCASE 2020 challenge: generalization across devices and low complexity solutions[EB/OL]. [2023-11-07]. https://arxiv.org/abs/2005.14623v2.
|
101 |
WANG S S, MESAROS A, HEITTOLA T, et al. A curated dataset of urban scenes for audio-visual scene analysis[C]//Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D.C., USA: IEEE Press, 2021: 626-630.
|
102 |
WANG S S, HEITTOLA T, MESAROS A, et al. Audio-visual scene classification: analysis of DCASE2021 challenge submissions[EB/OL]. [2023-11-07]. https://arxiv.org/abs/2105.13675v2.
|
103 |
MARTÍ N-MORATÓ I, PAISSAN F, ANCILOTTO A, et al. Low-complexity acoustic scene classification in DCASE2022 Challenge[EB/OL]. [2023-11-07]. https://arxiv.org/abs/2206.03835v2.
|
104 |
PICZAK K J, PICZAK K J. ESC: dataset for environmental sound classification[C]//Proceedings of the 23rd ACM International Conference on Multimedia. New York, USA: ACM Press, 2015: 1015-1018.
|
105 |
GEMMEKE J F, ELLIS D P W, FREEDMAN D, et al. AudioSet: an ontology and human-labeled dataset for audio events[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D.C., USA: IEEE Press, 2017: 776-780.
|
106 |
FONSECA E, FAVORY X, PONS J, et al. FSD50K: an open dataset of human-labeled sound events. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 30, 829- 852.
URL
|
107 |
SAKI F, GUO Y Y, HUNG C Y, et al. Open-set evolving acoustic scene classification system[C]//Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events. New York, USA: ACM Press, 2019: 219-223.
|
108 |
WILKINGHOFF K, KURTH F. Open-set acoustic scene classification with deep convolutional autoencoders[C]//Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events. New York, USA: ACM Press, 2019: 258-262.
|
109 |
|
110 |
|
111 |
SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2018: 4510-4520.
|
112 |
|
113 |
CHEN W L, WILSON J T, TYREE S, et al. Compressing neural networks with the hashing trick[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning. New York, USA: ACM Press, 2015: 2285-2294.
|
114 |
|
115 |
|
116 |
|
117 |
|
118 |
MEZZA A I, HABETS E A P, MVLLER M, et al. Unsupervised domain adaptation for acoustic scene classification using band-wise statistics matching[C]//Proceedings of the 28th European Signal Processing Conference (EUSIPCO). Washington D.C., USA: IEEE Press, 2021: 11-15.
|
119 |
|
120 |
KIM B, YANG S, KIM J, et al. QTI submission to DCASE2021: residual normalization for device-imbalanced acoustic scene classification with efficient design[EB/OL]. [2023-11-07]. https://arxiv.org/abs/2206.13909v2.
|
121 |
MEZZA A I, HABETS E A P, MVLLER M, et al. Unsupervised domain adaptation via principal subspace projection for acoustic scene classification. Journal of Signal Processing Systems, 2022, 94(2): 197- 213.
doi: 10.1007/s11265-021-01720-9?utm_source=xmol&utm_content=meta
|
122 |
GHARIB S, DROSSOS K, ÇAKIR E, et al. Unsupervised adversarial domain adaptation for acoustic scene classification[C]//Proceedings of Workshop on Detection and Classification of Acoustic Scenes and Events. New York, USA: ACM Press, 2018: 138-142.
|
123 |
OLVERA M, VINCENT E, GASSO G. On the impact of normalization strategies in unsupervised adversarial domain adaptation for acoustic scene classification[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D.C., USA: IEEE Press, 2022: 631-635.
|
124 |
DROSSOS K, MAGRON P, VIRTANEN T. Unsupervised adversarial domain adaptation based on the Wasserstein distance for acoustic scene classification[C]//Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). Washington D.C., USA: IEEE Press, 2019: 259-263.
|