1 |
SNYDER D, GARCIA-ROMERO D, SELL G, et al. x-vectors: robust DNN embeddings for speaker recognition[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D. C., USA: IEEE Press, 2018: 5329-5333.
|
2 |
曹书鑫, 冯藤藤, 葛凤培, 等. 基于尺度相关-双向长短期记忆网络模型的说话人识别. 计算机工程, 2023, 49 (4): 289- 296.
doi: 10.19678/j.issn.1000-3428.0064388
|
|
CAO S X , FENG T T , GE F P , et al. Speaker recognition based on scale correlation-bidirectional long short-term memory network model. Computer Engineering, 2023, 49 (4): 289- 296.
doi: 10.19678/j.issn.1000-3428.0064388
|
3 |
刘晓璇, 季怡, 刘纯平. 基于LSTM神经网络的声纹识别. 计算机科学, 2021, 48 (S2): 270- 274.
|
|
LIU X X , JI Y , LIU C P . Voiceprint recognition based on LSTM neural network. Computer Science, 2021, 48 (S2): 270- 274.
|
4 |
VARIANI E, LEI X, MCDERMOTT E, et al. Deep neural networks for small footprint text-dependent speaker verification[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D. C., USA: IEEE Press, 2014: 4052-4056.
|
5 |
DESPLANQUES B, THIENPONDT J, DEMUYNCK K. ECAPA-TDNN: emphasized channel attention, propagation and aggregation in TDNN based speaker verification[C]//Proceedings of the Interspeech 2020. [S. l. ]: ISCA, 2020: 3830-3834.
|
6 |
POVEY D, CHENG G F, WANG Y M, et al. Semi-orthogonal low-rank matrix factorization for deep neural networks[C]//Proceedings of the Interspeech 2018. [S. l. ]: ISCA, 2018: 3743-3747.
|
7 |
张玉杰, 张赞. DenseNet在声纹识别中的应用研究. 计算机工程与科学, 2022, 44 (1): 132- 137.
|
|
ZHANG Y J , ZHANG Z . Application of DenseNet in voiceprint recognition. Computer Engineering & Science, 2022, 44 (1): 132- 137.
|
8 |
CAI W C, CHEN J K, LI M. Exploring the encoding layer and loss function in end-to-end speaker and language recognition system[C]//Proceedings of Odyssey 2018. [S. l. ]: ISCA, 2018: 1-10.
|
9 |
XIAO R Q, MIAO X X, WANG W C, et al. Adaptive margin circle loss for speaker verification[C]//Proceedings of the Interspeech 2021. [S. l. ]: ISCA, 2021: 1-8.
|
10 |
KINNUNEN T , LI H Z . An overview of text-independent speaker recognition: from features to supervectors. Speech Communication, 2010, 52 (1): 12- 40.
|
11 |
|
12 |
CHUNG J S, HUH J, MUN S, et al. In defence of metric learning for speaker recognition[C]//Proceedings of the Interspeech 2020. [S. l. ]: ISCA, 2020: 2977-2981.
|
13 |
HANSEN J H L, WANG Z Y. Audio anti-spoofing using simple attention module and joint optimization based on additive angular margin loss and meta-learning[C]//Proceedings of the Interspeech 2022. [S. l. ]: ISCA, 2022: 376-380.
|
14 |
LI R D, FANG S, MA C G, et al. Adaptive rectangle loss for speaker verification[C]//Proceedings of the Interspeech 2022. [S. l. ]: ISCA, 2022: 301-305.
|
15 |
NOVOSELOV S, SHCHEMELININ V, SHULIPA A, et al. Triplet loss based cosine similarity metric learning for text-independent speaker recognition[C]//Proceedings of the Interspeech 2018. [S. l. ]: ISCA, 2018: 2242-2246.
|
16 |
WAN L, WANG Q, PAPIR A, et al. Generalized end-to-end loss for speaker verification[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D. C., USA: IEEE Press, 2018: 4879-4883.
|
17 |
XIE W D, NAGRANI A, CHUNG J S, et al. Utterance-level aggregation for speaker recognition in the wild[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D. C., USA: IEEE Press, 2019: 5791-5795.
|
18 |
KNOCHE M, ELKADEEM M, HORMANN S, et al. Octuplet loss: make face recognition robust to image resolution[C]//Proceedings of the IEEE 17th International Conference on Automatic Face and Gesture Recognition. Washington D. C., USA: IEEE Press, 2023: 1-8.
|
19 |
GAO Z F, SONG Y, MCLOUGHLIN I, et al. Improving aggregation and loss function for better embedding learning in end-to-end speaker verification system[C]//Proceedings of the Interspeech 2019. [S. l. ]: ISCA, 2019: 361-365.
|
20 |
|
21 |
NAGRANI A, CHUNG J S, ZISSERMAN A. VoxCeleb: a large-scale speaker identification dataset[C]//Proceedings of the Interspeech 2017. [S. l. ]: ISCA, 2017: 1-10.
|
22 |
FAN Y, KANG J W, LI L T, et al. CN-Celeb: a challenging Chinese speaker recognition dataset[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D. C., USA: IEEE Press, 2020: 7604-7608.
|
23 |
PARK D S, CHAN W, ZHANG Y, et al. SpecAugment: a simple data augmentation method for automatic speech recognition[C]//Proceedings of Interspeech 2019. [S. l. ]: ISCA, 2019: 2613-2617.
|
24 |
KINGMA D P, BA J. Adam: a method for stochastic optimization[C]//Proceedings of 2014 International Conference on Learning Representations (ICLR). Berlin, Germany: Springer, 2014: 1-15.
|
25 |
ZHOU D, WANG L B, LEE K A, et al. Dynamic margin softmax loss for speaker verification[C]//Proceedings of the Interspeech 2020. [S. l. ]: ISCA, 2020: 3800-3804.
|
26 |
LI Z, MAK M W. Speaker representation learning via contrastive loss with maximal speaker separability[C]//Proceedings of the 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). Washington D. C., USA: IEEE Press, 2022: 962-967.
|
27 |
HAN B, CHEN Z Y, QIAN Y M. Exploring binary classification loss for speaker verification[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Washington D. C., USA: IEEE Press, 2023: 1-5.
|
28 |
PRZYBOCKI M, MARTIN A, LE A. NIST speaker recognition evaluation Chronicles-part 2[C]//Proceedings of Odyssey 2006. Washington D. C., USA: IEEE Press, 2006: 1-10.
|
29 |
DENG J K, GUO J, XUE N N, et al. ArcFace: additive angular margin loss for deep face recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D. C., USA: IEEE Press, 2019: 4690-4699.
|