[1] DA’COSTA A, TEKE J, ORIGBO J E, et al. AI-driven triage in emergency departments: A review of benefits, challenges, and future directions[J]. International Journal of Medical Informatics, 2025, 197: 105838.
[2] AGARWAL S, ARYA K V, MEENA Y K. CNN-O-ELMNet: Optimized Lightweight and Generalized Model for Lung Disease Classification and Severity Assessment[J]. IEEE Transactions on Medical Imaging, 2024, 43(12): 4200-4210.
[3] 刘兆伟,方艳红,郑明宇,等.基于注意力机制与多任务的肺部疾病诊断方法[J].计算机工程,2025,51(01):332-342.
LIU Z W, FANG Y H, ZHENG M Y, et al. Lung disease diagnosis method based on attention mechanism and multi-tasking[J]. Computer Engineering, 2025, 51(01): 332-342. (in Chinese)
[4] NOWAK S, SCHNEIDER H, LAYER Y C, et al. Development of image-based decision support systems utilizing information extracted from radiological free-text report databases with text-based transformers[J]. European radiology, 2024, 34(5): 2895-2904.
[5] KADHIM M N, AL-SHAMMARY D, Sufi F. A novel voice classification based on Gower distance for Parkinson disease detection[J]. International Journal of Medical Informatics, 2024, 191: 105583.
[6] HANSEN L, ROCCA R, SIMONSEN A, et al. Speech-and text-based classification of neuropsychiatric conditions in a multidiagnostic setting[J]. Nature Mental Health, 2023, 1(12): 971-981.
[7] 杨士臣.面向医学文本的多标签分类方法研究[D]. 武汉: 武汉纺织大学,2024.
YANG S C. Research on multi-label classification methods for medical texts[D]. Wuhan: Wuhan Textile University, 2024. (in Chinese)
[8] 黎超,廖薇.基于医疗知识驱动的中文疾病文本分类模型[J].山东大学学报(理学版),2024,59(07):122-130.
LI C, LIAO W. Chinese disease text classification model driven by medical knowledge[J]. Journal of Shandong University (Natural Science), 2024, 59(07): 122-130. (in Chinese)
[9] 郑恩昱.基于深度学习组合模型的医疗文本分类[D]. 北京: 中央财经大学,2023.
ZHENG E Y. Medical text classification based on deep learning combination models[D]. Beijing: Central University of Finance and Economics, 2023. (in Chinese)
[10] LIU J, NGUYEN A, CAPURRO D, et al. Comparing text-based clinical risk prediction in critical care: a note-specific hierarchical network and large language models[J]. IEEE Journal of Biomedical and Health Informatics, 2025, 29(10): 7657 - 7667.
[11] PENG X, XU H, LIU J, et al. Voice disorder classification using convolutional neural network based on deep transfer learning[J]. Scientific Reports, 2023, 13(1): 7264.
[12] 梁丽娟.基于语音声学特征的抑郁智能识别模型及其验证研究[D]. 沈阳: 中国医科大学,2023.
LIANG L J. Research on the intelligent recognition model of depression based on speech acoustic features and its verification[D]. Shenyang: China Medical University, 2023. (in Chinese)
[13] ZHANG Z, WANG T, HU Z, et al. Multivariate time series approach integrating cross-temporal and cross-channel attention for dysarthria detection from speech[J]. Neurocomputing, 2025, 647: 130708.
[14] 孙阿朗.面向方言语音的阿尔茨海默病早期筛查系统设计与实现[D]. 上海: 东华大学,2025.
SUN A L. Design and implementation of early screening system for Alzheimer's disease oriented to dialect speech[D]. Shanghai: Donghua University, 2025. (in Chinese)
[15] 陈垒.基于注意力机制的阿尔茨海默病患者语音检测研究[D]. 重庆: 重庆工商大学,2025.
CHEN L. Research on speech detection of Alzheimer's disease patients based on attention mechanism[D]. Chongqing: Chongqing Technology and Business University, 2025. (in Chinese)
[16] LI S, NAIR R, NAQVI S M. Acoustic and text features analysis for adult ADHD screening: A data-driven approach utilizing DIVA interview[J]. IEEE journal of translational engineering in health and medicine, 2024, 12: 359-370.
[17] 赵健,崔骞,石佳,等.基于文本和声学特征的双模态融合抑郁倾向识别算法[J].计算机工程,2024,50(11):49-58.
ZHAO J, CUI Q, SHI J, et al. Dual-modal fusion depression tendency recognition algorithm based on text and acoustic features[J]. Computer Engineering, 2024, 50(11): 49-58. (in Chinese)
[18] 宋泓.基于文本—音频—图像信息的多模态情感分析与抑郁症辅助诊断方法[D]. 南京: 南京信息工程大学,2025.
SONG H. Multimodal emotion analysis and auxiliary diagnosis of depression based on text-audio-image information[D]. Nanjing: Nanjing University of Information Science and Technology, 2025. (in Chinese)
[19] DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: Association for Computational Linguistics, 2019: 4171-4186.
[20] GONG Y, CHUNG Y A, GLASS J. AST: audio spectrogram transformer[C]//Proceedings of Interspeech 2021. Brno: International Speech Communication Association, 2021: 571-575.
[21] MOONEY P. Medical speech, transcription, and intent: audio utterances paired with text for common medical symptoms[DB/OL]. Kaggle. https://www.kaggle.com/datasets/paultimothymooney/medical-speech-transcription-and-intent/data.
[22] HE P, LIU X, GAO J, et al. DeBERTa: decoding-enhanced BERT with disentangled attention[C]//Proceedings of the 9th International Conference on Learning Representations. OpenReview.net, 2021: 1-21.
[23] LEWIS M, LIU Y, GOYAL N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2020: 7871-7880.
[24] ZHANG K, LIU X, ZHAO N, et al. Dual channel semantic enhancement-based convolutional neural networks model for text classification[J]. International Journal of Modern Physics C, 2025, 36(10): 2442012.
[25] CHEN L, CHEN J. Deep neural network for automatic classification of pathological voice signals[J]. Journal of Voice, 2022, 36(2): 288.e15-288.e24.
[26] BELABBAS S, ADDOU D, SELOUANI S A. Pathological voice classification system based on CNN-BiLSTM network using speech enhancement and multi-stream approach[J]. International Journal of Speech Technology, 2024, 27(2): 483-502.
[27] CHANG K W, HSU M H, LI S W, et al. Exploring in-context learning of textless speech language model for speech classification tasks[C]//Proceedings of Interspeech 2024. Kos: International Speech Communication Association, 2024: 4139-4143.
[28] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[29] HOUJEIJ A, HAMIEH L, MEHDI N, et al. A novel approach for emotion classification based on fusion of text and speech[C]//Proceedings of the 2012 19th International Conference on Telecommunications. Piscataway: IEEE, 2012: 1-6.
[30] SHANG Y, FU T. Multimodal fusion: a study on speech-text emotion recognition with the integration of deep learning[J]. Intelligent Systems with Applications, 2024, 24: 200436.
[31] LIU Z, WANG Y, VAIDYA S, et al. KAN: Kolmogorov-Arnold networks[C]//Proceedings of the 2025 International Conference on Learning Representations. OpenReview.net, 2025: 1-47.
[32] GORISHNIY Y, KOTELNIKOV A, BABENKO A. TabM: advancing tabular deep learning with parameter-efficient ensembling[C]//Proceedings of the 2025 International Conference on Learning Representations. OpenReview.net, 2025: 1-37.
[33] PANDEY A, SINGH J, KAUR M. Bridging text and speech for emotion understanding: an explainable multimodal transformer fusion framework with unified audio-text attribution[J]. Journal of Intelligence, 2025, 13(12): 159.
[34] BAEVSKI A, ZHOU Y, MOHAMED A, et al. Wav2vec 2.0: a framework for self-supervised learning of speech representations[C]//Advances in Neural Information Processing Systems 33. Red Hook: Curran Associates, Inc., 2020: 1-12.
|