基于多尺度混合卷积Mamba网络的高光谱和LiDAR联合分类

doi:10.19678/j.issn.1000-3428.0252458

摘要/Abstract

摘要： 高光谱图像（HSI）与激光雷达（LiDAR）图像的联合分类能够充分发挥两者在光谱与空间结构信息方面的互补优势，已成为遥感领域的重要研究方向。然而，由于两种数据来源的成像机制存在显著差异，HSI与LiDAR在数据维度构成和特征分布上表现出高度异构性，这对多模态数据的语义表征与高效融合带来了严峻挑战。为应对上述挑战，提出了一种用于联合HSI和LiDAR数据分类的多尺度混合卷积Mamba网络（MHCMNet）。该框架首先通过多尺度特征提取模块（MFEM），从两种数据中分别提取光谱、空间和高程特征；随后，利用并行特征标记化模块（FTM）将两种模态的特征转换为统一的特征标记。为进一步增强多模态特征的协同表达能力，MHCMNet创新性地引入了基于Mamba架构的特征融合模块（MFFM），借助其出色的长程依赖建模能力，实现模态内及模态间特征的深度关联与高效融合。实验结果表明，MHCMNet在Trento、Houston2013和MUUFL三个数据集上分别取得了99.03%、90.71%和91.47%的最高总体精度（OA），同时保持了较低的模型复杂度。进一步的消融实验验证了各模块在性能提升中的有效性，充分证明了所提方法在多源遥感数据分类中的优越性能。

Abstract: The joint classification of hyperspectral imagery (HSI) and light detection and ranging (LiDAR) data can fully leverage their complementary advantages in spectral and spatial-structural information, and has become an important research focus in the field of remote sensing. However, due to significant differences in their imaging mechanisms, HSI and LiDAR exhibit a high degree of heterogeneity in terms of data dimensionality and feature distribution, which poses severe challenges for semantic representation and efficient fusion of multimodal data. To address these challenges, we propose a Multi-Scale Hybrid Convolution Mamba Network (MHCMNet) for joint HSI and LiDAR data classification. Specifically, the framework first employs a Multi-Scale Feature Extraction Module (MFEM) to extract spectral, spatial, and elevation features from the two modalities. Subsequently, the parallel Feature Tokenization Module (FTM) transforms the features of both modalities into unified feature tokens. To further enhance the collaborative representation of multimodal features, MHCMNet innovatively introduces a Mamba-based Feature Fusion Module (MFFM), which leverages its powerful long-range dependency modeling capability to achieve deep intra- and inter-modal feature interaction and efficient fusion. Experimental results demonstrate that MHCMNet achieves the highest overall accuracy (OA) of 99.03%, 90.71%, and 91.47% on the Trento, Houston2013, and MUUFL datasets, respectively, while maintaining low model complexity. In addition, ablation studies validate the effectiveness of each module in performance improvement, further confirming the superiority of the proposed method in multi-source remote sensing data classification.

吴江, 李子奇, 张永宏. 基于多尺度混合卷积Mamba网络的高光谱和LiDAR联合分类[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0252458.

Wu Jiang, Li Ziqi, Zhang Yonghong. Joint Hyperspectral and LiDAR Classification Based on Multiscale Hybrid Convolutional Mamba Networks[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0252458.

参考文献

[1]Gomez-Chova L, Tuia D, Moser G, et al. Multimodal Classification of Remote Sensing Images: A Review and Future Directions[J]. Proceedings of the IEEE, 2015, 103(9): 1560-1584.
[2]Fong A, Shu G, Mcdonogh B. Farm to Table: Applications for New Hyperspectral Imaging Technologies in Precision Agriculture, Food Quality and Safety[C]//CLEO: Applications and Technology, 2020.
[3]Gevaert C M, Suomalainen J, Tang J, et al. Generation of spectral–temporal response surfaces by combining multispectral satellite and hyperspectral UAV imagery for precision agriculture applications[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2015, 8(6): 3140-3146.
[4]Wang J, Zhang L, Tong Q, et al. The Spectral Crust project—Research on new mineral exploration technology[C]//2012 4th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS). IEEE, 2012: 1-4.
[5]Gao Y, Gao F, Dong J, et al. Change Detection From Synthetic Aperture Radar Images Based on Channel Weighting-Based Deep Cascade Network[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2019, 12(11): 4517-4529.
[6]Lee R J, Steele S L. Military use of satellite communications, remote sensing, and global positioning systems in the war on terror[J]. J. Air L. & Com., 2014, 79: 69.
[7]白淑芬,宋铁成.基于双文本提示和多重相似性学习的多标签遥感图像分类[J].电讯技术,2025,65(01):35-42. BAI S, SONG T. Multi-label remote sensing image classification based on dual-text prompts and multi-similarity learning[J]. Telecommunication Engineering, 2025, 65(01): 35-42.
[8]Sun L, Zhao G, Zheng Y, et al. Spectral–spatial feature tokenization transformer for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-14.
[9]Melgani F, Bruzzone L. Classification of hyperspectral remote sensing images with support vector machines[J]. IEEE Transactions on Geoscience and Remote Sensing, 2004, 42(8): 1778-1790.
[10]Ham J, Chen Y, Crawford M M, et al. Investigation of the random forest framework for classification of hyperspectral data[J]. IEEE Transactions on Geoscience and Remote Sensing, 2005, 43(3): 492-501.
[11]Pedergnana M, Marpu P R, Dalla Mura M, et al. Classification of remote sensing optical and LiDAR data using extended attribute profiles[J]. IEEE Journal of Selected Topics in Signal Processing, 2012, 6(7): 856-865.
[12]Jia S, Zhan Z, Zhang M, et al. Multiple feature-based superpixel-level decision fusion for hyperspectral and LiDAR data classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 59(2): 1437-1452.
[13]Xu X, Li W, Ran Q, et al. Multisource remote sensing data classification based on convolutional neural network[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 56(2): 937-949.
[14]Hang R, Li Z, Ghamisi P, et al. Classification of hyperspectral and LiDAR data using coupled CNNs[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(7): 4939-4950.
[15]He M, Li B, Chen H. Multi-scale 3D deep convolutional neural network for hyperspectral image classification[C] //2017 IEEE International Conference on Image Processing (ICIP). IEEE, 2017: 3904-3908.
[16]Zhao X, Tao R, Li W, et al. Joint classification of hyperspectral and LiDAR data using hierarchical random walk and deep CNN architecture[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(10): 7355-7370.
[17]hu F, Shi C, Shi K, et al. Joint Classification of Hyperspectral and LiDAR Data Using Hierarchical Multi-Modal Feature Aggregation Based Multi-Head Axial Attention Transformer[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 1-17.
[18] 金学鹏,高峰,石晓晨,等.针对多源遥感图像分类的门控跨模态聚合网络[J].中国图象图形学报,2025,30(03):883-894. Jin, X., Gao, F., Shi, X., et al. Gated cross-modal aggregation network for multisource remote sensing image classification[J]. Journal of Image and Graphics, 2025, 30(03): 883-894.
[19]Hong D, Han Z, Yao J, et al. SpectralFormer: Rethinking hyperspectral image classification with transformers[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1-15.
[20]Sun L, Zhao G, Zheng Y, et al. Spectral–spatial feature tokenization transformer for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-14.
[21]Xue Z, Yu X, Tan X, et al. Multiscale deep learning network with self-calibrated convolution for hyperspectral and LiDAR data collaborative classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1-16.
[22]Ding K, Lu T, Fu W, et al. Global–local transformer network for HSI and LiDAR data joint classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-13.
[23]Song T, Zeng Z, Gao C, et al. Joint Classification of Hyperspectral and LiDAR Data Using Height Information Guided Hierarchical Fusion-and-Separation Network[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 1-15.
[24]Chang H, Bi H, Li F, et al. Deep symmetric fusion transformer for multimodal remote sensing data classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 1-15.
[25]Liao D, Wang Q, Lai T, et al. Joint classification of hyperspectral and LiDAR data base on mamba[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 1-15.
[26]白玉,吴昊琦,张丽丽,等. SpiralMamba：一种用于高光谱图像分类的轻量级Mamba网络[J/OL].电讯技术,1-10. Bai, Y., Wu, H., Zhang, L., et al. SpiralMamba: A lightweight Mamba network for hyperspectral image classification[J/OL]. Telecommunication Engineering, 1-10. [2025-05-14].
[27]Gu A, Dao T. Mamba: Linear-time sequence modeling with selective state spaces[J]. arXiv preprint arXiv:2312.00752, 2023.
[28]Hong D, Gao L, Hang R, et al. Deep encoder–decoder networks for classification of hyperspectral and LiDAR data[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 19: 1-5.
[29]Roy S K, Deria A, Hong D, et al. Multimodal fusion transformer for remote sensing image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 1-20.
[30]Yao J, Zhang B, Li C, et al. Extended vision transformer (ExViT) for land use and land cover classification: A multimodal deep learning framework[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 1-15.
[31]Fang S, Li K, Li Z. S²ENet: Spatial–spectral cross-modal enhancement network for classification of hyperspectral and LiDAR data[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 19: 1-5.
[32]Mohla S, Pande S, Banerjee B, et al. Fusatnet: Dual attention based spectrospatial multimodal fusion network for hyperspectral and lidar classification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020: 92-93.
[33]Sun L, Wang X, Zheng Y, et al. Multiscale 3-D–2-D mixed CNN and lightweight attention-free transformer for hyperspectral and LiDAR classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 1-16.
[34]Roy S K, Sukul A, Jamali A, et al. Cross hyperspectral and LiDAR attention transformer: An extended self-attention for land use and land cover classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 1-15.

选择文件类型/文献管理软件名称

选择包含的内容