Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

Multi-level Loss-assisted Siamese Network for Remote Sensing Image Change Detection

  

  • Published:2026-06-12

面向遥感图像变化检测的多级损失辅助孪生网络 Sensing Image Change Detection

Abstract: Remote sensing image change detection aims to precisely localize land cover changes by comparatively analyzing the spatiotemporal evolution information contained in bi-temporal imagery, and has become a core task in fields such as dynamic monitoring of land resources, urban expansion assessment, and disaster emergency response. However, influenced by multiple factors including complex terrain interference, variations in illumination conditions, seasonal vegetation succession, and sensor imaging noise, change regions often exhibit characteristics such as substantial scale variations, discrete spatial distribution, and ambiguous boundary delineation. Existing change detection models suffer from insufficient exploitation of multi-scale information and inadequate extraction of deep global semantic correlations, rendering it challenging for these models to effectively discriminate genuine land surface changes from pseudo-changes, thereby constraining their discrimination accuracy in open-scene scenarios. To address the aforementioned limitations, a Multi-level Loss-assisted Siamese Network (MLLA_SiaNet) for remote sensing image change detection is proposed. The model adopts a weight-sharing Siamese architecture to extract multi-dimensional features from bi-temporal images separately, and generates hierarchical feature maps through a multi-level differential encoder. To overcome the linear limitations inherent in conventional differencing methods, we introduce a multi-angle difference representation strategy coupled with a channel-spatial hybrid attention mechanism, and design a Differential Fusion Module (DFM) to acquire high-quality difference features, thereby achieving adaptive suppression of background interference and precise focusing on genuine change characteristics. To compensate for the deficiency in global semantic representation, we integrate a spatial pooling pyramid with a Gaussian pyramid and propose a Deep Semantic Pyramid (DSP) module to construct multi-level semantic aggregation features, effectively expanding the receptive field and strengthening long-range contextual dependency modeling. During the decoding stage, the model employs a progressive upsampling strategy combined with a feature fusion mechanism to hierarchically restore spatial details, thereby enabling the reconstruction of high-resolution prediction maps. Furthermore, we introduce a deeply supervised Multi-level Loss-assisted (MLA) strategy to optimize the training process; by imposing auxiliary constraints on the outputs of each decoder layer, this strategy ensures consistency between local edge information and global contextual semantics, thereby constructing an end-to-end feature learning framework. To systematically validate the effectiveness of the proposed model, comparative experiments are conducted and results are comprehensively analyzed on two publicly available benchmark datasets, namely SYSU-CD and LEVIR-CD. On the SYSU-CD dataset, MLLA_SiaNet achieves an F1-score of 82.13%, outperforming seven other comparative methods and surpassing the second-best method, SFEARNet, by 1.3 percentage points; its precision and recall attain optimal values of 83.42% and 80.88%, respectively, achieving simultaneous improvement in both precision and recall metrics. On the LEVIR-CD dataset, MLLA_SiaNet achieves a precision of 89.48%, fully demonstrating the effectiveness of the proposed method in suppressing pseudo-change factors such as illumination variations, shadow effects, and seasonal vegetation changes; the F1-score of our model on the LEVIR-CD dataset reaches 85.87%, outperforming other state-of-the-art methods including SFEARNet (precision 84.89%), BIT (precision 82.80%), and IFN (precision 82.29%).Both quantitative and qualitative analyses of the experimental results demonstrate that the model exhibits superior robustness under varying spatial resolutions and complex land cover conditions. Ablation studies further corroborate the advantages of the DFM, DSP, and MLA modules in enhancing overall model performance, and the effectiveness of each architectural stage is empirically verified through analysis of the visualized response feature maps. In summary, this study mitigates the impacts of several critical challenges in remote sensing image change detection tasks, including insufficient multi-scale feature interaction, weak correlation modeling of global semantic information, and difficulties in suppressing pseudo-change interference. Future work will focus on lightweight model deployment, multi-temporal sequence modeling, and self-supervised pre-training techniques, as well as expanding systematic evaluations of model robustness across diverse application scenarios.

摘要: 遥感图像变化检测旨在通过对比分析双时相影像包含的时空演变信息,精准定位地表覆盖的变化情况,已成为国土资源动态监测、城市扩张评估及灾害应急响应等领域的核心任务。然而,受复杂地形干扰、光照条件差异、季节植被更替以及传感器成像噪声等多重因素影响,变化区域常常呈现尺度跨度大、空间分布离散以及边界模糊等特性。现有变化检测模型存在对多尺度信息利用不充分以及深层全局语义关联提取不充分的问题,模型难以有效区分真实地表演变与伪变化,制约了其在开放场景下的判别精度。针对上述局限,提出一种面向遥感图像变化检测的多级损失辅助孪生网络(Multi-level loss-assisted Siamese-Network,MLLA_SiaNet)。该模型采用权值共享孪生架构分别提取双时相图像的多维特征,通过多级差分编码器生成层次化特征图。为了突破传统差分方法的线性局限,引入多角度差异表示策略并耦合通道-空间混合注意力机制,设计差分融合模块(Differential Fusion Module,DFM)获取高质量差异特征,实现背景干扰的自适应抑制与真实变化特征的精准聚焦。为了弥补全局语义缺失,将空间池化金字塔与高斯金字塔结合,提出深度语义提取模块(Deep Semantic Pyramid,DSP)构建多层级语义聚合特征,有效扩大感受野并强化长程上下文依赖建模。模型的解码阶段采用渐进式上采样与特征融合机制逐级恢复空间细节,实现高分辨率预测图像的重建。并引入深度监督的多级辅助损失(Multi-level Loss-assisted,MLA)优化训练过程,通过对解码器各层输出进行辅助约束,确保局部边缘信息与全局信息一致性,构建端到端特征学习模型。为系统验证模型有效性,在SYSU-CD与LEVIR-CD公开数据集上开展对比实验并分析结果。在SYSU-CD数据集上,MLLA_SiaNet以82.13%的F1分数优于其他七种对比方法,较次优方法SFEARNet提升1.3个百分点;其精确度与召回率分别达到最优值83.42%和80.88%,实现了查准率与查全率的同步提升。在LEVIR-CD数据集上,MLLA_SiaNet的精确度达到了89.48%,充分说明所提出的方法在抑制光照、阴影及植被季节性变化等伪变化因素方面的有效性;本模型在LEVIR-CD数据集上的F1分数为85.87%,优于SFEARNet(精确度84.89%)、BIT(精确度82.80%)与IFN(精确度82.29%)等其他方法。对实验结果的定量分析与定性分析说明,模型在不同分辨率与复杂地物条件下均展现出较好的鲁棒性。消融实验进一步证实了DFM、DSP与MLA模块在提升模型性能方面的优势,并通过分析模型的可视化响应特征图,验证了模型各个阶段的有效性。综上,本研究缓解了遥感图像变化检测任务中多尺度特征交互不足、全局语义信息关联性较弱以及对伪变化抑制困难等关键问题的影响。未来工作将聚焦于轻量化部署、多时相序列建模及自监督预训练技术,拓展模型鲁棒性的系统性评测。