作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2025, Vol. 51 ›› Issue (2): 213-222. doi: 10.19678/j.issn.1000-3428.0069743

• 体系结构与软件技术 • 上一篇    下一篇

面向输电线路边缘智能的硬件加速设计

张树华1,2, 王继业2, 赵传奇2, 陈宏铭3,*(), 郭咏雯3   

  1. 1. 华北电力大学电气与电子工程学院, 北京 102206
    2. 中国电力科学研究院有限公司, 北京 100192
    3. 浙江海洋大学信息工程学院, 浙江 舟山 316022
  • 收稿日期:2024-04-15 出版日期:2025-02-15 发布日期:2024-08-05
  • 通讯作者: 陈宏铭
  • 基金资助:
    国家电网有限公司科技项目(5700-202255475A-2-0-KJ)

Hardware Acceleration Design for Edge Intelligence of Transmission Lines

ZHANG Shuhua1,2, WANG Jiye2, ZHAO Chuanqi2, CHEN Hongming3,*(), GUO Yongwen3   

  1. 1. School of Electrical and Electronic Engineering, North China Electric Power University, Beijing 102206, China
    2. China Electric Power Research Institute, Beijing 100192, China
    3. School of Information Engineering, Zhejiang Ocean University, Zhoushan 316022, Zhejiang, China
  • Received:2024-04-15 Online:2025-02-15 Published:2024-08-05
  • Contact: CHEN Hongming

摘要:

近年来, 随着输电物联网的发展, 输电线路在线监测成为重点建设项目, 但嵌入式平台的计算能力和功耗问题影响了输电线路可视化的实现。为解决这些问题, 研究计算资源和存储资源高度融合的存内计算优化技术。首先, 设计了一种轻量级神经网络, 专用于输电线路目标识别, 有效降低了资源利用率; 其次, 提出一种适用于卷积神经网络(CNN)的现场可编程逻辑门阵列(FPGA)计算架构, 基于超轻量化异常目标识别神经网络算法, 结合特征图输出复用和乒乓机制等优化策略, 大幅提升了嵌入式平台的运行帧率并降低了资源占用率; 最后, 利用层融合技术、多通道传输和网络参数重排等策略, 优化了嵌入式平台的功耗, 提升了能效比。实验结果表明, FPGA加速器在175 MHz主频下工作时, 功耗低于3.5 W, 在输电线路数据集上的识别帧率达到33帧/s, 与其他方案相比, 在资源利用率、帧率和能效比方面均有显著提升。

关键词: 人工智能加速, 现场可编程逻辑门阵列(FPGA), YOLOv3网络, RISC-V硬核, 卷积神经网络

Abstract:

In recent years, with the development of the Internet of Things (IoT) for power transmission, the online monitoring of transmission lines has become a key construction focus. However, the computational capacity and power consumption of embedded platforms are major obstacles to the visualization of transmission lines. To address these issues, this paper proposes an in memory computing optimization technology that effectively integrates computing resources and storage resources. First, a lightweight neural network designed specifically for transmission line target recognition has been developed to effectively reduce resource utilization. Second, an ultra-lightweight anomaly target recognition neural network algorithm has been deployed to propose a Field Programmable Gate Array (FPGA) computing architecture suitable for Convolutional Neural Networks (CNN). This architecture incorporates optimization strategies, such as feature map output reuse and a Ping-Pong mechanism, that significantly improve frame rate and reduce resource usage on the embedded platform. Finally, through strategies such as layer fusion technology, multi-channel transmission, and network parameter rearrangement, the power consumption of the embedded platform has been optimized to enhance energy efficiency. Experimental results show that the FPGA accelerator, operating at a main frequency of 175 MHz, consumed less than 3.5 W of power and achieved a recognition frame rate of 33 frame/s on the transmission line dataset, demonstrating significant improvements in resource utilization, frame rate, and energy efficiency, compared to other solutions.

Key words: Artificial Intelligence(AI) acceleration, Field Programmable Gate Array(FPGA), YOLOv3 network, RISC-V hardcore, Convolutional Neural Network(CNN)