作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2026, Vol. 52 ›› Issue (5): 404-417. doi: 10.19678/j.issn.1000-3428.0070165

• 新一代网络与边缘计算 • 上一篇    下一篇

LuffyNet: 面向硬件感知的边缘智能

林海1, 王和钰1,*(), 曹越1, 王丽园2, 王世杰1   

  1. 1. 武汉大学国家网络安全学院, 湖北 武汉 430072
    2. 中交第二公路勘察设计研究院有限公司, 湖北 武汉 430056
  • 收稿日期:2024-07-23 修回日期:2024-09-12 出版日期:2026-05-15 发布日期:2024-12-19
  • 通讯作者: 王和钰
  • 作者简介:

    林海(CCF会员), 男, 副教授、博士, 主研方向为车联网、边缘计算、强化学习、数据融合

    王和钰(通信作者), 硕士研究生

    曹越(CCF会员), 教授、博士

    王丽园, 教授级高级工程师、硕士

    王世杰, 硕士研究生

  • 基金资助:
    湖北省重点研发计划项目(2023BAB022); 湖北省国际科技合作项目(2023EHA033); 国家重点研发计划(2022YFB3102100)

LuffyNet: Toward Hardware-Aware Edge Intelligence

LIN Hai1, WANG Heyu1,*(), CAO Yue1, WANG Liyuan2, WANG Shijie1   

  1. 1. School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, Hubei, China
    2. CCCC Second Highway Consultants Co., Ltd., Wuhan 430056, Hubei, China
  • Received:2024-07-23 Revised:2024-09-12 Online:2026-05-15 Published:2024-12-19
  • Contact: WANG Heyu

摘要:

边缘智能面临计算实时性、资源受限等挑战, 且不同设备间差异较大。多数研究通过模型压缩设计轻量化网络, 以满足边缘场景下的快速推理需求。然而过度压缩导致精度下降的同时并不能缩短推理延时, 从而影响边缘智能的性能。为了满足实时需求与计算资源约束条件下提高模型精度, 提出面向硬件感知的边缘智能框架: LuffyNet。该框架通过查找表估算模型推理性能, 并以计算时延和设备内存资源为约束, 实现对边缘设备的硬件感知。为得出符合时延约束且匹配边缘设备计算资源的高精度网络, LuffyNet框架以模型精度、推理时延和网络大小为优化目标, 通过梯度下降完成目标网络模型的构建。为缩减架构搜索的时间, LuffyNet框架基于Best Optimize策略与Worst Optimize策略, 减少搜索过程中的无效计算, 降低搜索的时间成本和计算开销。在3个LuffyNet网络和4个先进模型上的实验结果表明, LuffyNet-A以1.69 ms延迟实现66.50%的Top-1精度, 比ResNet50快近5倍, 且大小仅为6.58 MB。LuffyNet-B与LuffyNet-C在2.65 ms延迟内实现超过73%的Top-1精度, 优于ResNet18、ResNet50、DenseNet121与DenseNet169等先进模型的精度与推理速度表现。消融实验进一步验证了基于Best Optimize策略与Worst Optimize策略的LuffyNet框架, 不仅能找到匹配边缘设备的网络, 而且能将搜索时间缩短接近25%。

关键词: 边缘智能, 硬件感知, 网络架构搜索, 推理时延, 网络大小

Abstract:

Edge intelligence faces challenges such as real-time computation, limited resources, and device variations. Typically, models are compressed to create lightweight networks for fast inference in edge environments. However, excessive compression reduces accuracy and does not always shorten the inference time, thereby affecting the performance of edge intelligence. To address these issues, this paper proposes a hardware-aware edge intelligence framework called LuffyNet. The framework uses a lookup table to estimate the inference performance. It applies constraints on the computational latency and device memory to make the model hardware-aware. LuffyNet aims to create high-accuracy networks that fit edge devices while satisfying latency limits. The framework optimizes the model accuracy, inference latency, and network size through gradient descent. To reduce the search time, LuffyNet uses best optimization and worst optimization strategies. This approach reduces unnecessary computation and saves time and resources. Comparison experiments evaluate LuffyNet networks against four advanced models. LuffyNet-A achieves 66.50% Top-1 accuracy with a 1.69 ms delay, approximately five times faster than ResNet50, and is only 6.58 MB in size. LuffyNet-B and LuffyNet-C exceed 73% Top-1 accuracy with a 2.65 ms delay. They outperform ResNet18, ResNet50, DenseNet121, and DenseNet169 in terms of accuracy and speed. Ablation experiments confirm that networks formed using the LuffyNet framework are not only suitable for edge devices but also reduce the search time by approximately 25%.

Key words: edge intelligence, hardware-aware, network architecture search, inference latency, network size