LuffyNet: 面向硬件感知的边缘智能

doi:10.19678/j.issn.1000-3428.0070165

计算机工程 ›› 2026, Vol. 52 ›› Issue (5): 404-417. doi: 10.19678/j.issn.1000-3428.0070165

• 新一代网络与边缘计算 • 上一篇下一篇

LuffyNet: 面向硬件感知的边缘智能

林海¹, 王和钰¹^,*(), 曹越¹, 王丽园², 王世杰¹

1. 武汉大学国家网络安全学院, 湖北武汉 430072
2. 中交第二公路勘察设计研究院有限公司, 湖北武汉 430056

收稿日期:2024-07-23 修回日期:2024-09-12 出版日期:2026-05-15 发布日期:2024-12-19
通讯作者: 王和钰
作者简介:
林海(CCF会员), 男, 副教授、博士, 主研方向为车联网、边缘计算、强化学习、数据融合
王和钰(通信作者), 硕士研究生
曹越(CCF会员), 教授、博士
王丽园, 教授级高级工程师、硕士
王世杰, 硕士研究生
基金资助:
湖北省重点研发计划项目(2023BAB022); 湖北省国际科技合作项目(2023EHA033); 国家重点研发计划(2022YFB3102100)

LuffyNet: Toward Hardware-Aware Edge Intelligence

LIN Hai¹, WANG Heyu¹^,*(), CAO Yue¹, WANG Liyuan², WANG Shijie¹

1. School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, Hubei, China
2. CCCC Second Highway Consultants Co., Ltd., Wuhan 430056, Hubei, China

Received:2024-07-23 Revised:2024-09-12 Online:2026-05-15 Published:2024-12-19
Contact: WANG Heyu

摘要/Abstract

摘要：

边缘智能面临计算实时性、资源受限等挑战, 且不同设备间差异较大。多数研究通过模型压缩设计轻量化网络, 以满足边缘场景下的快速推理需求。然而过度压缩导致精度下降的同时并不能缩短推理延时, 从而影响边缘智能的性能。为了满足实时需求与计算资源约束条件下提高模型精度, 提出面向硬件感知的边缘智能框架: LuffyNet。该框架通过查找表估算模型推理性能, 并以计算时延和设备内存资源为约束, 实现对边缘设备的硬件感知。为得出符合时延约束且匹配边缘设备计算资源的高精度网络, LuffyNet框架以模型精度、推理时延和网络大小为优化目标, 通过梯度下降完成目标网络模型的构建。为缩减架构搜索的时间, LuffyNet框架基于Best Optimize策略与Worst Optimize策略, 减少搜索过程中的无效计算, 降低搜索的时间成本和计算开销。在3个LuffyNet网络和4个先进模型上的实验结果表明, LuffyNet-A以1.69 ms延迟实现66.50%的Top-1精度, 比ResNet50快近5倍, 且大小仅为6.58 MB。LuffyNet-B与LuffyNet-C在2.65 ms延迟内实现超过73%的Top-1精度, 优于ResNet18、ResNet50、DenseNet121与DenseNet169等先进模型的精度与推理速度表现。消融实验进一步验证了基于Best Optimize策略与Worst Optimize策略的LuffyNet框架, 不仅能找到匹配边缘设备的网络, 而且能将搜索时间缩短接近25%。

关键词: 边缘智能, 硬件感知, 网络架构搜索, 推理时延, 网络大小

Abstract:

Edge intelligence faces challenges such as real-time computation, limited resources, and device variations. Typically, models are compressed to create lightweight networks for fast inference in edge environments. However, excessive compression reduces accuracy and does not always shorten the inference time, thereby affecting the performance of edge intelligence. To address these issues, this paper proposes a hardware-aware edge intelligence framework called LuffyNet. The framework uses a lookup table to estimate the inference performance. It applies constraints on the computational latency and device memory to make the model hardware-aware. LuffyNet aims to create high-accuracy networks that fit edge devices while satisfying latency limits. The framework optimizes the model accuracy, inference latency, and network size through gradient descent. To reduce the search time, LuffyNet uses best optimization and worst optimization strategies. This approach reduces unnecessary computation and saves time and resources. Comparison experiments evaluate LuffyNet networks against four advanced models. LuffyNet-A achieves 66.50% Top-1 accuracy with a 1.69 ms delay, approximately five times faster than ResNet50, and is only 6.58 MB in size. LuffyNet-B and LuffyNet-C exceed 73% Top-1 accuracy with a 2.65 ms delay. They outperform ResNet18, ResNet50, DenseNet121, and DenseNet169 in terms of accuracy and speed. Ablation experiments confirm that networks formed using the LuffyNet framework are not only suitable for edge devices but also reduce the search time by approximately 25%.

Key words: edge intelligence, hardware-aware, network architecture search, inference latency, network size

林海, 王和钰, 曹越, 王丽园, 王世杰. LuffyNet: 面向硬件感知的边缘智能[J]. 计算机工程, 2026, 52(5): 404-417.

LIN Hai, WANG Heyu, CAO Yue, WANG Liyuan, WANG Shijie. LuffyNet: Toward Hardware-Aware Edge Intelligence[J]. Computer Engineering, 2026, 52(5): 404-417.

https://www.ecice06.com/CN/Y2026/V52/I5/404

图/表 15

图1 超级网络示意图

Fig.1 Schematic diagram of the super network

图2 搜索算法流程

Fig.2 Search algorithm procedure

图3 LuffyNet框架工作流程

Fig.3 LuffyNet framework workflow procedure

图4 推理时延变化

Fig.4 Variation of inference latency

图5 网络大小变化

Fig.5 Variation of network size

图6 模型精度变化

Fig.6 Variation of model accuracy

图7 LuffyNet架构

Fig.7 LuffyNet architecture

图8 硬件性能对比

Fig.8 Comparison of hardware performance

图9 模型精度对比

Fig.9 Comparison of model accuracy

参考文献 49

1	DENG S G , ZHAO H L , FANG W J , et al. Edge intelligence: the confluence of edge computing and artificial intelligence. IEEE Internet of Things Journal, 2020, 7 (8): 7457- 7469. doi: 10.1109/JIOT.2020.2984887
2	ZHOU Z , CHEN X , LI E , et al. Edge intelligence: paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE, 2019, 107 (8): 1738- 1762. doi: 10.1109/JPROC.2019.2918951
3	谭郁松, 李恬, 张钰森. 面向边缘智能的神经网络模型生成与部署研究. 计算机工程, 2024, 50 (8): 1- 12. doi: 10.19678/j.issn.1000-3428.0068554
	TAN Y S , LI T , ZHANG Y S . Research on generation and deployment of neural network model for edge intelligence. Computer Engineering, 2024, 50 (8): 1- 12. doi: 10.19678/j.issn.1000-3428.0068554
4	MACH P , BECVAR Z . Mobile edge computing: a survey on architecture and computation offloading. IEEE Communications Surveys & Tutorials, 2017, 19 (3): 1628- 1656.
5	CHUN B G, IHM S, MANIATIS P, et al. CloneCloud: elastic execution between mobile device and cloud[C]//Proceedings of the 6th Conference on Computer Systems. New York, USA: ACM Press, 2011: 301-314.
6	宋艳蕊, 庄雷, 徐泽汐, 等. 基于云边协同的可靠服务功能链部署算法. 计算机工程, 2024, 50 (12): 184- 193. doi: 10.19678/j.issn.1000-3428.0069052
	SONG Y R , ZHUANG L , XU Z X , et al. Reliable service function chain deployment algorithm based on edge-cloud collaboration. Computer Engineering, 2024, 50 (12): 184- 193. doi: 10.19678/j.issn.1000-3428.0069052
7	HAN S, MAO H, DALLY W J. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding[EB/OL]. [2024-06-20]. https://arxiv.org/abs/1510.00149.
8	魏铭康, 李嘉楠, 韩林, 等. 面向深度学习编译器的多粒度量化框架支持与优化. 计算机工程, 2025, 51 (5): 62- 72. doi: 10.19678/j.issn.1000-3428.0069206
	WEI M K , LI J N , HAN L , et al. Support and optimization of multi-granularity quantization framework for deep learning compiler. Computer Engineering, 2025, 51 (5): 62- 72. doi: 10.19678/j.issn.1000-3428.0069206
9	CHEN P Y, LIN H C, GUO J I. Multi-scale dynamic fixed-point quantization and training for deep neural networks[C]// Proceedings of the IEEE International Symposium on Circuits and Systems. Washington D. C., USA: IEEE Press, 2023: 1-5.
10	RASTEGARI M, ORDONEZ V, REDMON J, et al. XNOR-Net: ImageNet classification using binary convolutional neural networks[C]//Proceedings of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 525-542.
11	VARDAR A, ZHANG L, HU S S, et al. Layer sensitivity aware CNN quantization for resource constrained edge devices[C]//Proceedings of the 9th International Conference on Soft Computing & Machine Intelligence. New York, USA: ACM Press, 2022: 26-30.
12	AKPOLAT M Z, BULBUL A. A global approach for goal-driven pruning of object recognition networks[C]// Proceedings of the 30th Signal Processing and Communications Applications Conference. Washington D. C., USA: IEEE Press, 2022: 1-4.
13	WANG Z D , LIU X X , HUANG L , et al. QSFM: model pruning based on quantified similarity between feature maps for AI on edge. IEEE Internet of Things Journal, 2022, 9 (23): 24506- 24515. doi: 10.1109/JIOT.2022.3190873
14	CHEN Y L, CHEN J, WANG Y, et al. A model compression method for power edge intelligent inspection via channel pruning[C]//Proceedings of the 3rd Power System and Green Energy Conference. New York, USA: ACM Press, 2023: 1169-1173.
15	AKHTER S , HOSSAIN M I , HOSSAIN M D , et al. NeuRes: highly activated neurons responses transfer via distilling sparse activation maps. IEEE Access, 2022, 10, 131555- 131566. doi: 10.1109/ACCESS.2022.3227804
16	林烁彬, 蔡捷仪, 方晓城, 等. 基于强度相关正则化学习的对抗鲁棒蒸馏方法. 计算机工程, 2025, 51 (1): 42- 50. doi: 10.19678/j.issn.1000-3428.0069656
	LIN S B , CAI J Y , FANG X C , et al. Adversarial robust distillation method based on intensity correlation regularization learning. Computer Engineering, 2025, 51 (1): 42- 50. doi: 10.19678/j.issn.1000-3428.0069656
17	RISSO M , BURRELLO A , CONTI F , et al. Lightweight neural architecture search for temporal convolutional networks at the edge. IEEE Transactions on Computers, 2022, 72 (3): 744- 758.
18	MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[C]//Proceedings of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 122-138.
19	ZOPH B, LE Q. Neural architecture search with reinforcement learning[EB/OL]. [2024-06-20]. https://openreview.net/forum?id=r1Ue8Hcxg.
20	TAN M X, CHEN B, PANG R M, et al. MnasNet: platform-aware neural architecture search for mobile[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2019: 2820-2828.
21	DAI X L, ZHANG P Z, WU B C, et al. ChamNet: towards efficient network design through platform-aware model adaptation[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2019: 11398-11407.
22	CAI H, ZHU L G, HAN S. ProxylessNAS: direct neural architecture search on target task and hardware[EB/OL]. [2024-06-20]. https://arxiv.org/abs/1812.00332.
23	WU B C, KEUTZER K, DAI X L, et al. FBNet: hardware-aware efficient ConvNet design via differentiable neural architecture search[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2019: 10734-10742.
24	PHAM H, GUAN M, ZOPH B, et al. Efficient neural architecture search via parameters sharing[C]//Proceedings of the International Conference on Machine Learning. New York, USA: ACM Press, 2018: 4095-4104.
25	SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 4510-4520.
26	ZOPH B, VASUDEVAN V, SHLENS J, et al. Learning transferable architectures for scalable image recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 8697-8710.
27	ESHELMAN L J . Genetic algorithms. New York, USA: CRC Press, 2018.
28	HU X L, HUANG Z, WANG Z F. Hybridization of the multi-objective evolutionary algorithms and the gradient-based algorithms[C]//Proceedings of the 2003 Congress on Evolutionary Computation. Washington D. C., USA: IEEE Press, 2003: 870-877.
29	JANG E, GU S, POOLE B. Categorical reparameterization with gumbel-softmax[EB/OL]. [2024-06-20]. https://arxiv.org/abs/1611.01144.
30	XIE S, ZHENG H, LIU C, et al. SNAS: stochastic neural architecture search[EB/OL]. [2024-06-20]. https://arxiv.org/abs/1812.09926.
31	GIBBS M, KANJO E. Realising the power of edge intelligence: addressing the challenges in AI and tinyML applications for edge computing[C]//Proceedings of the IEEE International Conference on Edge Computing and Communications. Chicago, USA: IEEE Press, 2023: 337-343.
32	HADIDI R, CAO J S, RYOO M S, et al. Reducing inference latency with concurrent architectures for image recognition at edge[C]//Proceedings of the IEEE International Conference on Edge Computing and Communications. Chicago, USA: IEEE Press, 2023: 245-254.
33	IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size[EB/OL]. [2024-06-20]. https://arxiv.org/abs/1602.07360.
34	YAN Z Y, LI X M, LI M, et al. Shift-net: image inpainting via deep feature rearrangement[C]//Proceedings of the European Conference on Computer Vision and Pattern Recognition. Berlin, Germany: Springer, 2018: 3-19.
35	LIU H, SIMONYAN K, YANG Y. Darts: differentiable architecture search[EB/OL]. [2024-06-20]. https://arxiv.org/abs/1806.09055.
36	ZHONG Z, YAN J J, WU W, et al. Practical block-wise neural network architecture generation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 2423-2432.
37	XUE S, ZHAO B, CHEN H L, et al. UCB-ENAS based on reinforcement learning[C]//Proceedings of the 16th IEEE Conference on Industrial Electronics and Applications. Washington D. C., USA: IEEE Press, 2021: 2008-2013.
38	WANG D L, LI M, GONG C Y, et al. AttentiveNAS: improving neural architecture search via attentive sampling[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE Press, 2021: 6418-6427.
39	ZHAO S X, QU S Y, WANG Y, et al. ENASA: towards edge neural architecture search based on CIM acceleration[C]//Proceedings of the 2023 Design, Automation & Test in European Conference & Exhibition. Berlin, Germany: Springer, 2023: 321-332.
40	XU K P, HE G. DNAS: a decoupled global neural architecture search method[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New Orleans, USA: IEEE Press, 2022: 1979-1985.
41	CAI H, GAN C, HAN S. Once for all: train one network and specialize it for efficient deployment[EB/OL]. [2024-06-20]. https://arxiv.org/abs/1908.09791.
42	LIU C X, ZOPH B, NEUMANN M, et al. Progressive neural architecture search[C]//Proceedings of the European Conference on Computer Vision and Pattern Recognition. Berlin, Germany: Springer, 2018: 19-35.
43	STAMOULIS D, DING R Z, WANG D, et al. Single-path NAS: designing hardware-efficient ConvNets in less than 4 hours[C]//Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin, Germany: Springer, 2020: 481-497.
44	STAMOULIS D , DING R Z , WANG D , et al. Single-path mobile AutoML: efficient ConvNet design and NAS hyperparameter optimization. IEEE Journal of Selected Topics in Signal Processing, 2020, 14 (4): 609- 622.
45	GUO Z C, ZHANG X Y, MU H Y, et al. Single path one-shot neural architecture search with uniform sampling[C]//Proceedings of the 16th European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 544-560.
46	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE Press, 2016: 770-778.
47	CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE Press, 2017: 1251-1258.
48	SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE Press, 2015: 1-9.
49	HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 4700-4708.

[1]	王田, 李果, 梅雅欣, 钟文韬. 传感云与边缘计算综述(特邀)[J]. 计算机工程, 2026, 52(5): 3-42.
[2]	王田, 李雨婷, 王文华. 就地计算、云边协同: 传感云与边缘计算的一体化框架(特邀)[J]. 计算机工程, 2026, 52(2): 7-12.
[3]	欧阳昱中, 韩锐, 刘驰. 边缘侧领域自适应中长尾视觉识别技术研究[J]. 计算机工程, 2025, 51(7): 171-179.
[4]	谭郁松, 李恬, 张钰森. 面向边缘智能的神经网络模型生成与部署研究[J]. 计算机工程, 2024, 50(8): 1-12.
[5]	缪斯, 祝永新. 针对图像盲去模糊的可微分神经网络架构搜索方法[J]. 计算机工程, 2021, 47(9): 313-320.

选择文件类型/文献管理软件名称

选择包含的内容

LuffyNet: 面向硬件感知的边缘智能

LuffyNet: Toward Hardware-Aware Edge Intelligence

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献 49

相关文章 5

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

LuffyNet: 面向硬件感知的边缘智能

LuffyNet: Toward Hardware-Aware Edge Intelligence

RichHTML

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献 49

相关文章 5

编辑推荐

Metrics

本文评价