[1] LE SCAO T, FAN A, AKIKI C, et al. BLOOM: a
176B-parameter open-access multilingual language model[EB/OL].
(2022-11-09) [2025-06-07]. https://arxiv.org/abs/2211.05100.
[2] GALLIFANT J, FISKE A, LEVITES STREKALOVA Y A, et al. Peer
review of GPT-4 technical report and systems card[J]. Plos Digital
Health, 2024, 3(1): e0000417.
[3] KARUMBUNATHAN L S. Nvidia jetson agx orin series[EB/OL].
[2025-06-07]. https://www.nvidia.cn/content/dam/en-zz/Solutions/gtcf
21/jetson-orin/nvidia-jetson-agx-orin-technical-brief.pdf.
[4] 杨春,张睿尧,黄泷,等.深度神经网络模型量化方法综述[J].工程科学
学报, 2023, 45(10): 1613-1629.
YANG C, ZHANG R Y, HUANG L, et al. A survey of quantization
methods for deep neural network models[J]. Journal of Engineering Science,
2023, 45(10): 1613-1629.
[5] CHEN M Z, SHAO W Q, XU P, et al. EfficientQAT: efficient
quantization-aware training for large language models[EB/OL].
(2024-07-10)[2025-06-07]. https://arxiv.org/abs/2407.11062.
[6] HASAN J. Optimizing large language models through quantization: a
comparative analysis of PTQ and QAT techniques[EB/OL].
(2024-11-09)[2025-06-07]. https://arxiv.org/abs/2411.06084.
[7] KERKOURI M A, TLIBA M, CHETOUANI A, et al. Quantization
effects on neural networks perception: how would quantization change
the perceptual field of vision models?[C]//Proceedings of the IEEE
Thirteenth International Conference on Image Processing Theory,
Tools and Applications. Washington D. C., USA: IEEE Press, 2024:
1-6.
[8] JACOB B, KLIGYS S, CHEN B, et al. Quantization and training of
neural networks for efficient integer-arithmetic-only inference[C]
//Proceedings of the IEEE conference on computer vision and pattern
recognition. Washington D. C., USA: IEEE Press, 2018: 2704-2713.
[9] COURBARIAUX M, HUBARA I, SOUDRY D, et al. Binarized neural
networks: Training deep neural networks with weights and activations
constrained
to+
1
or-1[EB/OL].(2016-07-9)[2025-09-01].
https://arxiv.org/abs/1602.02830.
[10] 咸聪慧,王天一,李超,等.基于量化的深度神经网络优化研究综述[J].
山东师范大学学报(自然科学版), 2024, 39(01): 21-32.
XIAN C H, WANG T Y, LI C, et al. A survey of optimization research
on deep neural networks based on quantization[J]. Journal of Shandong
Normal University (Natural Science Edition), 2024, 39(01): 21-32.
[11] CHENG Y, WANG D, ZHOU P, et al. A survey of model compression
and
acceleration
for
deep
neural
networks[EB/OL].
(2020-06-30)[2025-09-01]. https://arxiv.org/abs/1710.09282v1.
[12] KIM S, HOOPER C, WATTANAWONG T, et al. Full stack
optimization of transformer inference:a survey[EB/OL]. (2023-07-27)
[2025-09-01]. https://arxiv.org/abs/2302.14017.
[13] ZHU X, LI J, LIU Y, et al. A survey on model compression for large
language models[J]. Transactions of the Association for Computational
Linguistics, 2024, 12: 1556-1577.
[14] QIN H, GONG R, LIU X, et al. Binary neural networks: A survey[J].
Pattern Recognition, 2020, 105: 107281.
[15] ZHANG Z, GAO Y C, FAN J, et al. SelectQ: calibration data selection
for post-training quantization[J]. Machine Intelligence Research, 2025,
22(3): 499-510.
[16] HUBARA I, NAHSHAN Y, HANANI Y, et al. Improving post training
neural quantization: layer-wise calibration and integer programming
[EB/OL]. (2020-06-14)[2025-06-07]. https://arxiv.org/abs/2006.10518.
[17] GONG R, LIU X L, JIANG S H, et al. Differentiable soft quantization:
Bridging full-precision and low-bit neural networks[C]//Proceedings of
the IEEE/CVF International Conference on Computer Vision.
Washington D. C., USA: IEEE Press, 2019: 4852-4861.
[18] CHICCO D, WARRENS M J, JURMAN G. The coefficient of
determination r-squared is more informative than SMAPE, MAE,
MAPE, MSE and RMSE in regression analysis evaluation[J]. PeerJ.
Computer Science, 2021, 7: e623-e623.
[19] 郭秋丹,濮约刚,张启军,等.基于舍入误差的神经网络量化方法[J].计
算机工程与设计, 2024, 45(08): 2534-2539.
GUO Q D, PU Y G, ZHANG Q J, et al. Neural network quantization
method based on rounding errors[J]. Computer Engineering and
Design, 2024, 45(08): 2534-2539.
[20] KYURKCHIEV N, MARKOV S. Sigmoid functions: some
approximation and modelling aspects[J]. LAP LAMBERT Academic
Publishing, 2015, 4: 34.
[21] LIU L Y, JIANG H M, HE P C, et al. On the variance of the adaptive
learning rate and beyond[C]//Proceedings of the 8th International
Conference on Learning Representations. Washington D. C., USA:
IEEE Press, 2020: 1-14.
[22] WU D, TANG Q, ZHAO Y, et al. EasyQuant: post-training
quantization
via
scale
optimization[EB/OL].
[2025-06-07]. https://arxiv.org/abs/2006.16669.
(2020-06-30)
[23] DING Y F, FENG W L, CHEN C Y, et al. Reg-PTQ:
regression-specialized post-training quantization for fully quantized
object detector[C]//Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. Washington D. C., USA:
IEEE Press, 2024: 16174-16184.
[24] WANG Z W, WU Z Y, LU J W, et al. BiDet: an efficient binarized
object detector[C]//Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. Washington D. C., USA:
IEEE Press, 2020: 2049-2058.
[25] CHEN P, LIU J, ZHUANG B H, et al. AQD: towards accurate
quantized object detection[C]//Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition. Washington
D. C., USA: IEEE Press, 2021: 104-113.
[26] NIU L, LIU J W, YUAN Z H, et al. Improving post-training
quantization on object detection with task loss-guided Lp metric
[EB/OL].(2023-05-07)[2025-06-07].https://arxiv.org/abs/2304.09785.
[27] SO J, LEE J, AHN D, et al. Temporal dynamic quantization for
diffusion models[J]. Advances in Neural Information Processing
Systems, 2023, 36: 48686-48698.
[28] WANG C Y, WANG Z W, XU X W, et al. Towards accurate
post-training quantization for diffusion models[C]//Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Washington D. C., USA: IEEE Press, 2024: 16026-16035.
[29] LIU X C, YE M, ZHOU D Y, et al. Post-training quantization with
multiple points: mixed precision without mixed precision[C]//
Proceedings of the AAAI Conference on Artificial Intelligence.
Washington D. C., USA: IEEE Press, 2021, 35(10): 8697-8705.
[30] HE Y F, LIU J, WU W J, et al. EfficientDM: efficient
quantization-aware fine-tuning of low-bit diffusion models[C]//
Proceedings of the Twelfth International Conference on LearningRepresentations . Washington D. C., USA: IEEE Press, 2024: 1-20.
[31] HUBARA I, NAHSHAN Y, HANANI Y, et al. Accurate post training
quantization with small calibration sets[C]//Proceedings of the
International Conference on Machine Learning. New York, USA: ACM
Press, 2021: 4466-4475.
[32] WEI X Y, ZHANG Y C, ZHANG X G, et al. Outlier suppression:
pushing the limit of low-bit transformer language models[J]. Advances
in Neural Information Processing Systems, 2022, 35: 17402-17414.
[33] WEI X Y, ZHANG Y C, LI Y H, et al. Outlier suppression+: accurate
quantization of large language models by equivalent and optimal
shifting and scaling[C]//Proceedings of the 2023 Conference on
Empirical Methods in Natural Language Processing. Stroudburg: ACL
Press, 2023: 1648-1665.
[34] KIM N J, LEE J, KIM H. HyQ: hardware-friendly post-training
quantization for CNN-transformer hybrid networks[C]//Proceedings of
the
Thirty-Third International Joint Conference on Artificial
Intelligence. Jeju Island, Republic of Korea:[s. n.], 2024: 4291-4299.
[35] SHANG Y Z, LIU G W, KOMPELLA R R, et al. CL-Calib: enhancing
post-training quantization calibration through contrastive learning[C]
//Proceedings of the International Conference on Learning
Representations. Washington D. C., USA: IEEE Press, 2024: 1-11.
[36] LI X Y, LIU Y J, LIAN L, et al. Q-Diffusion: quantizing diffusion
models[C]//Proceedings of the IEEE/CVF International Conference on
Computer Vision. Washington D. C., USA: IEEE Press, 2023:
17535-17545.
[37] SHANG Y Z, YUAN Z H, XIE B, et al. Post-training quantization on
diffusion models[C]//Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. Washington D. C., USA:
IEEE Press, 2023: 1972-1981.
[38] TANG S, WANG X, CHEN H, et al. Post-training quantization with
progressive calibration and activation relaxing for text-to-image
diffusion models[C]//Proceedings of the European Conference on
Computer Vision. Cham: Springer Nature Switzerland, 2024: 404-420.
[39] YVINEC E, DAPOGNY A, BAILLY K. Gradient-based post-training
quantization: challenging the status quo[EB/OL]. [2025-06-7]. https://
arxiv.org/abs/2308.07662.
[40] JIANG Y F, SUN N, XIE X, et al. ADFQ-ViT: activation-distribution-
friendly post-training quantization for vision transformers[J]. Neural
Networks, 2025, 186: 107289.
[41] LI Z K, XIAO J R, YANG L et al. RepQ-ViT: scale reparameterization
for post-training quantization of vision transformers [C]//Proceedings
of the IEEE/CVF International Conference on Computer Vision.
Washington D. C., USA: IEEE Press, 2023: 17227-17236.
[42] OH S, SIM H, KIM J, et al. Non-uniform step size quantization for
accurate post-training quantization[C]//Proceedings of the European
Conference on Computer Vision. Cham: Springer Nature Switzerland,
2022: 658-673.
[43] MOON J, KIM D, CHEON J, et al. Instance-aware group quantization
for vision transformers[C]//Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition. Washington D. C., USA:
IEEE Press, 2024: 16132-16141.
[44] RANJAN N, SAVAKIS A. LRP-QViT: mixed-precision vision
transformer quantization via layer-wise relevance propagation[EB/OL].
[2025-06-07]. https://arxiv.org/abs/2401.11243.
[45] LEE J, KWON Y, PARK S, et al. Q-HyViT: post-Training quantization
of hybrid vision transformers With Bridge Block Reconstruction for
IoT Systems[J]. IEEE Internet of Things Journal, 2024, 11(22): 36384-
36396.
[46] YANG D W, HE N, HU X, et al. Post-training quantization for
re-parameterization via coarse & fine weight splitting[J]. Journal of
Systems Architecture, 2024, 147: 103065.
[47] RYU H, LIM S, SHIM H. Memory-efficient fine-tuning for quantized
diffusion model[C]//Proceedings of the European Conference on
Computer Vision. Cham: Springer Nature Switzerland, 2024: 356-372.
[48] WANG H X, SHANG Y Z, YUAN Z H, et al. QuEST: low-bit
diffusion model quantization via efficient selective finetuning[EB/OL].
[2025-06-07]. https://arxiv.org/abs/2402.03666.
[49] LIU X W, LI Z K, XIAO J R, et al. EDA-DM: enhanced distribution
alignment for post-training quantization of diffusion models [EB/OL].
(2024-01-25)[2025-06-07]. https://arxiv.org/abs/2401.04585.
[50] LI Y H, GONG R H, TAN X, et al. BRECQ: pushing the limit of
post-training quantization by block reconstruction[C]//Proceedings of
the International Conference on Learning Representations. Washington
D. C., USA: IEEE Press, 2021: 1-16.
[51] YAO H Y, LI P, CAO J, et al. RAPQ: rescuing accuracy for
power-of-two low-bit post-training quantization[C]//Proceedings of the
Thirty-First International Joint Conference on Artificial Intelligence.
Vienna, Austria: [s. n.], 2022: 1573-1579.
[52] SHOMRON G, GABBAY F, KURZUM S, et al. Post-training
sparsity-aware quantization[J]. Advances in Neural Information
Processing Systems, 2021, 34: 17737-17748.
[53] WANG C B, ZHENG D D, LIU Y L, et al. Leveraging inter-layer
dependency for post-training quantization[J]. Advances in Neural
Information Processing Systems, 2022, 35: 6666-6679.
[54] JEON Y, LEE C, CHO E, et al. Mr.BiQ: post-training non-uniform
quantization based on minimizing the reconstruction error[C]//
Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition. Washington D. C., USA: IEEE Press, 2022:
12329-12338.
[55] BAI S P, CHEN J, SHEN X T, et al. Unified data-free compression:
pruning and quantization without fine-tuning[C]//Proceedings of the
IEEE/CVF International Conference on Compute r Vision. Washington
D. C., USA: IEEE Press, 2023: 5876-5885.
[56] LI Y H, PANDA P. TesseraQ: ultra low-bit LLM post-training
quantization with block reconstruction[EB/OL]. (2024-10-24)
[2025-06-07]. https://arxiv.org/abs/2410.19103.
[57] YIN J J, DONG J H, WANG Y H, et al. ModuLoRA: finetuning 2-bit
LLMs on consumer GPUs by integrating with modular quantizers[C]//
Proceedings of the 2024 Transactions on Machine Learning Research.
Cham, Switzerland: Springer Nature Switzerland, 2024: 1-17.
[58] XU K, LI Z C, WANG S Y, et al. PTMQ: post-training multi-bit
quantization of neural networks[C]//Proceedings of the AAAI
Conference on Artificial Intelligence. Washington D. C., USA: IEEE
Press, 2024, 38(14): 16193-16201.
[59] DETTMERS T, LEWIS M, SHLEIFER S, et al. 8-bit optimizers via
block-wise
quantization[C]//Proceedings
of
the
International
Conference on Learning Representations. Washington D. C., USA:
IEEE Press, 2022: 1-19.
[60] FRANTAR E, ALISTARH D. Optimal brain compression: a
framework for accurate post-training quantization and pruning[J].
Advances in Neural Information Processing Systems, 2022, 35: 4475-
4488.
[61] Nagel M, Amjad R A, Van Baalen M, et al. Up or down? adaptive
rounding for post-training quantization[C]//International conference onmachine learning. New York, USA: ACM Press, 2020: 7197-7206.
[62] 田程,李正杰,陈功富,等.深度神经网络低比特量化方法综述[J].现代
信息科技,2025,9(10):23-33+38.
TIAN C, LI Z J, CHEN G F, et al. A survey of low-bit quantization
methods for deep neural networks[J].Modern Information Technology,
2025, 9(10): 23-33+38.
[63] HE Y F, LIU L P, LIU J, et al. PTQD: accurate post-training
quantization for diffusion models[C]//Proceedings of the 37th
International Conference on Neural Information Processing Systems.
New York, USA: ACM Press, 2023: 13237-13249.
[64] BHALGAT Y, LEE J, NAGEL M, et al. LSQ+: improving low-bit
quantization through learnable offsets and better initialization[C]//
Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition Workshops. Washington D. C., USA: IEEE Press,
2020: 696-697.
[65] LEE J H, KIM J, KWON S J, et al. FlexRound: learnable rounding
based on element-wise division for post-training quantization[C]//
Proceedings of the International Conference on Machine Learning.
New York, USA: ACM Press, 2023: 18913-18939.
[66] KIM H B, LEE J H, YOO S, et al. MetaMix: meta-state precision
searcher for mixed-precision activation quantization[C]//Proceedings
of the AAAI Conference on Artificial Intelligence. Washington D. C.,
USA: IEEE Press, 2024, 38(12): 13132-13141.
[67] ZHOU S F, LI L, ZHANG X Y, et al. LiDAR-PTQ: post-training
quantization for point cloud 3D object detection[C]//Proceedings of the
Twelfth International Conference on Learning Representations.
Washington D. C., USA: IEEE Press, 2024: 1-15.
[68] LIN J, TANG J M, TANG H T, et al. AWQ: activation-aware weight
quantization for on-device LLM compression and acceleration[J].
Proceedings of Machine Learning and Systems, 2024, 6: 87-100.
[69] PAN J Y, WANG C C, ZHENG K F, et al. SmoothQuant+: accurate and
efficient 4-bit post-training weight quantization for LLM[EB/OL].
(2023-12-01)[2025-06-08]. https://arxiv.org/abs/2312.03788.
[70] WANG P Q, WANG D S, JI Y, et al. QGAN: quantized generative
adversarial networks[EB/OL]. (2019-01-23)[2025-06-08]. https://arxiv.
org/abs/1901.08263.
[71] MA Y X, LI H X, ZHENG X W, et al. Solving oscillation problem in
post-training quantization through a theoretical perspective[C]//
Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition. Washington D. C., USA: IEEE Press, 2023: 7950-
7959.
[72] WEI X Y, GONG R H, LI Y H, et al. QDrop: randomly dropping
quantization for extremely low-bit post-training quantization[C]//
Proceedings of the International Conference on Learning
Representations. Washington D. C., USA: IEEE Press, 2022: 1-19.
[73] LIN Y, ZHANG T Y, SUN P Q, et al. FQ-ViT: post-training
quantization for fully quantized vision transformer[C]//Proceedings of
the Thirty-First International Joint Conference on Artificial Intelligence.
Vienna, Austria: [s. n.], 2022: 1173-1179.
[74] LV C T, CHEN H, GUO J Y, et al. PTQ4SAN: post-training
quantization for segment anything[C]//Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition. Washington
D. C., USA: IEEE Press, 2024: 15941-15951.
[75] LIU Y J, YANG H R, DONG Z, et al. NoisyQuant: noisy
bias-enhanced post-training activation quantization for vision
transformers[C]// Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. Washington D. C., USA:
IEEE Press, 2023: 20321-20330.
[76] CHEE J, CAI Y, KULESHOV V, et al. QuIp: 2-bit quantization of large
language models with guarantees[J]. Advances in Neural Information
Processing Systems, 2023, 36: 4396-4429.
[77] ADEPU H, ZENG Z P, ZHANG L, et al. FrameQuant: flexible low-bit
quantization for transformers[C]//Proceedings of the 41st International
Conference on Machine Learning. New York, USA: ACM Press, 2024:
203-227.
[78] YUAN Z H, SHANG Y Z, DONG Z. PB-LLM: partially binarized
large language models[C]//Proceedings of the Twelfth International
Conference on Learning Representations. Washington D. C., USA:
IEEE Press, 2024: 1-14.
[79] LIN C, PENG B, LI Z Y, et al. Bit-Shrinking: limiting instantaneous
sharpness for improving post-training quantization[C]//Proceedings of
the
IEEE/CVF Conference on Computer Vision and Pattern
Recognition. Washington D. C., USA: IEEE Press, 2023: 16196-
16205.
[80] WANG M Z, SUN H X, SHI J, et al. Q-YOLO: efficient inference for
real-time object detection[C]//Proceedings of the Asian Conference on
Pattern Recognition. Cham: Springer Nature Switzerland, 2023: 307-
321.
[81] PHAM C, HOANG A D, NGUYEN C C, et al. MetaAug: meta-data
augmentation for post-training quantization[C]//Proceedings of the
European Conference on Computer Vision. Cham: Springer Nature
Switzerland, 2024: 236- 252.
[82] ZHANG X G, QIN H T, DING Y F, et al. Diversifying sample
generation for accurate data-free quantization[C]//Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern recognition.
Washington D. C., USA: IEEE Press, 2021: 15658-15667.
[83] GUO C, QIU Y X, LENG J W, et al. SQuant: on-the-fly data-free
quantization via diagonal hessian approximation[C]//Proceedings of
the International Conference on Learning Representations. Washington
D. C., USA: IEEE Press, 2022: 1-18.
[84] YAO Z W, WU X X, LI C, et al. Exploring post-training quantization
in LLMs from comprehensive study to low rank compensation[C]//
Proceedings of the AAAI Conference on Artificial Intelligence.
Washington D. C., USA: IEEE Press, 2024, 38(17): 19377-19385.
[85] DETTMERS T, SVIRSCHEVSKI R A, EGIAZARIAN V, et al. SPQR:
a sparse-quantized representation for near-lossless LLM weight
compression[C]//Proceedings of the Twelfth International Conference
on Learning Representations. Washington D. C., USA: IEEE Press,
2024: 1-29.
[86] CAI Y H, YAO Z W, DONG Z, et al. ZeroQ: a novel zero shot
quantization framework[C]//Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition. Washington D. C., USA:
IEEE Press, 2020: 13169-13178.
[87] FAN C X, WANG Z Q, GUO D, et al. Data-free quantization via
pseudo-label filtering[C]//Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. Washington D. C., USA:
IEEE Press, 2024: 5589-5598.
[88] JEON Y, LEE C, KIM H. GENIE: show me the data for quantization[C]
//Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition. Washington D. C., USA: IEEE Press, 2023:
12064-12073.
[89] MASOURIS A, SHARMA M, BOGUSZEWSKI A, et al. Post-training
model quantization using GANs for synthetic data generation[EB/OL].
(2023-05-20)[2025-06-08]. https://arxiv.org/abs/2305.06052.
[90] ANDREEV P, FRITZLER A. Quantization of generative adversarial
networks for efficient inference: a methodological study[C]//
Proceedings of the 26th International Conference on Pattern
Recognition. Washington D. C., USA: IEEE Press, 2022: 2179-2185.
[91] LI Z K, MA L P, CHEN M J, et al. Patch similarity aware data-free
quantization for vision transformers[C]//Proceedings of the European
Conference on Computer Vision. Cham: Springer Nature Switzerland,
2022: 154-170.
[92] RAMACHANDRAN A, KUNDU S, KRISHNA T. CLAMP-ViT:
contrastive data-free learning for adaptive post-training quantization of
vits[C]//Proceedings of the European Conference on Computer Vision.
Cham: Springer Nature Switzerland, 2024: 307-325.
[93] LI J W, ZHANG T C, YEN I E H, et al. FP8-BERT: post-training
quantization for transformer[EB/OL]. [2025-06-08]. https://arxiv.org/
abs/2312.05725.
[94] CAO J L, CHOLAKKAL H, ANWER R M, et al. D2Det: towards high
quality object detection and instance segmentation[C]//Proceedings of
the
IEEE/CVF Conference on Computer Vision and Pattern
Recognition. Washington D. C., USA: IEEE Press, 2020: 11485-11494.
[95] 徐增敏,陈凯,郭威伟,等.面向轻量级卷积网络的激活函数与压缩模
型[J].计算机工程, 2022, 48(05): 1000-3428.
XU Z M, CHEN K, GUO W W, et al. Activation functions and
compressed models for lightweight convolutional networks. Computer
Engineering, 2022, 48(05): 1000-3428.
[96] 史宝岱,张秦,李瑶,等.面向图像目标识别的轻量化卷积神经网络[J].
计算机工程, 2022, 48(06): 1000-3428.
SHI B D, ZHANG Q, LI Y, et al. Lightweight convolutional neural
networks for image object recognition. Computer Engineering, 2022,
48(06): 1000-3428.
[97] MAKHOV D, OSTAPETS R, ZHELAVSKAYA I, et al. Towards
robust full low-bit quantization of super resolution networks[C]//
Proceedings of the European Conference on Computer Vision. Cham:
Springer Nature Switzerland, 2024: 182-198.
[98] TU Z J, HU J, CHEN H T, et al. Toward accurate post-training
quantization for image super resolution[C]//Proceedings of the IEEE/
CVF Conference on Computer Vision and Pattern Recognition.
Washington D. C., USA: IEEE Press, 2023: 5856-5865.
[99] TANG C, MENG Y, JIANG J, et al. Retraining-free model quantization
via one-shot weight-coupling learning[C]//Proceedings of the IEEE/
CVF Conference on Computer Vision and Pattern Recognition.
Washington D. C., USA: IEEE Press, 2024: 15855-15865.
[100] NAGEL M, BAALEN M, BLANKEVOORT T, et al. Data-free
quantization through weight equalization and bias correction[C]//
Proceedings of the IEEE/CVF International Conference on Computer
Vision. Washington D. C., USA: IEEE Press, 2019: 1325-1334.
[101] DUNG H A, PHAM C, LE T, et al. Sharpness-aware data generation
for
zero-shot quantization[C]//Proceedings of the International
Conference on Machine Learning. New York, USA: ACM Press, 2024:
12034-12045.
[102] LIU J, GONG R H, WEI X Y, et al. QLLM: accurate and efficient
low-bitwidth quantization for large language models[C]// Proceedings
of the Twelfth International Conference on Learning Representations.
Washington D. C., USA: IEEE Press, 2024: 1-23.
[103] XIAO G X, LIN J, SEZNEC M, et al. SmoothQuant: accurate and
efficient post-training quantization for large language models[C]//
Proceedings of the International Conference on Machine Learning.
New York, USA: ACM Press, 2023: 38087-38099.
[104] YUAN Z H, NIU L, LIU J , et al. RPTQ: reorder-based post-training
quantization for large language models[EB/OL]. [2025-06-08]. https://
arxiv.org/abs/2304.01089.
[105] LEE C, JIN J Y, KIM T, et al. OWQ: outlier-aware weight
quantization for efficient fine-tuning and inference of large language
models[C]// Proceedings of the AAAI Conference on Artificial
Intelligence. Washington D. C., USA: IEEE Press, 2024, 38(12):
13355-13364.
[106] KIM S, HOOPER C R C, GHOLAMI A, et al. SqueezeLLM: dense-
and-sparse
quantization[C]//Proceedings
of
the
International
Conference on Machine Learning. New York, USA: ACM Press, 2024:
23901-23923.
[107] KIM Y J, HENRY R, FAHIM R, et al. FineQuant: unlocking
efficiency
with
fine-grained
weight-only
quantization
for
LLMs[EB/OL]. [2025-06-08]. https://arxiv.org/abs/2308.09723.
[108] YAO Z W, WU X X, LI C, et al. Exploring post-training quantization
in LLMs from comprehensive study to low rank compensation[C]//
Proceedings of the AAAI Conference on Artificial Intelligence.
Washington D. C., USA: IEEE Press, 2024, 38(17): 19377-19385.
[109] DING X, LIU X Y, TU Z J, et al. CBQ: cross-block quantization for
large language models[C]//Proceedings of the International Conference
on Learning Representations. Washington D. C., USA: IEEE Press,
2025: 1-20.
[110] YAO Z W, YAZDANI AMINABADI R, ZHANG M J, et al.
ZeroQuant: efficient and affordable post-training quantization for
large-scale transformers[J]. Advances in Neural Information Processing
Systems, 2022, 35: 27168-27183.
[111] ZAFRIR O, BOUDOUKH G, IZSAK P, et al. Q8bert: Quantized 8bit
bert[C]//2019 Fifth Workshop on Energy Efficient Machine Learning
and Cognitive Computing-NeurIPS Edition. New York, USA: ACM
Press, 2019: 36-39.
[112] SHEN S, DONG Z, YE J, et al. Q-bert: Hessian based ultra low
precision quantization of bert[C]//Proceedings of the AAAI conference
on artificial intelligence. Washington D. C., USA: IEEE Press, 2020,
34(05): 8815-8821.
[113] LIU Z, ZHAO C, FEDOROV I, et al. SpinQuant: LLM Quantization
with Learned Rotations[C]//The Thirteenth International Conference
on Learning Representations. Washington D. C., USA: IEEE Press,
2024: 1-24.
[114] LIN Y, TANG H, YANG S, et al. QServe: W4A8KV4 Quantization
and System Co-design for Efficient LLM Serving[C]//Eighth
Conference on Machine Learning and Systems. Washington D. C.,
USA: IEEE Press, 2024: 1-28.
[115] LIU Z H, WANG Y H, HAN K, et al. Post-training quantization for
vision transformer[J]. Advances in Neural Information Processing
Systems, 2021, 34: 28092-28103.
[116] WU Z G, CHEN J X, ZHONG H W, et al. AdaLog: post-training
quantization for vision transformers with adaptive logarithm
quantizer[C]//Proceedings of the European Conference on Computer
Vision. Cham: Springer Nature Switzerland, 2024: 411-427.
[117] YUAN Z H, XUE C H, CHEN Y Q, et al. PTQ4ViT: post-training
quantization for vision transformers with twin uniform quantization[C]
//Proceedings of the European Conference on Computer Vision. Cham:
Springer Nature Switzerland, 2022: 191-207.
[118] LIU X Y, DING X, YU L, et al. PQ-SAM: post-training quantization
for
segment anything model[C]//Proceedings of the European
Conference on Computer Vision. Cham: Springer Nature Switzerland,2024: 420-437.
[119] ZHONG Y S, HU J W, HUANG Y, et al. ERQ: error reduction for
post-training quantization of vision transformers[C]//Proceedings of
the International Conference on Machine Learning. New York, USA:
ACM Press, 2024: 61664-61680.
[120] RANJAN N, SAVAKIS A. Mix-QViT: mixed-precision vision
transformer quantization driven by layer importance and quantization
sensitivity[EB/OL]. (2025-01-15)[2025-06-08]. https://arxiv.org/abs/25
01.06357.
[121] YAO Y Z, TIAN F, CHEN J, et al. Timestep-aware correction for
quantized diffusion models[C]//Proceedings of the European
Conference on Computer Vision. Cham: Springer Nature Switzerland,
2024: 215-232.
[122] YANG Y W, DAI X L, WANG J L, et al. Efficient quantization
strategies for latent diffusion models[EB/OL]. [2025-06-08]. https://
arxiv.org/abs/2312.05431.
[123] ZHAO T C, NING X F, FANG T C, et al. MixDQ: memory-efficient
few-step text-to-image diffusion models with metric-decoupled mixed
precision quantization[C]//Proceedings of the European Conference on
Computer Vision. Cham: Springer Nature Switzerland, 2024: 285-302.
[124] PARK G, KIM M, LEE S, et al. LUT-GEMM: quantized matrix
multiplication based on LUTs for efficient inference in large-scale
generative
language models[C]//Proceedings of the Twelfth
International Conference on Learning Representations. Washington D.
C., USA: IEEE Press, 2024: 1-18.
[125] DETTMERS T, LEWIS M, BELKADA Y, et al. Gpt3. int8 (): 8-bit
matrix multiplication for transformers at scale[J]. Advances in Neural
Information Processing Systems, 2022, 35: 30318-30332.
[126] HOOPER C, KIM S, MOHAMMADZADEH H, et al. KVQuant:
towards 10 million context length LLM inference with KV cache
quantization[J]. Advances in Neural Information Processing Systems,
2024, 37: 1270-1303.
[127] YUE Y X, YUAN Z H, DUANMU H, et al. WKVQuant: quantizing
weight and key/value cache for large language models gains
more[EB/OL]. [2025-06-08]. https://arxiv.org/abs/2402.12065.
[128] GUO C, TANG J M, HU W M, et al. OliVe: accelerating large
language models via hardware-friendly outlier-victim pair quantization
[C]//Proceedings of the 50th Annual International Symposium on
Computer Architecture. New York, USA: ACM Press, 2023: 1-15.
[129] GUO Y P, LANG Y L, REN Q Y. GPTQT: quantize large language
models twice to push the efficiency[C]//Proceedings of the IEEE
International Conference on Cybernetics and Intelligent Systems and
IEEE International Conference on Robotics, Automation and
Mechatronics. Washington D. C., USA: IEEE Press, 2024: 368-373.
[130] BAI H L, HOU L, SHANG L F, et al. Towards efficient post-training
quantization of pre-trained language models[J]. Advances in Neural
Information Processing Systems, 2022, 35: 1405-1418.
[131] MA Y X, LI H X, ZHENG X W, et al. Outlier-aware slicing for post-
training quantization in vision transformer[C]//Proceedings of the 41st
International Conference on Machine Learning. New York, USA: ACM
Press, 2024: 33811-33825.
[132] LIU J, NIU L, YUAN Z, et al. Pd-quant: Post-training quantization
based on prediction difference metric[C]//Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition.
Washington D. C., USA: IEEE Press, 2023: 24427-24437.
[133] SHAO W, CHEN M, ZHANG Z, et al. OmniQuant:
Omnidirectionally Calibrated Quantization for Large Language
Models[C]//The Twelfth International Conference on Learning
Representations. Washington D. C., USA: IEEE Press, 2024: 1-25.
[134] JØRGENSEN T E. Resource-efficient language models: quantization
for fast and accessible inference[EB/OL]. (2025-05-15)[2025-06-08].
https://arxiv.org/abs/2505.08620.
[135] INTEL. Neural compressor[EB/OL]. (2023-10-25)[2025-06-07].
https://github.com/intel/neural-compressor
[136] CHENG W H, ZHANG W W, SHEN H H, et al. Optimize weight
rounding via signed gradient descent for the quantization of LLMs
[C]//Proceedings of the Conference on Empirical Methods in Natural
Language Processing. Stroudburg: ACL Press, 2024: 11332- 11350.
[137] FRANTAR E, ASHKBOOS S, HOEFLER T, et al. GPTQ: Accurate
post-training quantization for generative pre-trained transformers[C]
//Proceedings of the Eleventh International Conference on Learning
Representations. Washington D. C., USA: IEEE Press, 2023: 1-16.
[138] HUGGING FACE. Optimum[EB/OL]. [2025-06-07]. https://github.
com/huggingface/optimum.
[139] SHEN Y L, SONG K T, TAN X, et al. Hugginggpt: solving ai tasks
with chatgpt and its friends in hugging face[J]. Advances in Neural
Information Processing Systems, 2023, 36: 38154-38180.
[140] TARAGHI M, DORCELUS G, FOUNDJEM A, et al. Deep learning
model reuse in the huggingface community: challenges, benefit and
trends[C]//Proceedings of the IEEE International Conference on
Software Analysis, Evolution and Reengineering. Washington D. C.,
USA: IEEE Press, 2024: 512-523.
[141] PADDLEPADDLE. PaddleSlim[EB/OL]. (2024-01-03)[2025-06-07].
https://github.com/PaddlePaddle/PaddleSlim.
[142] PADDLEPADDLE. PaddleSlim: Model automatic compression tool
(ACT) user guide[EB/OL]. [2025-06-07]. https://www.paddlepaddle.
org.cn/documentation/docs/zh/guides/infer/paddleslim/paddle_slim_cn.
html.
[143] PYTORCH. PyTorch native architecture optimization: torchao
[EB/OL]. (2024-08-08)[2025-06-07]. https://github.com/pytorch/ao.
[144] PYTORCH. PyTorch quantization support documentation[EB/OL].
[2025-06-07]. https://docs.pytorch.org/docs/stable/quantization-suppor
t.html.s
[145] HUAWEI. MindSpore official english documentation[EB/OL].
[2025-06-07]. https://www.mindspore.cn/en.
[146] HUAWEI. MindSpore[EB/OL]. [2025-06-07]. https://github.com/m
indspore-ai/mindspore.
[147] TONG Z H, DU N, SONG X B, et al. Study on MindSpore deep
learning framework[C]//Proceedings of the 17th International
Conference on Computational Intelligence and Security. Washington D.
C., USA: IEEE Press, 2021: 183-186.
[148] GOOGLE. QKeras: a quantization deep learning library for
Tensorflow Keras[EB/OL]. (2021-02-19)[2025-06-07]. https://github.
com/google/qkeras.
[149] LORO F, PAU D, TOMASELLI V. A QKeras neural network zoo for
deeply quantized imaging[C]//Proceedings of the IEEE 6th
International Forum on Research and Technology for Society and
Industry. Washington D. C., USA: IEEE Press, 2021: 165-170.
[150] TENSORFLOW. Post-training
quantization
guide[EB/OL].
(2022-08-03)[2025-06-07]. https://www.tensorflow.org/model_optimiz
ation/guide/quantization/post_training?hl=zh-cn.
[151] PANG B, NIJKAMP E, WU Y N. Deep learning with Tensorflow: a
review[J]. Journal of Educational and Behavioral Statistics, 2020,
45(2): 227-248.[152] ALIBABA. MNN: a blazing fast, lightweight deep learning
framework[EB/OL]. [2025-06-07]. https://github.com/alibaba/MNN.
[153] JIANG X T, WANG H, CHEN Y L, et al. MNN: a universal and
efficient inference engine[J]. Proceedings of Machine Learning and
Systems, 2020, 2: 1-13.
[154] TENCENT. NCNN: a high-performance neural network inference
framework optimized for mobile platforms[EB/OL]. [2025-06-07].
https://github.com/Tencent/ncnn.
[155] YU Y, YIN Q, ZHANG J, et al. ADMN: agent-driven modular
network for dynamic parameter sharing in cooperative multi-agent
reinforcement
learning[C]//Proceedings
of
the
Thirty-Third
International Joint Conference on Artificial Intelligence. Jeju Island,
Republic of Korea: [s. n.], 2024: 302-310.
[156] NVIDIA. NVIDIA TensorRT official documentation[EB/OL].
[2025-06-07]. https://docs.nvidia.com/deeplearning/tensorrt/.
[157] NVIDIA. NVIDIA TensorRT GitHub repository[EB/OL].
[2025-06-07]. https://github.com/NVIDIA/TensorRT.
[158] ZHOU Y, GUO Z, DONG Z, et al. TensorRT implementations of
model quantization on edge SoC[C]//Proceedings of the IEEE 16th
International
Symposium on Embedded Multicore/Many-core
Systems-on-Chip. Washington D. C., USA: IEEE Press, 2023:
486-493.
[159] MICROSOFT. Quantize ONNX models[EB/OL]. [2025-06-07].
https://onnxruntime.ai/docs/performance/model-optimizations/quantiza
tion.html.
[160] MICROSOFT. ONNX runtime GitHub repository[EB/OL].
[2025-06-07]. https://github.com/microsoft/onnxruntime.
[161] INTEL. OpenVINO GitHub repository[EB/OL]. [2025-06-07].
https://github.com/openvinotoolkit/openvino.
[162] INTEL. OpenVINO™ documentation[EB/OL]. [2025-06-07]. https://
docs.openvino.ai/.
[163] DEMIDOVSKIJ A, GORBACHEV Y, FEDOROV M, et al.
OpenVINO deep learning workbench: comprehensive analysis and
tuning of neural networks inference[C]//Proceedings of the
International Conference on Computer Vision Workshop. Washington
D. C., USA: IEEE Press, 2019: 783-787.
[164] LIU Z, OGUZ B, ZHAO C, et al. LLM-QAT: Data-Free Quantization
Aware Training for Large Language Models[C]//Findings of the
Association for Computational Linguistics ACL 2024. Stroudburg:
ACL Press, 2024: 467-484.
[165] QU X, APONTE D, BANBURY C, et al. Automatic joint structured
pruning and quantization for efficient neural network training and
compression[C]//Proceedings of the Computer Vision and Pattern
Recognition Conference. Washington D. C., USA: IEEE Press, 2025:
15234-15244.
[166] GAO T, GUO L, ZHAO S, et al. QuantNAS: quantization-aware
neural architecture search for efficient deployment on mobile
device[C]//Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition. Washington D. C., USA: IEEE Press,
2024: 1704-1713.
[167] XIA M, GAO T, ZENG Z, et al. Sheared LLaMA: Accelerating
Language Model Pre-training via Structured Pruning[C]//The Twelfth
International Conference on Learning Representations. Washington D.
C., USA: IEEE Press, 2024: 1-25.
[168] JUNG S, SON C, LEE S, et al. Learning to quantize deep networks by
optimizing quantization intervals with task loss[C]//Proceedings of the
IEEE/CVF conference on computer vision and pattern recognition.
Washington D. C., USA: IEEE Press, 2019: 4350-4359.
[169] ZHOU S, LI L, ZHANG X, et al. LiDAR-PTQ: Post-Training
Quantization for Point Cloud 3D Object Detection[C]//The Twelfth
International Conference on Learning Representations. Washington
D. C., USA: IEEE Press, 2024: 1-15.
[170] XU J, FAN J, NAN B, et al. Aslog: An area-efficient cnn accelerator
for
per-channel logarithmic post-training quantization[J]. IEEE
Transactions on Circuits and Systems I: Regular Papers, 2023, 70(12):
5380-5393.
|