基于改进条件生成对抗网络的书法字骨架提取

doi:10.19678/j.issn.1000-3428.0066030

摘要/Abstract

摘要：

书法字骨架保留书法字的结构、形态以及笔画细节，对于评价书法字笔画结构极为重要。为解决现有的骨架提取算法无法获取离线书法图像的动态信息，提出改进条件生成对抗网络的书法字骨架提取算法。为获取长距离上下文信息，将残差结构与分层空洞卷积模块引入条件生成对抗网络，并融合交叉注意力模块，以保证生成骨架的平滑性。使用谱归一化和Leaky ReLU激活函数稳定模型训练，提升书法字骨架提取的完整性，并基于在线手写字数据集，构建伪书法字图像数据集。实验结果表明，该算法在测试数据集中的F1值、联合交并比(IoU)和最小平均距离(AMD)分别为0.678 2、0.515 8和1.450 0，相较于现有骨架提取算法的最优结果，F1值、IoU分别提高了8.2%和8.8%, AMD降低了约0.42，可有效捕获到书法离线图像的动态信息，使骨架特征更具有代表性，在书法字帖图片上表现出较优的泛化能力。同时，消融实验结果验证了分层空洞卷积模块和交叉注意力模块的有效性，可以获得更完整、光滑的字符骨架。

关键词: pix2pix算法, 骨架提取, 分层空洞卷积, 交叉注意力, 离线书法图像

Abstract:

The skeleton of calligraphy characters retains the structure, shape, and stroke details of calligraphy characters, which is crucial for evaluating the structure of calligraphy characters. A skeleton extraction algorithm of calligraphy characters with improved conditional Generative Adversarial Network(cGAN) is proposed to solve the problem in which the existing skeleton extraction algorithms cannot obtain the dynamic information of offline calligraphy images. The residual structure and the hierarchical atrous convolution module are introduced into the conditional generative adversarial network to obtain long-distance context information. The criss-cross attention module is integrated to ensure the smoothness of the generated skeleton. By using spectral normalization and the Leaky ReLU activation function to stabilize the model training, the integrity of calligraphy character skeleton extraction is improved. The pseudo calligraphy image dataset is constructed based on the online Chinese handwriting databases. The comparison experimental results show that the F1 score of the proposed method is 0.678 2, the Intersection over Union(IoU) is 0.515 8, and the Average Minimum Distance(AMD) is 1.45. Compared with the optimal results of existing algorithms, the evaluation indicators F1 and IoU increase by 8.2% and 8.8%, respectively, and AMD decreases by approximately 0.42. The proposed method can capture the dynamic information of offline calligraphy images and make the skeleton features more representative. Besides, the model exhibits improved generalization ability in real calligraphy pictures. Moreover, the ablation experiment verified the effectiveness of the hierarchical atrous convolution module and the criss-cross attention module, which can obtain more complete and smooth character skeletons.

Key words: pix2pix algorithm, skeleton extraction, hierarchical atrous convolution, criss-cross attention, offline calligraphy image

张子珺, 陈劲松, 钱夕元. 基于改进条件生成对抗网络的书法字骨架提取[J]. 计算机工程, 2023, 49(10): 272-279.

Zijun ZHANG, Jinsong CHEN, Xiyuan QIAN. Calligraphy Character Skeleton Extraction Based on Improved Conditional Generative Adversarial Network[J]. Computer Engineering, 2023, 49(10): 272-279.

http://www.ecice06.com/CN/Y2023/V49/I10/272

图/表 15

图1 基于pix2pix模型的书法字符骨架提取效果图

Fig.1 Calligraphy character skeleton extraction renderings based on pix2pix model

图2 网络总体结构

Fig.2 Overall structure of the network

图3 生成器架构

Fig.3 Architecture of the generator

图4 分层空洞卷积合并模块

Fig.4 Hierarchical atrous convolution merging module

图5 交叉注意力模块

Fig.5 Cross attention module

图6 鉴别器架构

Fig.6 Architecture of the discriminator

图7 在线手写字与离线书法字

Fig.7 Online handwritten character and offline calligraphy character

图8 骨架提取算法的局部对比图

Fig.8 Local contrast map of skeleton extraction algorithms

图9 不同

$ \boldsymbol{\lambda } $

取值下L1 loss的对比

Fig.9 Comparison of L1 loss with different

$ \boldsymbol{\lambda } $

values

图10 不同算法在合成的书法字数据集上的骨架提取结果

Fig.10 Skeleton extraction results of different algorithms on the synthesized calligraphy character dataset

图11 骨架提取算法的局部对比图

Fig.11 Local contrast map of skeleton extraction algorithm

图12 不同算法在真实书法字上的骨架提取结果

Fig.12 Skeleton extraction results of different algorithms on real calligraphy characters

参考文献 29

1	ZHANG T Y, SUEN C Y. A fast parallel algorithm for thinning digital patterns. Communications of the ACM, 1984, 27 (3): 236- 239. doi: 10.1145/357994.358023
2	AHMED M, WARD R. A rotation invariant rule-based thinning algorithm for character recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24 (12): 1672- 1678. doi: 10.1109/TPAMI.2002.1114862
3	ZHANG J L, WANG X N, ZHANG L G, et al. A novel method for improving artifacts of Chinese calligraphy character skeleton extraction[C]//Proceedings of the 2nd International Conference on Multimedia and Image Processing. Washington D. C., USA: IEEE Press, 2017: 53-57.
4	陈树, 杨天. 一种基于改进ZS细化算法的指针仪表检测. 计算机工程, 2017, 43 (12): 216- 221. doi: 10.3969/j.issn.1000-3428.2017.12.039
	CHEN S, YANG T. A pointer meter detection based on improved ZS refinement algorithm. Computer Engineering, 2017, 43 (12): 216- 221. doi: 10.3969/j.issn.1000-3428.2017.12.039
5	常庆贺, 吴敏华, 骆力明. 基于改进ZS细化算法的手写体汉字骨架提取. 计算机应用与软件, 2020, 37 (7): 107-113, 164. URL
	CHANG Q H, WU M H, LUO L M. Handwritten Chinese character skeleton extraction based on improved zs thinning algorithm. Computer Applications and Software, 2020, 37 (7): 107-113, 164. URL
6	DONG J W, CHEN Y M, YANG Z J, et al. A parallel thinning algorithm based on stroke continuity detection. Signal, Image and Video Processing, 2017, 11 (5): 873- 879. doi: 10.1007/s11760-016-1034-y
7	ZHOU Z Y, ZHAN E Q, ZHENG J B. Stroke extraction of handwritten Chinese character based on ambiguous zone information[C]//Proceedings of the 2nd International Conference on Multimedia and Image Processing. Washington D. C., USA: IEEE Press, 2017: 68-72.
8	WANG P F, ZHAO F, MA S W. Skeleton extraction method based on distance transform[C]//Proceedings of the 11th IEEE International Conference on Electronic Measurement & Instruments. Washington D. C., USA: IEEE Press, 2014: 519-523.
9	ARCELLI C, SANNITI DI BAJA G. A one-pass two-operation process to detect the skeletal pixels on the 4-distance transform. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1989, 11 (4): 411- 414. doi: 10.1109/34.19037
10	ZOU J J, YAN H. Skeletonization of ribbon-like shapes based on regularity and singularity analyses. IEEE Transactions on Systems, Man, and Cybernetics, 2001, 31 (3): 401- 407. doi: 10.1109/3477.931528
11	SHEN W, ZHAO K, JIANG Y, et al. Object skeleton extraction in natural images by fusing scale-associated deep side outputs[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 222-230.
12	WANG T Q, LIU C L. Fully convolutional network based skeletonization for handwritten Chinese characters. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32 (1): 381- 393.
13	XIAO X F, JIN L W, YANG Y F, et al. Building fast and compact convolutional neural networks for offline handwritten Chinese character recognition. Pattern Recognition, 2017, 72, 72- 81. doi: 10.1016/j.patcog.2017.06.032
14	WANG H Y, ZHANG Z J, ZHU Q F, et al. Batch skeleton extraction from ESPI fringe patterns using pix2pix conditional generative adversarial network. Optical Review, 2022, 29 (2): 97- 105. doi: 10.1007/s10043-022-00728-1
15	DEMIR İ, HAHN C, LEONARD K, et al. SkelNetOn 2019: dataset and challenge on deep learning for geometric shape understanding[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 1143-1151.
16	MIRZA M, OSINDERO S. Conditional generative adversarial nets[EB/OL]. [2022-09-10]. https://arxiv.org/abs/1411.1784.
17	HAQ I U, ALI H, WANG H Y, et al. BTS-GAN: computer-aided segmentation system for breast tumor using MRI and conditional adversarial networks. Engineering Science and Technology, 2022, 36, 101154.
18	LI R D, PAN J S, LI Z C, et al. Single image dehazing via conditional generative adversarial network[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 8202-8211.
19	李洪安, 郑峭雪, 张婧, 等. 结合Pix2Pix生成对抗网络的灰度图像着色方法. 计算机辅助设计与图形学学报, 2021, 33 (6): 929- 938. URL
	LI H A, ZHENG Q X, ZHANG J, et al. Pix2Pix-based grayscale image coloring method. Journal of Computer-Aided Design & Computer Graphics, 2021, 33 (6): 929- 938. URL
20	ZHOU X X, ZHANG Z Y, CHEN X, et al. Chinese calligraphy character generating via CGAN with a multi-subnet parallel and cascade generator[C]//Proceedings of the 39th Chinese Control Conference. Washington D. C., USA: IEEE Press, 2020: 7446-7451.
21	QIN M X, CHEN X. Restore the incomplete calligraphy based on style transfer[C]//Proceedings of Chinese Control Conference. Guangzhou, China: [s. n.], 2019: 8812-8817.
22	张巍, 张筱, 万永菁. 基于条件生成对抗网络的书法字笔画分割. 自动化学报, 2022, 48 (7): 1861- 1868. URL
	ZHANG W, ZHANG X, WAN Y J. Stroke segmentation of calligraphy based on conditional generative adversarial network. Acta Automatica Sinica, 2022, 48 (7): 1861- 1868. URL
23	BI F K, HAN J H, TIAN Y M, et al. SSGAN: generative adversarial networks for the stroke segmentation of calligraphic characters. The Visual Computer, 2022, 38 (7): 2581- 2590. doi: 10.1007/s00371-021-02133-2
24	ISOLA P, ZHU J Y, ZHOU T H, et al. Image-to-image translation with conditional adversarial networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 5967-5976.
25	LIU Q H, KAMPFFMEYER M, JENSSEN R, et al. Dense dilated convolutions merging network for semantic mapping of remote sensing images[C]//Proceedings of Joint Urban Remote Sensing Event. Washington D. C., USA: IEEE Press, 2019: 1-4.
26	HUANG Z L, WANG X G, WEI Y C, et al. CCNet: criss-cross attention for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45 (6): 6896- 6908.
27	RAJAMANI K T, SIEBERT H, HEINRICH M P. Dynamic deformable attention network for COVID-19 lesions semantic segmentation. Journal of Biomedical Informatics, 2021, 119, 103816. doi: 10.1016/j.jbi.2021.103816
28	MIYATO T, KATAOKA T, KOYAMA M, et al. Spectral normalization for generative adversarial networks[EB/OL]. [2022-09-10]. https://arxiv.org/abs/1802.05957.
29	LIU C L, YIN F, WANG D H, et al. CASIA online and offline Chinese handwriting databases[C]//Proceedings of International Conference on Document Analysis and Recognition. Washington D. C., USA: IEEE Press, 2011: 37-41.

[1]	王款, 宣士斌, 何雪东, 李紫薇, 李嘉祥. 基于交叉注意力Transformer的人体姿态估计方法[J]. 计算机工程, 2023, 49(7): 223-231.
[2]	史聪伟, 赵杰煜, 常俊生. 基于中轴变换的骨架特征提取算法[J]. 计算机工程, 2019, 45(7): 242-250.
[3]	晁莹,耿国华,张雨禾,张靖. 基于区域分割的点云骨架提取算法[J]. 计算机工程, 2017, 43(10): 222-227,233.
[4]	黄新, 郝矿荣, 丁永生. 基于矢状面和神经网络的三维人体骨架提取[J]. 计算机工程, 2012, 38(04): 14-16.
[5]	吕哲;王福利;常玉清;刘阳. 改进的形态学骨架提取算法[J]. 计算机工程, 2009, 35(19): 23-25.

选择文件类型/文献管理软件名称

选择包含的内容