作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (10): 272-279. doi: 10.19678/j.issn.1000-3428.0066030

• 开发研究与工程应用 • 上一篇    下一篇

基于改进条件生成对抗网络的书法字骨架提取

张子珺1, 陈劲松2, 钱夕元1,*   

  1. 1. 华东理工大学 数学学院, 上海 200237
    2. 上海宏弈源软件科技有限公司, 上海 200233
  • 收稿日期:2022-09-19 出版日期:2023-10-15 发布日期:2023-01-12
  • 通讯作者: 钱夕元
  • 作者简介:

    张子珺(1997—),女,硕士研究生,主研方向为统计计算及应用

    陈劲松,助理研究员

  • 基金资助:
    上海市促进文化创意产业发展财政资金支持项目(2020011278_V0)

Calligraphy Character Skeleton Extraction Based on Improved Conditional Generative Adversarial Network

Zijun ZHANG1, Jinsong CHEN2, Xiyuan QIAN1,*   

  1. 1. School of Mathematics, East China University of Science and Technology, Shanghai 200237, China
    2. Shanghai Hongyiyuan Software Technology Co., Ltd., Shanghai 200233, China
  • Received:2022-09-19 Online:2023-10-15 Published:2023-01-12
  • Contact: Xiyuan QIAN

摘要:

书法字骨架保留书法字的结构、形态以及笔画细节,对于评价书法字笔画结构极为重要。为解决现有的骨架提取算法无法获取离线书法图像的动态信息,提出改进条件生成对抗网络的书法字骨架提取算法。为获取长距离上下文信息,将残差结构与分层空洞卷积模块引入条件生成对抗网络,并融合交叉注意力模块,以保证生成骨架的平滑性。使用谱归一化和Leaky ReLU激活函数稳定模型训练,提升书法字骨架提取的完整性,并基于在线手写字数据集,构建伪书法字图像数据集。实验结果表明,该算法在测试数据集中的F1值、联合交并比(IoU)和最小平均距离(AMD)分别为0.678 2、0.515 8和1.450 0,相较于现有骨架提取算法的最优结果,F1值、IoU分别提高了8.2%和8.8%, AMD降低了约0.42,可有效捕获到书法离线图像的动态信息,使骨架特征更具有代表性,在书法字帖图片上表现出较优的泛化能力。同时,消融实验结果验证了分层空洞卷积模块和交叉注意力模块的有效性,可以获得更完整、光滑的字符骨架。

关键词: pix2pix算法, 骨架提取, 分层空洞卷积, 交叉注意力, 离线书法图像

Abstract:

The skeleton of calligraphy characters retains the structure, shape, and stroke details of calligraphy characters, which is crucial for evaluating the structure of calligraphy characters. A skeleton extraction algorithm of calligraphy characters with improved conditional Generative Adversarial Network(cGAN) is proposed to solve the problem in which the existing skeleton extraction algorithms cannot obtain the dynamic information of offline calligraphy images. The residual structure and the hierarchical atrous convolution module are introduced into the conditional generative adversarial network to obtain long-distance context information. The criss-cross attention module is integrated to ensure the smoothness of the generated skeleton. By using spectral normalization and the Leaky ReLU activation function to stabilize the model training, the integrity of calligraphy character skeleton extraction is improved. The pseudo calligraphy image dataset is constructed based on the online Chinese handwriting databases. The comparison experimental results show that the F1 score of the proposed method is 0.678 2, the Intersection over Union(IoU) is 0.515 8, and the Average Minimum Distance(AMD) is 1.45. Compared with the optimal results of existing algorithms, the evaluation indicators F1 and IoU increase by 8.2% and 8.8%, respectively, and AMD decreases by approximately 0.42. The proposed method can capture the dynamic information of offline calligraphy images and make the skeleton features more representative. Besides, the model exhibits improved generalization ability in real calligraphy pictures. Moreover, the ablation experiment verified the effectiveness of the hierarchical atrous convolution module and the criss-cross attention module, which can obtain more complete and smooth character skeletons.

Key words: pix2pix algorithm, skeleton extraction, hierarchical atrous convolution, criss-cross attention, offline calligraphy image