Click-Through Rate Prediction and Optimization Based on Intra-Field Features Similarity

doi:10.19678/j.issn.1000-3428.0064164

Abstract

Abstract: There exist numerous deep learning-based Click-Through Rate (CTR) models;however, most of them improve prediction accuracy by modeling feature interaction between different fields.The feature embedding vectors have a significant impact on model performance; existing CTR models independently learn the embedding vectors of different features in a field.Consequently, most low-frequency features cannot attain sufficiently good embeddings because of the long-tail feature distribution, which seriously affects model accuracy.Noticing that implicit similarity exists between features of the same field, this study proposes a two intra-field feature similarity based on co-occurrence probability and walk probability, respectively.Moreover, the proposed method develops the corresponding similarity graph construction method and designs a breadth first traversal algorithm combined with pruning strategy to efficiently calculate similar features.Based on the intra-field feature similarity graph, an embedding generator is also proposed.For low-frequency features, information of similar features is aggregated on the similarity graph through a graph neural network.This data augmentation method is used as a preprocessing step to improve the learning quality of feature embedding vectors.Extensive experiments conducted on the public data sets Criteo and Avazu demonstrate that the proposed method improves the prediction accuracy of several classical CTR models.Regarding the representative CTR models xDeepFM and AutoInt, the AUC is increased by 0.007 and 0.008, respectively, while the LogLoss is decreased by 0.009 and 0.006, respectively, which proves the effectiveness of the embedding generator.

Key words: Click-Through Rate(CTR) prediction, sparse feature, feature embedding, feature similarity, graph neural network

摘要： 基于深度学习的点击率预估模型多数通过建模各个域的特征之间的交互关系提升预估准确率。特征嵌入向量对模型效果具有重要影响，而现有的CTR模型中不同特征的嵌入向量学习过程相互独立，且由于特征长尾分布导致大部分低频特征不能学习到较好的向量表示，严重影响模型的预测效果。基于域内特征间存在隐含的相似性，提出两种分别基于特征间共现概率和游走概率的相似度定义和对应的相似性图构建方法，并给出结合剪枝策略的广度优先遍历算法实现相似特征的高效计算。在此基础上，基于域内特征相似性图，设计一种嵌入生成器，对于低频特征，在域内特征相似性图上通过图神经网络聚合与其相似的特征信息，生成新的特征嵌入，作为预处理过程对特征嵌入向量进行数据增强，提升嵌入向量的表示学习质量。在公开数据集Criteo、Avazu上的实验结果表明，该方法明显提升点击率预估模型的预测准确率，其中对代表性点击率预估模型xDeepFM和AutoInt，AUC指标分别提升了0.007和0.008，LogLoss则下降了0.009和0.006，证明了嵌入生成模型的有效性。

关键词: 点击率预估, 稀疏特征, 特征嵌入, 特征相似性, 图神经网络

CLC Number:

TP391

LEI Lixiang, WU Zhihao, LIU Yu, ZHOU Zizhan. Click-Through Rate Prediction and Optimization Based on Intra-Field Features Similarity[J]. Computer Engineering, 2023, 49(2): 238-245.

雷李想, 武志昊, 刘钰, 周子站. 基于域内特征间相似性的点击率预估优化[J]. 计算机工程, 2023, 49(2): 238-245.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0064164

http://www.ecice06.com/EN/Y2023/V49/I2/238

Figures/Tables 9

References

[1] GRUNER R L, VOMBERG A, HOMBURG C, et al.Supporting new product launches with social media communication and online advertising:sales volume and profit implications[J].Journal of Product Innovation Management, 2019, 36(2):172-195.
[2] 许王昊, 肖秦琨.基于注意力机制的兴趣网络点击率预估模型[J].计算机工程, 2021, 47(1):101-108. XU W H, XIAO Q K.Click-through rate prediction model of interest network based on attention mechanism[J].Computer Engineering, 2021, 47(1):101-108.(in Chinese)
[3] RENDLE S.Factorization machines[C]//Proceedings of 2010 IEEE International Conference on Data Mining.Washington D.C., USA:IEEE Press, 2010:995-1000.
[4] MIKOLOV T, CHEN K, CORRADO G, et al.Efficient estimation of word representations in vector space[EB/OL].[2021-10-21].https://arxiv.org/abs/1301.3781.
[5] YANG Y W.Click-through rate prediction in online advertising:a literature review[J].Information Processing & Management, 2022, 59(2):102853.
[6] YANG H X.Targeted search and the long tail effect[J].The RAND Journal of Economics, 2013, 44(4):733-756.
[7] PAN F Y, LI S K, AO X, et al.Warm up cold-start advertisements:improving CTR predictions via learning to learn ID embeddings[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval.New York, USA:ACM Press, 2019:695-704.
[8] FINN C, ABBEEL P, LEVINE S.Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the 34th International Conference on Machine Learning.Washington D.C., USA:IEEE Press, 2017:1126-1135.
[9] SAMEK W, MONTAVON G, LAPUSCHKIN S, et al.Explaining deep neural networks and beyond:a review of methods and applications[J].Proceedings of the IEEE, 2021, 109(3):247-278.
[10] CHENG H T, KOC L, HARMSEN J, et al.Wide & deep learning for recommender systems[C]//Proceedings of the 1st Workshop on Deep Learning for Recommender Systems.Washington D.C., USA:IEEE Press, 2016:7-10.
[11] HEATON J.An empirical analysis of feature engineering for predictive modeling[C]//Proceedings of SoutheastCon'16.Washington D.C., USA:IEEE Press, 2016:1-6.
[12] GUO H F, TANG R M, YE Y M, et al.DeepFM:a factorization-machine based neural network for CTR prediction[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence.Washington D.C., USA:IEEE Press, 2017:536-548.
[13] HE X N, CHUA T S.Neural factorization machines for sparse predictive analytics[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.New York, USA:ACM Press, 2017:355-364.
[14] LIAN J X, ZHOU X H, ZHANG F Z, et al.xDeepFM:combining explicit and implicit feature interactions for recommender systems[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.New York, USA:ACM Press, 2018:1754-1763.
[15] SONG W P, SHI C C, XIAO Z P, et al.AutoInt:automatic feature interaction learning via self-attentive neural networks[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.New York, USA:ACM Press, 2019:1161-1170.
[16] VASWANI A, SHAZEER N, PARMAR N, et al.Attention is all you need[EB/OL].[2021-10-01].https://arxiv.org/abs/1706.03762.
[17] HE K M, ZHANG X Y, REN S Q, et al.Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:770-778.
[18] LI Z K, CUI Z Y, WU S, et al.Fi-GNN:modeling feature interactions via graph neural networks for CTR prediction[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.New York, USA:ACM Press, 2019:539-548.
[19] WU Z H, PAN S R, CHEN F W, et al.A comprehensive survey on graph neural networks[J].IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(1):4-24.
[20] KIPF T N, WELLING M.Semi-supervised classification with graph convolutional networks[EB/OL].[2021-10-01].https://arxiv.org/abs/1609.02907.
[21] 黄伟, 冯晶晶, 黄遥.基于多通道极深卷积神经网络的图像超分辨率算法[J].计算机工程, 2020, 46(9):242-247, 253. HUANG W, FENG J J, HUANG Y.Super-resolution algorithm for images based on multi-channel extremely deep convolutional neural network[J].Computer Engineering, 2020, 46(9):242-247, 253.(in Chinese)
[22] VELICKOVIC P, CUCURULL G, CASANOVA A, et al.Graph attention networks[EB/OL].[2022-02-10].https://arxiv.org/abs/1710.10903.
[23] RONG Y, HUANG W, XU T, et al.Dropedge:towards deep graph convolutional networks on node classification[EB/OL].[2021-02-10].https://arxiv.org/abs/1907.10903.
[24] ZHANG M H, CHEN Y X.Link prediction based on graph neural networks[EB/OL].[2022-02-10].https://arxiv.org/abs/1802.09691.
[25] MAAS A L.Rectifier nonlinearities improve neural network acoustic models[C]//Proceedings of ICML'13.Washington D.C., USA:IEEE Press, 2013:296-308.
[26] RUMELHART D E, HINTON G E, WILLIAMS R J.Learning representations by back-propagating errors[J].Nature, 1986, 323(6088):533-536.

Please choose a citation manager

Content to export