[1] BELLO I, ZOPH B, LE Q, et al.Attention augmented convolutional networks[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Washington D.C., USA:IEEE Press, 2019:3285-3294. [2] AZULAY A, WEISS Y.Why do deep convolutional networks generalize so poorly to small image transformations?[EB/OL].[2021-02-05].https://arxiv.org/abs/1805.12177. [3] PEARL J, MACKENZIE D.The book of why:the new science of cause and effect[M].Berlin, Germany:Springer, 2018. [4] SIMONYAN K, ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL].[2021-02-05].https://arxiv.org/abs/1409.1556. [5] HE K M, ZHANG X Y, REN S Q, et al.Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:770-778. [6] KOSIOREK A R, SABOUR S, TEH Y W, et al.Stacked capsule autoencoders[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.New York, USA:ACM Press, 2019:15512-15522. [7] SZEGEDY C, LIU W, JIA Y Q, et al.Going deeper with convolutions[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2015:1-9. [8] IOFFE S, SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//Proceedings of International Conference on Machine Learning.Washington D.C., USA:IEEE Press, 2015:448-456. [9] SZEGEDY C, VANHOUCKE V, IOFFE S, et al.Rethinking the inception architecture for computer vision[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2016:2818-2826. [10] LIN T Y, DOLLÁR P, GIRSHICK R, et al.Feature pyramid networks for object detection[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2017:936-944. [11] SABOUR S, FROSST N, HINTON G E.Dynamic routing between capsules[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.New York, USA:ACM Press, 2017:3859-3869. [12] HINTON G E, SABOUR S, FROSST N.Matrix capsules with EM routing[EB/OL].[2021-02-05].http://www.cs.toronto.edu/~hinton/absps/EMcapsules.pdf. [13] ARORA S, BHASKARA A, GE R, et al.Provable bounds for learning some deep representations[EB/OL].[2021-02-05].http://export.arxiv.org/pdf/1310.6343. [14] CIRESAN D C, MEIER U, MASCI J, et al.Flexible, high performance convolutional neural networks for image classification[C]//Proceedings of 2011 International Joint Conference on Artificial Intelligence.Palo Alto, USA:AAAI Press, 2011:1237-1242. [15] LECUN Y, BOTTOU L, BENGIO Y, et al.Gradient-based learning applied to document recognition[J].Proceedings of the IEEE, 1998, 86(11):2278-2324. [16] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM, 2017, 60(6):84-90. [17] HU J, SHEN L, SUN G.Squeeze-and-excitation networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:7132-7141. [18] WANG X L, GIRSHICK R, GUPTA A, et al.Non-local neural networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2018:7794-7803. [19] JEBARA T, WANG J, CHANG S F.Graph construction and b-matching for semi-supervised learning[C]//Proceedings of the 26th Annual International Conference on Machine Learning.New York, USA:ACM Press, 2009:441-448. [20] LI S, FU Y.Learning balanced and unbalanced graphs via low-rank coding[J].IEEE Transactions on Knowledge and Data Engineering, 2015, 27(5):1274-1287. [21] WANG F, ZHANG C S.Label propagation through linear neighborhoods[J].IEEE Transactions on Knowledge and Data Engineering, 2008, 20(1):55-67. [22] 王省, 康昭.基于光滑表示的半监督分类算法[J].计算机科学, 2021, 48(3):124-129. WANG X, KANG Z.Smooth representation-based semi-supervised classification[J].Computer Science, 2021, 48(3):124-129.(in Chinese) [23] VALLENDER S S.Calculation of the Wasserstein distance between probability distributions on the line[J].Theory of Probability & Its Applications, 1974, 18(4):784-786. [24] KESKIN Z, ASTE T.Information-theoretic measures for nonlinear causality detection:application to social media sentiment and cryptocurrency prices[J].Royal Society Open Science, 2020, 7(9):200863. [25] 蔡瑞初, 陈薇, 张坤, 等.基于非时序观察数据的因果关系发现综述[J].计算机学报, 2017, 40(6):1470-1490. CAI R C, CHEN W, ZHANG K, et al.A survey on non-temporal series observational data based causal discovery[J].Chinese Journal of Computers, 2017, 40(6):1470-1490.(in Chinese) [26] GRANGER C W J.Investigating causal relations by econometric models and cross-spectral methods[J].Econometrica, 1969, 37(3):424-438. [27] 胡宗义.投资选择及资产定价数学模型研究[D].长沙:湖南大学, 2004. HU Z Y.Research on investment choice and asset pricing mathematical model[D].Changsha:Hunan University, 2004.(in Chinese) [28] SHARPE W F.The Sharpe ratio[J].The Journal of Portfolio Management, 1994, 21(1):49-58. [29] BAILEY D, LÓPEZ DE PRADO M.The Sharpe ratio efficient frontier[J].The Journal of Risk, 2012, 15(2):3-44. [30] MELLOR J, TURNER J, STORKEY A, et al.Neural architecture search without training[EB/OL].[2021-02-05].https://arxiv.org/abs/2006.04647v1. [31] QIAN N.On the momentum term in gradient descent learning algorithms[J].Neural Networks, 1999, 12(1):145-151. [32] DUCHI J C, HAZAN E, SINGER Y.Adaptive subgradient methods for online learning and stochastic optimization[J].Journal of Machine Learning Research, 2011, 12(61):2121-2159. [33] KINGMA D P, BA J.Adam:a method for stochastic optimization[EB/OL].[2021-02-05].https://arxiv.org/abs/1412.6980 [34] GUPTA V, KOREN T, SINGER Y.Shampoo:preconditioned stochastic tensor optimization[C]//Proceedings of International Conference on Machine Learning.Washington D.C., USA:IEEE Press, 2018:1842-1850. |