| 1 | GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks. Communications of the ACM, 2020, 63(11): 139- 144.  doi: 10.1145/3422622
 | 
																													
																							| 2 |  | 
																													
																							| 3 |  | 
																													
																							| 4 | 耿鹏志, 樊红兴, 张翌阳, 等. 基于篡改伪影的深度伪造检测方法. 计算机工程, 2021, 47(12): 156- 162.  URL
 | 
																													
																							|  | GENG P Z, FAN H X, ZHANG Y Y, et al. Deepfake detection method based on tampering artifacts. Computer Engineering, 2021, 47(12): 156- 162.  URL
 | 
																													
																							| 5 | 李柯, 李邵梅, 吉立新, 等. 基于自注意力胶囊网络的伪造人脸检测方法. 计算机工程, 2022, 48(2): 194-200, 206.  URL
 | 
																													
																							|  | LI K, LI S M, JI L X, et al. Method of face forgery detection based on self-attention capsule network. Computer Engineering, 2022, 48(2): 194-200, 206.  URL
 | 
																													
																							| 6 | FAN H Q, XIONG B, MANGALAM K, et al. Multiscale Vision Transformers[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 6824-6835. | 
																													
																							| 7 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2016: 770-778. | 
																													
																							| 8 | MONTSERRAT D M, HAO H X, YARLAGADDA S K, et al. Deepfakes detection with automatic face weighting[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Washington D. C., USA: IEEE Press, 2020: 668-669. | 
																													
																							| 9 | SUN Z, HAN Y, HUA Z, et al. Improving the efficiency and robustness of deepfakes detection through precise geometric features[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 3609-3618. | 
																													
																							| 10 |  | 
																													
																							| 11 | CHUGH K, GUPTA P, DHALL A, et al. Not made for each other- audio-visual dissonance-based deepfake detection and localization[C]//Proceedings of the 28th ACM International Conference on Multimedia. New York, USA: ACM Press, 2020: 439-447. | 
																													
																							| 12 | KNAFO G, FRIED O. FakeOut: leveraging out-of-domain self-supervision for multi-modal video deepfake detection[EB/OL]. [2023-05-05]. https://arxiv.org/abs/2212.00773 . | 
																													
																							| 13 |  | 
																													
																							| 14 | BONETTINI N, CANNAS E D, MANDELLI S, et al. Video face manipulation detection through ensemble of CNNs[C]//Proceedings of the 25th International Conference on Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 5012-5019. | 
																													
																							| 15 | WANG G J, JIANG Q, JIN X, et al. FFR_FD: effective and fast detection of DeepFakes via feature point defects. Information Sciences: an International Journal, 2022, 596(C): 472- 488. | 
																													
																							| 16 | ZHAO H Q, WEI T Y, ZHOU W B, et al. Multi-attentional deepfake detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2021: 2185-2194. | 
																													
																							| 17 |  | 
																													
																							| 18 |  | 
																													
																							| 19 | HEO Y J, CHOI Y J, LEE Y W, et al. Deepfake detection scheme based on Vision Transformer and distillation[EB/OL]. [2023-05-05]. https://arxiv.org/abs/2104.01353 . | 
																													
																							| 20 | ZHANG K P, ZHANG Z P, LI Z F, et al. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 2016, 23(10): 1499- 1503.  doi: 10.1109/LSP.2016.2603342
 | 
																													
																							| 21 | LI Y H, WU C Y, FAN H Q, et al. MViTv2: improved multiscale Vision Transformers for classification and detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2022: 4804-4814. | 
																													
																							| 22 |  | 
																													
																							| 23 | CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2017: 1251-1258. | 
																													
																							| 24 |  | 
																													
																							| 25 | CHEN C F R, FAN Q F, PANDA R. CrossViT: cross-attention multi-scale Vision Transformer for image classification[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 357-366. | 
																													
																							| 26 | ZHAO Z X, BAI H W, ZHANG J S, et al. CDDFuse: correlation-driven dual-branch feature decomposition for multi-modality image fusion[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2023: 5906-5916. | 
																													
																							| 27 | ZHU L, WANG X J, KE Z H, et al. BiFormer: Vision Transformer with Bi-level routing attention[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2023: 10323-10333. | 
																													
																							| 28 |  | 
																													
																							| 29 | YANG X, LI Y Z, LÜ S W. Exposing deep fakes using inconsistent head poses[C]//Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. Washington D. C., USA: IEEE Press, 2019: 8261-8265. | 
																													
																							| 30 |  | 
																													
																							| 31 | ROSSLER A, COZZOLINO D, VERDOLIVA L, et al. FaceForensics++: learning to detect manipulated facial images[C]//Proceedings of IEEE/CVF International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2019: 1-11. | 
																													
																							| 32 | ZI B J, CHANG M H, CHEN J J, et al. WildDeepfake: a challenging real-world dataset for deepfake detection[C]//Proceedings of the 28th ACM International Conference on Multimedia. New York, USA: ACM Press, 2020: 2382-2390. | 
																													
																							| 33 |  | 
																													
																							| 34 | LI Y Z, YANG X, SUN P, et al. Celeb-DF: a large-scale challenging dataset for DeepFake forensics[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2020: 3207-3216. | 
																													
																							| 35 | HU J, LIAO X, WANG W, et al. Detecting compressed deepfake videos in social networks using frame-temporality two-stream convolutional network. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(3): 1089- 1102.  doi: 10.1109/TCSVT.2021.3074259
 | 
																													
																							| 36 | COCCOMINI D A, MESSINA N, GENNARO C, et al. Combining EfficientNet and Vision Transformers for video deepfake detection[EB/OL]. [2023-05-05]. https://arxiv.org/abs/2107.02612 . |