查看论文信息

题名：	基于表示学习的跨域人脸深度伪造检测方法研究
作者：	邱凌瑜
学号：	SX2216010
保密级别：	公开
语种：	chi
学科代码：	081200
学科：	工学 - 计算机科学与技术（可授工学、理学学位） - 计算机科学与技术
学生类型：	硕士
学位：	工学硕士
入学年份：	2025
学校：	南京航空航天大学
院系：	计算机科学与技术学院/人工智能学院
专业：	计算机科学与技术
研究方向：	计算机视觉
导师姓名：	谭晓阳
导师单位：	计算机科学与技术学院/人工智能学院
完成日期：	2025-03-17
答辩日期：	2025-03-17
外文题名：	Research on Cross-Domain Face Forgery Detection Method Based on Representation Learning
关键词：	Deepfake 检测 ; 人脸伪造检测 ; 计算机视觉 ; 机器学习 ; 领域泛化
外文关键词：	Deepfake Detection ; Face Forgery Detection ; Computer Vision ; Machine Learning ; Domain Generalization
摘要：	︿人脸伪造检测是计算机视觉领域的一项重要任务，该任务旨在辨别图像或视频中人脸是否经过伪造技术篡改，在保护隐私和加强信息安全方面发挥着关键作用。随着深度学习技术的发展和普及，生成的伪造图像与视频愈发逼真，给人脸伪造检测领域带来了严峻的挑战。然而，人脸伪造检测方法在面对训练中未见的复杂和未知的人脸伪造方法时或受到攻击时往往失效。综上，为了克服这些挑战，本研究致力于提升跨域人脸伪造检测的泛化性，并提出了三个算法，旨在增强人脸深度伪造检测模型在跨域场景中的泛化能力和鲁棒性。 1. 基于多级分布辨别增强的跨域人脸伪造检测算法从人脸伪造检测网络的分布特性的角度，研究真实和伪造图像的在表示空间的特征差异，本文提出了一种基于多级分布差异增强(MDDE) 的算法。MDDE 通过辨别真实人脸和伪造人脸在多个潜在表示层次上的分布模式变化，并结合可变形卷积模块从真实人脸图像中提取细粒度特征，提高了其在不同数据集中的检测精度性能。在多个基准数据集上的大量实验验证多级分布差异增强验证了本文方法的有效性及其与几种最先进的技术相比的卓越性能。 2. 基于对比脱敏学习的跨域人脸伪造检测算法从检测器的泛化性出发，研究现有的人脸伪造检测方法通常具有较高的误报率，从而破坏系统的可用性的问题。本文提出了一种基于鲁棒脱敏算法的对比脱敏网络(CDN)，该网络通过从对真实人脸图像的域转换中学习来捕获基本域特征，使得模型对不同的、可能看不见的伪造方法不敏感。对比脱敏学习在理论上证明了学习到的人脸表示上是合理的，因为它对域变化具有鲁棒性。在大规模基准数据集上进行的大量实验表明，本文方法比最先进的方法获得了更低的误报率和更高的检测精度。 3. 基于鲁棒域梯度对齐的跨域人脸伪造检测算法深度伪造检测中的跨域泛化问题是一个挑战，以往基于样本增强的方法依赖较强的假设，但在人脸伪造检测中往往难以满足。本文提出了一种新颖的学习目标，称为鲁棒域梯度对齐（RoGA），将泛化梯度更新引入ERM 梯度更新中，并通过对模型参数施加扰动实现域间上升点对齐，从而增强模型对域偏移的鲁棒性。该方法在保留域不变特征的同时有效处理域特定特性。实验结果表明，所提鲁棒梯度对齐策略优于当前最先进的域泛化技术，验证了其有效性。综上所述，本文从表示学习的角度研究深度伪造人脸的泛化性问题，并提出了多种方法来提升模型的检测精度和泛化性，在科学研究和实际应用上具备一定的意义。﹀
外摘要要：	︿ Face forgery detection is an important task in the field of computer vision, aiming to identify whether faces in images or videos have been manipulated using forgery techniques. It plays a pivotal role in safeguarding privacy and strengthening information security. However, with the rapid advancement and widespread application of deep learning, forged images and videos are becoming increasingly realistic, posing significant challenges to the field of face forgery detection. Existing methods often fail when encountering complex and unseen forgery techniques or under adversarial attacks. To address these challenges, this work focuses on enhancing generalization and robustness in cross-domain face forgery detection and introduces three key contributions: To overcome these challenges, this work introduces three key contributions to enhance the generalization ability and robustness of face forgery detection method in cross-domain scenarios. 1. Multi-level Distributional Discrepancy Enhancement for Cross Domain Face Forgery Detection. This work investigates the representational diﬀerences between real and forged images. A novel approach, Multi-level Distributional Discrepancy Enhancement (MDDE), is proposed to capture distributional variations across multiple levels of latent representations. By integrating a deformable convolution module, MDDE extracts fine-grained features from real images, thereby improving adaptability and performance across diverse datasets. Extensive experiments on benchmark validate the eﬀectiveness of MDDE, demonstrating superior performance compared to state-of-the-art methods. 2. Contrastive Desensitization Learning for Cross Domain Face Forgery Detection. Starting from the generalization of the detector, this work studies the problem that existing face forgery detection methods usually have a high false alarm rate, which undermines the usability of the system. This thesis proposes a contrastive desensitization network (CDN) based on a robust desensitization algorithm, which captures intrinsic features by desensitization learning from domain transformation of real face images, making the model insensitive to diﬀerent and unseen forgery techniques. Contrastive desensitization learning theoretically proves that the learned face representation is reasonable because it is robust under domain shift. Extensive experiments on large-scale benchmark datasets show that our method achieves lower false alarm rates and higher detection accuracy than the state-of-the-art methods. 3. A Generalizable Deepfake Detection Method through Robust Gradient Alignment. Cross-domain generalization in face forgery detection is a persistent challenge. Previous augmentation-based methods rely on strong domain assumptions, which are often impractical for face forgery detection. This work proposes a novel learning objective, Robust Gradient Alignment (RoGA), which aligns generalization gradient updates with ERM gradient updates. By applying perturbations to model parameters, RoGA achieves alignment of ascent points across domains, enhancing robustness to domain shifts. This method eﬀectively retains domain-invariant features while managing domain-specific characteristics. Experimental results show that RoGA outperforms state-of-the-art domain generalization techniques, validating its eﬀectiveness. In summary, this thesis addresses the generalization challenges in face forgery detection from a representation learning perspective. By introducing innovative methods, it significantly improves detection accuracy and generalization performance, achieving impactful results in both scientific research and practical applications. ﹀
参考文献：	︿ [1] Zhang T. Deepfake generation and detection, a survey[J]. Multimedia Tools and Applications, 2022, 81(5):6259–6276. [2] Westerlund M. The emergence of deepfake technology: A review[J]. Technology innovation management review, 2019, 9(11). [3] Seow J W, Lim M K, Phan R C, et al. A comprehensive overview of Deepfake: Generation, detection, datasets, and opportunities[J]. Neurocomputing, 2022, 513:351–371. [4] Heo Y J, Choi Y J, Lee Y W, et al. Deepfake detection scheme based on vision transformer and distillation[J]. arXiv preprint arXiv:2104.01353, 2021.. [5] Nguyen X H, Tran T S, Nguyen K D, et al. Learning spatio-temporal features to detect manipulated facial videos created by the deepfake techniques[J]. Forensic Science International: Digital Investigation, 2021, 36:301108. [6] Kaddar B, Fezza S A, Akhtar Z, et al. Deepfake detection using spatiotemporal transformer[J]. ACM Transactions on Multimedia Computing, Communications and Applications, 2024, 20(11):1–21. [7] Agarwal S, Farid H. Detecting deep-fake videos from aural and oral dynamics[C]. Proceedings of Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021. 981–989. [8] Rana M S, Nobi M N, Murali B, et al. Deepfake detection: A systematic literature review[J]. IEEE access, 2022, 10:25494–25513. [9] Zhou P, Han X, Morariu V I, et al. Two-Stream Neural Networks for Tampered Face Detection[C]. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017. [10] Zhao H, Zhou W, Chen D, et al. Multi-attentional deepfake detection[C]. Proceedings of Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021. 2185–2194. [11] Qian Y, Yin G, Sheng L, et al. Thinking in frequency: Face forgery detection by mining frequency-aware clues[C]. Proceedings of European conference on computer vision. Springer, 2020. 86–103. [12] Li J, Xie H, Li J, et al. Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection[C]. Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. [13] Miao C, Chu Q, Li W, et al. Learning forgery region-aware and ID-independent features for face manipulation detection[J]. IEEE Transactions on Biometrics, Behavior, and Identity Science, 2021, 4(1):71–84. [14] Li Y, Yang X, Sun P, et al. Celeb-DF: A Large-Scale Challenging Dataset for DeepFake Forensics[C]. Proceedings of Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [15] Li L, Bao J, Zhang T, et al. Face x-ray for more general face forgery detection[C]. Proceedings of Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020. 5001–5010. [16] Haliassos A, Vougioukas K, Petridis S, et al. Lips Don’t Lie: A Generalisable and Robust Approach To Face Forgery Detection[C]. Proceedings of Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. 5039–5049. [17] Ciftci U A, Demir I, Yin L. Fakecatcher: Detection of synthetic portrait videos using biological signals[J]. IEEE transactions on pattern analysis and machine intelligence, 2020.. [18] Gu Z, Chen Y, Yao T, et al. Spatiotemporal Inconsistency Learning for DeepFake Video Detection[C]. Proceedings of Proceedings of the 29th ACM International Conference on Multimedia, 2021. [19] Shiohara K, Yamasaki T. Detecting deepfakes with self-blended images[C]. Proceedings of Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022. 18720–18729. [20] Cao J, Ma C, Yao T, et al. End-to-end reconstruction-classification learning for face forgery detection[C]. Proceedings of Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022. 4113–4122. [21] Shi L, Zhang J, Shan S. Real Face Foundation Representation Learning for Generalized Deepfake Detection[J]. arXiv preprint arXiv:2303.08439, 2023.. [22] Seraj M S, Singh A, Chakraborty S. Semi-Supervised Deep Domain Adaptation for Deepfake Detection[C]. Proceedings of Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024. 1061–1071. [23] Kim M, Tariq S, Woo S S. Fretal: Generalizing deepfake detection using knowledge distillation and representation learning[C]. Proceedings of Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021. 1001–1012. [24] Ge S, Lin F, Li C, et al. Deepfake video detection via predictive representation learning[J]. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2022, 18(2s):1–21. [25] Carlini N, Farid H. Evading deepfake-image detectors with white-and black-box attacks[C]. Proceedings of Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2020. 658–659. [26] Gandhi A, Jain S. Adversarial perturbations fool deepfake detectors[C]. Proceedings of 2020 International joint conference on neural networks (IJCNN). IEEE, 2020. 1–8. [27] Choi Y, Choi M, Kim M, et al. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation[C]. Proceedings of Proceedings of the IEEE conference on computer vision and pattern recognition, 2018. 8789–8797. [28] Liu M, Ding Y, Xia M, et al. Stgan: A unified selective transfer network for arbitrary image attribute editing[C]. Proceedings of Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019. 3673–3682. [29] He Z, Zuo W, Kan M, et al. Attgan: Facial attribute editing by only changing what you want[J]. IEEE transactions on image processing, 2019, 28(11):5464–5478. [30] Chen R, Chen X, Ni B, et al. Simswap: An eﬃcient framework for high fidelity face swapping[C]. Proceedings of Proceedings of the 28th ACM international conference on multimedia, 2020. 2003–2011. [31] Nirkin Y, Keller Y, Hassner T. Fsgan: Subject agnostic face swapping and reenactment[C]. Proceedings of Proceedings of the IEEE/CVF international conference on computer vision, 2019. 7184–7193. [32] Li L, Bao J, Yang H, et al. Faceshifter: Towards high fidelity and occlusion aware face swapping[J]. arXiv preprint arXiv:1912.13457, 2019.. [33] Gao H, Pei J, Huang H. Progan: Network embedding via proximity generative adversarial network[C]. Proceedings of Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019. 1308–1316. [34] Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks[C]. Proceedings of Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019. 4401–4410. [35] Li H, Hou X, Huang Z, et al. StyleGene: Crossover and mutation of region-level facial genes for kinship face synthesis[C]. Proceedings of Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 20960–20969. [36] Lin C H, Chen H C, Cheng L C, et al. Styledna: A high-fidelity age and gender aware kinship face synthesizer[C]. Proceedings of 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021). IEEE, 2021. 1–8. [37] Tripathy S, Kannala J, Rahtu E. Icface: Interpretable and controllable face reenactment using gans[C]. Proceedings of Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2020. 3385–3394. [38] Siarohin A, Lathuilière S, Tulyakov S, et al. First order motion model for image animation[J]. Advances in neural information processing systems, 2019, 32. [39] Yang X, Li Y, Lyu S. Exposing deep fakes using inconsistent head poses[C]. Proceedings of ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2019. 8261–8265. [40] Korshunov P, Marcel S. Vulnerability assessment and detection of deepfake videos[C]. Proceedings of 2019 International Conference on Biometrics (ICB). IEEE, 2019. 1–6. [41] Rossler A, Cozzolino D, Verdoliva L, et al. FaceForensics++: Learning to Detect Manipulated Facial Images[C]. Proceedings of Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019. [42] Dolhansky B, Howes R, Pflaum B, et al. The Deepfake Detection Challenge (DFDC) Preview Dataset, 2019. [43] Liy C M, InIctuOculi L. Exposing ai created fake videos by detecting eye blinking[C]. Proceedings of 2018 IEEEInterG national Workshop on Information Forensics and Security (WIFS). IEEE, 2018. [44] torzdf. Deepfakes. https://github.com/deepfakes/faceswap, 2018. Accessed:. [45] Thies J, Zollhofer M, Stamminger M, et al. Face2face: Real-time face capture and reenactment of rgb videos[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016. 2387–2395. [46] MarekKowalski. FaceSwap. https://github.com/MarekKowalski/FaceSwap, 2018. Accessed:. [47] Thies J, Zollhöfer M, NieSSner M. Deferred Neural Rendering: Image Synthesis using Neural Textures[J]. arXiv: Computer Vision and Pattern Recognition,arXiv: Computer Vision and Pattern Recognition, 2019. [48] Zi B, Chang M, Chen J, et al. WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection[C]. Proceedings of the 28th ACM International Conference on Multimedia, 2020. [49] Hong C Y, Hsu Y C, Liu T L. Contrastive Learning for DeepFake Classification and Localization via Multi-Label Ranking[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024. 17627–17637. [50] Arini A, Bahaweres R B, Al Haq J. Quick classification of xception and resnet-50 models on deepfake video using local binary pattern[C]. Proceedings of 2021 International Seminar on Machine Learning, Optimization, and Data Science (ISMODE). IEEE, 2022. 254–259. [51] Tan M, Le Q. Eﬃcientnet: Rethinking model scaling for convolutional neural networks[C]. Proceedings of International conference on machine learning. PMLR, 2019. 6105–6114. [52] Afchar D, Nozick V, Yamagishi J, et al. Mesonet: a compact facial video forgery detection network[C]. Proceedings of 2018 IEEE international workshop on information forensics and security (WIFS). IEEE, 2018. 1–7. [53] Dosovitskiy A. An image is worth 16x16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv:2010.11929, 2020.. [54] Wodajo D, Atnafu S. Deepfake video detection using convolutional vision transformer[J]. arXiv preprint arXiv:2102.11126, 2021.. [55] Kingma D P, Welling M. Auto-Encoding Variational Bayes[J]. arXiv.org, 2014.. [56] Dai J, Qi H, Xiong Y, et al. Deformable convolutional networks[C]. Proceedings of the IEEE international conference on computer vision, 2017. 764–773. [57] Wei S E, Ramakrishna V, Kanade T, et al. Convolutional pose machines[C]. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2016. 4724–4732. [58] Shi Z, Chen H, Chen L, et al. Discrepancy-Guided Reconstruction Learning for Image Forgery Detection[J]. arXiv preprint arXiv:2304.13349, 2023.. [59] Deng J, Dong W, Socher R, et al. Imagenet: A large-scale hierarchical image database[C]. Proceedings of 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009. 248–255. [60] Chollet F. Xception: Deep Learning With Depthwise Separable Convolutions[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. [61] Kingma D P, Ba J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv:1412.6980, 2014 [62] Wang C, Deng W. Representative forgery mining for fake face detection[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021. 14923–14932. [63] Sun K, Liu H, Yao T, et al. An information theoretic approach for attention-driven face forgery detection[C]. Proceedings of European Conference on Computer Vision. Springer, 2022. 111–127. [64] Choi J, Kim T, Jeong Y, et al. Exploiting style latent flows for generalizing deepfake video detection[C]. Proceedings of Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2024. 1133–1143. [65] Chen M, Xu Z, Weinberger K, et al. Marginalized denoising autoencoders for domain adaptation[J]. arXiv preprint arXiv:1206.4683, 2012.. [66] Chen L, Zhang Y, Song Y, et al. Self-supervised Learning of Adversarial Example: Towards Good Generalizations for Deepfake Detection[C]. Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. [67] He Y, Yu N, Keuper M, et al. Beyond the Spectrum: Detecting Deepfakes via Re-Synthesis[C]. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021. [68] Hoﬀman J. Eﬃcient Learning of Domain-invariant Image Representations[J]. Computer Science, 2013. [69] Huang X, Belongie S. Arbitrary style transfer in real-time with adaptive instance normalization[C]. Proceedings of the IEEE international conference on computer vision, 2017. 1501–1510. [70] Zhou K, Yang Y, Qiao Y, et al. Mixstyle neural networks for domain generalization and adaptation[J]. International Journal of Computer Vision, 2023. 1–15. [71] Li Y, Wang N, Liu J, et al. Demystifying neural style transfer[J]. arXiv preprint arXiv:1701.01036, 2017. [72] Shuai C, Zhong J, Wu S, et al. Locate and Verify: A Two-Stream Network for Improved Deepfake Detection[C]. Proceedings of the 31st ACM International Conference on Multimedia, 2023. 7131–7142. [73] He K, Fan H, Wu Y, et al. Momentum contrast for unsupervised visual representation learning[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020. 9729–9738. [74] Haliassos A, Mira R, Petridis S, et al. Leveraging real talking faces via self-supervision for robust forgery detection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022. 14950–14962. [75] Wang Z, Bao J, Zhou W, et al. Altfreezing for more general video face forgery detection[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023. 4129–4138. [76] Zhang D, Xiao Z, Li J, et al. Self-supervised transformer with domain adaptive reconstruction for general face forgery video detection[J]. arXiv preprint arXiv:2309.04795, 2023.. [77] Sun K, Liu H, Ye Q, et al. Domain general face forgery detection by learning to weight[C]. Proceedings of the AAAI conference on artificial intelligence, volume 35, 2021. 2638–2646. [78] Zhang D, Tang J, Cheng K T. Graph reasoning transformer for image parsing[C]. Proceedings of the 30th ACM International Conference on Multimedia, 2022. 2380–2389. [79] Li D, Yang Y, Song Y Z, et al. Learning to generalize: Meta-learning for domain generalization[C]. Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018. [80] Zhao T, Xu X, Xu M, et al. Learning self-consistency for deepfake detection[C]. Proceedings of the IEEE/CVF international conference on computer vision, 2021. 15023–15033. [81] Hinton G E, Roweis S T. Stochastic Neighbor Embedding[C]. Proceedings of Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, NIPS 2002, December 9-14, 2002, Vancouver, British Columbia, Canada]. MIT Press, 2002. 833–840. [82] Foret P, Kleiner A, Mobahi H, et al. Sharpness-aware minimization for eﬃciently improving generalization[J]. arXiv preprint arXiv:2010.01412, 2020.. [83] Wang P, Zhang Z, Lei Z, et al. Sharpness-aware gradient matching for domain generalization[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 3769–3778. [84] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016. 770–778. [85] Deng J, Dong W, Socher R, et al. Imagenet: A large-scale hierarchical image database[C]. Proceedings of 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009. 248–255. [86] Yan Z, Zhang Y, Yuan X, et al. DeepfakeBench: a comprehensive benchmark of deepfake detection[C]. Proceedings of the 37th International Conference on Neural Information Processing Systems, 2023. 4534–4565. [87] Dolhansky B. The dee pfake detection challenge (DFDC) pre view dataset[J]. arXiv preprint arXiv:1910.08854, 2019.. [88] Liu H, Li X, Zhou W, et al. Spatial-phase shallow learning: rethinking face forgery detection in frequency domain[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021. 772–781. [89] Ni Y, Meng D, Yu C, et al. Core: Consistent representation learning for face forgery detection[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022. 12–21. [90] Li Y, Lyu S. Exposing deepfake videos by detecting face warping artifacts[J]. arXiv preprint arXiv:1811.00656, 2018.. [91] Luo Y, Zhang Y, Yan J, et al. Generalizing face forgery detection with high-frequency features[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021. 16317–16326. [92] Yan Z, Zhang Y, Fan Y, et al. Ucf: Uncovering common features for generalizable deepfake detection[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023. 22412–22423. [93] Nguyen H H, Fang F, Yamagishi J, et al. Multi-task learning for detecting and segmenting manipulated facial images and videos[C]. Proceedings of 2019 IEEE 10th international conference on biometrics theory, applications and systems (BTAS). IEEE, 2019. 1–8. [94] Chen J, Tian J, Yu C, et al. ConfR: Conflict Resolving for Generalizable Deepfake Detection[C]. Proceedings of 2024 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2024. 1–6. [95] Pollard D. Convergence of stochastic processes[M]. Springer Science & Business Media, 2012. [96] Selvaraju R R, Cogswell M, Das A, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization[C]. Proceedings of the IEEE international conference on computer vision, 2017. 618–626. [97] Wang J, Wu Z, Ouyang W, et al. M2tr: Multi-modal multi-scale transformers for deepfake detection[C]. Proceedings of the 2022 international conference on multimedia retrieval, 2022. 615–623. [98] Yang W, Zhou X, Chen Z, et al. Avoid-df: Audio-visual joint learning for detecting deepfake[J]. IEEE Transactions on Information Forensics and Security, 2023, 18:2015–2029. [99] Katamneni V S, Rattani A. Contextual Cross-Modal Attention for Audio-Visual Deepfake Detection and Localization[C]. Proceedings of 2024 IEEE International Joint Conference on Biometrics (IJCB). IEEE, 2024. 1–11. ﹀
中图分类号：	TP391
馆藏号：	2025-016-0192
开放日期：	2025-09-29

附件下载