查看论文信息

题名：	网络舆情治理中的仇恨迷因检测增强方法研究
作者：	李盛
学号：	SX2209020
保密级别：	公开
语种：	chi
学科代码：	120100
学科：	管理学 - 管理科学与工程(可授管理学、工学学位) - 管理科学与工程
学生类型：	硕士
学位：	管理学硕士
入学年份：	2022
学校：	南京航空航天大学
院系：	经济与管理学院
专业：	管理科学与工程
研究方向：	大数据分析
导师姓名：	马静
导师单位：	经济与管理学院
完成日期：	2025-03-17
答辩日期：	2025-03-13
外文题名：	Research on Enhanced Methods for Multimodal Hateful Meme Detection in Online Public Opinion Governance
关键词：	仇恨迷因检测 ; 舆情治理 ; 互联网迷因 ; 多模态 ; 特征增强
外文关键词：	Hateful Meme Detection ; Public Opinion Governance ; Internet Meme ; Multimodality ; Feature Enhancement
摘要：	︿随着互联网快速普及，网络已成为人们表达观点和情感的重要途径。同时，一种结合了图像和文本的多模态载体——互联网迷因，因其幽默、简短而内涵丰富的特点而广受网民欢迎，在网络舆情中占据越来越高的比例。然而，其病毒式传播的特性也使包含仇恨内容的迷因（即仇恨迷因）成为威胁社会和谐的重要隐患，为网络舆情治理带来了挑战。为了降低仇恨迷因带来的社会影响，在仇恨迷因传播早期进行检测识别是治理仇恨迷因的关键问题之一，而传统的纯文本仇恨言论检测方法对于检测多模态的仇恨迷因效果不佳。研究针对仇恨迷因的自动化检测方法能够更准确地识别海量互联网舆情信息中的仇恨迷因，对网络舆情治理有重要意义。本文首先分析仇恨迷因检测的挑战性和重要性，梳理仇恨言论检测、互联网迷因以及仇恨迷因检测的研究现状。总结现有仇恨迷因检测方法存在的不足，并针对现有研究的不足提出了两种仇恨迷因检测的特征增强方法。本文的主要研究内容如下：（1）针对检测模型提取仇恨检测相关的迷因图像高层语义特征困难的问题，提出了融合领域图像实体的仇恨迷因检测增强方法。以数据驱动的方式从仇恨言论语料中进行实体发现和实体精炼，构建了与仇恨检测任务相关的高质量的感兴趣实体集合。之后，利用CLIP模型的零样本学习能力从感兴趣实体集合中匹配迷因图像出现的领域图像实体。领域图像实体包含迷因图像中与仇恨检测相关的实体元素，从而帮助检测模型理解迷因图像中与仇恨检测相关的高层语义信息，增强仇恨迷因检测性能。在公开仇恨迷因数据集上对多种检测模型进行对比实验，对比基线模型，在文本的RoBERTa-base模型提高了0.122的AUC分数，在视觉-语言模型BridgeTower-base上提高了0.030的AUC分数。实验结果表明融合领域图像实体对多种检测模型都有增强效果，而且增强效果优于通用图像实体特征，实验验证了方法的稳定性和有效性。（2）在融合领域图像实体提升检测模型高层语义特征提取能力的基础上，针对检测模型缺乏检测仇恨迷因必要的背景信息的问题，提出了融合实体背景关系的仇恨迷因检测增强方法。将仇恨迷因检测任务所需的背景知识建模为可能存在仇恨的实体间的背景关系信息，通过大语言模型的提示学习和检索增强生成技术为迷因提取实体背景关系。由于大语言模型只能处理文本模态的信息，复用了领域图像实体提取的模块，将迷因图像中的实体转为文本模态的领域图像实体从而囊括了迷因图像中的重要视觉元素。先通过提示学习驱动大语言模型从迷因文本和领域图像实体中找出可能存在仇恨关系的实体集合。之后再通过检索增强生成技术，从实体背景知识库中召回相关的片段生成实体关系信息。在公开仇恨迷因数据集上以视觉-语言模型BridgeTower-base作为基线模型在多种实验设置下进行对比实验。对比实验的结果表明，融合实体背景关系后检测模型相较于基线检测模型有显著的性能提升，尤其是在融合实体背景关系特征后仅使用50%的训练数据可以取得媲美基线模型使用全量数据训练的AUC分数和正确率，证明了融合实体背景关系方法的有效性。最后，结合本文研究内容总结了对网络舆情治理的管理启示，对不同管理角色提出了治理仇恨迷因的对策建议，并对本文研究的局限以及未来可改进的研究方向进行了展望。﹀
外摘要要：	︿ With the rapid proliferation of the internet, online platforms have become a significant avenue for people to express their opinions and emotions. Meanwhile, internet memes, a multimodal medium combining visual and textual modalities, have gained widespread popularity among netizens due to their humor, brevity, and rich connotations, playing an increasingly prominent role in online public opinion. However, their characteristic of rapid and wide propagation has also made memes containing hateful content (i.e., hateful memes) a significant threat to social harmony, posing challenges to the governance of online public opinion. To mitigate the societal impact of hateful memes, early detection and identification during their initial spread is a key issue in combating them. Traditional text-based hate speech detection methods, however, are less effective at identifying multimodal hateful memes. Research on automated detection methods for hate memes can more accurately identify such content from vast amounts of online information, playing a critical role in the governance of online public opinion. This thesis begins by analyzing the challenges and significance of hate meme detection, reviewing the current researches on hate speech detection, internet memes, and hateful meme detection. It summarizes the limitations of existing hate meme detection methods and proposes two feature enhancement approaches to address these shortcomings. The main research contributions of this thesis are as follows: (1) To address the challenge of extracting high-level semantic features related to hateful meme detection from meme's images, this thesis proposes an enhanced hateful meme detection method that integrates domain-specific visual entities. A data-driven approach is used to discover and refine entities from hate speech corpora, resulting in a high-quality collection of task-relevant entities. Leveraging the zero-shot learning capability of the CLIP model, domain-specific image entities present in meme's image are matched from this collection. These extracted entities contain elements relevant to hate detection within memes, aiding detection models in understanding high-level semantic information associated with hate memes and improving their detection performance. A comparative experiment was conducted on multiple detection models using a publicly available hateful meme dataset. Compared to the baseline models, the RoBERTa-base model for text achieved an AUC score improvement of 0.122, while the BridgeTower-base vision-language model improved by 0.030. The experimental results demonstrate that incorporating domain-specific visual entities enhances various detection models, with a greater enhancement effect than general visual entity features. The results further validate the stability and effectiveness of the proposed approach. (2) Based on improving the high-level semantic feature extraction capabilities of detection models by incorporating domain-specific visual entities, an enhanced method for detecting hate memes by integrating entity-background relationships is proposed to address the issue that the detection model lacks the necessary background information for detecting hateful memes. Specifically, the background knowledge required for hateful meme detection is modeled as contextual relationship information between entities potentially involved in hateful content. Using prompt learning and retrieval augmented generation techniques in large language models (LLMs), the method extracts entity background relationships from memes. Since LLMs can only process textual data, the domain-specific visual entity extraction module is reused to convert visual entities in meme images into textual entities, ensuring that critical visual elements in memes are included. First, prompt learning is utilized to guide the LLM in identifying sets of entities with potential hate relationships from meme text and domain-specific visual entities. Subsequently, retrieval-augmented generation techniques are employed to retrieve relevant fragments from an entity background knowledge base and generate entity relationship information. A comparative experiment was conducted under various experimental settings using the vision-language model BridgeTower-base as the baseline on a publicly available hateful meme dataset. The results indicate that integrating entity-background relationships significantly improves the performance of detection models compared to the baseline. Notably, after incorporating entity-background relationships, the model achieves AUC scores and accuracy comparable to those of the baseline model trained on the full dataset, even when using only 50% of the training data. This demonstrates the effectiveness of the entity-background relationships integration approach. Finally, based on the research content of this thesis, the management implications for online public opinion governance are summarized. Countermeasures and suggestions for different management roles to mitigate the issue of hateful memes are proposed, along with a discussion of the study's limitations and potential directions for future research. ﹀
参考文献：	︿ [1] 中国互联网络信息中心. 第54次《中国互联网络发展状况统计报告》[R]. 2024. [2] United Nations. Promoting interreligious and intercultural dialogue and tolerance in countering hate speech[Z]. A/75/L.115, 2021. [3] Birhane A, Prabhu V U, Kahembwe E. Multimodal datasets: misogyny, pornography, and malignant stereotypes[J]. arXiv preprint arXiv:2110.01963, 2021. [4] 李华君, 曾留馨, 滕姗姗. 网络暴力的发展研究:内涵类型、现状特征与治理对策[J]. 情报杂志, 2017, 36(9): 139-145. [5] 刘美萍. 重大突发事件网络舆情协同治理机制构建研究[J]. 求实, 2022(5): 64-76+111. [6] 郭小宇, 马静, Zubiaga A, 等. 互联网迷因研究：现状与展望[J]. 情报理论与实践, 2021, 44(6): 199-207. [7] Shifman L. Memes in a digital world: Reconciling with a conceptual troublemaker[J]. Journal of computer-mediated communication, 2013, 18(3): 362-377. [8] Kiela D, Firooz H, Mohan A, et al. The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes[C]//Advances in Neural Information Processing Systems: Vol 33. Curran Associates, Inc., 2020: 2611-2624. [9] Xu Z, Zhu S. Filtering offensive language in online communities using grammatical relations[C]//Proceedings of the Seventh Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference. 2010: 1-10. [10] Warner W, Hirschberg J. Detecting hate speech on the world wide web[C]//Proceedings of the second workshop on language in social media. 2012: 19-26. [11] Davidson T, Warmsley D, Macy M, et al. Automated hate speech detection and the problem of offensive language[C]//Proceedings of the international AAAI conference on web and social media: Vol 11. 2017: 512-515. [12] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[J]. arXiv preprint arXiv.1810.04805, 2019. [13] Pereira-Kohatsu J C, Quijano-Sánchez L, Liberatore F, et al. Detecting and monitoring hate speech in Twitter[J]. Sensors, 2019, 19(21): 4654. [14] Caselli T, Basile V, Mitrović J, et al. HateBERT: Retraining BERT for Abusive Language Detection in English[C]//Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021). 2021: 17-25. [15] Mozafari M, Farahbakhsh R, Crespi N. A BERT-based transfer learning approach for hate speech detection in online social media[C]//Complex Networks and Their Applications VIII: Volume 1 Proceedings of the Eighth International Conference on Complex Networks and Their Applications COMPLEX NETWORKS 2019 8. Springer, 2020: 928-940. [16] 曾江峰, 高鹏钰, 李玲, 等. 基于BERT和提示学习的网络暴力言论识别研究[J]. 情报杂志: 1-9. [17] ElSherief M, Ziems C, Muchlinski D, et al. Latent Hatred: A Benchmark for Understanding Implicit Hate Speech[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021: 345-363. [18] 王海涵, 朱焱. 融合反讽机制的攻击性言论检测[J]. 计算机应用, 2024, 44(4): 1065-1071. [19] 王小龙, 王琰慧, 张顺香, 等. 融合帖文属性的性别歧视言论检测模型[J]. 计算机科学, 2024, 51(6): 338-345. [20] 叶瀚, 胡凯茜, 李欣, 等. 基于语义不一致性的网络暴力舆情预警方法[J]. 情报杂志, 2024, 43(4): 135-145+67. [21] Rodríguez A, Argueta C, Chen Y L. Automatic Detection of Hate Speech on Facebook Using Sentiment and Emotion Analysis[C]//2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). 2019: 169-174. [22] Plaza-Del-Arco F M, Molina-González M D, Ureña-López L A, et al. A Multi-Task Learning Approach to Hate Speech Detection Leveraging Sentiment Analysis[J]. IEEE Access, 2021, 9: 112478-112489. [23] Zhu Y, Zhang P, Haq E U, et al. Can chatgpt reproduce human-generated labels? a study of social computing tasks[J]. arXiv preprint arXiv:2304.10145, 2023. [24] Roy S, Harshvardhan A, Mukherjee A, et al. Probing LLMs for hate speech detection: strengths and vulnerabilities[C]//Findings of the Association for Computational Linguistics: EMNLP 2023. 2023: 6116-6128. [25] 徐磊, 胡亚豪, 陈满, 等. 融合前缀调优和提示学习的仇恨言论检测方法[J]. 计算机科学与探索: 1-13. [26] Gomez R, Gibert J, Gomez L, et al. Exploring Hate Speech Detection in Multimodal Publications[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2020: 1470-1478. [27] Dawkins R. The selfish gene[M]. Oxford university press, 1976. [28] Suryawanshi S, Chakravarthi B R, Arcan M, et al. Multimodal Meme Dataset (MultiOFF) for Identifying Offensive Content in Image and Text[C]//Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying. Marseille, France: European Language Resources Association (ELRA), 2020: 32-41. [29] Fersini E, Gasparini F, Rizzi G, et al. SemEval-2022 Task 5: Multimedia Automatic Misogyny Identification[C]//Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). Seattle, United States: Association for Computational Linguistics, 2022: 533-549. [30] Pramanick S, Dimitrov D, Mukherjee R, et al. Detecting harmful memes and their targets[J]. arXiv preprint arXiv:2110.00413, 2021. [31] Google Trends. Search Trends for “Meme” (2010-2024)[Z/OL]. (2024). https://trends.google.com/trends/explore?date=2010-01-01%202024-07-01&q=meme. [32] Instagram. Instagram Year in Review: How Memes Were the Mood of 2020[Z/OL]. (2020-12). https://about.instagram.com/blog/announcements/instagram-year-in-review-how-memes-were-the-mood-of-2020. [33] Gleeson J P, O’Sullivan K P, Baños R A, et al. Effects of network structure, competition and memory time on social spreading phenomena[J]. Physical Review X, 2016, 6(2): 021019. [34] 张頔. 互联网迷因的流行病传播模型研究[D]. 天津科技大学, 2018. [35] Malodia S, Dhir A, Bilgihan A, et al. Meme marketing: How can marketers drive better engagement using viral memes?[J]. Psychology & Marketing, 2022, 39(9): 1775-1801. [36] He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 770-778. [37] Zhang W, Liu G, Li Z, et al. Hateful Memes Detection via Complementary Visual and Linguistic Networks[J]. arXiv preprint arXiv.2012.04977, 2020. [38] Honnibal M, Montani I, Van Landeghem S, et al. spaCy: Industrial-strength Natural Language Processing in Python[J]. 2020. [39] Ren S, He K, Girshick R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[C]//Advances in Neural Information Processing Systems: Vol 28. Curran Associates, Inc., 2015. [40] Lippe P, Holla N, Chandra S, et al. A Multimodal Framework for the Detection of Hateful Memes[J]. arXiv preprint arXiv.2012.12871, 2020. [41] Chen Y C, Li L, Yu L, et al. UNITER: UNiversal Image-TExt Representation Learning[C]//Vedaldi A, Bischof H, Brox T, et al. Computer Vision – ECCV 2020. Cham: Springer International Publishing, 2020: 104-120. [42] Kumar G K, Nandakumar K. Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features[C]//2nd Workshop on NLP for Positive Impact, NLP4PI 2022 held in conjunction with the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022. Association for Computational Linguistics (ACL), 2022: 171-183. [43] Muennighoff N. Vilio: State-of-the-art Visio-Linguistic Models applied to Hateful Memes[J]. arXiv preprint arXiv.2012.07788, 2020. [44] Yu F, Tang J, Yin W, et al. Ernie-vil: Knowledge enhanced vision-language representations through scene graphs[C]//Proceedings of the AAAI conference on artificial intelligence: Vol 35. 2021: 3208-3216. [45] Li X, Yin X, Li C, et al. Oscar: Object-semantics aligned pre-training for vision-language tasks[C]//Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXX 16. Springer, 2020: 121-137. [46] Li L H, Yatskar M, Yin D, et al. Visualbert: A simple and performant baseline for vision and language[J]. arXiv preprint arXiv:1908.03557, 2019. [47] Zhu R. Enhance Multimodal Transformer With External Label And In-Domain Pretrain: Hateful Meme Challenge Winning Solution[J]. arXiv preprint arXiv.2012.08290, 2020. [48] Kärkkäinen K, Joo J. Fairface: Face attribute dataset for balanced race, gender, and age[J]. arXiv preprint arXiv:1908.04913, 2019. [49] Blaier E, Malkiel I, Wolf L. Caption Enriched Samples for Improving Hateful Memes Detection[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021: 9350-9358. [50] Cao R, Lee R K W, Chong W H, et al. Prompting for Multimodal Hateful Meme Classification[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, 2022: 321-332. [51] Koutlis C, Schinas M, Papadopoulos S. MemeFier: Dual-stage modality fusion for image meme classification[C]//Proceedings of the 2023 ACM International Conference on Multimedia Retrieval. 2023: 586-591. [52] Zhou Y, Chen Z, Yang H. Multimodal Learning For Hateful Memes Detection[C]//2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). 2021: 1-6. [53] 刘旭东, 杨亮, 张冬瑜, 等. 结合图卷积网络的多模态仇恨迷因识别研究[J]. 重庆理工大学学报(自然科学), 2024, 38(1): 169-179. [54] Dai W, Li J, Li D, et al. InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning[J]. arXiv preprint arXiv:2305.06500, 2023. [55] Li J, Li D, Savarese S, et al. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models[C]//International conference on machine learning. PMLR, 2023: 19730-19742. [56] Cao R, Hee M S, Kuek A, et al. Pro-cap: Leveraging a frozen vision-language model for hateful meme detection[C]//Proceedings of the 31st ACM International Conference on Multimedia. 2023: 5244-5252. [57] Lu J, Xu B, Zhang X, et al. Towards Comprehensive Detection of Chinese Harmful Memes[J]. arXiv preprint arXiv:2410.02378, 2024. [58] Kougia V, Fetzel S, Kirchmair T, et al. Memegraphs: Linking memes to knowledge graphs[C]//International Conference on Document Analysis and Recognition. Springer, 2023: 534-551. [59] 林燕霞, 谢湘生, 张德鹏. 复杂交互行为影响下的网络舆情演化分析[J]. 中国管理科学, 2020, 28(1): 212-221. [60] 杨兴坤, 廖嵘, 熊炎. 虚拟社会的舆情风险防治[J]. 中国行政管理, 2015(4): 16-21. [61] Sabat B O, Ferrer C C, Giro-i-Nieto X. Hate speech in pixels: Detection of offensive memes towards automatic moderation[J]. arXiv preprint arXiv:1910.02334, 2019. [62] Simonyan K. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014. [63] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017. [64] Deng J, Dong W, Socher R, et al. Imagenet: A large-scale hierarchical image database[C]//2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009: 248-255. [65] Liu Y, Ott M, Goyal N, et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach[J]. arXiv preprint arXiv.1907.11692, 2019. [66] Radford A, Kim J W, Hallacy C, et al. Learning transferable visual models from natural language supervision[C]//International conference on machine learning. PMLR, 2021: 8748-8763. [67] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale[J]. arXiv preprint arXiv:2010.11929, 2021. [68] Radford A, Wu J, Child R, et al. Language models are unsupervised multitask learners[J]. OpenAI blog, 2019, 1(8): 9. [69] Xu X, Wu C, Rosenman S, et al. BridgeTower: Building Bridges between Encoders in Vision-Language Representation Learning[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(9): 10637-10647. [70] Mikolov T. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv:1301.3781, 2013, 3781. [71] Pennington J, Socher R, Manning C D. Glove: Global vectors for word representation[C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014: 1532-1543. [72] Peters M E, Neumann M, Iyyer M, et al. Deep contextualized word representations[J]. arXiv preprint arXiv:1802.05365, 2018. [73] Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05): Vol 1. Ieee, 2005: 886-893. [74] LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. [75] He K, Fan H, Wu Y, et al. Momentum contrast for unsupervised visual representation learning[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 9729-9738. [76] You Q, Luo J, Jin H, et al. Cross-modality consistent regression for joint visual-textual sentiment analysis of social multimedia[C]//Proceedings of the Ninth ACM international conference on Web search and data mining. 2016: 13-22. [77] Lampert C H, Nickisch H, Harmeling S. Attribute-based classification for zero-shot visual object categorization[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 36(3): 453-465. [78] Liu P, Yuan W, Fu J, et al. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing[J]. ACM Computing Surveys, 2023, 55(9): 1-35. [79] Wei J, Tay Y, Bommasani R, et al. Emergent abilities of large language models[J]. arXiv preprint arXiv:2206.07682, 2022. [80] Zhao W X, Zhou K, Li J, et al. A survey of large language models[J]. arXiv preprint arXiv:2303.18223, 2023. [81] Hessel J, Holtzman A, Forbes M, et al. CLIPScore: A Reference-free Evaluation Metric for Image Captioning[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021: 7514-7528. [82] Goh G, Cammarata N, Voss C, et al. Multimodal neurons in artificial neural networks[J]. Distill, 2021, 6(3): e30. [83] Chenwa Y. OCR-SAM: Combining MMOCR with Segment Anything & Stable Diffusion[Z/OL]. (2023). https://github.com/yeungchenwa/OCR-SAM. [84] Liao M, Zou Z, Wan Z, et al. Real-time scene text detection with differentiable binarization and adaptive scale fusion[J]. IEEE transactions on pattern analysis and machine intelligence, 2022, 45(1): 919-931. [85] Rombach R, Blattmann A, Lorenz D, et al. High-resolution image synthesis with latent diffusion models[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 10684-10695. [86] Wenzek G, Lachaux M A, Conneau A, et al. CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data[C]//Proceedings of the Twelfth Language Resources and Evaluation Conference. 2020: 4003-4012. [87] Zang Y, Li W, Zhou K, et al. Open-vocabulary detr with conditional matching[C]//European Conference on Computer Vision. Springer, 2022: 106-122. [88] Fang A, Jose A M, Jain A, et al. Data filtering networks[J]. arXiv preprint arXiv:2309.17425, 2023. [89] Belkin M, Hsu D, Ma S, et al. Reconciling modern machine-learning practice and the classical bias–variance trade-off[J]. Proceedings of the National Academy of Sciences, 2019, 116(32): 15849-15854. [90] Liu Z, Mao H, Wu C Y, et al. A convnet for the 2020s[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 11976-11986. [91] Chase H. LangChain[Z/OL]. (2022-10). https://github.com/langchain-ai/langchain. [92] OpenAI. text-embedding-3-small[Z/OL]. (2024). https://platform.openai.com/docs/guides/embeddings. [93] Liu A, Feng B, Wang B, et al. Deepseek-v2: A strong, economical, and efficient mixture-of-experts language model[J]. arXiv preprint arXiv:2405.04434, 2024. ﹀
中图分类号：	TP391
馆藏号：	2025-009-0035
开放日期：	2025-09-27

附件下载