题名: | 面向复杂场景的自动驾驶感知决策与交叉路口控制策略研究 |
作者: | |
学号: | BX2002302 |
保密级别: | 公开 |
语种: | chi |
学科代码: | 080204 |
学科: | 工学 - 机械工程 - 车辆工程 |
学生类型: | 博士 |
学位: | 工学博士 |
入学年份: | 2020 |
学校: | 南京航空航天大学 |
院系: | |
专业: | |
研究方向: | 自动驾驶 |
导师姓名: | |
导师单位: | |
完成日期: | 2025-04-04 |
答辩日期: | 2025-05-24 |
外文题名: |
Research on Perception, Decision-making, and Intersection Control Strategies for Autonomous Driving in Complex Scenarios |
关键词: | |
外文关键词: | Autonomous Driving ; Object Detection ; Lane Line Detection ; Behavior Decision-making ; Intersection Control Strategy ; Deep Reinforcement Learning |
摘要: |
随着全球城市化进程的加速推进与汽车保有量的持续增长,交通安全与通行效率等问题日益凸显,自动驾驶技术逐渐成为解决这类问题的核心手段。本文聚焦于复杂遮挡结构化道路场景中的视觉感知与决策规划技术,以及复杂混合交通流交叉路口场景中的协同控制策略,重点针对目标检测、目标测距、车道线检测、行为决策与交叉路口控制策略等关键问题展开系统性研究,具体研究内容如下: (1)针对复杂遮挡场景中现有目标检测算法无法处理目标中心点重叠的问题,设计了一系列基于中心点检测的目标检测算法,通过优化网络拓扑结构、设计特征融合策略及增强模型整体关联性的方式,有效解决了上述问题,并提升了算法的目标检测精度与推理速度,其AP值达到了52.7%,FPS值达到了187.3。针对单目视觉测距任务中的尺度不确定性问题,设计了一种基于标定表的目标测距算法,相较于传统算法,其纵向测距误差降低了2.23%至7.17%,横向测距误差降低了1.18%至4.59%。 (2)针对复杂遮挡场景中传统车道线检测算法性能不足的问题,提出了一种基于视觉变换器模型架构的去遮挡车道线检测算法。设计重叠图像块划分与先验信息嵌入机制,增强了模型对图像上下文信息与车道线空间特征的理解能力;构建共享主干网络参数的双分支解码器架构,实现了去遮挡图像重建与车道线检测任务的协同优化;提出像素位置敏感的车道线分割损失函数,提升了车道线检测的定位精度。多场景基准测试与硬件在环仿真结果表明,该算法在Tusimple、CULane与CurveLanes三个公开基准数据集上,F1分数分别达到了97.72%、79.50%与86.26%,FPS值均为83.5,为自动驾驶系统提供了可靠的横向定位支持。 (3)针对复杂遮挡结构化道路场景中传统行为决策算法的局限性,提出了一种基于深度强化学习的行为决策算法。构建了多尺度栅格图表征体系,实现了遮挡物体、交通规则与导航路线等多源信息的统一编码;设计了基于双重注意力机制的多模态主干网络,实现了异构状态信息的协同互补;提出了混合多步学习策略与基于规则的风险评估模块,分别提升了模型的训练效率与仿真成功率。仿真结果表明,所提算法与同类型先进算法对比,在保证同等通行效率的前提下,将仿真成功率提升了38.0%,有效降低了复杂遮挡场景中存在的行驶安全隐患。 (4)针对复杂混合交通流交叉路口场景中的协同优化难题,提出了一种行人友好型分层式交叉路口控制策略。通过整合监控摄像头与交通信号灯等现有道路网联基础设施资源,构建了行人动态全局管控框架,实现了行人过街需求的精准预判;设计了分层式中枢,上层构建双延迟深度Q网络动态调控交通信号相位,以适应复杂交通流的变化,下层基于时空预留机制对网联自动驾驶车辆进行协同控制,确保高效、安全的通行。仿真结果表明,该策略在交叉路口场景中,能够有效缩短行人等待时间,并提升整个交通系统的通行效率。 本文的研究成果为复杂交通场景中的自动驾驶视觉感知、行为决策与交叉路口控制策略等任务提供了理论支撑与实践指导,具有重要意义。 |
外摘要要: |
With the accelerated advancement of global urbanization and the continuous growth of vehicle ownership, issues such as traffic safety and travel efficiency have become increasingly prominent. Autonomous driving technology has gradually emerged as a core solution to address these challenges. This thesis focuses on visual perception and decision-making technologies in complex occluded structured road scenarios, as well as collaborative control strategies in mixed traffic flow intersection scenarios. Systematic research was conducted on key technologies including object detection, object ranging, lane line detection, behavior decision-making, and intersection control strategies. The specific research contents are as follows: (1) To address the limitation of existing object detection algorithms in handling center-overlapping targets in complex occluded scenarios, a series of center-point-based object detection algorithms were designed By optimizing network topology, designing feature fusion strategies, and enhancing model contextual coherence, the proposed algorithms effectively resolved this issue while improving detection accuracy and inference speed, achieving an AP of 52.7% and an FPS of 187.3. For the scale ambiguity problem in monocular visual distance measurement, a calibration table-based distance estimation algorithm was proposed. Compared with traditional methods, it reduced longitudinal distance errors by 2.23%–7.17% and lateral distance errors by 1.18%–4.59%. (2) To overcome the performance limitations of traditional lane line detection algorithms in complex occluded scenarios, a de-occlusion lane line detection algorithm based on a vision transformer architecture was developed. By designing overlapping image patch division and prior information embedding mechanisms, the model’s understanding of image contextual information and lane spatial features was strengthened. A dual-branch decoder architecture with shared backbone network parameters was constructed to achieve synergistic optimization of occlusion removal and lane line detection. Additionally, a pixel-position-sensitive lane line segmentation loss function was proposed to improve lane line localization precision. Multi-scenario benchmark tests and hardware-in-the-loop simulations demonstrated that the proposed algorithm achieved F1 scores of 97.72%, 79.50%, and 86.26% on the Tusimple, CULane, and CurveLanes datasets, respectively, with an FPS of 83.5, providing reliable lateral positioning support for autonomous driving systems. (3) To address the shortcomings of traditional behavior decision-making algorithms in complex occluded structured road scenarios, a behavior decision-making algorithm based on deep reinforcement learning was proposed. A multi-scale grid map representation system was constructed to achieve unified encoding of multi-source information including occluded objects, traffic rules, and navigation routes. A dual-attention-based multi-modal backbone network was designed to enable complementary integration of heterogeneous state information. Furthermore, a hybrid multi-step learning strategy and a rule-based risk assessment module were introduced to improve training efficiency and simulation success rates, respectively. Simulation results showed that compared with state-of-the-art algorithms, the proposed method improved the simulation success rate by 38.0% while maintaining comparable travel efficiency, effectively reducing safety risks in complex occluded scenarios. (4) To tackle the collaborative optimization challenges in mixed traffic flow intersection scenarios, a pedestrian-friendly hierarchical intersection control strategy was proposed. By integrating existing connected road infrastructure resources such as surveillance cameras and traffic signals, a dynamic global pedestrian control framework was established to accurately predict pedestrian crossing demands. A hierarchical control architecture was designed, where the upper layer employed a Double Delayed Deep Q-Network (DD-DQN) to dynamically adjust traffic signal phases in response to complex traffic flow variations, while the lower layer implemented a spatiotemporal reservation mechanism to coordinate connected and autonomous vehicles, ensuring efficient and safe passage. Simulation results demonstrated that the proposed strategy effectively reduced pedestrian waiting times and improved overall traffic efficiency at intersections. The research results of this thesis provide theoretical support and practical guidance for autonomous driving tasks such as visual perception, behavior decision-making, and intersection control strategies in complex traffic scenarios, demonstrating significant importance. |
参考文献: |
[8] 张亚勤, 李震宇, 尚国斌, 等. 面向自动驾驶的车路云一体化框架[J]. 汽车安全与节能学报, 2023, 14(3): 249-273. [12] 中国汽车工程学会. 节能与新能源汽车技术路线图2.0[M]. 北京: 机械工业出版社, 2020. [13] 朱放. 典型场景下自动驾驶车辆换道决策模型研究[D]. 北京: 北方工业大学, 2024. [14] 孙康. 智能网联车辆不同智能等级与渗透率场景的协同换道控制策略[D]. 西安: 长安大学, 2023. [15] 王雪柠, 翟媛, 陈颢. “十五五”我国汽车产业发展趋势简析[J]. 汽车工业研究, 2024(4): 10-12. [19] 工业和信息化部. 汽车驾驶自动化分级: GB/T 40429-2021[S]. 2021. [25] 全国汽车标准化技术委员会. 道路车辆先进驾驶辅助系统(ADAS)术语及定义: GB/T39263-2020[S]. 2020. [31] 刘潇. 面向高速环境的自动驾驶车辆行为决策、规划与控制研究[D]. 杭州: 浙江大学, 2022. [32] 韩改宁. 城市道路环境下无人驾驶车辆运动规划与控制研究[D]. 西安: 西安理工大学, 2022. [36] 李洪硌. 无人驾驶汽车高速工况智能决策、轨迹规划与跟踪研究[D]. 广州: 华南理工大学, 2020. [37] 闫相同. 汽车纵向自动驾驶个性化决策方法研究[D]. 长春: 吉林大学, 2024. [38] 何科. 无人驾驶汽车车道级定位导航系统关键技术研究[D]. 长春: 吉林大学, 2023. [41] 刘润武. 城市环境下的智能汽车端到端运动规划研究[D]. 合肥: 合肥工业大学, 2023. [50] 曹景伟. 复杂场景下的智能汽车目标检测与跟踪算法研究[D]. 长春: 吉林大学, 2022. [51] 徐谦. 面向复杂环境自动驾驶的视觉环境感知研究[D]. 长春: 吉林大学, 2023. [52] 孙畅. 基于深度学习的高级辅助驾驶视觉感知关键技术研究[D]. 北京: 北京科技大学, 2022. [57] 王科未. 基于多传感融合的智能汽车驾驶场景理解与建模研究[D]. 武汉: 武汉理工大学, 2022. [58] 赵兰. 自动驾驶车辆环境认知与交互式决策研究[D]. 长春: 吉林大学, 2024. [59] 李彦锋. 面向自动驾驶虚拟测试的场景车辆行为与运动规划方法研究[D]. 长春: 吉林大学, 2024. [60] 刘延东. 基于自主学习的自动驾驶决策与控制研究[D]. 北京: 中国科学院大学(中国科学院深圳先进技术研究院), 2022. [61] 赵俊武. 考虑交通参与者驾驶特性的智能车辆交互式决策规划研究[D]. 长春: 吉林大学, 2024. [81] Wang A, Chen H, Liu L, et al. YOLOv10: real-time end-to-end object detection[A]. arXiv, 2024. [88] 孙方伟, 李承阳, 谢永强, 等. 深度学习应用于遮挡目标检测算法综述[J]. 计算机科学与探索, 2022, 16(6): 1243-1259. [126] 郑义. 车联网环境下无信号交叉口车辆协同控制算法研究[D]. 长春: 吉林大学, 2020. [127] 王润民. 交叉口环境下网联汽车群体协同控制策略研究[D]. 西安: 长安大学, 2024. [129] 姜忠太. 智能网联环境下城市交叉口交通流协同控制方法研究[D]. 长春: 吉林大学, 2024. [153] Zhou X, Wang D, Krähenbühl P. Objects as points[A]. arXiv, 2019. [165] OpenAI, Achiam J, Adler S, et al. GPT-4 technical report[A]. arXiv, 2024. |
中图分类号: | U471.15 |
馆藏号: | 2025-002-0350 |
开放日期: | 2025-12-20 |