Où docteurs et entreprises se rencontrent
Menu
Connexion

Auto-Regulated Traffic Signal Control in Multi-Modal Urban Networks Using Graph-Based Deep Reinforcement Learning

ABG-131317 Sujet de Thèse
21/04/2025 Contrat doctoral
Logo de
LICIT laboratory (ENTPE/UGE), Lyon
Lyon - Auvergne-Rhône-Alpes - France
Auto-Regulated Traffic Signal Control in Multi-Modal Urban Networks Using Graph-Based Deep Reinforcement Learning
  • Informatique

Description du sujet

Traffic Signal Control (TSC) is a cornerstone of urban traffic management, directly impacting traffic efficiency, network stability, and environmental performance [1]. Over the past decade, adaptive and intelligent TSC approaches have become essential tools for mitigating congestion. These methods adjust signal timings based on real-time traffic conditions, helping to reduce delays and improve throughput. Among these approaches, Reinforcement Learning (RL), particularly Deep Reinforcement Learning (DRL), has emerged as a promising paradigm capable of capturing complex traffic dynamics through interaction with the environment [2].

In real-world traffic networks, intersections are inherently interdependent: the conditions at one intersection are influenced by upstream inflows and downstream congestion, forming tightly coupled spatial dependencies. This complexity becomes more pronounced when multiple intersections share major traffic flows or transit routes. As such, isolated signal optimization is often insufficient. Recent work has explored Multi-Agent Reinforcement Learning (MARL) to coordinate control across multiple intersections via distributed agents. These decentralized approaches offer scalability and robustness but require careful coordination strategies to avoid myopic or conflicting decisions [3].

Challenges in Coordination and Perception

A critical open issue remains: (i) how intersections can effectively exchange and process relevant information, and (ii) to what extent an intersection is interlinked with others [4]. In most practical deployments, controllers use data only from signalized intersections, without considering the impact of non-signalized nodes (e.g., roundabouts or priority-to-the-right junctions), which are common in urban networks. These elements can significantly affect the dynamics of nearby controlled intersections.

This issue can be interpreted as a Partial Observability problem, similar to those encountered when deploying agents in real-world scenarios. There is thus a need to develop models capable of capturing heterogeneous neighborhood effects—i.e., identifying which nearby nodes influence a given intersection and integrating only the relevant information into the decision-making process [4,5].

When such neighborhood information is incorporated into the agent controlling an intersection [4], the process is typically mono-directional: the surrounding context is used to enhance the agent's perception, but without introducing a truly mutual relationship aimed at cooperation. As a result, agents tend to maintain selfish decisions, with little consideration for the impact on their surroundings, even if their perception is augmented by local context.

Need for Dynamic Protection and Proactive Coordination

Furthermore, even with sophisticated multi-agent control, oversaturated conditions (e.g., during rush hours or major public events) can lead to gridlock and systemic collapse due to spillback effects. To address this, the concept of Perimeter Control has been proposed [6], which involves restricting vehicle inflows into high-demand areas to preserve internal flow conditions. However, most existing approaches rely on static boundaries and centralized coordination, limiting scalability, transferability, and adaptability to real-time changes.

There is a pressing need for adaptive, agent-driven perimeter protection, capable of dynamically identifying and regulating protected zones based on local observations and decentralized operations [7]. Achieving this requires developing agents with local perception and control, capable of exchanging information with neighbors to foster cooperative behaviors. This is a key step toward the emergence of self-organized, proactive traffic management strategies, particularly in the context of spatially dynamic protected networks.

Embracing Multi-Modality and Multi-Objective Optimization

Managing the multi-objectivity and multi-modality of urban traffic is also becoming increasingly essential. Urban intersections accommodate a wide variety of users, including private vehicles, freight, bicycles, pedestrians, and public transit. Buses, in particular, are sensitive to signal timing and congestion, requiring headway regularity to avoid bunching and ensure reliable service.

Despite some recent progress [8], most RL-based TSC approaches still fail to model real-world bus dynamics, such as open-loop operations or heterogeneous passenger demand. Beyond multi-modality, multi-objectivity in TSC must account for a broad set of Key Performance Indicators (KPIs), including efficiency, safety, and pollutant emissions. Among these, air pollution is of paramount importance for most urban areas. One promising research avenue is to explore selective access policies based on pollution levels—dynamically creating protected areas where access is restricted depending on real-time emission data.

Research Gap and Thesis Objective

To date, no existing framework simultaneously addresses all the following aspects in a coherent way:

Multi-modal traffic control

Dynamic perimeter protection

Graph-based coordination

Robustness to incomplete or heterogeneous infrastructure data

This PhD thesis proposal aims to fill this gap by developing an advanced TSC framework that leverages graph-aware, learning-based multi-agent architectures. The objective is to optimize traffic performance in multi-modal, oversaturated, and heterogeneous urban networks.

The ultimate goal is to propose a holistic and scalable urban traffic control system, improving both overall network efficiency and public transport reliability—thus pushing the boundaries of current smart mobility systems.

References

[1] Wei Miao, Long Li, and Zhiwen Wang. A survey on deep reinforcement learning for traffic signal control. Proceedings of the 33rd Chinese Control and Decision Conference (CCDC), IEEE, 2021.

[2] Haiyan Zhao, Chengcheng Dong, Jian Cao, and Qingkui Chen. A survey on deep reinforcement learning approaches for traffic signal control. Engineering Applications of Artificial Intelligence, 133:108100, 2024.

[3] Chengdong Ma, Aming Li, Yali Du, Hao Dong, and Yaodong Yang. Efficient and scalable reinforcement learning for large-scale network control. Nature Machine Intelligence, 6(9):1006–1020, 2024.

[4] Jing Shang et al. Graph-based cooperation multi-agent reinforcement learning for intelligent traffic signal control. IEEE Internet of Things Journal, 2025.

[5] Yiming Bie, Yuting Ji, and Dongfang Ma. Multi-agent deep reinforcement learning collaborative traffic signal control method considering intersection heterogeneity. Transportation Research Part C: Emerging Technologies, 164:104663, 2024.

[6] Xinfeng Ru, Weiguo Xia, Xiang Fan, and Tao Sun. Nearly optimal perimeter tracking control for two urban regions with unknown dynamics. IEEE Control Systems Letters, 2024.

[7] Jiajie Yu et al. Perimeter control with heterogeneous metering rates for cordon signals: A physics-regularized multi-agent reinforcement learning approach. Transportation Research Part C: Emerging Technologies, 171:104944, 2025.

[8] Jiajie Yu, Pierre-Antoine Laharotte, Yu Han, and Ludovic Leclercq. Decentralized signal control for multi-modal traffic network: A deep reinforcement learning approach. Transportation Research Part C: Emerging Technologies, 154:104281, 2023.

 

Nature du financement

Contrat doctoral

Précisions sur le financement

Présentation établissement et labo d'accueil

LICIT laboratory (ENTPE/UGE), Lyon

The Transport and Traffic Engineering Laboratory (LICIT-ECO7) is a joint research unit under the supervision of Université Gustave Eiffel and ENTPE (École Nationale des Travaux Publics de l’État), and is part of the Université de Lyon and the Lyon Urban School. Located on both the ENTPE and Université Gustave Eiffel campuses in Lyon, France, LICIT-ECO7 focuses on the dynamic modeling, monitoring, and control of mobility networks, as well as the usage, aging, and control of new vehicles and components. citeturn0search2

Research Themes:

Traffic Modeling, Simulation, and Control: Development of simulation models and tools based on physics and traffic theory to test new mobility services and control strategies.

Data-Driven Mobility Analysis and Prediction: Utilization of multi-source data, including call detail records, mobile network probe data, drone images, and GPS data, to build intelligent transport systems and understand human mobility at various spatio-temporal scales.

Optimal and Resilient Transport Offer Design: Application of complex networks theory and operations research to design optimal multi-modal sustainable transport offers, focusing on both densely and sparsely populated areas.

Energy Management and Design Optimization of Vehicle Powertrains: Development of vehicle models and algorithms to optimize control, component sizing, and architecture of innovative vehicles, considering energy efficiency, pollutant emissions, and aging of critical components. 

International Collaborations:

LICIT-ECO7 is actively involved in numerous international research projects and collaborative programs, including:

Participation in Horizon Europe, H2020, and other EU-funded initiatives.

Joint research with leading academic institutions such as TU Delft, ETH Zurich, KTH Stockholm, and Politecnico di Milano.

Partnerships with international transport agencies, urban mobility startups, and mobility research centers in North America, Asia, and the Middle East.

The laboratory contributes to global conferences like the ITS World Congress, TRB, and IEEE ITSC, aiming to produce impactful research relevant to future smart cities and sustainable transport policies.

For more information, visit the LICIT-ECO7 website.

Profil du candidat

We are looking for a motivated PhD candidate with the following background:
• Academic Background: Master’s degree or engineering diploma in Computer Science,
Artificial Intelligence, Applied Mathematics, or related fields.
• AI and Modeling Skills: Solid knowledge in deep learning (PyTorch or TensorFlow), with
interest or initial experience in reinforcement learning; familiarity with graph neural networks
(e.g., GCNs) is appreciated.
• Traffic Simulation Tools: Experience with traffic simulators (e.g., SUMO, CityFlow) is a
plus.
• Research Skills: Strong analytical mindset, autonomy, scientific curiosity, and good communication
skills in English.
 

06/05/2025
Partager via
Postuler
Fermer

Vous avez déjà un compte ?

Nouvel utilisateur ?