Reinforcement Learning for Ramp Control: An Analysis of Learning Parameters

Chao Lu; Jie Huang; Jianwei Gong

doi:10.7307/ptt.v28i4.1830

Chao Lu Beijing Institute of Technology
Jie Huang University of Leeds
Jianwei Gong Beijing Institute of Technology

DOI: https://doi.org/10.7307/ptt.v28i4.1830

Keywords: reinforcement learning, Q-learning, ramp control, agent, macroscopic traffic flow model, ent learning

Abstract

Reinforcement Learning (RL) has been proposed to deal with ramp control problems under dynamic traffic conditions; however, there is a lack of sufficient research on the behaviour and impacts of different learning parameters. This paper describes a ramp control agent based on the RL mechanism and thoroughly analyzed the influence of three learning parameters; namely, learning rate, discount rate and action selection parameter on the algorithm performance. Two indices for the learning speed and convergence stability were used to measure the algorithm performance, based on which a series of simulation-based experiments were designed and conducted by using a macroscopic traffic flow model. Simulation results showed that, compared with the discount rate, the learning rate and action selection parameter made more remarkable impacts on the algorithm performance. Based on the analysis, some suggestions
about how to select suitable parameter values that can achieve a superior performance were provided.

Author Biographiesaaa replica rolex repwatches replica rolex watches for men replica iwc watch

Chao Lu, Beijing Institute of Technology

School of Mechanical Engineering

Jie Huang, University of Leeds

Institute for Transport Studies

Jianwei Gong, Beijing Institute of Technology

School of Mechanical Engineering

References

Zhang G, Wang Y. Optimizing coordinated ramp metering: a preemptive hierarchical control approach. Comput-Aided Civ Inf. 2013;28(1):22-37. doi: 10.1111/j.1467-8667.2012.00764.x

Papageorgiou M, Kotsialos A. Freeway ramp metering: an overview. IEEE Trans Intell Transport Syst. 2002;3(4):271-281. doi: 10.1109/TITS.2002.806803

Masher DP, Ross DW, Wong PJ, Tuan PL, Zeidler HM, Petracek S. Guidelines for design and operation of ramp control systems. Stanford: Stanford Research Institute; 1975.

Papageorgiou M, Hadj-Salem H, Blosseville JM. ALINEA: A local feedback control law for on-ramp metering. Transport Res Rec, 1991;1320:58-64.

Hegyi A, De Schutter B, Hellendoorn H. Model predictive control for optimal coordination of ramp metering and variable speed limits. Transport Res C-Emer. 2005;13(3):185-209. doi: 10.1016/j.trc.2004.08.001

Papamichail I, Kotsialos A, Margonis I, Papageorgiou M. Coordinated ramp metering for freeway networks – a model-predictive hierarchical control approach. Transport Res C- Emer. 2010;18(3):311-331. doi: 10.1016/j.trc.2008.11.002

Jacob C, Abdulhai B. Machine learning for multi-jurisdictional optimal traffic corridor control. Transport Res A-Pol. 2010;44(2):53-64. doi: 10.1016/j.tra.2009.11.001

Jacob C, Abdulhai B. Automated adaptive traffic corridor control using reinforcement learning: approach and case studies. Transport Res Rec. 2006;1959:1-8. doi: 10.3141/1959-01

Veljanovska K, Bombol K, Maher T. Reinforcement learning technique in multiple motorway access control strategy design. Promet – Traffic & Transportation. 2010;22(2):117-123. doi: 10.7307/ptt.v22i2.170

Rezaee K, Abdulhai B, Abdelgawad H. Application of reinforcement learning with continuous state space to ramp metering in real-world conditions. Proceedings of the 15th International IEEE Conference on Intelligent Transportation Systems. 2012 Sept 16-19; Anchorage, USA. IEEE; 2012.

Lu C, Chen H, Grant-Muller S. An indirect reinforcement learning approach for ramp control under incident-induced congestion. Proceedings of the 16th International IEEE Conference on Intelligent Transportation Systems. 2013 Oct 6-9; The Hague, the Netherlands. IEEE; 2013.

Lu C, Chen H, Grant-Muller S. Indirect reinforcement learning for incident-responsive ramp control. Procedia Soc Behav Sci. 2014;111:1112-1122. doi: 10.1016/j.sbspro.2014.01.146

Rezaee K, Abdulhai B, Abdelgawad H. Self-learning adaptive ramp metering: analysis of design parameters on a test case in Toronto, Canada. Transport Res Rec. 2013;2013:10-18. doi: 10.3141/2396-02

Sutton RS, Barto AG. Reinforcement learning: an introduction. Cambridge: the MIT press; 1998.

[Watkins CCH, Dayan P. Q-learning. Mach Learn. 1992;8(3-4):279-292. doi: 10.1007/BF00992698

Even-Dar E, Mansour Y. Learning rates for Q-learning. J Mach Learn Res. 2004;5:1-25.

Sun X, Horowitz R. Set of new traffic-responsive ramp-metering algorithms and microscopic simulation results. Transport Res Rec. 2006;1959:9-18. doi: 10.3141/1959-02

Gomes G, Horowitz R. Optimal freeway ramp metering using the asymmetric cell transmission model. Transport Res C-Emer. 2006;14(4):244-262. doi: 10.1016/j.trc.2006.08.001

Haddad J, Ramezani M, Geroliminis N. Cooperative traffic control of a mixed network with two urban regions and a freeway. Transport Res B-Meth. 2013;54:17-36. doi: 10.1016/j.trb.2013.03.007

Gomes G, Horowitz R. A study of two onramp metering schemes for congested freeways. in, 2003. Proceedings of the 2003 American Control Conference. 2003 June 4-6; Denver, USA. IEEE; 2003.

Arnold ED. Ramp metering: a review of the literature. Virginia: Virginia Transportation Research Council; 1998.

Saltelli A. Sensitivity analysis: Could better methods be used. J Geophys Res-Atmos. 1999;104(D3):3789-3793. doi: 10.1029/1998JD100042