• Peter Kjær
  • Samuel Alexander Vall Andersen
4. semester, Software, Kandidat (Kandidatuddannelse)
Traffic congestion in urban areas is a problem for
the environment and the economy. One solution to minimize
congestion is optimizing traffic lights. Traffic signal control is
a challenging problem due to the complex traffic flow patterns.
Conventional traffic control use pre-coded cycle pattern plans,
which suffer from adapting to the complex flow dynamics.
Reinforcement Learning allows for dynamic control but is unable
to properly catch temporal-feature due to the Markov property.
To solve this, recent papers propose incorporating prediction
modules into Reinforcement Learning control, however, this
suffers from additional loss and generalization.
To circumvent these issues, we propose Recurrent Light
(ReLight), which treats the environment as a Partially Observable
Markov Decision Process which depends on the history of
previous belief states. We utilize this dependency to capture
spatial-temporal features and utilize an LSTM in the DQN
network to capture important long-short term features through
hidden states. To properly capture cycle phases, we propose two
sampling and two training strategies. In our experiments, we
demonstrate that ReLight outperforms state-of-the-art models
on one, multi and city-wide datasets.
Udgivelsesdatojun. 2022
Antal sider13
ID: 472566945