• Alexander Christoffer Eilertsen
4. term, Computer Science, Master (Master Programme)
When dealing with machine learning on cyber-physical systems, one problem is to train the models without extensive cost or harm to the system or its surroundings as the method learns.
One method is to use Priced Timed Markov Decision Processes over a Euclidean state space to define a formal model for these systems and train on.
We attempt to use Neural Networks to find optimal strategies for such models.
We do this by implementing Deep Q-Network in Uppaal Stratego, make a sweep over possible hyperparamters for DQN, select three candidates and test these against the current state of the art optimization algorithm in Uppaal Stratego.
Our results show that DQN can with the right hyperparameters find the optimal strategy for simple models in fewer runs than the current method, and find better strategies on some of the more complex models.
However, we could not find improved strategies for all models within the tested set hyperparameter configuration.
SpecialisationGame Programming
LanguageEnglish
Publication date10 Jun 2021
Number of pages26
ID: 414378297