Implementing a Predictive Autoscaler in Kubernetes Using ML Time Series Forecasting Models
Term
4. term
Education
Publication year
2025
Submitted on
2025-06-13
Pages
20
Abstract
This thesis explores how a predictive autoscaling system for Kubernetes can be implemented using time series forecasting utilizing multiple different machine learning models, continuously trained on incoming data. The default reactive autoscaling solutions often leads to loss of QoS during high peak periods, since deployments begin to scale only when the peak is already apparent. We propose a predictive autoscaling solution able to predict the future load of multiple individual deployments and begin scaling each deployment accordingly before the load occurs. The solution is fairly easy-to-deploy, which only requires few dependencies being present in the cluster. The solution continuously monitors the cluster detecting new deployments for which the autoscaling can be applied. Experimental results demonstrate that the Autoscaling system is able to outperform the Kubernetes HPA on the average response time by between 14% and 20%, while lowering the amount of requests above one second by between 93% and 95%, while only using 3% more power, and between 2% and 5% more pods. The source code we used in this study is available as open-source on GitHub see Section A (p. 19).
Documents
