Predicting Phosphorus Concentrations in WWTPs Using Data-Driven Methods

Student thesis: Master thesis (including HD thesis)

  • Laura Debel Hansen
4. semester, Sustainable Energy Engineering, Master (Master Programme)
Modelling wastewater treatment processes is the key to improve and optimize treatment
performance, and the task has been the topic of various research for decades. However,
the problem remains a major challenge in both academia and industry as the wastewater
processes are highly nonlinear, coupled and time-varying dynamic systems containing
both physical and biochemical reactions and large time delay features. As a result, the
use of data-driven system identification has increased, introducing the artificial neural
networks as predictive models for the processes.
This study proposes several data-driven identification methods to predict the phosphorus
concentration at a case plant. The wastewater treatment plant (WWTP) of interest
is located in Agtrup, Denmark, and the plant uses a combination of chemical precipitation
and biological phosphorus removal.
In this study, both linear and nonlinear data-driven methods are investigated to obtain
the best model for phosphorus concentration in wastewater. Dynamic mode decomposition
with control is applied to obtain a linear model, however, the model shows poor
generalizability, and is assessed inadequate to predict the inherently nonlinear process.
To accurately model the nonlinearities in the system, two neural network structures
are proposed; a NARX neural network and a long short-term memory network. Bayesian
optimization is applied to optimize the model structure, and results shows that a LSTM
structure with Bayesian optimized hyperparameters has the best prediction performance.
The obtained models are compared based on several statistical measures, including temporal
evaluations, ensuring that the model dynamics reflects the dynamics of the actual
The best model is concluded to the a LSTM with 25 inputs, 2 hidden LSTM layers with
93 units in each and a output layer with a single unit. When validated on new data, the
best model shows strong performance estimating the phosphorus concentration with a
low MSE of 0:0848 and R2 = 0:42.
Publication date28 May 2021
Number of pages77
External collaboratorKrüger A/S
Aviaja Anna Hansen
ID: 413089833