Machine Learning for predicting wastewater pollutants in space and time

Embedding visualization, Pan et al., 2019.

Objective:

In this MSc research project, you will use machine learning (ML) models to reproduce time series that depend on spatial features. Most ML time series methods can be used to predict future time datasets at locations where historical data is available. Here, you will focus on predictions based on spatial features from locations without time-series records.
Time-series captured at the block level from a small town. Adapted from Pan et al., 2019.

Description:

Nowadays, a wide variety of time series are captured with sensors located at different places for monitoring purposes. Unfortunately, sensors can be expensive, and placing sensors to cover a study area fully is frequently impossible. ML time series forecasting faces predicting future values in time-series in specific locations, even considering spatial features. However, research is needed for predictions where no previous time-series datasets are available.

Can we adapt spatiotemporal ML models to predict time series at locations that provide spatial features but no previous time-series datasets are available? We look for predicting their time-series by creating an ML model based on other locations that own datasets of spatial features and similar time-series. The approach is innovative as we could apply such ML models to predict possible sensor monitoring stations. As support, it will be studied if the reproduced time series explain spatiotemporal patterns of the phenomena under study. See (Augustijn & Zurita-Milla, 2013).

Machine learning techniques are being studied to consider spatial features linked to time-series datasets of monitoring stations to model phenomena. Several algorithms exist (LSTM, SARIMAX, RF, VAR); Long Short-Term Memory (LSTM) has been used in time series datasets to study air quality prediction, traffic prediction, and taxicabs flow prediction(Pan et al., 2019). Experimenting with different algorithms and assessing current spatiotemporal ML methods requires a better understanding of the time-series forecasting capabilities from a spatiotemporal perspective.

A time-series synthetic dataset produced with an ABM will be available for this MSc topic. The domestic wastewater time-series dataset that belongs to a small locality allows an easy understanding and interpretation of your future experiments to assess different ML time series algorithms. The topic provides an opportunity to conduct groundbreaking research while acquiring knowledge that prepares you for the AI market, where more and more data is being captured with sensors.

References:

  • Augustijn, E. W., & Zurita-Milla, R. (2013). Self-organizing maps as an approach to exploring spatiotemporal diffusion patterns. International Journal of Health Geographics, 12(1), 60. https://doi.org/10.1186/1476-072X-12-60

  • Pan, Z., Liang, Y., Zhang, J., Yi, X., Yu, Y., & Zheng, Y. (2019). HyperST-Net: Hypernetworks for Spatio-Temporal Forecasting. www.aaai.org

Domain(s):

Study Program(s):

Researchers working on this field: