Assessment of a measurement-based machine learning technique for air quality and urban climate modelling

Thesis from Abdul Samad
University of Stuttgart, 2024

true" ? copyright : '' }

Air pollution in the modern world is a matter of grave concern. Due to rapid commercial, social, and economic expansion, pollutant concentrations in different parts of the world continue to increase and disrupt human life. Thus, monitoring the pollutant levels is of primary importance to keep the pollutant concentrations under check. Regular monitoring enables the authorities to take appropriate measures in case of high pollution. However, monitoring the pollutant concentrations is not straightforward as it requires installing monitoring stations to collect the relevant pollutant data, which comes with high installation, maintenance, and personal costs to run the station and evaluate the data. Therefore, other possible options that can be used for this purpose and their comparison with the accustomed techniques are essential. Continuous monitoring stations, mobile measurements, vertical profile measurements, and low-cost sensor measurements were used to collect air quality data from various sources. These datasets were pre-processed and integrated to generate a comprehensive dataset. In this research, the data obtained from different measurement techniques were compared with the urban climate model simulation results and predicted pollutant concentrations using machine learning modelling.

PM2.5, PM10, and NO2 concentrations were predicted at two sites in Stuttgart (Marienplatz and Am Neckartor) using machine learning methods. The pollutants were measured with the help of monitoring stations at these locations. Five machine learning models, namely ridge regressor, support vector regressor, random forest, extra trees regressor, and extreme gradient boosting, were adopted for this study. Meteorological parameters, traffic data, and pollutant concentrations from nearby monitoring stations from January 1, 2018 to March 31, 2022, were considered input data to model the pollutants. From the results, it was concluded that the pollutant information from the nearby stations has a significant effect in predicting the pollutant concentrations. Further, whether a similar methodology can be applied to estimate pollutant concentrations at other locations was investigated. This procedure was tested on the data of the monitoring station Karlsruhe-Nordwest located in another German city named Karlsruhe, situated at a distance of approx. 80 km from Stuttgart. The results demonstrated that this method is applicable in other areas as well.

Urban climate model PALM-4U was applied to obtain the pollutant distribution along the federal highways B14 and B27 in Stuttgart to compare these results with the measurement data and the data of the machine learning model. The simulations were carried out in two areas of the city, namely the city center and Kaltental with a domain size of 3.2 km × 2 km and 3.2 km × 1.6 km, respectively, having a grid size of 10 m × 10 m × 10 m for each domain. The validation results showed that the pollutant concentrations simulated by the urban climate model PALM-4U were underestimated compared to the field measurements and the results obtained from the machine learning model.

In conclusion, the results of this research indicate that machine learning techniques can be effectively applied to predict air pollutant concentrations, utilizing data from various air quality monitoring techniques. The integration of urban climate models and air quality monitoring data can enhance the accuracy and reliability of air quality predictions, providing valuable insights for the management and mitigation of air pollution.

To the top of the page