OpenLUR: Off-the-shelf air pollution modeling with open features and machine learning17.06.2020
In our paper "OpenLUR: Off-the-shelf air pollution modeling with open features and machine learning" that was recently published at atmospheric environment we introduced OpenLUR, an easily and globally applicable land use regression framework for air pollution modelling.
Air pollution is a major concern in modern cities. Therefore, authorities as well as research institutions and citizen movements attempt to measure air pollution. However these measurements can only be recorded at a small number of locations throughout a city, e.g. due to the high costs of measurement equipment, and are only valid at the location they were taken. In order to obtain information about locations where no measurements were taken, researches investigate the relationship between pollutants and land use features like industrial buildings or traffic density in the vicinity. Up until now, these features were obtained from local or national authorities or commercial products, which restricts availability or makes the features expensive.
In our paper "OpenLUR: Off-the-shelf air pollution modeling with open features and machine learning" that was recently published at atmospheric environment we introduced OpenLUR, an easily and globally applicable land use regression framework, building on top of two advancements in land use regression:
Openly and globally available features: We extracted land use regression features solely from the openly as well as globally available datasource OpenStreetMap (OSM). We showed that our OSM features where able to outperform local features significantly.
Easy to use models: Previous land use regression studies mostly relied on simple linear regression or Random Forests to model the dependency of pollutants on land use features. We tested the applicability of state-of-the-art methods like automated machine learning which do not require tedious manual hyper-parameter tuning. While the differences where not significant these modern methods outperformed traditional models consistently.
To show the global applicability of OpenLUR we performed cross-learning on pollution measurements from two different cities: Zürich and London. That means we enhanced the performance of a model for a small dataset by simultaneously training on another, bigger dataset from a different city.
Thus, OpenLUR provides easy to use land use regression using openly and globally available data and state-of-the-art machine learning. Additionally, it opens the area of land use regression for very small dataset through cross-learning.