Analysis of environmental data is the key to most computational approaches for sustainability. However, such data feature complex spatial pattern at different scales, due to the combination of several spatial phenomena or various influencing factors. Moreover, they are highly dispersed geographically. For this reason, there is the need to develop approaches that can easily scale with the high volume of data coming from different sources and the computational demand of their analysis.
Federated learning has been proposed as a solution to these issues. In this approach, data are analyzed at their source (e.g. on IoT devices). This allows (1) high scalability, (2) low network overhead, (3) specialized machine learning models, and (4) protection of privacy. To account for global information, an aggregate model still co-exists with local models. However, data-collecting devices are usually battery-powered and have limited storage and computational resources. Since our goal is to provide accurate environmental data analysis, respecting the hardware constraints of measurement devices, we need to explore solutions that enhance the capabilities of measuring devices by exploiting Cloud/Edge resources.
To this end, we plan to (1) apply dynamic data reduction and compression techniques, to ensure efficient use of the existing network and storage infrastructure, keeping only the most relevant data at data sources, and (2) exploit dynamic offloading of federated learning processes to the Cloud/Edge, in order to increase battery lifetime of measurement devices.