Abstract

Operators from various industries have been pushing the adoption of wireless sensing nodes for industrial monitoring, and such efforts have produced sizeable condition monitoring datasets that can be used to build diagnosis algorithms capable of warning maintenance engineers of impending failure or identifying current system health conditions. However, single operators may not have sufficiently large fleets of systems or component units to collect sufficient data to develop data-driven algorithms. One potential solution to overcome the challenge of having limited representative datasets is to merge datasets from multiple operators with the same type of assets. However, directly sharing data across the company’s borders yields privacy concerns. Federated learning (FL) has emerged as a promising solution to leverage datasets from multiple operators to train a decentralized asset fault diagnosis model while maintaining data confidentiality. However, the performance of traditional FL algorithms degrades when local clients’ datasets are heterogeneous. The dataset heterogeneity is particularly prevalent in fault diagnosis applications due to the high diversity of operating conditions and system configurations. To address this challenge, this paper proposes a novel clustering-based FL algorithm where clients are clustered based on their dataset similarity. Estimating dataset similarity between clients without explicitly sharing data is achieved by training probabilistic deep learning models and having each client examine the predictive uncertainty of the other clients’ models on its local dataset. Clients are then clustered for FL based on relative prediction accuracy and uncertainty. Experiments on three bearing fault datasets, two publicly available and one newly collected for this work, show that our algorithm significantly outperforms FedAvg and a cosine similarity-based algorithm by 5.1% and 30.7% on average over the three datasets. Further, using a probabilistic classification model has the additional advantage of accurately quantifying its predictive uncertainty, which we show it does exceptionally well.

Details