Files

Résumé

As modern machine learning continues to achieve unprecedented benchmarks, the resource demands to train these advanced models grow drastically. This has led to a paradigm shift towards distributed training. However, the presence of adversaries—whether malicious or unintentional—complicates the training process. These attacks present notable security and performance challenges. This thesis primarily focuses on enhancing the Byzantine robustness in distributed machine learning. More precisely, we seek to enhance Byzantine robustness across varying conditions, including heterogeneous data, decentralized communication, and preserving input privacy. In this thesis, we formalize these problems and provide solutions backed by theoretical guarantees. Apart from Byzantine robustness, we investigate alternative communication schemes in decentralized learning and methods for improving sample complexities in conditional stochastic optimization (CSO). In decentralized learning, gossip is predominantly the communication technique employed. However, it is susceptible to data heterogeneity and is slow to converge. We introduce a novel relay mechanism implemented over the spanning tree of the communication graph, offering independence of data heterogeneity. Lastly, in addressing the CSO problem, we observe that its stochastic gradient possesses inherent bias stemming from the nested structure of its objective. This bias contributes to an overhead in sample complexity. In this thesis, we enhance the sample complexity by deploying variance reduction and bias correction methods.

Détails

PDF