Robust Crowdsourcing and Federated Learning under Poisoning Attacks Público

Tahmasebian, Farnaz (Spring 2021)

Permanent URL: https://etd.library.emory.edu/concern/etds/hm50ts935?locale=pt-BR
Published

Abstract

Crowd-based computing can be described in a way that distributes tasks among multiple individuals or organizations to interact with their intelligent or computing devices. Two of the exciting classes of crowd-based computing are crowdsourcing and federated learning, where the first one is crowd-based data collection, and the second one is crowd-based model learning. Crowdsourcing is a paradigm that provides a cost-effective solution for obtaining services or data from a large group of users. It has been increasingly used in modern society for data collection in various domains such as image annotation or real-time traffic reports. Although crowdsourcing is a cost-effective solution, it is an easy target to take advantage of by assembling great numbers of users to artificially boost support for organizations, products, or even opinions. Therefore, deciding to use the best aggregation method that tackles attacks in such applications is one of the main challenges in developing an effective crowdsourcing system. Moreover, the original aggregation algorithm in federated learning is susceptible to data poisoning attacks. Also, the dynamic behavior of this framework in terms of choosing clients randomly in each iteration poses further challenges for implementing the robust aggregating method in federated learning. In this dissertation, we devise strategies that improve the system’s robustness under data poisoning attacks when workers intentionally or strategically misbehave.

Table of Contents

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Crowd-based data collection: Crowdsourcing . . . . . . . . . . 1

1.1.2 Crowd-based model learning: Federated Learning . . . . . . . 4

1.2 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.1 Evaluation of Truth Inference Methods under Data Poisoning Attack (Chapter 3) . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.2 Enhanced Truth Inference Method under Data Poisoning Attack (Chapter 4) . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.2.3 Robust Federated Learning under Data Poisoning Attack (Chapter 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Related Works 12

2.1 Crowdsourcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.1.1 Truth Inference Methods . . . . . . . . . . . . . . . . . . . . . 13

2.1.2 Data Poisoning Attacks in Crowdsourcing . . . . . . . . . . . 14

2.1.3 Matrix Completion . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2 Federated Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.1 Aggregation Methods in Federated Learning. . . . . . . . . . . 18

2.2.2 Adversarial Attacks on Federated Learning . . . . . . . . . . . 19

2.2.3 Byzantine-Robust Federated Learning . . . . . . . . . . . . . . 21

3 Evaluation of Truth Inference Methods under Data Poisoning Attack 23

3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2 Truth Inference Methods . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.2.1 Direct Computation . . . . . . . . . . . . . . . . . . . . . . . 26

3.2.2 Optimization Based Methods . . . . . . . . . . . . . . . . . . 27

3.2.3 Probabilistic Graphical Models . . . . . . . . . . . . . . . . . 28

3.2.4 Neural Networks and Other Methods . . . . . . . . . . . . . . 31

3.3 Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3.1 Attack Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3.2 Selected Truth Inference Methods . . . . . . . . . . . . . . . . 34

3.3.3 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.4 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.4.1 Heuristic-Based Attacks (HeurAtt): Untargeted . . . . . . . . 37

3.4.2 Heuristic-Based Attacks (HeurAtt): Targeted . . . . . . . . . 42

3.4.3 Optimization Based Attacks (OptAtt): Targeted . . . . . . . . 48

4 Enhanced Truth Inference Method under Data Poisoning Attack 52

4.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.1.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . 54

4.1.2 Attack Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.2 Defense Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.2.1 Boundary Task based Data Augmentation . . . . . . . . . . . 57

4.2.2 Augmentation Phase . . . . . . . . . . . . . . . . . . . . . . . 59

4.2.3 Enhanced Inference Method . . . . . . . . . . . . . . . . . . . 61

4.3 Experiments and Results . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.3.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.3.2 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . 66

5 Robust Federated Learning under Data Poisoning Attack 71

5.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5.1.1 Federated Learning . . . . . . . . . . . . . . . . . . . . . . . . 73

5.1.2 Adversarial Model . . . . . . . . . . . . . . . . . . . . . . . . 74

5.2 Proposed Robust Model Aggregation . . . . . . . . . . . . . . . . . . 75

5.2.1 Truth Inference Method . . . . . . . . . . . . . . . . . . . . . 75

5.2.2 Robust Aggregation Method: FARel . . . . . . . . . . . . . . 76

5.2.3 Reduce Effect of Malicious Clients: FARel adapt . . . . . . . . 80

5.2.4 Further Improving the Defense Capability: FARel hist . . . . 81

5.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.3.1 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . . 82

5.3.2 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . 83

6 Conclusions and Future Work 87

6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6.2.1 Extending robustness in crowdsourcing under data poisoning attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6.2.2 Extending robustness in federated learning setting . . . . . . . 90

Bibliography 91

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Palavra-chave
Committee Chair / Thesis Advisor
Committee Members
Última modificação

Primary PDF

Supplemental Files