Data-driven Performance Modeling in Complex Networked Systems Restricted; Files Only

Zhang, Yazhuo (Spring 2024)

Permanent URL: https://etd.library.emory.edu/concern/etds/8623j0070?locale=de
Published

Abstract

Complex networked systems are a ubiquitous presence in our daily lives, but these systems require active maintenance and management. Their reliability and efficiency hinge on operators being able to reason about capacity planning, bottleneck identification, and making informed decisions about scaling, load balancing, and system optimization. However, it is challenging to perform performance modeling for complex networked systems because of a multitude of challenges, including the complexity of the components themselves, the intricate dependencies that exist between them, and a gigantic configuration space that must be navigated.

This dissertation focuses on how to conduct data-driven performance modeling in complex networked systems, guided by two main principles. The first principle is to decompose the entire system into key components that have interpretable interactions. The second principle is to leverage empirical system data to inform the modeling process and to guide the decision making.

Using these two key principles, we conduct data-driven performance modeling in three distinct types of general complex systems: microservice-based applications, large- scale web cache systems, and content delivery networks (CDNs). We present LatenSeer, a data-driven modeling framework for estimating end-to-end latency distributions in microservice-based web applications. By leveraging distributed tracing data, LatenSeer models the latency experienced by end users at scale, in an effective, accurate, and robust manner. Next, we discuss Sieve, an cache eviction primitive, inspired by real-world web cache workloads and informed by data on cache item access patterns. Sieve is simpler than LRU and provides better efficiency and scalability than state-of- the-art algorithms. Finally, we introduce Theodon, a framework for modeling CDN architectures via modular simulations, enabling fast discovery of efficient architectures and parameters configurations that balance performance and cost. 

Table of Contents

1 Introduction 1 

1.1 Contributions 3 

1.2 Thesis Organization 7 

2 Background 8 

2.1 Microservice-based Applications 8 

2.1.1 Complicated Dependencies and Interactions 8 

2.1.2 The Need for Modeling Latency 9 

2.1.3 The Proliferation of Distributed Tracing 10 

2.1.4 Challenges of Using Distributed Traces to Estimate Latency 11 

2.2 Web Caching Systems 13 

2.2.1 Increasing Complexity in Cache Eviction Policies 14 

2.2.2 Open-sourced Cache Workloads 15 

2.2.3 Opportunity: Lazy Promotion and Quick Demotion 17 

2.3 Content Delivery Networks 17 

2.3.1 CDN Architectures 18 

2.3.2 Cost of CDN 19 

2.3.3 The Need for Performance and Cost Modeling 20 

3 LatenSeer: Causal Modeling of End-to-End Latency Distributions by Harnessing Distributed Tracing 21

3.1 Introduction 21 

3.2 Design and Implementation 26 

3.2.1 Modeling Latency with Invocation Graph 27 

3.2.2 Aggregating Traces with L-tree 30 

3.2.3 End-to-end Latency Estimation 33 

3.2.4 Making LatenSeer Practical for Production Workloads 35 

3.3 Evaluation 38 

3.3.1 Experimental Setup 38 

3.3.2 Estimation Accuracy 41 

3.3.3 Case Studies 42 

3.3.4 Sensitivity Analysis 47 

3.3.5 LatenSeer Performance 50 

3.4 Discussion 51 

3.5 Related Work 53 

3.6 Conclusion 55 

4 SIEVE: an Efficient Turn-Key Eviction Algorithm for Web Caches 56 

4.1 Introduction 56 

4.2 Design and Implementation 60 

4.2.1 SIEVE Design 60 

4.2.2 Implementation 62 

4.3 Evaluation 64 

4.3.1 Experimental Setup 64 

4.3.2 Efficiency Results 65 

4.3.3 Throughput Performance 69 

4.3.4 Simplicity 70 

4.4 Distilling SIEVE’s Effectiveness 73 

4.4.1 Visualizing the Sifting Process 73 

4.4.2 Analyzing the Sifting process 74

4.4.3 Deeper Study with Synthetic Workloads 76 

4.5 SIEVE as a Turn-key Cache Primitive 81 

4.5.1 Eviction Algorithm Designs 81 

4.5.2 Efficient Cache Primitives 81 

4.5.3 Turn-key Cache Eviction with SIEVE 83 

4.6 Discussion 85 

4.6.1 Byte Miss Ratio 85 

4.6.2 SIEVE is Not Scan-resistant 85 

4.6.3 TTL-friendliness 86 

4.7 Conclusion 88 

5 Theodon: A Modular Framework for CDN Optimization 89 

5.1 Introduction 89 

5.2 Motivation 92 

5.2.1 CDN Examples 92 

5.2.2 Challenges 94 

5.3 Design 96 

5.3.1 Modular Simulation 96 

5.3.2 Searching Engine 99 

5.4 Implementation 102 

5.5 Evaluation 104 

5.5.1 Experimental Setup 104 

5.5.2 Effectiveness 105 

5.5.3 Case Study 106 

5.6 Discussion 111 

5.7 Related Work 112 

5.8 Conclusion 113 

6 Conclusion 114 

6.1 Future Directions 116 

Bibliography 118

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Stichwort
Committee Chair / Thesis Advisor
Committee Members
Zuletzt geändert Preview image embargoed

Primary PDF

Supplemental Files