End-to-End Latency Decomposition

Data Pipelines, Monitoring & Reliability DS practice problem on Onlearn.

Difficulty: medium.

Topics: End-to-End Latency Decomposition, P99 Latency Percentiles, Critical Path Analysis, Context Propagation, Head-of-Line Blocking, Serialization Overhead, Distributed Systems Architecture, Performance Engineering, Observability and Telemetry, Network Infrastructure, Asynchronous Processing, Request Tracing, Queueing Theory, Serialization Protocols, Resource Contention Analysis, Load Balancing Strategies.

Implement a function decompose latency that takes latency measurements from multiple stages of an ML inference pipeline and produces a comprehensive latency decomposition report. In production ML systems, understanding where time is spent across the inference pipeline (preprocessing, model forward pass, postprocessing, etc.) is critical for optimization. Given per request timing data for each stage, your function should compute end to end statistics, per stage statistics, identify the bottleneck, and determine each stage's percentage contribution. Input: stage latencies: A dictionary mapping stage names (strings) to numpy arrays of latency measurements in milliseconds. All arrays have the same length (one measurement per request). percentiles: A list of integers representing which percentiles to compute (e.g., [50, 95, 99]). Output: A dictionary with the following keys: 'e2e mean': Mean end to end latency (sum of all stages per request, then averaged), rounded to 2 decimal places. 'e2e percentiles': A dict mapping each requested percentile to its value computed over per request end to end latencies, rounded to 2 decimal places. 'stage stats': A dict mapping each stage name to a sub dict containing 'mean' (rounded to 2), 'std' (population std, rounded to 2), and 'percentiles' (dict of percentile values, rounded to 2). 'bottleneck': The name of the stage with the highest mean latency. 'stage pct': A dict mapping each stage name to its percentage contribution to the mean end to end latency, rounded to 2 decimal places.