RL Training GPU Utilization Tracker

Advanced & Deep RL DS practice problem on Onlearn.

Difficulty: easy.

Topics: RL Training GPU Utilization Tracker, Experience Replay Buffer, CUDA Kernel Latency, VRAM Allocation Profiling, Multi-Process Vectorized Envs, Compute-to-Memory Ratio, Reinforcement Learning, Distributed Computing, Hardware Architecture, Performance Engineering, Software Telemetry, Deep Q-Learning, GPU Memory Management, Parallel Environment Simulation, Asynchronous Gradient Updates, Resource Scheduling.

During reinforcement learning training, GPU utilization fluctuates significantly across different phases of the training loop. Environment stepping is typically CPU bound (low GPU utilization), forward passes for action selection use moderate GPU resources, and gradient updates during backpropagation drive high GPU utilization. Tracking and analyzing these patterns helps identify bottlenecks in the training pipeline. Implement a function track gpu utilization that analyzes GPU utilization measurements collected during an RL training run and produces summary statistics per training phase. The function receives: timestamps ms: a list of N floats representing sorted timestamps in milliseconds marking interval boundaries. gpu util pct: a list of N 1 floats representing the GPU utilization percentage (0 100) during each interval. phase labels: a list of N 1 strings representing the training phase active during each interval. Each interval i spans from timestamps ms[i] to timestamps ms[i+1], has duration equal to their difference, GPU utilization gpu util pct[i], and phase label phase labels[i]. Return a dictionary with: total time ms: the total time span (rounded to 2 decimal places) avg gpu util pct: the time weighted average GPU utilization across all intervals (rounded to 2 decimal places) phase stats: a dictionary keyed by phase name (sorted alphabetically), where each value is a dictionary with total time ms, avg util pct, and time fraction pct (all rounded to 2 decimal places) bottleneck phase: the phase with the highest total wasted GPU time (duration multiplied by unused utilization). On ties, choose the phase whose first interval appears earliest. gpu idle fraction pct: the overall percentage of GPU capacity that went unused (rounded to 2 decimal places)