GPU Ops:Byte Ratio Calculation from Spec Sheet

Infrastructure, Parallelism & Hardware Efficiency DS practice problem on Onlearn.

Difficulty: medium.

Topics: GPU Ops:Byte Ratio Calculation from Spec Sheet, Floating Point Operations Per Second (FLOPS), High Bandwidth Memory (HBM) Throughput, Operational Intensity, Memory Wall Bottleneck, Compute-to-Memory Ratio, Computer Architecture, Parallel Computing, Memory Hierarchy, Performance Engineering, AI Infrastructure, Arithmetic Intensity, Memory Bandwidth Analysis, Instruction Throughput, Roofline Modeling, Data Movement Optimization.

Task: Compute the Ops:Byte Ratio from GPU Specifications When optimizing deep learning workloads on GPUs, one of the most fundamental hardware awareness metrics is the ops:byte ratio (also called the ridge point or machine balance point ). This ratio, derived from a GPU's spec sheet, tells you the minimum arithmetic intensity (in FLOPs per byte of memory traffic) an operation must achieve to fully utilize the GPU's compute capability. Given a dictionary of GPU specifications containing: compute tflops: A dictionary mapping precision format names (e.g., "FP32", "FP16", "INT8") to peak compute throughput in teraFLOPS (or tera operations per second) memory bandwidth gbps: Peak memory bandwidth in GB/s (gigabytes per second) Write a function gpu ops byte ratio(gpu specs) that returns a dictionary mapping each precision format to its ops:byte ratio in FLOPs per byte (or OPs per byte). The ops:byte ratio represents the boundary between memory bound and compute bound regimes for that precision on the given hardware. Operations with arithmetic intensity below this ratio will be memory bound; those above will be compute bound. All ratio values should be rounded to 2 decimal places.