INT8 Quantization
MoE, Compression & Scaling DS practice problem on Onlearn.
Difficulty: hard.
Topics: Understanding INT8 Quantization in Large Language Models, Symmetric Quantization, Affine Quantization Mapping, Quantization Error (Clipping), Dequantization Recovery, Scale Factor Calculation, Numerical Analysis, Deep Learning Optimization, Computer Architecture, Information Theory, Linear Algebra, Model Compression, Weight Pruning, Fixed-point Arithmetic, Dynamic Range Calibration, Floating-point Representation.
Implement a function quantize tensor(tensor, num bits=8) that performs symmetric INT8 quantization on a 1D list of floats. The quantization should map the input range [ max abs, max abs] to [ 127, 127]. Return the quantized integers and the scale factor used.