Quantization Quality Check via Perplexity Delta
MoE, Compression & Scaling DS practice problem on Onlearn.
Difficulty: hard.
Topics: Understanding Quantization Quality via Perplexity Delta, Perplexity Exponentiation, Negative Log-Likelihood Aggregation, Floating Point Precision Errors, Quantization Noise Floor, Calibration Set Representativeness, Information Theory, Deep Learning Optimization, Model Compression, Statistical Analysis, Numerical Precision, Language Modeling, Quantization-Aware Training, Post-Training Quantization, Loss Landscape Analysis, Calibration Datasets.
Implement a function 'calculate perplexity delta' that computes the increase in perplexity when transitioning from a high precision model (FP16) to a quantized model (INT8/INT4). Your function should take the negative log likelihoods (NLL) of both models over a shared evaluation dataset and return the delta: Perplexity(Quantized) Perplexity(FP16). Assume the input is a list of total NLLs for each model and the total number of tokens processed.