Perplexity for Language Models

Text Representation & Classical NLP DS practice problem on Onlearn.

Difficulty: medium.

Topics: Understanding Information-Theoretic Evaluation of Language Models, Negative Log-Likelihood, Exponentiation of Entropy, Geometric Mean of Inverse Probabilities, Numerical Stability in Log-Space, Normalization over Vocabulary, Natural Language Processing, Probability Theory, Information Theory, Statistical Modeling, Evaluation Metrics, Language Modeling, Cross-Entropy, Maximum Likelihood Estimation, Tokenization, Sequence Modeling.

Implement a function that calculates the perplexity of a language model given a sequence of probabilities (or log probabilities) assigned to the tokens in a test set. Assume the input is a list of log probabilities for each token in the sequence.