Pass@k and Majority Voting Evaluation Metrics

Text Generation & NLP Evaluation DS practice problem on Onlearn.

Difficulty: medium.

Topics: Understanding Pass@k and Majority Voting Evaluation Metrics, Pass@k Unbiased Estimator, Binomial Coefficients, Majority Voting Selection, Sampling Without Replacement, Frequency Distribution Analysis, Natural Language Processing, Large Language Models, Model Evaluation, Statistical Metrics, Probability and Combinatorics, Text Generation Evaluation, Code Generation Metrics, Stochastic Sampling, Self-Consistency Decoding, Unbiased Estimation.

Evaluation of generative models like LLMs requires metrics that account for stochasticity. 1. Implement calculate pass at k(n, c, k): This calculates the probability that at least one of $k$ randomly selected samples is correct, given $n$ total samples generated and $c$ correct samples. Use the unbiased estimator: $1 \frac{\binom{n c}{k}}{\binom{n}{k}}$. If $n c < k$, the probability is 1.0. 2. Implement majority voting(answers): This takes a list of generated answers (strings or numbers) and returns the most frequent answer. In case of a tie, return the one that appeared first among the tied candidates.