Historical Context Compression Ratio

Text Representation & Classical NLP DS practice problem on Onlearn.

Difficulty: medium.

Topics: Understanding Information Theory in Text Representation, Logarithmic bit allocation, Character frequency mapping, Floating point precision, Data redundancy, Fixed-length vs Variable-length encoding, Information Theory, Data Compression, Probability Theory, Computational Linguistics, Algorithm Complexity, Shannon Entropy, Source Coding Theorem, Frequency Analysis, Bit-depth Calculation, Statistical Modeling.

Implement a function that calculates the 'Compression Ratio' of a given string. The compression ratio is defined as (original size in bits) / (compressed size in bits). Assume the original size is 8 bits per character. For the compressed size, calculate the sum of (frequency of char bit length of huffman code) for each unique character. Use a simplified Huffman style frequency calculation where bit length is ceiling(log2(total chars / char frequency)).