Autoregressive Video Chunk FPS Calculator
Detection, Video & Advanced Vision DS practice problem on Onlearn.
Difficulty: medium.
Topics: Autoregressive Video Chunk FPS Calculator, Frames Per Second (FPS), Latency Estimation, Chunking Strategy, Recursive State Update, Throughput Bottleneck Analysis, Computer Vision, Time Series Analysis, Software Engineering, Performance Optimization, Digital Signal Processing, Video Frame Processing, Autoregressive Modeling, Computational Complexity, Buffer Management, Temporal Data Synchronization.
Modern real time video generation models such as Helios operate autoregressively : they produce video one chunk at a time, where each chunk is a fixed number of frames. Within each chunk, the model runs a diffusion denoising loop for a fixed number of steps. Understanding the relationship between chunk size, denoising steps, per step latency, and context encoding overhead is essential for predicting whether a model can meet a real time FPS target. Write a function compute video generation fps that calculates the end to end throughput of an autoregressive video generation pipeline. The function should accept the following parameters: num chunks (int): Total number of video chunks to generate chunk frames (int): Number of frames produced per chunk denoising steps (int): Number of diffusion denoising steps run per chunk time per step ms (float): Wall clock time in milliseconds for a single denoising step context encoding ms (float, default 0.0): Additional overhead in milliseconds per chunk for encoding historical context (compressed history tokens, VAE encoding, etc.) realtime fps threshold (float, default 24.0): The FPS value at or above which generation is considered real time The function should return a dictionary with the following keys: total frames (int): Total number of frames generated across all chunks total time ms (float): Total wall clock generation time in milliseconds total time s (float): Total time in seconds, rounded to 4 decimal places fps (float): Frames per second, rounded to 2 decimal places time per chunk ms (float): Time spent generating a single chunk in milliseconds is realtime (bool): True if fps = realtime fps threshold