Unified History Injection for Autoregressive Video Diffusion
Detection, Video & Advanced Vision DS practice problem on Onlearn.
Difficulty: hard.
Topics: Understanding Unified History Injection in Autoregressive Video Diffusion Models, History Buffer Management, Latent Concatenation, Temporal Windowing, Autoregressive Sampling, Feature Map Alignment, Computer Vision, Deep Learning, Generative Modeling, Sequence Modeling, Tensor Operations, Video Diffusion Models, Autoregressive Transformers, Temporal Consistency, Latent Space Representation, Attention Mechanisms.
Implement a class 'HistoryInjector' that simulates the logic of unified history injection. It should accept a history buffer (a list of latents) and a current frame, and return a concatenated tensor representation suitable for an autoregressive transformer. Assume latents are of shape (C, H, W). If the history exceeds a maximum window size, truncate it to the most recent frames.