Episode Details

Back to Episodes

LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editing

Episode 1987 Published 1 day, 9 hours ago
Description

🤗 Upvotes: 72 | cs.CV

Authors:
Xinyu Wang, Chongbo Zhao, Fangneng Zhan, Yue Ma

Title:
LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editing

Arxiv:
http://arxiv.org/abs/2606.26740v1

Abstract:
Streaming video editing has made rapid progress, yet practical deployment is still limited by two core issues: maintaining stable backgrounds and non-edited regions over time, and achieving the low latency required for real-time interactive scenarios. Meanwhile, recent streaming video generation methods are mostly developed for synthesis and cannot be directly applied to editing due to the strict preservation requirement and region-specific control. In this work, we present a novel streaming video editing framework that performs causal, frame-by-frame editing with strong content preservation and real-time responsiveness. Our key design is a three-stage distillation pipeline that progressively transfers editing capability from a powerful bidirectional foundation model to an efficient unidirectional streaming editor, enabling stable long-horizon edits without sacrificing visual fidelity. To further support real-time deployment, we introduce an AR-oriented mask cache that reuses region-related computation across frames, substantially reducing redundant processing and accelerating inference. Finally, we establish a dedicated benchmark for streaming video editing. Extensive evaluations demonstrate that our method achieves state-of-the-art visual quality among streaming baselines while drastically boosting inference speed to 12.66 FPS, making it suitable for interactive and augmented reality applications.

Listen Now

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us