Back to AI information
Meituan's LongCat team launched LongCat-Video-Avatar: for long-time speaker video generation and multi-character scenes

Meituan's LongCat team launched LongCat-Video-Avatar: for long-time speaker video generation and multi-character scenes

AI information Admin 126 views

Meituan's LongCat team announced the release of LongCat-Video-Avatar in the LongCat-Video codebase update, and simultaneously launched the project page and Hugging Face weights. Based on the LongCat-Video architecture, the model supports Audio-Text-to-Video (AT2V), Audio-Text-Image-to-Video (ATI2V), and video continuation with audio conditions, covering single-person, multi-character and long-duration content generation.

According to public materials, LongCat-Video-Avatar focuses on long sequence stability and more natural dynamic performance: Cross-Chunk Latent Stitching reduces degradation and seam problems in long video generation, and uses Reference Skip Attention to reduce "hard copy" traces while maintaining identity consistency; At the same time, a decoupling guidance strategy is proposed to reduce the over-dependence on voice signals and improve the problem of too stiff silent segments. The team cited EvalTalker as a benchmark for human evaluation in the model card and showed the comparison of naturalness and realism, but details such as external list rankings and participant size were not fully disclosed on the public page, and the relevant conclusions still need to be based on the evaluation paper and reproducible experiments.

FAQs

Q: What model is the LongCat-Video-Avatar?

A: LongCat-Video-Avatar is an audio-driven video generation model for character performance, emphasizing long-timing stability, lip-syncing, and identity consistency.

Q: What generation modes does the LongCat-Video-Avatar released by Meituan's LongCat team support?

A: LongCat-Video-Avatar supports AT2V, ATI2V, as well as video continuation and long video expansion for audio conditions.

Q: What is the difference between LongCat-Video-Avatar and InfiniteTalk?

A: LongCat-Video-Avatar emphasizes more natural dynamics and more stable long sequence performance in the introduction, and uses Reference Skip Attention to reduce the "copy-paste" artifact caused by reference image injection.

Q: What risks should developers be aware of when using LongCat-Video-Avatar?

A: Developers need to pay attention to portrait and audio licensing, compliance and content security, and avoid generating misused character content without permission.

Meituan LongCat released the Avatar video model Meituan LongCat launched an audio-driven avatar LongCat-Video-Avatar weights are public LongCat-Video-Avatar supports long videos LongCat-Video-Avatar focuses on stability Meituan LongCat strengthens lip shape and identity LongCat-Video-Avatar supports AT2V LongCat-Video-Avatar supports ATI2V LongCat-Video-Avatar supports continuation Meituan LongCat covers multiple character generation Meituan LongCat supports single long duration LongCat-Video-Avatar Drop Seam Degradation LongCat-Video-Avatar introduces stitching Meituan LongCat uses Latent splicing LongCat-Video-Avatar is more natural and dynamic LongCat-Video-Avatar reduces artifacts Meituan LongCat uses Skip attention LongCat-Video-Avatar guarantees the same identity Meituan LongCat reduces hard copy traces LongCat-Video-Avatar decoupling guidance Meituan LongCat improves silence and stiffness LongCat-Video-Avatar weakens voice dependencies Meituan LongCat cited EvalTalker review LongCat-Video-Avatar Show Comparison Meituan LongCat did not disclose the details of the list LongCat-Video-Avatar needs to be reproducible Meituan's LongCat synchronization project page is online Meituan LongCat synchronous HF weights released LongCat-Video-Avatar is performance-oriented LongCat-Video-Avatar emphasizes realism LongCat-Video-Avatar emphasizes naturalness Meituan LongCat updated the codebase released LongCat-Video-Avatar is suitable for creation Meituan LongCat promotes audio and video generation LongCat-Video-Avatar supports conditional continuation LongCat-Video-Avatar supports extensions Meituan LongCat overlays audio condition generation LongCat-Video-Avatar supports Wensheng Video LongCat-Video-Avatar supports audio-visual Meituan LongCat improves long sequence consistency LongCat-Video-Avatar solves seams Meituan LongCat optimization reference diagram injection LongCat-Video-Avatar is benchmarked against InfiniteTalk Meituan LongCat emphasizes differences and highlights LongCat-Video-Avatar is suitable for multiple scenarios Meituan LongCat warns of portrait licensing risks LongCat-Video-Avatar is subject to compliant use Meituan LongCat emphasizes content security boundaries LongCat-Video-Avatar avoids simulation abuse

Recommended Tools

More