Innovate Futures @ Benji

MultiTalk With Wan 2.1 - A New High-Level Talking Avatar! Still Need A Subscription Plan?

Added 2025-06-23 12:58:00 +0000 UTC

Discover MultiTalk, the next-gen talking avatar AI built on Wan 2.1, revolutionizing conversational lip-sync videos locally! This tutorial dives into how MultiTalk leverages diffusion transformers to create smooth, natural-looking avatars—supporting single or multiple speakers, musicians, and even animated characters. Learn how to set it up in ComfyUI using the WAN Video Wrapper, optimize performance with LightX2V LORA for low sampling steps, and generate long-duration videos (1-2 minutes+) with precise lip-syncing. Perfect for content creators, animators, and AI enthusiasts looking to replace outdated tools like Live Portrait with advanced, locally runnable AI.

Who is this content suitable for?

AI video creators, animators, YouTubers, podcasters, developers, and anyone interested in cutting-edge AI lip-syncing and talking avatar generation.

Why it matters:

MultiTalk eliminates the robotic, masked-mouth animations of older tools (e.g., Live Portrait) by using transformer-based frame-by-frame generation, delivering lifelike results. Despite high hardware demands (e.g., RTX 4090 or 16GB VRAM+), it enables long-form video synthesis with minimal distortion—ideal for podcasts, music videos, and dynamic storytelling.

MultiTalk

https://github.com/MeiGen-AI/MultiTalk

Custom Node

https://github.com/kijai/ComfyUI-WanVideoWrapper/tree/main

Fork Project for MutliTalk

https://github.com/MaTeZZ/ComfyUI-WanVideoWrapper-MultiTalk

AI Model

https://huggingface.co/MeiGen-AI/MeiGen-MultiTalk

Attached basic workflow below, in case you don't know there's one in example folder.

Have Fun!