About InfiniteTalk
InfiniteTalk is an audio-driven video generation framework focused on dubbing and portrait animation. It aligns lip shapes, head motion, and expressions with speech while preserving identity over long durations. InfiniteTalk supports both image-to-video and video-to-video, enabling you to start from a single image or adapt an existing clip.
What is InfiniteTalk?
InfiniteTalk is a sparse-frame video dubbing method. Rather than conditioning motion only at the mouth, it uses the audio signal to influence head turns, posture, and facial changes. The method emphasizes temporal stability so the output remains coherent for minutes, not just a short clip.
Key Features
- Audio-driven control of lip shapes and upper-body motion.
- Long-duration generation with segment overlap for continuity.
- Works with a single image or a source video.
- Guidance values and sampling steps that are easy to tune for timing and stability.
How to Use InfiniteTalk
- Pick your mode: image-to-video for avatars, or video-to-video for dubbing.
- Start at 480p and moderate sampling steps; preview a short segment.
- Adjust audio/text guidance if lips lag timing; then render the full piece with overlap between segments.
Note: This is an unofficial about page for InfiniteTalk, created for educational purposes.