For AI artists who demand precision and modularity, ComfyUI represents the pinnacle of control over the Stable Diffusion ecosystem. Its node-based, visual workflow allows for complex, repeatable, and easily shareable generation processes. When it comes to creating video, combining the AnimateDiff framework with ComfyUI is a match made in heaven. This AnimateDiff ComfyUI workflow gives you granular control over every aspect of your animation, from the motion module to advanced techniques like prompt travel and ControlNet integration.
This comprehensive tutorial will guide you through every step required to get AnimateDiff running flawlessly in ComfyUI. We'll start with the installation, build a complete text-to-video workflow from scratch, and then explore advanced options that will elevate your animations. If you're serious about creating high-quality AI video, mastering the AnimateDiff ComfyUI workflow is an essential skill. This is how you move from simple GIFs to cinematic sequences.
Phase 1: Installation and Setup
Getting your ComfyUI environment ready for AnimateDiff is a two-step process: installing the necessary custom nodes and downloading the motion models that power the animation. Let's get this foundational work done.
Install the AnimateDiff-Evolved Node
The ComfyUI community has made this step incredibly simple. The recommended package is ComfyUI-AnimateDiff-Evolved by Kosinkadink, which is the most feature-rich and well-maintained option.
- First, ensure you have the ComfyUI Manager installed. It's an essential tool for managing custom nodes.
- In your ComfyUI interface, open the Manager and click on “Install Custom Nodes.”
- Search for AnimateDiff. You will see a few options; select ComfyUI-AnimateDiff-Evolved and click “Install.”
- After the installation is complete, you must fully restart ComfyUI for the new nodes to be recognized.
Download and Place Motion Modules
The AnimateDiff workflow requires a separate motion module. This is the pre-trained model that understands motion. You need to download these and place them in the correct folder.
- You can find official motion modules on Hugging Face or Civitai. A great one to start with is mm_sd_v15_v2.ckpt. Other popular ones include v3_sd15_mm.ckpt.
- Download the motion module file.
- Place the downloaded .ckpt or .safetensors file into the following directory: ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/models/.
- If you download other related models, like Motion LoRAs, they go into the motion_lora subfolder within that same directory.
Once the custom node is installed and your motion module is in place, you're ready to build your first AnimateDiff ComfyUI workflow.
Phase 2: Building a Core txt2video Workflow
Now for the fun part. We will build a complete AnimateDiff ComfyUI workflow node by node. This structure will become the foundation for all your future AI video projects.
Step 1: Load Checkpoint
Every ComfyUI workflow starts with a base model. Add a Load Checkpoint node and select your favorite Stable Diffusion 1.5-based model. This checkpoint defines the artistic style of your video.
Step 2: CLIP Text Encode (Prompts)
Next, we need to tell the model what to draw. Add two CLIP Text Encode nodes. One will be for your positive prompt (what you want to see) and the other for your negative prompt (what you want to avoid). Connect the CLIP output from the Load Checkpoint node to the CLIP input of both text encoders.
Positive Prompt Example: "cinematic shot of a majestic wolf howling at the moon, beautiful northern lights, hyperdetailed, artstation"
Negative Prompt Example: "blurry, low quality, cartoon, watermark, text"Step 3: AnimateDiff Loader
Here's where the magic begins. Add an AnimateDiff Loader node. This is the central control panel for the motion aspect of the AnimateDiff ComfyUI workflow.
- In the model_name field, select the motion module you downloaded (e.g., mm_sd_v15_v2.ckpt).
- The context_options allow you to manage how many frames are processed at once to maintain consistency, which is crucial for longer videos. For now, you can leave it at default.
Step 4: Empty Latent Image
The diffusion process works on a latent image, not pixels directly. Add an Empty Latent Image node. This sets the dimensions of your video. A good starting resolution is 512x512. The batch_size here corresponds to the number of frames in your video. Set it to 16 for a basic animation.
Step 5: Apply AnimateDiff Model & KSampler
This is the most critical part of the ComfyUI workflow. You need a sampler node, like the standard KSampler. This node orchestrates the diffusion process.
- Connect the positive and negative prompt outputs to the positive and negative inputs on the KSampler.
- Connect the latent image output to the latent_image input on the KSampler.
- Crucially: Connect the MODEL output from the Load Checkpoint to the model input of the AnimateDiff Loader node. Then, take the MODEL output from the AnimateDiff Loader and connect it to the model input of the KSampler. This “chains” the motion module into the main model.
Step 6: VAE Decode and Video Combine
The sampler outputs a latent image. We need to decode it into pixels. Connect the LATENT output of the KSampler to a VAE Decode node. Use the VAE from your original Load Checkpoint node for this.
Finally, connect the IMAGE output of the VAE Decode to a Video Combine node. This node will take the sequence of generated images and compile them into a video file. You can set the frame rate (FPS) and choose to save it as a GIF, MP4, or other formats.
Queue your prompt, and watch as your first custom AnimateDiff ComfyUI workflow comes to life! The ability to see the data flow through this graph is what makes ComfyUI so powerful for understanding and debugging the entire process.
Advanced AnimateDiff ComfyUI Techniques
Once you've mastered the basic workflow, you can explore the more advanced features that make the AnimateDiff ComfyUI combination so powerful.
- Context Options: For videos longer than 16-24 frames, the default context can lead to flickering or loss of coherence. The AnimateDiff Loader's context options (like StandardUniform) help create smooth, longer animations by processing frames in overlapping batches. This is a key setting to experiment with for longer-form video.
- Vid2Vid with ControlNet: The true power of this workflow is its modularity. You can insert ControlNet nodes into your graph to guide the animation. For example, you can feed in a video of a person dancing to a ControlNet OpenPose node to guide the AnimateDiff generation, effectively performing a style transfer onto a source motion. This opens up incredible possibilities for video-to-video tasks.
- Prompt Travel: AnimateDiff excels at morphing between concepts. Instead of a single static prompt, you can use specialized nodes to schedule prompt changes. For example, you can have a prompt for “a lion” for the first 8 frames and “a tiger” for the next 8 frames. The AnimateDiff motion module will generate a smooth transition between the two. Dive deeper in our prompting guide.
Mastering this ComfyUI workflow transforms you from a user of AnimateDiff to an architect of AI animation. You have the keys to build complex, layered, and highly specific video generation pipelines limited only by your imagination and GPU VRAM.