Innovate Futures @ Benji

Omnigen2 in ComfyUI Installation tutorial - Is This AI Image Model Good?

Added 2025-07-02 14:00:16 +0000 UTC

In this in-depth video, we explore the new OmniGen 2 image editing model and how it compares to other popular AI tools like Flux Context. The video covers a hands-on test of OmniGen 2 using ComfyUI , including text-to-image generation, instruct-guided image editing, and workflows for advanced users. Discover the model’s architecture improvements, such as its dual-path transformer design , and learn how to set it up locally with step-by-step guidance. The creator also shares real-world testing results, performance benchmarks, and direct comparisons between OmniGen 2 and Flux Context in tasks like object replacement, color transformation, and style consistency. Whether you're an AI enthusiast, digital artist, or developer looking to experiment with cutting-edge models, this video provides valuable insights into what OmniGen 2 can (and cannot) do.

Who is This Content Suitable For?

This content is ideal for:

AI developers and researchers working on image generation or multimodal models.

Digital artists and designers interested in using AI tools like ComfyUI , Flux Context, and OmniGen 2.

Tech-savvy creators who want to understand the differences between modern in-context image editing models.

Anyone curious about the current state of AI-generated imagery and practical implementation in local environments.

Why Does This Matter?

Understanding the capabilities and limitations of emerging AI models like OmniGen 2 helps creators and developers make informed decisions when choosing tools for image editing, artistic creation, or integration into larger AI pipelines. This video highlights real-world usage scenarios, setup instructions, and performance insights that are not typically covered in official documentation, making it a valuable resource for anyone experimenting with AI-based creative workflows.

Omnigen2 ComfyUI DOCS

https://docs.comfy.org/tutorials/image/omnigen/omnigen2

Steps (via ComfyUI Org):

1 - Load Main Model: Ensure the Load Diffusion Model node loads omnigen2_fp16.safetensors

2 - Load Text Encoder: Ensure the Load CLIP node loads qwen_2.5_vl_fp16.safetensors

3 - Load VAE: Ensure the Load VAE node loads ae.safetensors

4 - Set Image Dimensions: Set the generated image dimensions in the EmptySD3LatentImage node (recommended 1024x1024)

Input Prompts:

5 - Input positive prompts in the first CLipTextEncode node (content you want to appear in the image)

6 - Input negative prompts in the second CLipTextEncode node (content you don’t want to appear in the image)

7 - Start Generation: Click the Queue Prompt button, or use the shortcut Ctrl(cmd) + Enter to execute text-to-image generation

8 - View Results: After generation is complete, the corresponding images will be automatically saved to the ComfyUI/output/ directory, and you can also preview them in the SaveImage node

image edit workflow attached below, in case you don't know how to get the file from Comfyui Docs.