Creative Workflow Roundup: Veo 2 vs. Sora, Lipsync Models, and the Latest AI Tools

Dec 21, 2024

∙ Paid

In This Week’s Roundup: Both Google’s Veo 2 and KLING 1.6 push the boundaries of video generation, competing with OpenAI’s Sora. Meanwhile, Google’s Whisk introduces a playful, image-driven approach to creativity, and Sync Labs raises the bar for lipsync animation workflows, now available on Fal. Instagram teases AI-powered video editing tools that could transform creator workflows. Plus, updates from Runway, Bytedance, and more!

Google’s Veo 2 vs OpenAI’s Sora: Next-Gen AI Video Creation Takes Shape

The News: Google has officially unveiled Veo 2, the next iteration of its AI video generation model, and it’s already making waves despite its limited early access. Veo 2 introduces cutting-edge features like 4K resolution, improved human movement and expressions, better temporal coherence, and advanced physics modeling. These updates give its generated videos a more polished, lifelike quality compared to earlier models.

One of Veo 2’s standout features is its ability to interpret prompts in a way that aligns with how producers and filmmakers naturally describe scenes. For example, you can prompt it to generate "a classic car chase," and it translates that into a cinematic result with highly detailed motion and textures. Other tools like Luma's Dream Machine have recently introduced similar natural-language capabilities to simplify complex prompting.

The Veo 2 model is currently accessible only via a waitlist through Google Labs VideoFX, but it’s already outperforming competitors like OpenAI’s Sora. While Imagen 3, Google’s updated image-generation model, also launched this week with improvements in composition and style versatility, Veo 2 is clearly stealing the spotlight.

Lab Notes: While it delivers some of the best video results seen to date, it is still not available broadly. In the meantime, OpenAI’s Sora model, which launched recently, currently remains more accessible despite its cost for watermark-free outputs and other limitations. However, the early buzz is clear: Veo 2 is being hailed as a better, more consistent tool.

Sora’s launch attracted a lot of hype throughout the year, but Veo 2 has now effectively overshadowed it. Many with early access to both models have noted that Veo 2 has slightly better outputs, which is likely to shift momentum in Google's favor as the model rolls out to more users. High-profile voices like Marques Brownlee, report that Veo 2 consistently produces more accurate, realistic results than Sora.

For now, I’m keeping a close eye on how Veo 2’s rollout unfolds and how it measures up in real-world production scenarios. While the tech is exciting, broader access and usability will ultimately determine whether it lives up to the hype. I’m curious to see how Google navigates this balance and how other competitors like Runway and Luma Labs will respond.

KLING AI 1.6: Another Strong Contender in AI Video Generation

The News: KLING AI has introduced KLING AI 1.6, a major update to its video generation platform, boasting improved video quality, prompt adherence, and motion realism. Early reports suggest that KLING AI 1.6 is capable of producing video outputs on par with leading models like Google’s Veo 2, offering smooth, lifelike motion even in complex actions like dancing and turning.

In addition to the 1.6 release, the KLING AI 1.5 model has also been updated to support Standard Mode and 720p resolution, broadening its usability.

Lab Notes: The improvements in KLING AI 1.6 are substantial, especially in areas like motion realism and handling dynamic actions. Smooth, believable motion—whether it’s a dancer twirling or fast-paced turning—is a challenging benchmark for AI video tools, and KLING 1.6 seems to perform well here.

With so many new models launching and competing for attention, it’s becoming increasingly difficult to keep track of what each one offers. At this point, producers and creatives might find that the only way to navigate this crowded space is to test these tools individually and determine which model fits their specific workflows and creative needs.

Whisk: A Visual Approach to AI-Driven Creativity

The News: Google Labs has unveiled Whisk, a tool designed to rethink how users engage with AI for image generation. Instead of relying on detailed text prompts, Whisk lets users drag and drop images to define elements like subject, scene, and style. Powered by Google’s Gemini and Imagen 3 models, the tool analyzes the essence of these inputs and uses them to create novel outputs.

Whisk is positioned as a tool for rapid exploration, not precise editing, giving users the freedom to experiment and refine as they go. The tool is currently in an experimental phase and available to users in the US.

Lab Notes: Whisk’s “Prompt Less, Play More” branding taps into the growing fatigue with complex prompting systems and offers a more intuitive way to explore ideas. By letting users rely on images to define subject, scene, and style, it simplifies the creative process and makes generating unique results feel playful and approachable. This approach reflects a broader trend in AI tools moving toward more natural and flexible workflows, which could resonate strongly with hobbyists or casual creators.

That said, it’s hard to see Whisk fitting into a professional workflow. Google hasn’t made it clear who the target audience for Whisk is—whether it’s professional artists and producers, casual “creators”, or just anyone curious about AI.

As tools like Whisk emerge, the anti-prompting trend could open new doors for how we approach creativity in 2025. For now, Whisk feels like a unique way to experiment, but I’m not sure about its long-term role in the AI creative landscape.

Fal Partners with Sync Labs for Realistic Lipsync Model

The News: Fal has announced a partnership with Sync Labs to release API and pay-as-you-go usage for what’s being touted as the world’s best lipsyncing model. This new model, available now via Fal’s platform, generates highly realistic lipsync animations directly from audio.

Creative Workflow Lab

Creative Workflow Roundup: Veo 2 vs. Sora, Lipsync Models, and the Latest AI Tools

This post is for paid subscribers