Google Gemini 2.0 Flash Adds Conversational Image Editing, Topaz AI Upscaler Gets Pay-As-You-Go, Wan 2.1 Pro Fine-Tuned on FAL, and OpenAI Teases New Writing Model
Creative Workflow Roundup: No fluff, no sponsors, no affiliate links, just this week's key AI + creative tech news and my unfiltered lab notes.
In This Week’s Roundup: Google’s Gemini 2.0 Flash brings interactive editing to creative workflows and Topaz AI’s Upscaler, once a high-cost standalone product, is now available pay-as-you-go across multiple platforms. FAL’s Wan 2.1 Pro pushes video quality further. OpenAI’s new writing model shows promising literary capabilities, and image-to-video options are expanding with Veo 2 on Krea and Freepik. And much more!
Google Gemini 2.0 Flash: Interactive Image Editing with Conversational Control
The News: Google has released Gemini 2.0 Flash Experimental with image generation through Google AI Studio, offering a new way to create and edit images using natural language prompts. The model supports conversational control, allowing users to make adjustments on the fly without needing complex UI controls.
Gemini 2.0 Flash can handle a variety of creative tasks, from generating consistent characters to editing poses and facial expressions. Some users are even using it to create game assets and storyboard scenes, taking advantage of its contextual editing capabilities.
Lab Notes: This feels like Imagen 3 with a chat interface, but it is clear that Google is leaning into making creative workflows more conversational. Being able to edit and refine images in a more natural way is a big advantage. This could set a trend for how creative tools evolve, less focus on rigid interfaces and more on fluid human-AI dialogue. I think we might see more models adopt this kind of interactive editing approach, making it easier to iterate without losing the creative flow.
Baidu’s ERNIE 4.5 and ERNIE X1: Lower-Cost AI for Creative Writing
The News: Baidu launched two new AI models (ERNIE 4.5 and ERNIE X1) focused on writing, ideation, and reasoning. ERNIE 4.5 is a multimodal model with improved text generation and better hallucination control, potentially making it useful for long-form creative work. ERNIE X1 is designed for deep reasoning but still works well for structured writing tasks.
The biggest shift here is pricing. These models are significantly cheaper than competitors like DeepSeek R1 and OpenAI’s GPT models. However, access may be limited, currently, a Chinese phone number is required for sign-up, making it difficult for global users to try them. Baidu plans to open-source ERNIE 4.5 by June 30, 2025, which could help with availability.
Lab Notes: AI for creatives is getting cheaper. While there's been a lot of talk about AI costs going up, especially for video models, text generation is going in the opposite direction. DeepSeek R1 was already a huge price drop, and now ERNIE X1 is even cheaper. It’s not widely available outside China yet, but it shows where things are headed. More affordable, high-quality AI models are on the way.
Topaz AI Upscaler Now Pay-As-You-Go on Multiple Platforms
The News: Topaz AI Video Upscaler, long considered one of the best video enhancement tools, is now available on a pay-as-you-go basis across multiple platforms, including FAL, Freepik AI Suite, and Channel 42. Previously, users had to buy Topaz as a standalone product with a high upfront cost. Now, it’s integrated into different services, making high-quality upscaling more accessible.
Users can now upscale low-resolution or AI-generated footage to HD, 4K, and even 16K, improving sharpness, frame clarity, and overall video quality.
Lab Notes: Getting access to Topaz upscaling without a big upfront cost was previously impossible, this makes it way more flexible to use. Other upscalers exist, but Topaz has been the industry standard for years, and now it’s available on more platforms. Inputs still matter a lot, better starting footage gives better results, but the models keep improving.
Wan 2.1 Pro: A Fine-Tuned Upgrade on FAL
The News: FAL (pay-as-you-go media gen platform) introduced Wan 2.1 Pro, a fine-tuned version of Wan 2.1 that generates 6-second, 1080p, 30FPS videos. Unlike the standard Wan 2.1, which is a relatively affordable model, the Pro version offers higher-quality outputs but comes at a higher price. This version is exclusive to FAL and is not expected to be available on other platforms.
Lab Notes: Wan 2.1 Pro is a step up in quality, but some outputs still have an animated look. That could just be a matter of refining the prompts. Since FAL put in the effort to fine-tune this version, it makes sense that it is staying exclusive. For those looking for a budget-friendly option, the standard Wan 2.1 is still a solid, affordable model.
On a related note, Replicate (another pay-as-you-go media gen platform) has made the standard Wan 2.1 faster and cut prices by 40 to 50 percent. While this does not include the fine-tuned improvements of the Pro version, the speed boost and price drop make the base model more accessible. Both options have their place, depending on the budget and quality needs.
New AI Model Updates on FAL: Easel AI, Wan Effects, and Pika 2.2
The News: FAL has rolled out several new models and features that expand creative possibilities across multiple workflows. The most notable additions include:
Easel AI: Advanced face-swapping with impressive precision and creativity. It supports single-player and multiplayer modes while preserving aesthetic integrity.
Wan Effects: A set of creative transformation effects like squish, muscle, inflate, crush, rotate, gun-shooting, deflate, and cakeify.
Pika 2.2: An image-to-video and text-to-video model from Pika AI with high-resolution 1080p generations.
Lab Notes: There is a lot to experiment with here, and it is exciting to see so many new options in one place. This just shows how fast the creative tool landscape is evolving. It feels like every week there is something new to play with, and it is a good time to experiment with all these different models to find the right fit for each workflow.
OpenAI Teases New Creative Writing Model
The News: Sam Altman, CEO of OpenAI, hinted at a new creative writing model on X, saying it is the first time he has been genuinely struck by something written by AI. He described it as capturing the vibe of metafiction in a way that felt surprisingly authentic. No details yet on the model’s release or accessibility, but the fact that Altman is highlighting it suggests they are aiming to push boundaries in literary AI.
Altman even shared a short story generated by the model that leaned into themes of AI and grief, using a metafictional style to blur the line between storytelling and self-awareness.
Lab Notes: This is interesting because it feels like OpenAI is trying to make a point about AI’s potential in literary writing. GPT-4.5 was already a big upgrade in terms of creative output, and now they are hinting at something even more advanced. It sounds like they are moving towards models that specialize in specific creative tasks rather than just being general-purpose.
Right now, it is just a teaser, and we do not have access to it yet. For practical use, Claude 3.7 Sonnet or GPT-4.5 is still the best option available, especially now that GPT-4.5 is available for ChatGPT Plus users at $20 a month. It will be interesting to see how this new model compares when it eventually rolls out.
Veo 2 on Krea and Freepik: Image-to-Video Expands
The News: Google’s powerful video generation model Veo 2 now supports image-to-video, available on Krea, FAL, and Freepik, making it more accessible for creative professionals. Veo 2 image-to-video allows producers to take a single image and generate high-quality video outputs.
The model is priced at $0.50 per second, which is better compared to Veo 2’s initial text-to-video launch pricing. Despite these updates, broader availability through Google’s own services is still pending.
Lab Notes: The fact that Veo 2 image-to-video is rolling out on Krea and Freepik before Google’s own platforms is a bit surprising. Still, it is good to see it becoming more widely available. The pricing seems to have settled at a more reasonable rate, so that helps with planning for larger projects. I was expecting a bigger rollout directly through Google AI Studio or Google Labs VideoFX, but we will see if that happens later. For now, it is just nice to have more access to Veo 2 image-to-video.