Creative Workflow Roundup: GPT-4.5 Creativity Boost? Claude 3.7’s Hybrid Reasoning, Alibaba Wan 2.1 AI Video Upgrades, and More
In This Week’s Roundup: OpenAI drops GPT-4.5 with improved creativity and mixed reviews (the API cost is expensive, although it will be rolling out to the $20/m Plus tier soon), Claude 3.7 introduces deeper reasoning, and Pika 2.2 improves video generation. Meanwhile, BRIA AI aims to tackle copyright concerns, and Wan 2.1 proves that high-quality video models don’t have to be expensive (sorry, Google Veo 2). Add in fine-tuned video tools, 4D Gaussian splats, research tools, new diffusion LLMs, and enhanced speech-to-text models, and more… there’s a lot to cover this week.
Alibaba’s Wan 2.1: High-Quality Open-Source Video Model Now on Replicate and Fal
The News: Alibaba's Wan 2.1, a state-of-the-art open-source video model, is now available on both Fal and Replicate. These platforms function as pay-as-you-go model aggregators, making it easier for users to access and run advanced AI models without complex setup. Wan 2.1 is Alibaba's highest-quality model yet, featuring 14 billion parameters and offering high-speed generations optimized by WavespeedAI. Pricing is notably low compared to other leading models, with a 5-second video costing just $0.40. In terms of performance, a 5-second video takes 39 seconds to generate at 480p and 150 seconds at 720p.
Lab Notes: This one caught my eye because 1) the price is so low and 2) the quality is great even when compared to something like Google’s Veo 2, which isn't cheap. Fal and Replicate seem to be the best spots right now for getting quick access to the latest tools.
OpenAI Releases GPT-4.5: Improved Creativity, Higher Costs
The News: OpenAI has launched GPT-4.5, its latest (non-reasoning) large language model, available now for ChatGPT Pro subscribers at $200 per month. The model improves on GPT-4 with better pattern recognition, deeper creative insight, and stronger emotional intelligence. OpenAI claims it offers the most natural and nuanced responses yet. Creative professionals may see noticeable improvements in writing, storytelling, and design tasks, though the cost may be a barrier for some. For now, there’s only Pro tier and expensive API access available, while ChatGPT Plus at $20 per month (and Team users) are expected to get access within a week.
Lab Notes: The reaction to this has been mixed. Some say it is just an iterative and expensive update, while I’ve seen others point out that it is one of the biggest models ever released and showing strong advancements in creative and collaborative tasks. Later this year, OpenAI wants to merge these advancements into a hybrid model, likely called GPT-5, similar to how Claude 3.7 is built for hybrid reasoning (more on that later). From what I have seen so far from GPT-4.5, the real leap is in creativity, thoughtfulness, and storytelling. OpenAI even developed a new internal "vibes" benchmark to measure its improvements, which says a lot. The price will likely drop over time, but even at this cost, it could be valuable for those who know how to use it. Most people still have not integrated AI into their workflows, and adoption takes years, even if AGI showed up tomorrow. In the meantime, there are solid cheaper alternatives like GPT-4o (yes, 4o is still a great model). OpenAI has quietly (only with a small announcement via X) updated 4o’s creative capabilities (better writing, etc) and it delivers strong results when prompted thoughtfully. I do not have Pro, but once Plus users get access sometime next week, I will be testing out GPT-4.5. Curious to see if it actually changes my workflow. I am especially curious about improvements in image/video prompt generation/adaptation and general coding tasks for custom creative tools. I’ll report back asap!
Claude 3.7 Sonnet: Better Thinking, Better Coding, and New Claude Code Tool
The News: Anthropic has released Claude 3.7 Sonnet, a major upgrade focused on hybrid reasoning. The model can switch between near-instant conversational creative responses and extended step-by-step logical thinking, making it more adaptable for complex tasks. It is designed to excel in instruction-following, math, physics, and coding, with API users gaining precise control over how long the model can "think" before responding.
Anthropic has also introduced Claude Code, an agentic coding tool that allows users to delegate substantial programming tasks directly from the terminal. In early testing, it has significantly reduced manual coding time, completing some tasks in a single pass that would normally take 45 minutes or more. Claude Code is launching in a limited research preview.
Lab Notes: Claude has already been one of the best models for both coding and creative writing, and this update pushes it even further. Even before this release, the previous Claude 3.5 Sonnet was a strong choice for coding-heavy workflows. I would not compare this release to GPT-4.5 since Claude 3.7 is a hybrid reasoning model, in general Anthropic’s models are tuned differently and excel at different use cases in my testing. I often switch between LLM’s on a task-by-task basis. I recommend learning Cursor and using the Claude 3.7 integration to start experimenting with building custom creative tools. I’ll write more on this soon!
Pika 2.2 Brings 1080p Video, Longer Generations, and Keyframe Control
The News: Pika, a powerful AI-powered video generation platform, has released their Pika 2.2 video model, introducing 10-second video generations, 1080p resolution, and Pikaframes, a new keyframe feature for output control that allows for smooth transitions anywhere from 1 to 10 seconds. This update gives users more control over how scenes evolve, making animations and edits more seamless. While Pikaffects and Pikadditions are not yet compatible with this version, Pikascenes still works, allowing for consistent characters, objects, and locations across shots.
Lab Notes: This is a good update on top of so many solid Pika updates recently! Pika has always been about more than just model quality, it is the creative toolset that makes it stand out. With 2.2 video model now available, the improvements are there but they feel more like a refinement than a complete overhaul. The ability to generate higher-res, longer clips, and add control frames makes a big difference. The lack of Pikadditions (a feature that makes it easy to add anything into a scene) for now is a bit of a limitation.
BRIA AI Brings Copyright-Safe Image Generation and Editing to Fal
The News: BRIA AI, a platform focused on commercial AI image generation and editing, is now available on Fal. Unlike many other AI models, BRIA AI is trained exclusively on licensed data, aiming to provide a safer option for businesses concerned about copyright and compliance. It offers large-scale image generation, real-time editing, and tools for maintaining brand consistency across visuals. A key feature is its attribution engine, which compensates data contributors based on their content’s impact on AI-generated outputs.
Lab Notes: This is another example of AI companies adapting to copyright concerns, but whether training on licensed data actually makes a model "safer" is still unclear. No one really knows yet. Companies like BRIA and Adobe are betting that controlled datasets reduce legal risk, but there has not been a definitive ruling on whether it makes a difference in practice. For now, it is mostly about perception. Clients who are cautious about AI-generated content may feel more comfortable with models that claim to use only licensed material.
At the same time, I personally will try almost all models as long as I am creating original work and using them responsibly. The real takeaway here is that AI companies are now offering different levels of risk management, which means more options depending on what a project needs.
Separately, the U.S. Copyright Office recently ruled that AI-generated work can be copyrighted if it is sufficiently modified and integrated in a unique way. That adds another layer to the discussion, but it does not directly address whether training with unlicensed “publicly available data” impacts the legal status of companies releasing AI models. That issue is still being litigated between major companies, although not against individual users.
I think more copyright and legal clarity and access to licensed models are all wins for creatives who use AI responsibly in their workflows.
This commentary reflects my perspective and research. It is not intended as legal advice. If you have specific questions about copyright or AI usage, consult a legal professional.
Additional Findings: Quick Updates on Important News and Workflows
Deep Research for ChatGPT Plus Users – OpenAI is expanding deep research access to Plus users, offering PhD-level research results and papers. Great for studying creative workflows or niche styles.
Veo 2 Image-to-Video on Fal – Fal now supports Google’s expensive Veo 2 video model with image-to-video capability and offers a guide on how to master Veo 2 for better video generation.
4D Gaussian Splatting for Video – A new breakthrough in capturing realistic moving scenes with 4D Gaussian splatting, pushing the boundaries of film, VFX, and XR.