The Macro: AI Video Is Everywhere and Most of It Is Terrible
I have spent the last year watching AI video tools multiply like bacteria in a petri dish. Runway Gen-3 raised the bar on raw quality. Pika added motion controls. Kling came out of nowhere with impressive results. Luma Dream Machine, Haiper, Stability’s Stable Video, Minimax. The list is long and getting longer every quarter.
And yet. Almost nothing produced by these tools looks like a real film.
The outputs are technically impressive in isolation. A single shot of a woman walking through a field, a car driving through a neon city, a dragon flying over mountains. Beautiful. Coherent for four seconds. And completely disconnected from every other shot. You cannot tell a story with disconnected four-second clips any more than you can write a novel with disconnected sentences.
The problem is not the models. The models are getting better at an absurd rate. The problem is the workflow. Every AI video tool right now is basically a text box. Type a prompt. Get a clip. Type another prompt. Get another clip. There is no concept of scenes, characters, visual continuity, narrative arc, or shot composition. There is no way to say “this character should look the same across all twelve shots.” There is no way to plan a sequence before generating it.
Filmmaking is not about individual shots. It is about the relationship between shots. It is about pacing, framing, emotional progression, visual motifs. The current crop of AI video tools treats filmmaking like image generation with a time axis, and that fundamental misunderstanding is why the output feels hollow.
Some tools are starting to address pieces of this. Runway’s multi-motion brush gives you spatial control. Pika’s scene-to-scene feature attempts continuity. But none of them have built a full creative environment that treats filmmaking as a structured creative process rather than serial prompt engineering.
The Micro: An Instagram AI Engineer and a Festival-Winning Filmmaker Walk Into a Startup
Flick describes itself as “Figma plus Cursor for AI filmmaking.” That is a loaded comparison but I think it is the right one. Figma gave designers spatial reasoning in a collaborative canvas. Cursor gave developers AI-native code editing. Flick is trying to give filmmakers spatial, narrative, and AI-native tools in a single workspace.
Ray Wang and Zoey Zhang are the founders. Ray was a founding engineer on the AI team at a major social platform. Zoey is an award-winning filmmaker and product designer. That combination is not something I see often. Usually AI creative tools are built by engineers who do not understand creative workflows, or by creatives who cannot build the technology. Flick has both in the founding team.
They are a two-person company out of Sunnyvale, backed by Y Combinator’s Fall 2025 batch. The product is live and the results speak for themselves. Four short films made entirely with Flick have won over 20 international film festival awards. That is not a marketing claim you can fake.
The infinite canvas is the core interface. Instead of working in a linear timeline, you lay out your film spatially. Scenes, shots, characters, references, all on an infinite canvas with an integrated chat for directing the AI. It borrows from Figma’s spatial model and adds filmmaking-specific features like reusable character templates, studio collaborations, and built-in editing tools.
Multi-model integration is key. Flick is not locked to a single video generation model. It orchestrates across models and focuses the user experience on scripts, characters, and scenes rather than raw prompts. This is the right approach because no single model is best at everything, and the model landscape changes every three months.
There is also a Perplexity integration for pulling classic film shot references, which tells me they are building for people who care about cinema, not just people who want cool clips for social media. That distinction matters. The social media clip market is a race to the bottom. The creative filmmaking market is a premium positioning play.
The Verdict
Flick is the first AI video tool I have used where I thought “this is actually designed for making films.” Not clips. Not shorts for TikTok. Films with narrative structure and visual consistency and intentional composition.
At 30 days, I want to see the creator community growing. Not just users, but people making actual short films and sharing them. The film festival wins are great validation, but the question is whether the tool is accessible enough for creators beyond the founders. At 60 days, I want to understand the model economics. Orchestrating multiple video generation models is expensive, and filmmaking requires iteration. If making a five-minute short costs $500 in compute, the market is limited. If it costs $50, every film student in the world is a potential customer. At 90 days, the question is whether Flick becomes the default workspace for AI filmmaking or whether Runway and Pika absorb these ideas into their own products. Speed matters. The incumbents have distribution advantages and they are watching this space closely. But Flick has a creative philosophy baked into the product that is hard to copy, and that might be the moat that matters most.