The landscape of generative AI is shifting from the creation of isolated clips to the construction of cohesive narratives. AI startup Manus has entered this fray with the launch of a new text-to-video tool designed specifically to transform simple user prompts into structured, animated stories.
While industry heavyweights like OpenAI and Google have focused heavily on the visual fidelity and physics of short-form video, Manus is positioning its tool as a storytelling engine. Rather than generating a single standalone shot, the platform aims to plan scenes, craft visuals, and animate a sequenced vision from a single prompt, offering a more structured approach to AI cinematography.
This strategic pivot toward narrative coherence places Manus in direct competition with OpenAI’s Sora and Google’s Veo, as well as established players like Runway and Synthesia. The move signals a broader industry trend where the goal is no longer just “realistic video,” but the ability to maintain visual and logical consistency across a series of scenes.
Beyond the Single Clip: The Focus on Narrative Structure
Most current text-to-video models operate on a “one-shot” basis, where a prompt generates a brief, high-quality clip. Manus is attempting to solve the “coherence problem” by automating the storyboarding process. According to the company, the tool does not just animate a prompt; it plans the sequence of events, ensuring that the transition from one scene to the next feels logical and visually consistent.
In a social media announcement on June 3, 2025, the company detailed the tool’s capabilities:
“Introducing Manus video generation. Manus transforms your prompts into complete stories—structured, sequenced, and ready to watch. With a single prompt, Manus plans each scene, crafts the visuals, and animates your vision. From storyboard creation to concept visualization—your…” ManusAI (@ManusAI_HQ)
By focusing on the storyboard phase, the tool aims to bridge the gap between a conceptual idea and a finished product. This approach is particularly relevant for creators who need to produce explainer videos, promotional content for social media, or conceptual pitches without the need for extensive manual editing or a full production team.
The Competitive Landscape: Global Giants and Open Source
The race for AI video dominance is currently split between two primary philosophies: proprietary “walled gardens” and open-source accessibility. Sora and Veo rely on closed, proprietary models that offer high control and quality but limited access. Conversely, Chinese tech giants such as Alibaba (with its Wan model) and Tencent (with Hunyuan) have leaned toward open-source approaches to accelerate developer experimentation.
Manus, which is owned by the Chinese startup Butterfly Effect and incorporated in the Cayman Islands, occupies a unique space. While it competes with the scale of the American giants, it brings a focus on “agentic” behavior—the ability for an AI to handle multi-step tasks independently. This is a natural extension of the company’s previous work with general-purpose AI agents capable of performing complex research and analysis.
The company’s growth is backed by significant capital. According to reporting from Bloomberg on April 24, the startup received $75 million in funding led by Benchmark Capital.
Comparison of AI Video Approaches
| Provider | Primary Focus | Access Model | Key Strength |
|---|---|---|---|
| Manus | Structured Storytelling | Early Access / Tiered | Automated storyboarding and sequencing |
| OpenAI (Sora) | Visual Fidelity/Physics | Paid Subscribers | High-realism and complex scene physics |
| Google (Veo) | Cinematic Control | Limited Preview | Integration with Google’s creative ecosystem |
| Alibaba/Tencent | Developer Access | Open-Source/Hybrid | Rapid iteration via community access |
Availability and User Access
The text-to-video feature is not yet available to the general public. It is currently restricted to early access users who are subscribed to the Basic, Plus, and Pro tiers. However, the company has stated in its official communications that a general release for the wider public is planned for the near future.

This tiered rollout allows the company to refine the “narrative flow” of its AI—a notoriously difficult technical challenge—before scaling to millions of users. The ability to maintain a character’s appearance and the setting’s consistency across multiple scenes is the “holy grail” of AI video, and Manus is betting that a storyboard-first approach is the way to achieve it.
Key Takeaways for Creators
- Shift to Storytelling: The tool focuses on sequenced scenes rather than isolated clips.
- Automated Pipeline: It handles everything from the initial storyboard to the final animation from a single prompt.
- Strategic Funding: Backed by a $75 million investment led by Benchmark Capital.
- Tiered Access: Currently available for early access users on Basic, Plus, and Pro plans, with a free general release coming soon.
As the industry moves toward 2026, the focus is shifting from what the AI can draw to how the AI can tell a story. Whether Manus can successfully challenge the compute power of Google and OpenAI remains to be seen, but its focus on the creative process—rather than just the output—marks a significant evolution in generative media.
The company is expected to provide further updates on the general release date of the tool in the coming months. We will continue to monitor the rollout and provide performance benchmarks as the tool becomes widely available.
Do you suppose AI-driven storyboarding will replace traditional pre-production, or is it just a tool for rapid prototyping? Share your thoughts in the comments below.