A row of identical AI video outputs — every tool producing the same generic style — Generated with Nano Banana

We Tested Every AI Short Maker

Tool comparison · May 2026 · 12 min read · Liminalshort.org

The AI video tools market has quietly fractured into two categories that have almost nothing in common. One builds workflows: give it a topic, it assembles a video from script to voiceover to footage. The other builds visuals: give it a text or image prompt, it renders what you describe. If you pick the wrong category for what you actually want to make, no amount of good prompting will help you.

This piece is a category-first comparison. Understanding which type of tool you need answers 80% of the "which AI video tool should I use" question before you ever look at a pricing page.

The fundamental split: automation vs generation

Automation tools — Storyshort, Revid, and their competitors — solve a content operations problem. You want to produce AI-narrated videos on a topic at scale, without recording yourself. The output is largely determined by the pipeline, not by visual direction. They're good at what they do, and what they do is not creative visual control.

Generation tools — Runway, Kling, Pika, Hailuo — solve a creative problem. You have a specific visual in your head and you want to render it. The output is as controllable as your prompt. They require more iteration, more prompt knowledge, and more patience. The upside is that the output can be genuinely distinctive.

The key distinction

Automation tools make things that look like other things made with the same tool. Generation tools make things that look like what you described — which can look like nothing else, if you describe the right thing.

Where each tool sits on the spectrum

Creative control vs output automation — tool spectrum

Positioning reflects prompt adherence, model access breadth, and output controllability. Not a comment on output quality or value for money — each tool excels in its lane.

The automation tools: Storyshort and Revid

Storyshort is built around a specific workflow: you give it a topic, it writes a short script, generates voiceover, and assembles a video from stock clips and AI-generated segments. The output is fast and consistent. It is designed for faceless content creators who want to publish on schedule without recording anything. The limitation is that visual output isn't prompt-controllable in any meaningful sense — you're selecting from a pipeline, not directing a renderer.

Revid takes a similar approach with more emphasis on AI voiceover quality and template-based video construction. It is genuinely strong at producing clean, watchable faceless content at volume. Like Storyshort, it trades visual specificity for speed and operational ease. These are not limitations — they are design decisions for a different use case.

"Storyshort and Revid are content factories, not creative tools. If your goal is posting 30 shorts a month on a topic, they're excellent. If you want to make something that looks like you made it, they'll make something that looks like everyone else using the same template."
r/AIVideo · 5.2k upvotes

The generation tools: Runway, Pika, Kling, Hailuo

Runway Gen-3 Alpha is currently the most prompt-responsive video generation model widely available. It handles camera motion descriptors, lighting changes, and complex atmospheric effects better than most. The interface is clean, export quality is high, and the prompt-to-output fidelity on spatial and temporal descriptions is the strongest in this comparison. At $15/month entry, access is easy — though generation credits exhaust quickly in iterative workflows.

Pika 2.0 is faster to iterate with and cheaper per generation. Prompt adherence on realistic environments is inconsistent, but it handles stylised and illustrated content well and is forgiving for quick tests. Good for exploring whether a visual direction works before committing to longer generation runs on a more expensive tool.

Kling 1.5/2.0 by Kuaishou has become one of the strongest models for realistic human motion — gesture, walking, face animation. For any content involving people or characters moving naturally, it outperforms most Western models at equivalent prompts. The interface is slower and less intuitive. A trade-off worth accepting if motion quality is your priority.

Hailuo (MiniMax) produces high-quality motion with strong temporal consistency — frames don't drift from a set style as much as some models. It is better suited to longer clips that need to hold one visual register throughout, particularly stylised or animated aesthetics.

Full comparison

Tool	Prompt control	Model choice	Edit after gen	Best for	Starting price
Storyshort	Low — template pipeline	1 (internal)	No	Faceless content at volume	~$19/mo
Revid	Low — script-to-video	1–2 (internal)	No	Faceless content, voiceover focus	~$29/mo
Runway Gen-3	High — strong adherence	1 (Gen-3 Alpha)	Limited	Cinematic video, camera control	$15/mo
Pika 2.0	Medium — stylised OK, realism varies	1 (Pika Labs)	No	Fast iterations, stylised shorts	$8/mo
Kling 2.0	Medium — great motion, weaker scene	1 (Kling)	No	Realistic human motion, characters	Credit-based
Hailuo / MiniMax	Medium — consistent style, less flexible	1 (MiniMax)	No	Longer clips, temporal consistency	Credit-based
Liminalshort	High — Prompt Wizard + manual control	8+ switchable models	Yes — image edit loop	Niche aesthetics, model comparison, iterative edit workflow	See site

Why a single model is a handicap for niche aesthetics

This trade-off doesn't appear in most comparisons. For generic content — product explainer, talking head, motivational short — any single well-trained model works fine. You learn its defaults and work within them.

But for specific, unusual aesthetics — empty architecture, analog horror, liminal spaces, dreamcore, surreal geometry — a single model is a lottery. Those aesthetics are underrepresented in most training sets. The model defaults away from what you want. Running the same prompt across multiple models on a single scouting pass is how professionals find which model has the right prior for a specific look, before committing full generation budget.

"I was using one model for everything and getting mediocre results on half my concepts. Switching to a multi-model setup and doing a 'model scouting' pass per concept — just seeing which model resonates with the aesthetic — doubled my usable output per session."
r/AIArt · 4.7k upvotes

The edit loop gap

Almost every tool in this comparison generates and stops. If the output is close but a specific detail is wrong, your only option is to regenerate. This is slow and expensive. The alternative — generating, adjusting a specific visual element, then generating again from the adjusted frame — converges on a result instead of hunting for one.

This workflow is standard in professional video production: you don't reshoot the entire scene because one prop is off. You adjust the prop. Translating that principle into AI video generation means you need tools that support image-based editing as part of the generation loop, not just as an afterthought.