Artificial Analysis Global Leaderboard - Champion

A new global video model king has been born!

SkyReels-V4 from Kunlun Tech has directly topped the Artificial Analysis text-to-video (with audio) global leaderboard, surpassing Veo 3.1 and Sora 2. Just a month ago, its Preview version ranked #2 globally.

SkyReels V4 achieved #1 ranking on the authoritative Artificial Analysis text-to-video (with audio) global leaderboard, surpassing internationally renowned models like OpenAI's Sora 2 and Google's Veo 3.1, firmly establishing itself in the global top tier. Moving from "generating clips" to controllable, continuous complete video production.

Global Ranking

1080p

Max Resolution

32FPS

Frame Rate

15s

Video Duration

SkyReels V4 Latest Upgrade Highlights

From global #2 to global #1 in just one month, SkyReels V4's capabilities have reached new heights. This upgrade is not just minor tweaks, but a comprehensive capability leap.

Full-Modal Reinforcement Learning System Upgrade

The fully upgraded full-modal reinforcement learning system makes videos "make sense." The model no longer mechanically "assembles frames" according to prompts, but begins to understand the logic of the entire process.

Full-Modal Semantic Reward Model - Provides the model with a "global evaluation standard," not just checking if individual frames are good, but whether the entire video is reasonable
Progressive Curriculum Reinforcement Learning Path - From three dimensions: resolution & duration, task complexity, and data difficulty, progressively mastering complex capabilities

New Keyframe Reference Capability NEW

Provide multiple keyframes to the AI, with key transitions entirely under your control, while the model automatically generates the intermediate frames. Emphasizes control over narrative rhythm and action continuity.

New Grid Reference Capability NEW

A feature specifically designed for short dramas (including AI comic short dramas). Users can upload up to 9 plot keyframes at once, and the model will stably extract and preserve character features and scene styles, generating narrative videos with complete logic and consistent characters and scenes throughout. Mainly used for locking character consistency and visual style.

SkyReels V4 Core Features

SkyReels V4 adopts a video/audio dual-stream MMDiT architecture, making it the world's first unified audio-video AI model. It achieves tight integration of audio and video from the ground up, supporting cinematic audio-video synchronized generation.

🎬

Text-to-Video

Generate high-quality videos through text descriptions, supporting complex scene descriptions and storylines. The AI automatically understands and creates video content that matches the description.

🖼

Image-to-Video

Upload reference images, and the AI transforms static images into dynamic videos, maintaining the image's style and details while adding natural motion effects.

🎭

Multi-Modal Reference Generation

Supports various generation methods including reference image-to-video, video extension, audio-driven virtual avatars, and more to meet different creative needs.

🔊

Audio-Visual Joint Generation

The world's first audio-video synchronized generation technology, automatically generating matching audio while generating video, achieving true audio-visual integration.

✂

Video Editing & Inpainting

Supports Mask-based editing and video inpainting, allowing targeted modification or filling of specific video regions without regenerating the entire video.

🎥

Cinematic Quality

Generated videos can match real-person footage, with significantly reduced "AI feel," achieving professional cinema-level visual quality.

🖼

Keyframe Reference

Provide multiple keyframe images to control narrative rhythm and action continuity, with intermediate frames automatically generated.

📱

Grid Reference (Short Drama Tool)

Upload up to 9 plot keyframes to lock character consistency and visual style, generating coherent narrative videos.

SkyReels V4 Technical Innovation

Symmetric Dual-Stream MMDiT Architecture

SkyReels V4 adopts an innovative video/audio dual-stream MMDiT (Multimodal Diffusion Transformer) architecture, fundamentally welding audio and video together from the ground up. This architecture allows the model to "see, hear, and create" within a unified framework, fundamentally solving the problem of separate audio-video processing in traditional methods.

Full-Modal Reference Unified Framework

Unifies all input forms including text, images, video, audio, masks, and more under a single framework. Previously, tasks requiring multiple models接力 and manual alignment can now be completed in one generation pass, significantly reducing engineering complexity.

Based on Tiangong Language Model

SkyReels V4 is developed based on Kunlun Tech's self-developed Tiangong language model, inheriting powerful semantic understanding capabilities to accurately understand user creative intent and transform it into high-quality video content.

API Service & Published Research

SkyReels V4 is available through the official website platform and API services. Research papers have been published on arXiv, providing valuable learning and research resources for global developers and researchers. Note: V2 and V3 versions are open-source, while V4 is available through the API platform.

SkyReels Version Comparison - V3 vs V4

From SkyReels V3 to V4, Kunlun Tech AI has achieved major technological breakthroughs. Here's a detailed comparison of both versions:

Feature	SkyReels V3	SkyReels V4
Release Date	January 2025	February 27, 2026
Model Type	Open Source	API Service / Unified Audio-Video Model
Audio Generation	Not Supported	Audio-Visual Joint Generation
Max Resolution	720p	1080p
Frame Rate	24 FPS	32 FPS
Video Duration	10 seconds	15 seconds
Editing Features	Basic Editing	Advanced Inpainting
Keyframe Reference	Not Supported	Supported
Grid Reference	Not Supported	Supported (9 images)
Global Ranking	-	Artificial Analysis #1

SkyReels Development History

August 2024

SkyReels A3 Released

Based on DiT video diffusion model + frame interpolation extension + reinforcement learning action optimization, establishing the technical foundation for the SkyReels series.

January 2025

SkyReels V3 Open Source Release

Open source version released, supporting reference image-to-video, video extension, audio-driven virtual avatars, and more.

April 2025

SkyReels V2 Released

The world's first infinite-duration movie generation model using diffusion forcing framework, breaking through video duration limits.

January 2026

SkyReels V4 Preview Version

Preview version released, ranking #2 on Artificial Analysis global leaderboard, surpassing Sora 2 and Veo 3.1.

February 2026

SkyReels V4 Officially Tops Global #1

The upgraded SkyReels-V4 directly reached the global top spot, becoming the new king of video models! Moving from "generating clips" to controllable, continuous complete video production, opening a new era of audio-video joint generation.

About Kunlun Tech AI

Kunlun Tech Group

Kunlun Tech Group (Stock Code: SZ300418) is a leading Chinese internet platform company with multiple well-known AI products including Tiangong AI. The SkyReels series is developed by the Skywork AI team under Kunlun Tech Group.

Tiangong AI

Tiangong AI is a large language model independently developed by Kunlun Tech, with powerful semantic understanding and generation capabilities. SkyReels V4 is developed based on the Tiangong language model, inheriting its excellent language understanding abilities.

DramaWave - AI Version of Netflix

Tiangong AI applies SkyReels-V4 to its own short drama platform DramaWave. As an overseas paid short drama platform launched in October 2024, DramaWave has exceeded 80 million monthly active users, achieving a complete closed loop from technology to product to commercialization.

Application Scenarios

SkyReels V4 is widely used in:

Short Drama Production - Rapid generation of high-quality short drama content, AI comic short dramas in one pass
Advertising & Marketing - Creating eye-catching advertising videos
Cinematic Video Content Creation - Professional-level video production
Game Cutscenes - Standardized, reusable process scaling
Music Videos - Combined with Mureka platform for one-stop music needs

Frequently Asked Questions

Is SkyReels V4 free?

SkyReels V4 is available through the official website platform and API services. You can experience the online version at skyreels.ai or integrate it into your applications through the API platform. SkyReels V2 and V3 are open-source versions available on GitHub.

How does SkyReels V4 compare to Seedance 2.0?

SkyReels V4 and Seedance 2.0 are both top Chinese AI video generation models. SkyReels V4 currently tops the Artificial Analysis global leaderboard at #1, with its unique advantage being audio-video joint generation capability, making it the world's first unified model to achieve this functionality.

How do I use SkyReels V4?

You can visit skyreels.ai to directly use the online version, or integrate it into your applications through the API platform. For open-source versions, you can get SkyReels V2 and V3.

What input formats does SkyReels V4 support?

SkyReels V4 supports text input (text-to-video), image input (image-to-video), keyframe reference, grid reference (up to 9 images), and other multi-modal reference inputs to meet various creative scenario needs.

What are keyframe reference and grid reference?

Keyframe reference allows you to provide multiple keyframe images to control narrative rhythm and action continuity; grid reference is designed for short dramas, uploading up to 9 plot keyframes to lock character consistency and visual style, generating coherent narrative videos.

Start Using SkyReels V4

Experience the world's #1 AI video generation technology today and start your creative journey. From "generation" to "production," the era of video industrialization has arrived.

Visit Official Site