Text-to-Video
Generate high-quality videos through text descriptions, supporting complex scene descriptions and storylines. The AI automatically understands and creates video content that matches the description.
Kunlun Tech AI - The New King of Video Models
Topped the global leaderboard! Surpassing Veo 3.1 and Sora 2, the world's first unified audio-video AI model supporting 1080p cinematic video generation.
Try NowSkyReels-V4 from Kunlun Tech has directly topped the Artificial Analysis text-to-video (with audio) global leaderboard, surpassing Veo 3.1 and Sora 2. Just a month ago, its Preview version ranked #2 globally.
SkyReels V4 achieved #1 ranking on the authoritative Artificial Analysis text-to-video (with audio) global leaderboard, surpassing internationally renowned models like OpenAI's Sora 2 and Google's Veo 3.1, firmly establishing itself in the global top tier. Moving from "generating clips" to controllable, continuous complete video production.
From global #2 to global #1 in just one month, SkyReels V4's capabilities have reached new heights. This upgrade is not just minor tweaks, but a comprehensive capability leap.
The fully upgraded full-modal reinforcement learning system makes videos "make sense." The model no longer mechanically "assembles frames" according to prompts, but begins to understand the logic of the entire process.
Provide multiple keyframes to the AI, with key transitions entirely under your control, while the model automatically generates the intermediate frames. Emphasizes control over narrative rhythm and action continuity.
A feature specifically designed for short dramas (including AI comic short dramas). Users can upload up to 9 plot keyframes at once, and the model will stably extract and preserve character features and scene styles, generating narrative videos with complete logic and consistent characters and scenes throughout. Mainly used for locking character consistency and visual style.
SkyReels V4 adopts a video/audio dual-stream MMDiT architecture, making it the world's first unified audio-video AI model. It achieves tight integration of audio and video from the ground up, supporting cinematic audio-video synchronized generation.
Generate high-quality videos through text descriptions, supporting complex scene descriptions and storylines. The AI automatically understands and creates video content that matches the description.
Upload reference images, and the AI transforms static images into dynamic videos, maintaining the image's style and details while adding natural motion effects.
Supports various generation methods including reference image-to-video, video extension, audio-driven virtual avatars, and more to meet different creative needs.
The world's first audio-video synchronized generation technology, automatically generating matching audio while generating video, achieving true audio-visual integration.
Supports Mask-based editing and video inpainting, allowing targeted modification or filling of specific video regions without regenerating the entire video.
Generated videos can match real-person footage, with significantly reduced "AI feel," achieving professional cinema-level visual quality.
Provide multiple keyframe images to control narrative rhythm and action continuity, with intermediate frames automatically generated.
Upload up to 9 plot keyframes to lock character consistency and visual style, generating coherent narrative videos.
SkyReels V4 adopts an innovative video/audio dual-stream MMDiT (Multimodal Diffusion Transformer) architecture, fundamentally welding audio and video together from the ground up. This architecture allows the model to "see, hear, and create" within a unified framework, fundamentally solving the problem of separate audio-video processing in traditional methods.
Unifies all input forms including text, images, video, audio, masks, and more under a single framework. Previously, tasks requiring multiple models接力 and manual alignment can now be completed in one generation pass, significantly reducing engineering complexity.
SkyReels V4 is developed based on Kunlun Tech's self-developed Tiangong language model, inheriting powerful semantic understanding capabilities to accurately understand user creative intent and transform it into high-quality video content.
SkyReels V4 is available through the official website platform and API services. Research papers have been published on arXiv, providing valuable learning and research resources for global developers and researchers. Note: V2 and V3 versions are open-source, while V4 is available through the API platform.
From SkyReels V3 to V4, Kunlun Tech AI has achieved major technological breakthroughs. Here's a detailed comparison of both versions:
| Feature | SkyReels V3 | SkyReels V4 |
|---|---|---|
| Release Date | January 2025 | February 27, 2026 |
| Model Type | Open Source | API Service / Unified Audio-Video Model |
| Audio Generation | Not Supported | Audio-Visual Joint Generation |
| Max Resolution | 720p | 1080p |
| Frame Rate | 24 FPS | 32 FPS |
| Video Duration | 10 seconds | 15 seconds |
| Editing Features | Basic Editing | Advanced Inpainting |
| Keyframe Reference | Not Supported | Supported |
| Grid Reference | Not Supported | Supported (9 images) |
| Global Ranking | - | Artificial Analysis #1 |
Based on DiT video diffusion model + frame interpolation extension + reinforcement learning action optimization, establishing the technical foundation for the SkyReels series.
Open source version released, supporting reference image-to-video, video extension, audio-driven virtual avatars, and more.
The world's first infinite-duration movie generation model using diffusion forcing framework, breaking through video duration limits.
Preview version released, ranking #2 on Artificial Analysis global leaderboard, surpassing Sora 2 and Veo 3.1.
The upgraded SkyReels-V4 directly reached the global top spot, becoming the new king of video models! Moving from "generating clips" to controllable, continuous complete video production, opening a new era of audio-video joint generation.
Kunlun Tech Group (Stock Code: SZ300418) is a leading Chinese internet platform company with multiple well-known AI products including Tiangong AI. The SkyReels series is developed by the Skywork AI team under Kunlun Tech Group.
Tiangong AI is a large language model independently developed by Kunlun Tech, with powerful semantic understanding and generation capabilities. SkyReels V4 is developed based on the Tiangong language model, inheriting its excellent language understanding abilities.
Tiangong AI applies SkyReels-V4 to its own short drama platform DramaWave. As an overseas paid short drama platform launched in October 2024, DramaWave has exceeded 80 million monthly active users, achieving a complete closed loop from technology to product to commercialization.
SkyReels V4 is widely used in:
SkyReels V4 is available through the official website platform and API services. You can experience the online version at skyreels.ai or integrate it into your applications through the API platform. SkyReels V2 and V3 are open-source versions available on GitHub.
SkyReels V4 and Seedance 2.0 are both top Chinese AI video generation models. SkyReels V4 currently tops the Artificial Analysis global leaderboard at #1, with its unique advantage being audio-video joint generation capability, making it the world's first unified model to achieve this functionality.
You can visit skyreels.ai to directly use the online version, or integrate it into your applications through the API platform. For open-source versions, you can get SkyReels V2 and V3.
SkyReels V4 supports text input (text-to-video), image input (image-to-video), keyframe reference, grid reference (up to 9 images), and other multi-modal reference inputs to meet various creative scenario needs.
Keyframe reference allows you to provide multiple keyframe images to control narrative rhythm and action continuity; grid reference is designed for short dramas, uploading up to 9 plot keyframes to lock character consistency and visual style, generating coherent narrative videos.
Experience the world's #1 AI video generation technology today and start your creative journey. From "generation" to "production," the era of video industrialization has arrived.
Visit Official Site