Video-Bench: Human-Aligned Video Generation Benchmark

by Hui Han1, Siyuan Li1, Jiaqi Chen2,3,4, Yiwen Yuan5, Yuling Wu6, Chak Tou Leong5, Hanwen Du7, Junchen Fu8, Youhua Li9, Jie Zhang4, Chi Zhang10, Li-jia Li11, Yongxin Ni12

1. Shanghai Jiao Tong University, 2. Stanford University, 3. Fellou AI, 4. Fudan University, 5. Carnegie Mellon University, 6. Hong Kong Polytechnic University, 7. Soochow University, 8. University of Glasgow, 9. City University of Hong Kong, 10. Westlake University, 11. LiveX AI, 12. National University of Singapore

Head Image

Project Overview

Video generation assessment is critical for ensuring generative models produce visually realistic, high-quality videos aligned with human expectations. Current video generation benchmarks are limited in aligning with human judgment. To address this, Video-Bench is introduced—a comprehensive benchmark incorporating large language models (LLMs) to evaluate video generation quality. The framework includes automated multimodal LLM evaluation, improving the alignment with human preferences. Experimental results show that Video-Bench significantly outperforms previous methods and provides more objective and accurate insights into generated video quality.

Main Results

Video-Bench Leaderboard

Higher scores indicate better performance. The best score in each dimension is highlighted in bold. ``Avg Rank" is the average rank of multiple dimensions, the lower the better.
Model Video Quality Video-Condition Alignment Overall Rank
Gen3 4.66 4.38 1
CogVideoX 3.84 4.62 2
VideoCrafter2 4.08 4.18 3
Kling 4.26 4.07 4
Show-1 3.30 4.21 5
LaVie 3.00 3.71 6
PiKa-Beta 3.76 2.60 7

Human Preference Alignment Scores

This score is measured by Spearman's rank correlation coefficient. Higher score indicates better performance. The best score in each dimension is highlighted in bold. In practice, ComBench$^{*}$~\cite{sun2024t2v} is a reproduction version on our benchmark metrics.
Entities Video Quality Video-Condition Alignment Average Score
HU - HU 0.63 0.47 0.52
HU - GPT 0.51 0.47 0.41
HU - HA 0.61 0.50 0.50

Project Video Demonstration

GitHub Repository

For more details, visit the official repository: Video-Bench GitHub Repository