Skip to content

Latest commit

 

History

History
1020 lines (443 loc) · 43.2 KB

survey_VSR_VFI.md

File metadata and controls

1020 lines (443 loc) · 43.2 KB

Survey on Video SR & VFI & deblur

main note README

  • Q:diffusion Venhancer 生成的细节改变了,非 diffusion 方案生成人体形变比 diffusion 更严重

思考大致方向

  • Reference based? 强调一致性
  • 跨越 clip

data & ckpt & metrics

RVRT 仓库整理了很多 test-set 可以直接下

https://github.com/JingyunLiang/VRT/releases

参考 RVRT

For video SR, we consider two settings: bicubic (BI) and blur-downsampling (BD) degradation.

因此 BI Degradation 测试

  • REDS4 RGB
  • Vimeo90K-T
  • Vid4 (有 BI, BD 两种,差别不大)

BD degradation

  • UDM10
  • Vimeo-90K-T
  • Vid4

metrics

survey_IQA.md

REDS4 VSR+DB (val_blur_bicubic)

Params(M) time(s/f) PSNR(RGB) PSNR(Y) SSIM↑ LPIPS ↓ FID DISTS↓ toF↓ WE↓ BRISQUE
BasicVSR++(CVPR2022)
RealBasicVSR(CVPR2022)
VRT(Arxiv2022 -> TIP2024)
RVRT(NIPS2022)
PSRT(NIPS2022)
MGLD-VSR(ECCV2024)
Upscale-A-Video(CVPR2024)
MIA-VSR (CVPR2024) 16.596719 24.920246 0.731834
FAM-Net(CVPR2024)
⚠️ 没生成首尾帧
9.766503 28.87683 30.2705 0.83172 0.2466 1.9148 45.0958

REDS4 VSR (val_sharp_bicubic)

Params(M) time(s/f) PSNR(RGB) PSNR(Y) SSIM↑ LPIPS ↓ FID DISTS↓ toF↓ WE↓ QualiCLIP MUSIQ DOVER BRISQUE
BasicVSR++(CVPR2022)
RealBasicVSR(CVPR2022)
RVRT(NIPS2022)
PSRT(NIPS2022)
MGLD-VSR(ECCV2024)
Upscale-A-Video(CVPR2024)
MIA-VSR (CVPR2024) 16.596719 32.790506 0.911523
FAM-Net(CVPR2024)
⚠️ 没生成首尾帧
9.766503 26.377029 27.73774 0.800012 0.2722 2.19973 46.3111

Vid4 BI-Degradation VSR

Vid4 BD-Degradation VSR

Vimeo-90K-T BI-Degradation VSR

UDM10 VSR

ckpts

all-in-focus

  • "Foreground-background separation and deblurring super-resolution method"

  • "BokehMe: When Neural Rendering Meets Classical Rendering" CVPR-oral, 2022 Jun 25 paper code pdf note Authors: Juewen Peng, Zhiguo Cao, Xianrui Luo, Hao Lu, Ke Xian, Jianming Zhang

fig12

render bokeh effect 光圈虚化效果,需要给定 disparity 图(类似深度图)

  • "BokehMe++: Harmonious Fusion of Classical and Neural Rendering for Versatile Bokeh Creation"

Video SR

  • "EDVR: Video Restoration with Enhanced Deformable Convolutional Networks" CVPR NTIRE 1st, 2019 May paper code pdf note Authors: Xintao Wang, Kelvin C. K. Chan, Ke Yu, Chao Dong, Chen Change Loy

image-20231115205018174

  • "Progressive fusion video super-resolution net work via exploiting non-local spatio-temporal correlations" ICCV, 2019, PNFL code

  • "BasicVSR++: Improving video super-resolution with enhanced propagation and alignment" CVPR, 2021 Apr 🗿 paper code note

  1. Flow guided Deformable Transformer
  2. 增加 second order residual 信息

image-20231121170824361

  • "Investigating Tradeoffs in Real-World Video Super-Resolution" CVPR, 2021 Nov, RealBasicVSR paper code note

fig3

盲视频超分,基于2个发现进行改进:长时序反而会降低性能,有噪声没有特殊处理;iteration L=10 太少了会造成颜色伪影,20->30 会好一些;基于 BasicVSR 加入动态预处理模块,改进训练数据策略降低计算量

  • "VRT: A Video Restoration Transformer" TIP, 2022 Jan 28 paper code pdf note Authors: Jingyun Liang, Jiezhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, Luc Van Gool (ETH + Meta)

fig2

整理好了 testset 数据集 👍

  • "Recurrent Video Restoration Transformer with Guided Deformable Attention" NeurlPS, 2022 June, RVRT 🗽 paper code pdf note

RVRT_Framework.png

  • "Rethinking Alignment in Video Super-Resolution Transformers" NIPS, 2022 Jul 18, PSRTpaper code pdf note Authors: Shuwei Shi, Jinjin Gu, Liangbin Xie, Xintao Wang, Yujiu Yang, Chao Dong

发现光流 warp 不适合 VSR 任务,光流存在很多噪声,改成用 attention 去做对齐

  • "STDAN: Deformable Attention Network for Space-Time Video Super-Resolution" NNLS, 2023 Feb paper code note

fig5

Deformable Transformer 用到 video 上面,逐帧搞 deformable

  • "Expanding Synthetic Real-World Degradations for Blind Video Super Resolution" CVPR, 2023 May paper

  • "Learning Spatial-Temporal Implicit Neural Representations for Event-Guided Video Super-Resolution" CVPR-2023, 2023 Mar 24 paper

  • "Enhancing Video Super-Resolution via Implicit Resampling-based Alignment" CVPR, 2023 Apr 29 paper code pdf note Authors: Kai Xu, Ziwei Yu, Xin Wang, Michael Bi Mi, Angela Yao

fig1

发现 optical sample 中 bilinear 存在缺陷,提出 implicit resample,改进 bilinear 采样方式

  • "Motion-Guided Latent Diffusion for Temporally Consistent Real-world Video Super-resolution" ECCV, 2023 Dec, MGLD-VSR paper code note pdf Authors: Xi Yang, Chenhang He, Jianqi Ma, Lei Zhang

image-20240222173628376

  • "Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution" CVPR, 2023 Dec, Upscale-A-Video paper code website pdf note Authors: Shangchen Zhou, Peiqing Yang, Jianyi Wang, Yihang Luo, Chen Change Loy

image-20231220135955447

  • "TMP: Temporal Motion Propagation for Online Video Super-Resolution" TIP, 2023 Dec 15 paper code pdf note Authors: Zhengqiang Zhang, Ruihuang Li, Shi Guo, Yang Cao, Lei Zhang

  • "FMA-Net: Flow-Guided Dynamic Filtering and Iterative Feature Refinement with Multi-Attention for Joint Video Super-Resolution and Deblurring" CVPR-oral, 2024 Jan 8 ⭐ paper code pdf note Authors: Geunhyuk Youk, Jihyong Oh, Munchurl Kim

fig3

  • 同时做 deblur + x4 SR;

  • 实验发现 Dynamic Filter 传统 conv2d 版本不支持 large motion. 把输入特征加光流 warp 了一下,支持 large motion。。

    • 和 DCN 很类似,效果好一丢丢
  • 训练数据使用 REDS 除去 REDS4 的 4 个视频。

  • 对比 4 个方法 ok 了

    tb2

  • 测试代码很简洁,赏心悦目 🐻

  • "Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention" CVPR-2024, 2024 Jan 12,MIA-VSR paper code pdf note Authors: Xingyu Zhou, Leheng Zhang, Xiaorui Zhao, Keze Wang, Leida Li, Shuhang Gu

  • "CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility" Arxiv, 2024 Mar 18 paper code pdf note Authors: Bojia Zi, Shihao Zhao, Xianbiao Qi, Jianan Wang, Yukai Shi, Qianyu Chen, Bin Liang, Kam-Fai Wong, Lei Zhang

  • "Learning Spatial Adaptation and Temporal Coherence in Diffusion Models for Video Super-Resolution" CVPR-2024, 2024 Mar 25 paper code pdf note Authors: Zhikai Chen, Fuchen Long, Zhaofan Qiu, Ting Yao, Wengang Zhou, Jiebo Luo, Tao Mei

  • "VideoGigaGAN: Towards Detail-rich Video Super-Resolution" ECCV, 2024 Apr 18 paper code ⚠️ web pdf note Authors: Yiran Xu, Taesung Park, Richard Zhang, Yang Zhou, Eli Shechtman, Feng Liu, Jia-Bin Huang, Difan Liu(Adobe)

fig3

把 Image GigaGAN (未开源) 改到 Video 上面,加 temporal attention & 光流;把 downsample block 改为 Pool 降低伪影;只比较了 PSNR(没 BasicVSR++好)LPIPS(好了一些),FVD

  • "DaBiT: Depth and Blur informed Transformer for Joint Refocusing and Super-Resolution" Arxiv, 2024 Jul 1 paper code pdf note ⚠️ Authors: Crispian Morris, Nantheera Anantrasirichai, Fan Zhang, David Bull

  • "Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors" ECCV, 2024 Jul 13, ST-AVSR paper code pdf note Authors: Wei Shang, Dongwei Ren, Wanying Zhang, Yuming Fang, Wangmeng Zuo, Kede Ma

  • "RealViformer: Investigating Attention for Real-World Video Super-Resolution" ECCV, 2024 Jul 19 paper code pdf note Authors: Yuehan Zhang, Angela Yao

  • "SeeClear: Semantic Distillation Enhances Pixel Condensation for Video Super-Resolution" NIPS, 2024 Oct 8 paper code pdf note ⚠️ Authors: Qi Tang, Yao Zhao, Meiqin Liu, Chao Yao

  • "Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution" WACV paper

diffusion

  • "Motion-Guided Latent Diffusion for Temporally Consistent Real-world Video Super-resolution" ECCV, 2023 Dec, MGLD-VSR paper code note pdf Authors: Xi Yang, Chenhang He, Jianqi Ma, Lei Zhang

image-20240222173628376

  • "Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution" CVPR, 2023 Dec, Upscale-A-Video paper code website pdf note Authors: Shangchen Zhou, Peiqing Yang, Jianyi Wang, Yihang Luo, Chen Change Loy

image-20231220135955447

cartoon

  • "AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos" NIPS, 2022 Jul ⭐ paper code

Compression

  • "Compression-Aware Video Super-Resolution" CVPR-2023 paper

  • "Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting" CVPR_highlight-2024, 2023 Mar paper code note

Event Camera

  • "EvTexture: Event-driven Texture Enhancement for Video Super-Resolution" Arxiv, 2024 Jun 19, EvTexture paper code pdf note Authors: Dachun Kai, Jiayao Lu, Yueyi Zhang, Xiaoyan Sun

3D SR 🐻

  • "SuperGaussian: Repurposing Video Models for 3D Super Resolution" ECCV, 2024 Jun 2 paper code pdf note Authors: Yuan Shen, Duygu Ceylan, Paul Guerrero, Zexiang Xu, Niloy J. Mitra, Shenlong Wang, Anna Frühstück

  • "Sequence Matters: Harnessing Video Models in 3D Super-Resolution" AAAI, 2024 Dec 16 paper code pdf note Authors: Hyun-kyu Ko, Dongheok Park, Youngin Park, Byeonghyeon Lee, Juhee Han, Eunbyung Park

Video Deblur

  • "Spatio-Temporal Filter Adaptive Network for Video Deblurring" ICCV, 2019 code

Dynamic Filter Network 预测退化特征,融入 deblur/SR

  • "High-resolution optical flow and frame-recurrent network for video super-resolution and deblurring" NeuralComputing, 2022 Jun 7, HOFFR paper code pdf note Authors: Ning Fang, Zongqian Zhan

双分支,一个搞 SR,一个 Deblur 最后 channel attention 合起来;一开始预测一个光流

fig1

  • "Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and Diffusion Model" ECCV, 2024 Aug 24 paper code pdf note Authors: Chen Rao, Guangyuan Li, Zehua Lan, Jiakai Sun, Junsheng Luan, Wei Xing, Lei Zhao, Huaizhong Lin, Jianfeng Dong, Dalong Zhang

fig1

Dynamic Filter Network

理解为根据输入退化,动态地预测类似卷积核,效果比 DCN 好一些

  • "Spatio-Temporal Filter Adaptive Network for Video Deblurring" ICCV, 2019 code

  • "FMA-Net: Flow-Guided Dynamic Filtering and Iterative Feature Refinement with Multi-Attention for Joint Video Super-Resolution and Deblurring" CVPR-oral, 2024 Jan 8 paper code pdf note Authors: Geunhyuk Youk, Jihyong Oh, Munchurl Kim

Reference-VSR 📖

  • "TDAN: Temporally Deformable Alignment Network for Video Super-Resolution" CVPR, 2018 Dec paper

  • "EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation" CICAI, 2021 Oct paper

  • "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets" CVPR2022, 2021 Oct paper code

  • "NeuriCam: Key-Frame Video Super-Resolution and Colorization for IoT Cameras" MobiCom-2023, paper

  • "Reference-Based Image and Video Super-Resolution via C2-Matching" TPAMI, 2023 Dec 21 paper

  • "Efficient Reference-based Video Super-Resolution (ERVSR): Single Reference Image Is All You Need" WACV, 2023 paper code

  • "Reference-based Restoration of Digitized Analog Videotapes" WACV, 2023 Oct, TAPE paper code note Authors: Lorenzo Agnolucci, Leonardo Galteri, Marco Bertini, Alberto Del Bimbo

fig3

  • "Self-Supervised Learning for Real-World Super-Resolution from Dual Zoomed Observations" ECCV, 2022 Mar, paper

  • "Self-Supervised Learning for Real-World Super-Resolution from Dual and Multiple Zoomed Observations" TPAMI, 2024 May, ⭐ paper code

  • "Reference-based Burst Super-resolution" ACMM paper code

  • "PERSONALIZED REPRESENTATION FROM PERSONALIZED GENERATION" paper

  • "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets" CVPR, 2022 Mar, RefVSR 🗽 paper website code [pdf](./2022_03_Reference-based-Video -Super-Resolution-Using-Multi-Camera-Video-Triplets.pdf)

  • "RefVSR++: Exploiting Reference Inputs for Reference-based Video Super-resolution" Arxiv, 2023 Jul RefVSR++ paper

  • "Toward Real-World Super Resolution With Adaptive Self-Similarity Mining"

  • "Reference-based Burst Super-resolution" ACM-MM, 2024 Oct 28 paper code pdf note Authors: Seonggwan Ko, Yeong Jun Koh, Donghyeon Cho

Image

  • "Robust Reference-based Super-Resolution via C2-Matching" CVPR, 2021 Jun, C2-Matching 🗽 paper code

  • "Reference-Based Image and Video Super-Resolution via C2-Matching" TPAMI, 2023 July 01 paper

  • "Dual-Camera Super-Resolution with Aligned Attention Modules" ICCV oral, 2021 Sep, DCSR paper code note

  • "Reference-based Image Super-Resolution with Deformable Attention Transformer" ECCV, 2022 Jul, DATSR 🗽 paper code note

  • "DARTS: Double Attention Reference-based Transformer for Super-resolution" Arxiv, 2023 Jul paper code

  • "LMR: A Large-Scale Multi-Reference Dataset for Reference-based Super-Resolution" ICCV, 2023 paper code

  • "Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment" paper

dense correspondence 🕸️

  • "Learning Video Representations from Correspondence Proposals" CVPR, 2019 May 20 paper

  • "Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective" WACV, 2021 Mar 31 paper

  • "Comparing Correspondences: Video Prediction with Correspondence-wise Losses" CVPR, 2021 Apr paper

  • "PDC-Net+: Enhanced Probabilistic Dense Correspondence Network" TPAMI, 2021 Sep 28, paper

  • "Modelling Neighbor Relation in Joint Space-Time Graph for Video Correspondence Learning" ICCV-2021, 2021 Sep 28 paper

  • "Warp Consistency for Unsupervised Learning of Dense Correspondences" ICCV-2021-oral, 2021 Apr 7 paper

  • "Locality-Aware Inter-and Intra-Video Reconstruction for Self-Supervised Correspondence Learning" CVPR-2022, 2022 Mar paper

  • "Probabilistic Warp Consistency for Weakly-Supervised Semantic Correspondences" CVPR-2022, 2022 Mar paper code

  • "Correspondence Matters for Video Referring Expression Comprehension" ACMM-2022, 2022 Jul paper

  • "Neural Matching Fields: Implicit Representation of Matching Fields for Visual Correspondence" NeurIPS, 2022 Oct 6 paper code website note

    INR 隐式网络用于特征点匹配,SOTA & 推理一张图要 8-9s

  • "DiffMatch: Diffusion Model for Dense Matching" Arxiv, 2023 May ⚠️ paper website

    Neural Matching Fields 同个组

  • "Emergent Correspondence from Image Diffusion" NIPS, 2023 Jun 6 paper code pdf note Authors: Luming Tang, Menglin Jia, Qianqian Wang, Cheng Perng Phoo, Bharath Hariharan

  • "Match me if you can: Semi-Supervised Semantic Correspondence Learning with Unpaired Images" ACCV, 2023 Nov 30 paper

  • "Unifying Feature and Cost Aggregation with Transformers for Semantic and Visual Correspondence" ICLR, 2024 Mar 17 paper

  • "DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization" 2024 Feb paper

新的 KV memory,用 memory KV 去和新的做 match

  • "CONDA: Condensed Deep Association Learning for Co-Salient Object Detection" Arxiv, 2024 Oct 10 paper

  • "Cross-View Completion Models are Zero-shot Correspondence Estimators" 2024 Dec paper

video grounding

  • "Knowing Where to Focus: Event-aware Transformer for Video Grounding" ICCV-2023, 2023 Aug 14 paper

  • "OmniViD: A Generative Framework for Universal Video Understanding" CVPR-2024 paper

Space-Time VSR

VSR+VFI

  • "How Video Super-Resolution and Frame Interpolation Mutually Benefit" ACMM paper

  • "Enhancing Space-time Video Super-resolution via Spatial-temporal Feature Interaction" Arxiv, 2022 Jul

https://github.com/yuezijie/STINet-Space-time-Video-Super-resolution?tab=readme-ov-file

  • "RSTT: Real-time Spatial Temporal Transformer for Space-Time Video Super-Resolution" CVPR 2022 paper

  • "MoTIF: Learning Motion Trajectories with Local Implicit Neural Functions for Continuous Space-Time Video Super-Resolution" ICCV-2023, 2023 Jul 16 paper code

  • "Scale-adaptive feature aggregation for efficient space-time video super-resolution" WACV 2024

  • "Complementary Dual-Branch Network for Space-Time Video Super-Resolution" ICPR, 2024 Dec 05

    https://link.springer.com/chapter/10.1007/978-3-031-78125-4_13

  • "A Resource-Constrained Spatio-Temporal Super Resolution Model" 2024

  • "Global Spatial-Temporal Information-based Residual ConvLSTM for Video Space-Time Super-Resolution" 2024 Jul paper

  • "VEnhancer: Generative Space-Time Enhancement for Video Generation" Arxiv, 2024 Jul 10 paper code pdf note Authors: Jingwen He, Tianfan Xue, Dongyang Liu, Xinqi Lin, Peng Gao, Dahua Lin, Yu Qiao, Wanli Ouyang, Ziwei Liu

  • "3DAttGAN: A 3D Attention-based Generative Adversarial Network for Joint Space-Time Video Super-Resolution" paper

VFI

  • "LDMVFI: Video Frame Interpolation with Latent Diffusion Models" Arxiv, 2023 Mar, LDMVFI paper code note

  • "Disentangled Motion Modeling for Video Frame Interpolation" Arxiv, 2024 Jun 25 paper code pdf note Authors: Jaihyun Lew, Jooyoung Choi, Chaehun Shin, Dahuin Jung, Sungroh Yoon

We use Vimeo90k for training, and use SNU-FILM, Xiph, Middlebury-others for validation.

  • "VFIMamba: Video Frame Interpolation with State Space Models" NIPS, 2024 Oct

    https://github.com/MCG-NJU/VFIMamba

  • "Perception-Oriented Video Frame Interpolation via Asymmetric Blending" CVPR, 2024 Apr 10, PerVFI paper code pdf note Authors: Guangyang Wu, Xin Tao, Changlin Li, Wenyi Wang, Xiaohong Liu, Qingqing Zheng

Image

  • "DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors" ECCV, 2023 Oct 18 paper code pdf note Authors: Jinbo Xing, Menghan Xia, Yong Zhang, Haoxin Chen, Wangbo Yu, Hanyuan Liu, Xintao Wang, Tien-Tsin Wong, Ying Shan

  • "ToonCrafter: Generative Cartoon Interpolation" Arxiv, 2024 May 28 paper code pdf note Authors: Jinbo Xing, Hanyuan Liu, Menghan Xia, Yong Zhang, Xintao Wang, Ying Shan, Tien-Tsin Wong

  • Film: Frame interpolation for large motion.

cartoon

  • "Thin-Plate Spline-based Interpolation for Animation Line Inbetweening" paper

AnyScale

  • "Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers" Arxiv, 2024 May 9 paper code pdf note Authors: Peng Gao, Le Zhuo, Dongyang Liu, Ruoyi Du, Xu Luo, Longtian Qiu, Yuhang Zhang, Chen Lin, Rongjie Huang, Shijie Geng, Renrui Zhang, Junlin Xi, Wenqi Shao, Zhengkai Jiang, Tianshuo Yang, Weicai Ye, He Tong, Jingwen He, Yu Qiao, Hongsheng Li

VAE 优化

  • "CV-VAE: A Compatible Video VAE for Latent Generative Video Models" NIPS, 2024 May 30 paper code pdf note Authors: Sijie Zhao, Yong Zhang, Xiaodong Cun, Shaoshu Yang, Muyao Niu, Xiaoyu Li, Wenbo Hu, Ying Shan

SVD 优化 3D VAE 25frames-> 96frames

Text/FrameConsistency

  • "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement" Arxiv, 2024 Nov 22 paper code pdf note Authors: Daeun Lee, Jaehong Yoon, Jaemin Cho, Mohit Bansal

  • "Video-Infinity: Distributed Long Video Generation"