- 官网链接:http://cvpr2021.thecvf.com
- 时间:2021年6月19日-6月25日
- 论文接收公布时间:2021年2月28日
- CVPR2021官方接受论文列表:http://cvpr2021.thecvf.com/sites/default/files/2021-03/accepted_paper_ids.txt
- 1.超分辨率(Super-Resolution)
- 2.图像去雨(Image Deraining)
- 3.图像去雾(Image Dehazing)
- 4.去模糊(Deblurring)
- 5.去噪(Denoising)
- 6.图像恢复(Image Restoration)
- 7.图像增强(Image Enhancement)
- 8.图像去摩尔纹(Image Demoireing)
- 9.图像阴影去除(Image Shadow Removal)
- 10.图像翻译(Image Translation)
- 11.插帧(Frame Interpolation)
- 12.视频压缩(Video Compression)
- 13.图像编辑(Image Edit)
- 图像目标检测(Image Object Detection)
- 视频目标检测(Video Object Detection)
- 三维目标检测(3D Object Detection)
- 动作检测(Activity Detection)
- 异常检测(Anomally Detetion)
- 全景分割(Panoptic Segmentation)
- 语义分割(Semantic Segmentation)
- 实例分割(Instance Segmentation)
- 超像素(Superpixel)
- 视频目标分割(Video Object Segmentation)
- 抠图(Matting)
- Paper:https://arxiv.org/abs/2011.14631
- Homepage:http://www.liuyebin.com/crossMPI/crossMPI.html
- Analysis:CVPR 2021,Cross-MPI以底层场景结构为线索的端到端网络,在大分辨率(x8)差距下也可完成高保真的超分辨率
CT Film Recovery via Disentangling Geometric Deformation and Illumination Variation: Simulated Datasets and Deep Models
- Paper:https://arxiv.org/abs/2103.01255
- Code:https://github.com/tsingqguo/exposure-fusion-shadow-removal
- Paper:https://arxiv.org/abs/2012.08512
- Code:https://tarun005.github.io/FLAVR/Code
- Homepage:https://tarun005.github.io/FLAVR/
[1] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小样本目标检测的语义关系推理)
[2] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
- 解读:无监督预训练检测器
[3] Positive-Unlabeled Data Purification in the Wild for Object Detection(野外检测对象的阳性无标签数据提纯)
[4] General Instance Distillation for Object Detection(通用实例蒸馏技术在目标检测中的应用)
[5] Instance Localization for Self-supervised Detection Pretraining(自监督检测预训练的实例定位)
[6] Multiple Instance Active Learning for Object Detection(用于对象检测的多实例主动学习)
[7] Towards Open World Object Detection(开放世界中的目标检测)
[8] You Only Look One-level Feature
[9] End-to-End Object Detection with Fully Convolutional Network()
[10] FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding(通过对比提案编码进行的小样本目标检测)
[11] Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection(学习可靠的定位质量估计用于密集目标检测)
[12] MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection(用于类别识别无监督域自适应对象检测)
[13] OPANAS: One-Shot Path Aggregation Network Architecture Search for Object(一键式路径聚合网络体系结构搜索对象)
[14] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
[1] Depth from Camera Motion and Object Detection(相机运动和物体检测的深度)
[2] There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge(多模态知识提取的自监督多目标检测与有声跟踪)
[3] Dogfight: Detecting Drones from Drone Videos(从无人机视频中检测无人机)
[1] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection(利用IoU预测进行半监督3D对象检测)
[2] Categorical Depth Distribution Network for Monocular 3D Object Detection(用于单目三维目标检测的分类深度分布网络)
[3] ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection(ST3D:在三维目标检测上进行无监督域自适应的自训练)
[4] Center-based 3D Object Detection and Tracking(基于中心的3D目标检测和跟踪)
[1] Coarse-Fine Networks for Temporal Activity Detection in Videos
[2] Detecting Human-Object Interaction via Fabricated Compositional Learning(通过人为构图学习检测人与物体的相互作用)
[3] Reformulating HOI Detection as Adaptive Set Prediction(将人物交互检测重新配置为自适应集预测)
[4] QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information(具有图像范围的上下文信息的基于查询的成对人物交互检测)
[5] End-to-End Human Object Interaction Detection with HOI Transformer(使用HOI Transformer进行端到端的人类对象交互检测)
[1] Multiresolution Knowledge Distillation for Anomaly Detection(用于异常检测的多分辨率知识蒸馏)
[2] ReDet: A Rotation-equivariant Detector for Aerial Object Detection(ReDet:用于航空物体检测的等速旋转检测器)
[3] Dense Label Encoding for Boundary Discontinuity Free Rotation Detection(密集标签编码,用于边界不连续自由旋转检测)
[4] Skeleton Merger: an Unsupervised Aligned Keypoint Detector(骨架合并:无监督的对准关键点检测器)
[1] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?
[2] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行航空图像分割)
[3] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行航空图像分割)
[4] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space(在连续频率空间中通过情景学习进行医学图像分割的联合域泛化)
[1] Cross-View Regularization for Domain Adaptive Panoptic Segmentation(用于域自适应全景分割的跨视图正则化)
[2] 4D Panoptic LiDAR Segmentation(4D全景LiDAR分割)
[1] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割:数据集,基准和挑战)
[2] PLOP: Learning without Forgetting for Continual Semantic Segmentation(PLOP:学习而不会忘记连续的语义分割)
[3] Cross-Dataset Collaborative Learning for Semantic Segmentation(跨数据集协同学习的语义分割)
[4] BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation(用于弱监督语义和实例细分的边界框归因图)
[5] Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations(通过稀疏和纠缠的潜在表示的排斥力进行连续语义分割)
[6] Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion(通过双边扩充和自适应融合对实点云场景进行语义分割)
[7] Capturing Omni-Range Context for Omnidirectional Segmentation(捕获全方位上下文进行全方位分割)
[8] MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation(MetaCorrection:语义分割中无监督域自适应的域感知元丢失校正)
[9] Learning Statistical Texture for Semantic Segmentation(学习用于语义分割的统计纹理)
[10] Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation(基于双层域混合的半监督域自适应语义分割)
[11] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation(多源领域自适应与协作学习的语义分割)
### 实例分割(Instance Segmentation)[1] End-to-End Video Instance Segmentation with Transformers(使用Transformer的端到端视频实例分割)
[2] BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation(用于弱监督语义和实例细分的边界框归因图)
## 超像素(Superpixel)[1] Learning the Superpixel in a Non-iterative and Lifelong Manner(以非迭代和终身的方式学习超像素)
[1] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild(学习推荐帧用于交互式野外视频对象分割)
[2] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion(模块化交互式视频对象分割:面具交互,传播和差异感知融合)
[1] Real-Time High Resolution Background Matting
[1] CanonPose: Self-supervised Monocular 3D Human Pose Estimation in the Wild(野外自监督的单眼3D人类姿态估计)
[2] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers(具有透视作物层的3D姿势的几何感知神经重建)
[3] DCPose: Deep Dual Consecutive Network for Human Pose Estimation(用于人体姿态估计的深度双重连续网络)
[4] Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing(用于实例感知人类语义解析的可微分多粒度人类表示学习)
[1] Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration(基于语义聚合和自适应2D-1D配准的相机空间手部网格恢复)
[2] Skeleton Based Sign Language Recognition Using Whole-body Keypoints(基于全身关键点的基于骨架的手语识别)
[1] GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation(用于单眼6D对象姿态估计的几何引导直接回归网络)
[2] Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments(在动态室内环境中,通过空间划分的鲁棒神经路由可实现摄像机的重新定位)
[3] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)
[1] Cross Modal Focal Loss for RGBD Face Anti-Spoofing(Cross Modal Focal Loss for RGBD Face Anti-Spoofing)
[2] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework(当年龄不变的人脸识别遇到人脸年龄合成时:一个多任务学习框架)
[3] Multi-attentional Deepfake Detection(多注意的深伪检测)
[4] Image-to-image Translation via Hierarchical Style Disentanglement
[5] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿势面部识别的3D GAN)
[1] HPS: localizing and tracking people in large 3D scenes from wearable sensors(通过可穿戴式传感器对大型3D场景中的人进行定位和跟踪)
[2] Track to Detect and Segment: An Online Multi-Object Tracker(跟踪检测和分段:在线多对象跟踪器)
[3] Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking(多目标跟踪的概率小波计分和修复)
[4] Rotation Equivariant Siamese Networks for Tracking(旋转等距连体网络进行跟踪)
[5] Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking(Transformer与追踪器相遇:利用时间上下文进行可靠的视觉追踪)
[6] Track to Detect and Segment: An Online Multi-Object Tracker(跟踪检测和分段:在线多目标跟踪器)
[7] Learning a Proposal Classifier for Multiple Object Tracking(用于多对象跟踪的分类器)
[8] Center-based 3D Object Detection and Tracking(基于中心的3D目标检测和跟踪)
[1] Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection(【人脸伪造检测】由单中心损失监督的频率感知判别特征学习,用于人脸伪造检测)
[2] 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction(单视图3D漫画面部重建的数据集和基线方法)
[3] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis(进行全面伪造分析的多功能基准)
[4] Image-to-image Translation via Hierarchical Style Disentanglement(通过分层样式分解实现图像到图像的翻译)
[5] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework(当年龄不变的人脸识别遇到人脸年龄合成时:一个多任务学习框架)
[6] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)
[7] Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders(分析和改进自省变分自动编码器)
[1] Cross Modal Focal Loss for RGBD Face Anti-Spoofing(跨模态焦点损失,用于RGBD人脸反欺骗)
[2] Multi-attentional Deepfake Detection(多注意的Deepfake检测)
## 重识别
[1] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批处理人员重新标识的元批实例规范化)
[2] On Semantic Similarity in Video Retrieval(视频检索中的语义相似度)
paper | code
[1] QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval(实用的查询高效的图像检索黑盒攻击)
paper
[1] Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation(临时加权层次聚类,实现无监督动作分割)
[2] Coarse-Fine Networks for Temporal Activity Detection in Videos(粗细网络,用于视频中的时间活动检测)
[3] Learning Discriminative Prototypes with Dynamic Time Warping(通过动态时间扭曲学习判别性原型)
[4] Temporal Action Segmentation from Timestamp Supervision(时间监督中的时间动作分割)
[5] ACTION-Net: Multipath Excitation for Action Recognition(用于动作识别的多路径激励)
[6] BASAR:Black-box Attack on Skeletal Action Recognition(骨骼动作识别的黑匣子攻击)
[7] Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack(了解对抗攻击下基于骨骼的动作识别的鲁棒性)
[8] Temporal Difference Networks for Efficient Action Recognition(用于有效动作识别的时差网络)
[9] Behavior-Driven Synthesis of Human Dynamics(行为驱动的人类动力学综合)
[1] DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images(一种心脏标记磁共振图像运动跟踪的无监督深度学习方法)
[2] Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning(多机构协作改进基于深度学习的联合学习磁共振图像重建)
[3] 3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management(用于胰腺肿块分割,诊断和定量患者管理的3D图形解剖学几何集成网络)
[4] Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies(深部病变追踪器:在4D纵向成像研究中监控病变)
[5] Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-constrained Optimization(通过脊柱矫正和解剖学约束优化在CT中自动进行椎骨定位和识别)
[6] Brain Image Synthesis with Unsupervised Multivariate Canonical CSCℓ4Net(无监督多元规范CSCℓ4Net的脑图像合成)
[7] XProtoNet: Diagnosis in Chest Radiography with Global and Local Explanations(使用全局和局部解释诊断胸部X光片)
[8] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space(在连续频率空间中通过情景学习进行医学图像分割的联合域泛化)
[9] Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles(多实例字幕:从组织病理学教科书和文章中学习表示形式)
[10] Discovering Hidden Physics Behind Transport Dynamics(在运输动力学背后发现隐藏物理)
[1] AttentiveNAS: Improving Neural Architecture Search via Attentive(通过注意力改善神经架构搜索)
[2] ReNAS: Relativistic Evaluation of Neural Architecture Search(NAS predictor当中ranking loss的重要性)
[3] HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens(降低NAS的成本)
[4] Prioritized Architecture Sampling with Monto-Carlo Tree Search(蒙特卡洛树搜索的优先架构采样)
[5] Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator(通过生成进行搜索:带有架构生成器的灵活高效的一键式NAS)
[6] Contrastive Neural Architecture Search with Neural Architecture Comparators(带有神经结构比较器的对比神经网络架构搜索)
[7] OPANAS: One-Shot Path Aggregation Network Architecture Search for Object(一键式路径聚合网络体系结构搜索对象)
[1] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)
[2] Efficient Conditional GAN Transfer with Knowledge Propagation across Classes(高效的有条件GAN转移以及跨课程的知识传播)
[3] Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing(利用GAN中潜在的空间维度进行实时图像编辑)
[4] Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs(Hijack-GAN:意外使用经过预训练的黑匣子GAN)
[5] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(样式编码:用于图像到图像翻译的StyleGAN编码器)
[6] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿势面部识别的3D GAN)
[7] DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network(通过对比生成对抗网络进行多种条件图像合成)
[8] Diverse Semantic Image Synthesis via Probability Distribution Modeling(基于概率分布建模的多种语义图像合成)
[9] HumanGAN: A Generative Model of Humans Images(人类图像的生成模型)
[10] MetaSimulator: Simulating Unknown Target Models for Query-Efficient Black-box Attacks(模拟未知目标模型以提高查询效率的黑盒攻击)
[11] Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders(分析和改进自省变分自动编码器)
[12] LOHO: Latent Optimization of Hairstyles via Orthogonalization(LOHO:通过正交化潜在地优化发型)
[13] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)
[14] Closed-Form Factorization of Latent Semantics in GANs(GAN中潜在语义的闭式分解)
[15] PD-GAN: Probabilistic Diverse GAN for Image Inpainting(用于图像修复的概率多样GAN)
[2] A Deep Emulator for Secondary Motion of 3D Characters(三维角色二次运动的深度仿真器)
[1] 3D CNNs with Adaptive Temporal Feature Resolutions(具有自适应时间特征分辨率的3D CNN)
[14] Skeleton Merger: an Unsupervised Aligned Keypoint Detector(骨架合并:无监督的对准关键点检测器)
paper | code
[13] Cycle4Completion: Unpaired Point Cloud Completion using Cycle Transformation with Missing Region Coding(使用缺失区域编码的循环变换完成不成对的点云)
paper
[12] Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion(通过双边扩充和自适应融合对实点云场景进行语义分割)
paper
[11] How Privacy-Preserving are Line Clouds? Recovering Scene Details from 3D Lines(线云如何保护隐私? 从3D线中恢复场景详细信息)
paper | code
[10] PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency(使用深度空间一致性进行稳健的点云配准)
paper | code
[9] Robust Point Cloud Registration Framework Based on Deep Graph Matching(基于深度图匹配的鲁棒点云配准框架)
paper | code
[8] TPCN: Temporal Point Cloud Networks for Motion Forecasting(面向运动预测的时态点云网络)
[7] PointGuard: Provably Robust 3D Point Cloud Classification(可证明稳健的三维点云分类)
[6] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割:数据集,基准和挑战)
[5] SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration(SpinNet:学习用于3D点云注册的通用表面描述符)
[4] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)
[3] Diffusion Probabilistic Models for 3D Point Cloud Generation(三维点云生成的扩散概率模型)
[2] Style-based Point Generator with Adversarial Rendering for Point Cloud Completion(用于点云补全的对抗性渲染基于样式的点生成器)
[1] PREDATOR: Registration of 3D Point Clouds with Low Overlap(预测器:低重叠的3D点云的注册)
[1] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers(具有透视作物层的3D姿势的几何感知神经重建)
[1] Manifold Regularized Dynamic Network Pruning(动态剪枝的过程中考虑样本复杂度与网络复杂度的约束)
[2] Learning Student Networks in the Wild(一种不需要原始训练数据的模型压缩和加速技术)
[1] Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation(通过自学来完善自己:通过自我蒸馏提炼特征)
[2] Knowledge Evolution in Neural Networks(神经网络中的知识进化)
[3] Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning(少班级增量学习的语义感知知识蒸馏)
[4] Teachers Do More Than Teach: Compressing Image-to-Image Models(https://arxiv.org/abs/2103.03467)
[5] General Instance Distillation for Object Detection(通用实例蒸馏技术在目标检测中的应用)
[6] Multiresolution Knowledge Distillation for Anomaly Detection(用于异常检测的多分辨率知识蒸馏)
[7] Distilling Object Detectors via Decoupled Features(前景背景分离的蒸馏技术)
[1] Coordinate Attention for Efficient Mobile Network Design(协调注意力以实现高效的移动网络设计)
[2] Inception Convolution with Efficient Dilation Search
- Paper: https://arxiv.org/abs/2012.13587
- Code: None
[3] Rethinking Channel Dimensions for Efficient Model Design(重新考虑通道尺寸以进行有效的模型设计)
[4] Inverting the Inherence of Convolution for Visual Recognition(颠倒卷积的固有性以进行视觉识别)
[5] RepVGG: Making VGG-style ConvNets Great Again
[6] Fast and Accurate Model Scaling(快速准确的模型缩放)
[7] Involution: Inverting the Inherence of Convolution for Visual Recognition(反转卷积的固有性以进行视觉识别)
[1] Transformer Interpretability Beyond Attention Visualization(注意力可视化之外的Transformer可解释性)
[2] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
[3] Pre-Trained Image Processing Transformer(底层视觉预训练模型)
[2] Quantifying Explainers of Graph Neural Networks in Computational Pathology(计算病理学中图神经网络的量化解释器)
[1] Sequential Graph Convolutional Network for Active Learning(主动学习的顺序图卷积网络)
[1] KeepAugment: A Simple Information-Preserving Data Augmentation(一种简单的保存信息的数据扩充)
[3] Adaptive Consistency Regularization for Semi-Supervised Transfer Learning(半监督转移学习的自适应一致性正则化)
[2] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批处理人员重新标识的元批实例规范化)
[1] Representative Batch Normalization with Feature Calibration(具有特征校准功能的代表性批量归一化)
[2] Improving Unsupervised Image Clustering With Robust Learning(通过鲁棒学习改善无监督图像聚类)
[1] Reconsidering Representation Alignment for Multi-view Clustering(重新考虑多视图聚类的表示对齐方式)
[1] Are Labels Necessary for Classifier Accuracy Evaluation?(测试集没有标签,我们可以拿来测试模型吗?)
[2] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割:数据集,基准和挑战)
[1] Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels(重新标记ImageNet:从单标签到多标签,从全局标签到本地标签)
[3] Vab-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning
[2] Multiple Instance Active Learning for Object Detection(用于对象检测的多实例主动学习)
[1] Sequential Graph Convolutional Network for Active Learning(主动学习的顺序图卷积网络)
[6] Goal-Oriented Gaze Estimation for Zero-Shot Learning(零样本学习的目标导向注视估计)
[5] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?
[4] Counterfactual Zero-Shot and Open-Set Visual Recognition(反事实零射和开集视觉识别)
[3] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小样本目标检测的语义关系推理)
[2] Few-shot Open-set Recognition by Transformation Consistency(转换一致性很少的开放集识别)
[1] Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning(探索少量学习的不变表示形式和等变表示形式的互补强度)
[2] Rainbow Memory: Continual Learning with a Memory of Diverse Samples(不断学习与多样本的记忆)
[1] Learning the Superpixel in a Non-iterative and Lifelong Manner(以非迭代和终身的方式学习超像素)
[1] Transformation Driven Visual Reasoning(转型驱动的视觉推理)
[4] Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning(通过域随机化和元学习对视觉表示进行连续调整)
[3] Domain Generalization via Inference-time Label-Preserving Target Projections(基于推理时间保标目标投影的区域泛化)
[2] MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing(可伸缩的自适应视频压缩传感重建)
[1] FSDR: Frequency Space Domain Randomization for Domain Generalization(用于域推广的频域随机化)
[1] Fine-grained Angular Contrastive Learning with Coarse Labels(粗标签的细粒度角度对比学习)
[1] QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval(实用的查询高效的图像检索黑盒攻击)
Learning Asynchronous and Sparse Human-Object Interaction in Videos(视频中异步稀疏人-物交互的学习)
Self-supervised Geometric Perception(自我监督的几何知觉)
Quantifying Explainers of Graph Neural Networks in Computational Pathology(计算病理学中图神经网络的量化解释器)
Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts(探索具有对比场景上下文的数据高效3D场景理解)
Data-Free Model Extraction(无数据模型提取)
Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition(用于【位置识别】的局部全局描述符的【多尺度融合】)
Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations(适用于正确概念的权利:通过可解释性来修正神经符号概念)
Multi-Objective Interpolation Training for Robustness to Label Noise(多目标插值训练的鲁棒性)
VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs(【文本生成】VX2TEXT:基于视频的文本生成的端到端学习来自多模式输入)
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans(【图像字幕】Scan2Cap:RGB-D扫描中的上下文感知密集字幕)
Hierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph(基于目标关系图的分层部分可观测目标驱动策略学习)
ID-Unet: Iterative Soft and Hard Deformation for View Synthesis(视图合成的迭代软硬变形)
PML: Progressive Margin Loss for Long-tailed Age Classification(【长尾分布】【图像分类】长尾年龄分类的累进边际损失)
Diversifying Sample Generation for Data-Free Quantization(【图像生成】多样化的样本生成,实现无数据量化)
Domain Generalization via Inference-time Label-Preserving Target Projections(通过保留推理时间的目标投影进行域泛化)
DeRF: Decomposed Radiance Fields(分解的辐射场)
Densely connected multidilated convolutional networks for dense prediction tasks(【密集预测】密集连接的多重卷积网络,用于密集的预测任务)
VirTex: Learning Visual Representations from Textual Annotations(【表示学习】从文本注释中学习视觉表示)
Weakly-supervised Grounded Visual Question Answering using Capsules(使用胶囊进行弱监督的地面视觉问答)
FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation(【视频插帧】FLAVR:用于快速帧插值的与流无关的视频表示)
Probabilistic Embeddings for Cross-Modal Retrieval(跨模态检索的概率嵌入)
Self-supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map(道路动力学和成本图的自监督式多步同时预测)
IIRC: Incremental Implicitly-Refined Classification(增量式隐式定义的分类)
Fair Attribute Classification through Latent Space De-biasing(通过潜在空间去偏的公平属性分类)
Information-Theoretic Segmentation by Inpainting Error Maximization(修复误差最大化的信息理论分割)
UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pretraining(【视频语言学习】UC2:通用跨语言跨模态视觉和语言预培训)
Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling(通过稀疏采样进行视频和语言学习)
D-NeRF: Neural Radiance Fields for Dynamic Scenes(D-NeRF:动态场景的神经辐射场)
Weakly Supervised Learning of Rigid 3D Scene Flow(刚性3D场景流的弱监督学习)
[23] Self-supervised Geometric Perception(自我监督的几何知觉)
[22] DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images(一种心脏标记磁共振图像运动跟踪的无监督深度学习方法)
[21] Modeling Multi-Label Action Dependencies for Temporal Action Localization(为时间动作本地化建模多标签动作相关性)
[20] HPS: localizing and tracking people in large 3D scenes from wearable sensors(通过可穿戴式传感器对大型3D场景中的人进行定位和跟踪)
[19] Real-Time High Resolution Background Matting(实时高分辨率背景抠像)
[18] Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts(探索具有对比场景上下文的数据高效3D场景理解)
[17] Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments(在动态室内环境中,通过空间划分的鲁棒神经路由可实现摄像机的重新定位)
[16] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)
[15] Categorical Depth Distribution Network for Monocular 3D Object Detection(用于单目三维目标检测的分类深度分布网络)
[14] PatchmatchNet: Learned Multi-View Patchmatch Stereo(学习多视图立体声)
[13] Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning(通过域随机化和元学习对视觉表示进行连续调整)
[12] Single-Stage Instance Shadow Detection with Bidirectional Relation Learning(具有双向关系学习的单阶段实例阴影检测)
[11] Neural Geometric Level of Detail:Real-time Rendering with Implicit 3D Surfaces(神经几何细节水平:隐式3D曲面的实时渲染)
[9] PREDATOR: Registration of 3D Point Clouds with Low Overlap(预测器:低重叠的3D点云的注册)
[8] Domain Generalization via Inference-time Label-Preserving Target Projections(通过保留推理时间的目标投影进行域泛化)
[7] Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction(全局一致的非刚性重建的神经变形图)
[6] Fine-grained Angular Contrastive Learning with Coarse Labels(粗标签的细粒度角度对比学习)
[5] Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling(通过稀疏采样进行视频和语言学习)
[4] Cross-View Regularization for Domain Adaptive Panoptic Segmentation(用于域自适应全景分割的跨视图正则化)
[3] Image-to-image Translation via Hierarchical Style Disentanglement(通过分层样式分解实现图像到图像的翻译)
[2] Towards Open World Object Detection(开放世界中的目标检测)
- [paper](Towards Open World Object Detection)
- code
[1] End-to-End Video Instance Segmentation with Transformers(使用Transformer的端到端视频实例分割)
- CVPR 2021 | GFLV2:目标检测良心技术,无Cost涨点!
- CVPR 2021 | 上交和国科大提出DCL:旋转目标检测新方法
- CVPR 2021 | 涨点神器!IC-Conv:使用高效空洞搜索的Inception卷积,全方位提升!
- CVPR 2021 Oral | 层次风格解耦:人脸多属性篡改终于可控了!
- CVPR 2021 | Transformer进军low-level视觉!北大华为等提出预训练模型IPT