Table of Contents
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-08-29 | SODAWideNet++: Combining Attention and Convolutions for Salient Object Detection | Rohit Venkata Sai Dulam et.al. | 2408.16645 | null |
2024-08-29 | Android Malware Detection Based on RGB Images and Multi-feature Fusion | Zhiqiang Wang et.al. | 2408.16555 | null |
2024-08-29 | SAU: A Dual-Branch Network to Enhance Long-Tailed Recognition via Generative Models | Guangxi Li et.al. | 2408.16273 | link |
2024-08-29 | Improving Diffusion-based Data Augmentation with Inversion Spherical Interpolation | Yanghao Wang et.al. | 2408.16266 | null |
2024-08-29 | Low Saturation Confidence Distribution-based Test-Time Adaptation for Cross-Domain Remote Sensing Image Classification | Yu Liang et.al. | 2408.16265 | null |
2024-08-28 | EMP: Enhance Memory in Data Pruning | Jinying Xiao et.al. | 2408.16031 | null |
2024-08-28 | Local Descriptors Weighted Adaptive Threshold Filtering For Few-Shot Learning | Bingchen Yan et.al. | 2408.15924 | null |
2024-08-28 | ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation | Tiantian Feng et.al. | 2408.15803 | null |
2024-08-28 | Visual Prompt Engineering for Medical Vision Language Models in Radiology | Stefan Denner et.al. | 2408.15802 | null |
2024-08-28 | Harnessing the Intrinsic Knowledge of Pretrained Language Models for Challenging Text Classification Settings | Lingyu Gao et.al. | 2408.15650 | null |
2024-08-27 | DCT-CryptoNets: Scaling Private Inference in the Frequency Domain | Arjun Roy et.al. | 2408.15231 | null |
2024-08-27 | A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships | Gracile Astlin Pereira et.al. | 2408.15178 | null |
2024-08-28 | AnomalousPatchCore: Exploring the Use of Anomalous Samples in Industrial Anomaly Detection | Mykhailo Koshil et.al. | 2408.15113 | null |
2024-08-27 | Data downlink prioritization using image classification on-board a 6U CubeSat | Keenan A. A. Chatar et.al. | 2408.14865 | null |
2024-08-27 | Leveraging Self-supervised Audio Representations for Data-Efficient Acoustic Scene Classification | Yiqiang Cai et.al. | 2408.14862 | null |
2024-08-27 | Text-guided Foundation Model Adaptation for Long-Tailed Medical Image Classification | Sirui Li et.al. | 2408.14770 | null |
2024-08-26 | On-Chip Learning with Memristor-Based Neural Networks: Assessing Accuracy and Efficiency Under Device Variations, Conductance Errors, and Input Noise | M. Reza Eslami et.al. | 2408.14680 | null |
2024-08-26 | Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification | Mahrukh Awan et.al. | 2408.14441 | null |
2024-08-26 | Uncertainties of Latent Representations in Computer Vision | Michael Kirchhof et.al. | 2408.14281 | null |
2024-08-26 | MSFMamba: Multi-Scale Feature Fusion State Space Model for Multi-Source Remote Sensing Image Classification | Feng Gao et.al. | 2408.14255 | null |
2024-08-26 | Feature Aligning Few shot Learning Method Using Local Descriptors Weighted Rules | Bingchen Yan et.al. | 2408.14192 | null |
2024-08-26 | GenFormer -- Generated Images are All You Need to Improve Robustness of Transformers on Small Datasets | Sven Oehri et.al. | 2408.14131 | null |
2024-08-25 | Few-Shot Histopathology Image Classification: Evaluating State-of-the-Art Methods and Unveiling Performance Insights | Ardhendu Sekhar et.al. | 2408.13816 | null |
2024-08-25 | On the Robustness of Kolmogorov-Arnold Networks: An Adversarial Perspective | Tal Alter et.al. | 2408.13809 | null |
2024-08-25 | Enhancing Adaptive Deep Networks for Image Classification via Uncertainty-aware Decision Fusion | Xu Zhang et.al. | 2408.13744 | link |
2024-08-25 | 3D-RCNet: Learning from Transformer to Build a 3D Relational ConvNet for Hyperspectral Image Classification | Haizhao Jing et.al. | 2408.13728 | null |
2024-08-24 | Enhanced Astronomical Source Classification with Integration of Attention Mechanisms and Vision Transformers | Srinadh Reddy Bhavanam et.al. | 2408.13634 | null |
2024-08-23 | Domain-specific long text classification from sparse relevant information | Célia D'Cruz et.al. | 2408.13253 | null |
2024-08-23 | EAViT: External Attention Vision Transformer for Audio Classification | Aquib Iqbal et.al. | 2408.13201 | null |
2024-08-23 | A gradient system based on anisotropic monochrome image processing with orientation auto-adjustment | Harbir Antil et.al. | 2408.12847 | null |
2024-08-23 | Underwater SONAR Image Classification and Analysis using LIME-based Explainable Artificial Intelligence | Purushothaman Natarajan et.al. | 2408.12837 | null |
2024-08-23 | VALE: A Multimodal Visual and Language Explanation Framework for Image Classifiers using eXplainable AI and Language Models | Purushothaman Natarajan et.al. | 2408.12808 | null |
2024-08-23 | BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models | Yige Li et.al. | 2408.12798 | null |
2024-08-23 | Semi-Supervised Variational Adversarial Active Learning via Learning to Rank and Agreement-Based Pseudo Labeling | Zongyao Lyu et.al. | 2408.12774 | null |
2024-08-23 | Symmetric masking strategy enhances the performance of Masked Image Modeling | Khanh-Binh Nguyen et.al. | 2408.12772 | null |
2024-08-22 | ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation | Lujia Zhong et.al. | 2408.12561 | link |
2024-08-22 | The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design | Artem Snegirev et.al. | 2408.12503 | null |
2024-08-22 | Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification | Sudi Murindanyi et.al. | 2408.12426 | null |
2024-08-22 | AT-SNN: Adaptive Tokens for Vision Transformer on Spiking Neural Network | Donghwa Kang et.al. | 2408.12293 | null |
2024-08-22 | Whole Slide Image Classification of Salivary Gland Tumours | John Charlton et.al. | 2408.12275 | null |
2024-08-22 | Query-Efficient Video Adversarial Attack with Stylized Logo | Duoxun Tang et.al. | 2408.12099 | null |
2024-08-21 | Approaching Deep Learning through the Spectral Dynamics of Weights | David Yunis et.al. | 2408.11804 | link |
2024-08-21 | SBDet: A Symmetry-Breaking Object Detector via Relaxed Rotation-Equivariance | Zhiqiang Wu et.al. | 2408.11760 | null |
2024-08-21 | Improving Calibration by Relating Focal Loss, Temperature Scaling, and Properness | Viacheslav Komisarenko et.al. | 2408.11598 | link |
2024-08-21 | MSCPT: Few-shot Whole Slide Image Classification with Multi-scale and Context-focused Prompt Tuning | Minghao Han et.al. | 2408.11505 | null |
2024-08-21 | Enabling Small Models for Zero-Shot Classification through Model Label Learning | Jia Zhang et.al. | 2408.11449 | null |
2024-08-21 | Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond | Minghao Liu et.al. | 2408.11338 | null |
2024-08-21 | Towards Evaluating Large Language Models on Sarcasm Understanding | Yazhou Zhang et.al. | 2408.11319 | null |
2024-08-20 | Privacy-preserving Universal Adversarial Defense for Black-box Models | Qiao Li et.al. | 2408.10647 | null |
2024-08-20 | A Tutorial on Explainable Image Classification for Dementia Stages Using Convolutional Neural Network and Gradient-weighted Class Activation Mapping | Kevin Kam Fung Yuen et.al. | 2408.10572 | null |
2024-08-20 | NoMatterXAI: Generating "No Matter What" Alterfactual Examples for Explaining Black-Box Text Classification Models | Tuc Nguyen et.al. | 2408.10528 | null |
2024-08-20 | Cervical Cancer Detection Using Multi-Branch Deep Learning Model | Tatsuhiro Baba et.al. | 2408.10498 | null |
2024-08-19 | HaSPeR: An Image Repository for Hand Shadow Puppet Recognition | Syed Rifat Raiyan et.al. | 2408.10360 | link |
2024-08-19 | Leveraging Superfluous Information in Contrastive Representation Learning | Xuechu Yu et.al. | 2408.10292 | null |
2024-08-19 | SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models | Anke Tang et.al. | 2408.10174 | link |
2024-08-19 | Towards Robust Federated Image Classification: An Empirical Study of Weight Selection Strategies in Manufacturing | Vinit Hegiste et.al. | 2408.10024 | null |
2024-08-19 | Detecting Adversarial Attacks in Semantic Segmentation via Uncertainty Estimation: A Deep Analysis | Kira Maag et.al. | 2408.10021 | null |
2024-08-19 | Active Learning for Identifying Disaster-Related Tweets: A Comparison with Keyword Filtering and Generic Fine-Tuning | David Hanny et.al. | 2408.09914 | null |
2024-08-19 | Ranking Generated Answers: On the Agreement of Retrieval Models with Humans on Consumer Health Questions | Sebastian Heineking et.al. | 2408.09831 | null |
2024-08-19 | AutoML-guided Fusion of Entity and LLM-based representations | Boshko Koloski et.al. | 2408.09794 | null |
2024-08-19 | Dataset Distillation for Histopathology Image Classification | Cong Cong et.al. | 2408.09709 | null |
2024-08-19 | A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification | Claudio M. V. de Andrade et.al. | 2408.09629 | null |
2024-08-18 | Attention Is Not What You Need: Revisiting Multi-Instance Learning for Whole Slide Image Classification | Xin Liu et.al. | 2408.09449 | null |
2024-08-17 | Narrowing the Focus: Learned Optimizers for Pretrained Models | Gus Kristiansen et.al. | 2408.09310 | null |
2024-08-16 | DPA: Dual Prototypes Alignment for Unsupervised Adaptation of Vision-Language Models | Eman Ali et.al. | 2408.08855 | null |
2024-08-16 | LEVIS: Large Exact Verifiable Input Spaces for Neural Networks | Mohamad Fares El Hajj Chehade et.al. | 2408.08824 | null |
2024-08-16 | Leveraging FourierKAN Classification Head for Pre-Trained Transformer-based Text Classification | Abdullah Al Imran et.al. | 2408.08803 | null |
2024-08-16 | Xpikeformer: Hybrid Analog-Digital Hardware Acceleration for Spiking Transformers | Zihang Song et.al. | 2408.08794 | null |
2024-08-16 | Quantum convolutional neural networks for jet images classification | Hala Elhag et.al. | 2408.08701 | null |
2024-08-16 | MM-UNet: A Mixed MLP Architecture for Improved Ophthalmic Image Segmentation | Zunjie Xiao et.al. | 2408.08600 | null |
2024-08-16 | Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs | Jinming Liu et.al. | 2408.08575 | null |
2024-08-16 | Efficient Image-to-Image Diffusion Classifier for Adversarial Robustness | Hefei Mei et.al. | 2408.08502 | link |
2024-08-15 | Beyond Uniform Query Distribution: Key-Driven Grouped Query Attention | Zohaib Khan et.al. | 2408.08454 | null |
2024-08-15 | Predictive uncertainty estimation in deep learning for lung carcinoma classification in digital pathology under real dataset shifts | Abdur R. Fayjie et.al. | 2408.08432 | null |
2024-08-15 | SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training | Gengwei Zhang et.al. | 2408.08295 | link |
2024-08-15 | Moving Healthcare AI-Support Systems for Visually Detectable Diseases onto Constrained Devices | Tess Watt et.al. | 2408.08215 | null |
2024-08-15 | Towards flexible perception with visual memory | Robert Geirhos et.al. | 2408.08172 | null |
2024-08-15 | Category-Prompt Refined Feature Learning for Long-Tailed Multi-Label Image Classification | Jiexuan Yan et.al. | 2408.08125 | link |
2024-08-15 | HAIR: Hypernetworks-based All-in-One Image Restoration | Jin Cao et.al. | 2408.08091 | link |
2024-08-14 | Large Language Models Prompting With Episodic Memory | Dai Do et.al. | 2408.07465 | null |
2024-08-14 | Leveraging Perceptual Scores for Dataset Pruning in Computer Vision Tasks | Raghavendra Singh et.al. | 2408.07243 | null |
2024-08-13 | Efficient Search for Customized Activation Functions with Gradient Descent | Lukas Strack et.al. | 2408.06820 | link |
2024-08-13 | Do Vision-Language Foundational models show Robust Visual Perception? | Shivam Chandhok et.al. | 2408.06781 | link |
2024-08-13 | Towards Cross-Domain Single Blood Cell Image Classification via Large-Scale LoRA-based Segment Anything Model | Yongcheng Li et.al. | 2408.06716 | link |
2024-08-13 | Coherence Awareness in Diffractive Neural Networks | Matan Kleiner et.al. | 2408.06681 | null |
2024-08-12 | Is it a work or leisure travel? Applying text classification to identify work-related travel on social networks | Lucas Félix et.al. | 2408.06341 | null |
2024-08-12 | Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance | Manuel Milling et.al. | 2408.06264 | null |
2024-08-12 | Deep Learning System Boundary Testing through Latent Space Style Mixing | Amr Abdellatif et.al. | 2408.06258 | null |
2024-08-12 | Global-to-Local Support Spectrums for Language Model Explainability | Lucas Agussurja et.al. | 2408.05976 | null |
2024-08-12 | A Simple Task-aware Contrastive Local Descriptor Selection Strategy for Few-shot Learning between inter class and intra class | Qian Qiao et.al. | 2408.05953 | null |
2024-08-12 | Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information | Mingkun Zhang et.al. | 2408.05900 | null |
2024-08-11 | HiLight: A Hierarchy-aware Light Global Model with Hierarchical Local ConTrastive Learning | Zhijian Chen et.al. | 2408.05786 | null |
2024-08-11 | PRECISe : Prototype-Reservation for Explainable Classification under Imbalanced and Scarce-Data Settings | Vaibhav Ganatra et.al. | 2408.05754 | null |
2024-08-11 | Disposable-key-based image encryption for collaborative learning of Vision Transformer | Rei Aso et.al. | 2408.05737 | null |
2024-08-11 | A Novel Momentum-Based Deep Learning Techniques for Medical Image Classification and Segmentation | Koushik Biswas et.al. | 2408.05692 | null |
2024-08-09 | A conformalized learning of a prediction set with applications to medical imaging classification | Roy Hirsch et.al. | 2408.05037 | null |
2024-08-09 | Generalisation First, Memorisation Second? Memorisation Localisation for Natural Language Classification Tasks | Verna Dankers et.al. | 2408.04965 | null |
2024-08-09 | LiD-FL: Towards List-Decodable Federated Learning | Hong Liu et.al. | 2408.04963 | null |
2024-08-09 | In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation | Dahyun Kang et.al. | 2408.04961 | link |
2024-08-08 | Enhanced Prototypical Part Network (EPPNet) For Explainable Image Classification Via Prototypes | Bhushan Atote et.al. | 2408.04606 | null |
2024-08-08 | SCENE: Evaluating Explainable AI Techniques Using Soft Counterfactuals | Haoran Zheng et.al. | 2408.04575 | null |
2024-08-08 | An experimental comparative study of backpropagation and alternatives for training binary neural networks for image classification | Ben Crulis et.al. | 2408.04460 | null |
2024-08-08 | Dual-branch PolSAR Image Classification Based on GraphMAE and Local Feature Extraction | Yuchen Wang et.al. | 2408.04294 | null |
2024-08-07 | FMiFood: Multi-modal Contrastive Learning for Food Image Classification | Xinyue Pan et.al. | 2408.03922 | null |
2024-08-07 | Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning | Simret Araya Gebreegziabher et.al. | 2408.03819 | null |
2024-08-07 | Intuitionistic Fuzzy Cognitive Maps for Interpretable Image Classification | Georgia Sovatzidi et.al. | 2408.03745 | null |
2024-08-07 | CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications | Tianfang Zhang et.al. | 2408.03703 | link |
2024-08-07 | Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks | Jaewook Lee et.al. | 2408.03663 | null |
2024-08-07 | Making Robust Generalizers Less Rigid with Soft Ascent-Descent | Matthew J. Holland et.al. | 2408.03619 | null |
2024-08-06 | AI Foundation Models in Remote Sensing: A Survey | Siqi Lu et.al. | 2408.03464 | null |
2024-08-06 | Compress and Compare: Interactively Evaluating Efficiency and Behavior Across ML Model Compression Experiments | Angie Boggust et.al. | 2408.03274 | null |
2024-08-06 | A Debiased Nearest Neighbors Framework for Multi-Label Text Classification | Zifeng Cheng et.al. | 2408.03202 | null |
2024-08-06 | Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi | Pranita Deshmukh et.al. | 2408.03172 | null |
2024-08-06 | Comb, Prune, Distill: Towards Unified Pruning for Vision Model Compression | Jonas Schmitt et.al. | 2408.03046 | null |
2024-08-06 | L3iTC at the FinLLM Challenge Task: Quantization for Financial Text Classification & Summarization | Elvys Linhares Pontes et.al. | 2408.03033 | null |
2024-08-06 | Adversarial Robustness of Open-source Text Classification Models and Fine-Tuning Chains | Hao Qin et.al. | 2408.02963 | null |
2024-08-06 | Dual-View Pyramid Pooling in Deep Neural Networks for Improved Medical Image Classification and Confidence Calibration | Xiaoqing Zhang et.al. | 2408.02906 | null |
2024-08-05 | Interpretation of the Intent Detection Problem as Dynamics in a Low-dimensional Space | Eduardo Sanchez-Karhunen et.al. | 2408.02838 | null |
2024-08-05 | Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services | Shaopeng Fu et.al. | 2408.02814 | null |
2024-08-05 | FPT+: A Parameter and Memory Efficient Transfer Learning Method for High-resolution Medical Image Classification | Yijin Huang et.al. | 2408.02426 | null |
2024-08-05 | On the Robustness of Malware Detectors to Adversarial Samples | Muhammad Salman et.al. | 2408.02310 | null |
2024-08-05 | Low-Cost Self-Ensembles Based on Multi-Branch Transformation and Grouped Convolution | Hojung Lee et.al. | 2408.02307 | null |
2024-08-05 | Network Fission Ensembles for Low-Cost Self-Ensembles | Hojung Lee et.al. | 2408.02301 | null |
2024-08-04 | VidModEx: Interpretable and Efficient Black Box Model Extraction for High-Dimensional Spaces | Somnath Sendhil Kumar et.al. | 2408.02140 | null |
2024-08-04 | DeMansia: Mamba Never Forgets Any Tokens | Ricky Fang et.al. | 2408.01986 | null |
2024-08-06 | A Survey and Evaluation of Adversarial Attacks for Object Detection | Khoi Nguyen Tiet Nguyen et.al. | 2408.01934 | null |
2024-08-03 | Safe Semi-Supervised Contrastive Learning Using In-Distribution Data as Positive Examples | Min Gu Kwak et.al. | 2408.01872 | null |
2024-08-03 | LAM3D: Leveraging Attention for Monocular 3D Object Detection | Diana-Alexandra Sas et.al. | 2408.01739 | null |
2024-08-02 | Counterfactual Explanations for Medical Image Classification and Regression using Diffusion Autoencoder | Matan Atad et.al. | 2408.01571 | null |
2024-08-02 | Spatial-Spectral Morphological Mamba for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2408.01372 | null |
2024-08-02 | WaveMamba: Spatial-Spectral Wavelet Mamba for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2408.01231 | null |
2024-08-02 | Multi-head Spatial-Spectral Mamba for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2408.01224 | null |
2024-08-02 | Rethinking Pre-trained Feature Extractor Selection in Multiple Instance Learning for Whole Slide Image Classification | Bryan Wong et.al. | 2408.01167 | null |
2024-08-01 | CERT-ED: Certifiably Robust Text Classification for Edit Distance | Zhuoqun Huang et.al. | 2408.00728 | null |
2024-08-01 | Deep Learning in Medical Image Classification from MRI-based Brain Tumor Images | Xiaoyi Liu et.al. | 2408.00636 | null |
2024-08-01 | DECIDER: Leveraging Foundation Model Priors for Improved Model Failure Detection and Explanation | Rakshith Subramanyam et.al. | 2408.00331 | null |
2024-07-31 | Vera Verto: Multimodal Hijacking Attack | Minxing Zhang et.al. | 2408.00129 | null |
2024-07-31 | Learning Video Context as Interleaved Multimodal Sequences | Kevin Qinghong Lin et.al. | 2407.21757 | null |
2024-07-30 | Contrasting Deep Learning Models for Direct Respiratory Insufficiency Detection Versus Blood Oxygen Saturation Estimation | Marcelo Matheus Gauy et.al. | 2407.20989 | null |
2024-07-30 | Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach | Adam Wojciechowski et.al. | 2407.20899 | null |
2024-08-01 | DFE-IANet: A Method for Polyp Image Classification Based on Dual-domain Feature Extraction and Interaction Attention | Wei Wang et.al. | 2407.20843 | null |
2024-08-01 | The Susceptibility of Example-Based Explainability Methods to Class Outliers | Ikhtiyor Nematov et.al. | 2407.20678 | null |
2024-07-30 | Knowledge Fused Recognition: Fusing Hierarchical Knowledge for Image Recognition through Quantitative Relativity Modeling and Deep Metric Learning | Yunfeng Zhao et.al. | 2407.20600 | null |
2024-07-30 | Exploring Liquid Neural Networks on Loihi-2 | Wiktoria Agata Pawlak et.al. | 2407.20590 | null |
2024-07-29 | Graphite: A Graph-based Extreme Multi-Label Short Text Classifier for Keyphrase Recommendation | Ashirbad Mishra et.al. | 2407.20462 | null |
2024-07-29 | Diffusion Feedback Helps CLIP See Better | Wenxuan Wang et.al. | 2407.20171 | null |
2024-07-29 | Distilling High Diagnostic Value Patches for Whole Slide Image Classification Using Attention Mechanism | Tianhang Nan et.al. | 2407.19821 | null |
2024-07-28 | Competition-based Adaptive ReLU for Deep Neural Networks | Junjia Chen et.al. | 2407.19441 | null |
2024-07-28 | Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets | Tianxiao Zhang et.al. | 2407.19394 | link |
2024-07-27 | Inference-Time Selective Debiasing | Gleb Kuzmin et.al. | 2407.19345 | null |
2024-07-27 | Stellar Blend Image Classification Using Computationally Efficient Gaussian Processes | Chinedu Eleh et.al. | 2407.19297 | null |
2024-07-27 | Towards Robust Few-shot Class Incremental Learning in Audio Classification using Contrastive Representation | Riyansha Singh et.al. | 2407.19265 | null |
2024-07-27 | A Survey of Malware Detection Using Deep Learning | Ahmed Bensaoud et.al. | 2407.19153 | null |
2024-07-26 | UniForensics: Face Forgery Detection via General Facial Representation | Ziyuan Fang et.al. | 2407.19079 | null |
2024-07-26 | A Scalable Quantum Non-local Neural Network for Image Classification | Sparsh Gupta et.al. | 2407.18906 | link |
2024-07-26 | Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment | Yuze Zheng et.al. | 2407.18854 | null |
2024-07-26 | Local Binary Pattern(LBP) Optimization for Feature Extraction | Zeinab Sedaghatjoo et.al. | 2407.18665 | null |
2024-07-26 | Topology Optimization of Random Memristors for Input-Aware Dynamic SNN | Bo Wang et.al. | 2407.18625 | null |
2024-07-26 | Content-driven Magnitude-Derivative Spectrum Complementary Learning for Hyperspectral Image Classification | Huiyan Bai et.al. | 2407.18593 | null |
2024-07-26 | VSSD: Vision Mamba with Non-Casual State Space Duality | Yuheng Shi et.al. | 2407.18559 | link |
2024-07-25 | Self-supervised pre-training with diffusion model for few-shot landmark detection in x-ray images | Roberto Di Via et.al. | 2407.18125 | null |
2024-07-25 | Mew: Multiplexed Immunofluorescence Image Analysis through an Efficient Multiplex Network | Sukwon Yun et.al. | 2407.17857 | link |
2024-07-25 | SAM-MIL: A Spatial Contextual Aware Multiple Instance Learning Approach for Whole Slide Image Classification | Heng Fang et.al. | 2407.17689 | link |
2024-07-26 | Unsqueeze [CLS] Bottleneck to Learn Rich Representations | Qing Su et.al. | 2407.17671 | link |
2024-07-24 | Explaining the Model, Protecting Your Data: Revealing and Mitigating the Data Privacy Risks of Post-Hoc Model Explanations via Membership Inference | Catherine Huang et.al. | 2407.17663 | null |
2024-07-23 | S-E Pipeline: A Vision Transformer (ViT) based Resilient Classification Pipeline for Medical Imaging Against Adversarial Attacks | Neha A S et.al. | 2407.17587 | null |
2024-07-24 | A Novel Two-Step Fine-Tuning Pipeline for Cold-Start Active Learning in Text Classification Tasks | Fabiano Belém et.al. | 2407.17284 | null |
2024-07-24 | Graph Neural Networks: A suitable Alternative to MLPs in Latent 3D Medical Image Classification? | Johannes Kiechle et.al. | 2407.17219 | link |
2024-07-24 | Quanv4EO: Empowering Earth Observation by means of Quanvolutional Neural Networks | Alessandro Sebastianelli et.al. | 2407.17108 | null |
2024-07-24 | An Adaptive Gradient Regularization Method | Huixiu Jiang et.al. | 2407.16944 | null |
2024-07-23 | Lawma: The Power of Specialization for Legal Tasks | Ricardo Dominguez-Olmedo et.al. | 2407.16615 | null |
2024-07-23 | Deep Bayesian segmentation for colon polyps: Well-calibrated predictions in medical imaging | Daniela L. Ramos et.al. | 2407.16608 | null |
2024-07-23 | Designing robust diffractive neural networks with improved transverse shift tolerance | Daniil V. Soshnikov et.al. | 2407.16456 | null |
2024-07-23 | Image Classification using Fuzzy Pooling in Convolutional Kolmogorov-Arnold Networks | Ayan Igali et.al. | 2407.16268 | null |
2024-07-23 | HSVLT: Hierarchical Scale-Aware Vision-Language Transformer for Multi-Label Image Classification | Shuyi Ouyang et.al. | 2407.16244 | null |
2024-07-23 | Improved Few-Shot Image Classification Through Multiple-Choice Questions | Dipika Khullar et.al. | 2407.16145 | null |
2024-07-22 | Pavement Fatigue Crack Detection and Severity Classification Based on Convolutional Neural Network | Zhen Wang et.al. | 2407.16021 | null |
2024-07-22 | AIDE: Antithetical, Intent-based, and Diverse Example-Based Explanations | Ikhtiyor Nematov et.al. | 2407.16010 | null |
2024-07-22 | Comprehensive Study on Performance Evaluation and Optimization of Model Compression: Bridging Traditional Deep Learning and Large Language Models | Aayush Saxena et.al. | 2407.15904 | null |
2024-07-22 | Beyond Size and Class Balance: Alpha as a New Dataset Quality Metric for Deep Learning | Josiah Couch et.al. | 2407.15724 | null |
2024-07-22 | Retinomorphic Feature Detection and Machine Vision in a Network Laser | Wai Kit Ng et.al. | 2407.15558 | null |
2024-07-22 | Learning deep illumination-robust features from multispectral filter array images | Anis Amziane et.al. | 2407.15472 | null |
2024-07-22 | Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data | Junha Song et.al. | 2407.15383 | null |
2024-07-22 | FMDNN: A Fuzzy-guided Multi-granular Deep Neural Network for Histopathological Image Classification | Weiping Ding et.al. | 2407.15312 | null |
2024-07-21 | Assessing Sample Quality via the Latent Space of Generative Models | Jingyi Xu et.al. | 2407.15171 | null |
2024-07-21 | A multi-level multi-label text classification dataset of 19th century Ottoman and Russian literary and critical texts | Gokcen Gokceoglu et.al. | 2407.15136 | null |
2024-07-20 | Toward Efficient Convolutional Neural Networks With Structured Ternary Patterns | Christos Kyrkou et.al. | 2407.14831 | link |
2024-07-20 | Subgraph Clustering and Atom Learning for Improved Image Classification | Aryan Singh et.al. | 2407.14772 | null |
2024-07-20 | A Comprehensive Review of Few-shot Action Recognition | Yuyang Wanyan et.al. | 2407.14744 | null |
2024-07-19 | DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks | Sarah Jabbour et.al. | 2407.14509 | null |
2024-07-19 | Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models | Xuenan Xu et.al. | 2407.14355 | null |
2024-07-19 | EmoCAM: Toward Understanding What Drives CNN-based Emotion Recognition | Youssef Doulfoukar et.al. | 2407.14314 | null |
2024-07-18 | CoAPT: Context Attribute words for Prompt Tuning | Gun Lee et.al. | 2407.13808 | null |
2024-07-18 | GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model | Abdelrahman Shaker et.al. | 2407.13772 | link |
2024-07-18 | Addressing Imbalance for Class Incremental Learning in Medical Image Classification | Xuze Hao et.al. | 2407.13768 | null |
2024-07-18 | Differential Privacy Mechanisms in Neural Tangent Kernel Regression | Jiuxiang Gu et.al. | 2407.13621 | null |
2024-07-18 | CycleMix: Mixing Source Domains for Domain Generalization in Style-Dependent Data | Aristotelis Ballas et.al. | 2407.13421 | link |
2024-07-17 | LookupViT: Compressing visual information to a limited number of tokens | Rajat Koner et.al. | 2407.12753 | null |
2024-07-17 | Toward INT4 Fixed-Point Training via Exploring Quantization Error for Gradients | Dohyung Kim et.al. | 2407.12637 | null |
2024-07-17 | Domain-specific or Uncertainty-aware models: Does it really make a difference for biomedical text classification? | Aman Sinha et.al. | 2407.12626 | null |
2024-07-18 | Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks | Antoni Kowalczuk et.al. | 2407.12588 | link |
2024-07-17 | Non-parametric regularization for class imbalance federated medical image classification | Jeffry Wicaksana et.al. | 2407.12446 | link |
2024-07-17 | FETCH: A Memory-Efficient Replay Approach for Continual Learning in Image Classification | Markus Weißflog et.al. | 2407.12375 | null |
2024-07-17 | Adaptive Cascading Network for Continual Test-Time Adaptation | Kien X. Nguyen et.al. | 2407.12240 | null |
2024-07-16 | Generalized Coverage for More Robust Low-Budget Active Learning | Wonho Bae et.al. | 2407.12212 | null |
2024-07-18 | A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification | Markus Marks et.al. | 2407.12210 | null |
2024-07-16 | Novel Artistic Scene-Centric Datasets for Effective Transfer Learning in Fragrant Spaces | Shumei Liu et.al. | 2407.11701 | null |
2024-07-16 | Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification | Naif Alkhunaizi et.al. | 2407.11573 | null |
2024-07-16 | TCFormer: Visual Recognition via Token Clustering Transformer | Wang Zeng et.al. | 2407.11321 | link |
2024-07-16 | PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer | Pierre-David Letourneau et.al. | 2407.11306 | null |
2024-07-15 | Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion | Philipp Allgeuer et.al. | 2407.11211 | null |
2024-07-16 | DataDream: Few-shot Guided Dataset Generation | Jae Myung Kim et.al. | 2407.10910 | link |
2024-07-15 | Pathology-knowledge Enhanced Multi-instance Prompt Learning for Few-shot Whole Slide Image Classification | Linhao Qu et.al. | 2407.10814 | null |
2024-07-15 | Employing Sentence Space Embedding for Classification of Data Stream from Fake News Domain | Paweł Zyblewski et.al. | 2407.10807 | null |
2024-07-15 | Anticipating Future Object Compositions without Forgetting | Youssef Zahran et.al. | 2407.10723 | null |
2024-07-15 | GeoMix: Towards Geometry-Aware Data Augmentation | Wentao Zhao et.al. | 2407.10681 | link |
2024-07-15 | Learning Natural Consistency Representation for Face Forgery Video Detection | Daichi Zhang et.al. | 2407.10550 | null |
2024-07-15 | Improving Hyperbolic Representations via Gromov-Wasserstein Regularization | Yifei Yang et.al. | 2407.10495 | null |
2024-07-15 | Backdoor Attacks against Image-to-Image Networks | Wenbo Jiang et.al. | 2407.10445 | null |
2024-07-14 | Deep Learning Algorithms for Early Diagnosis of Acute Lymphoblastic Leukemia | Dimitris Papaioannou et.al. | 2407.10251 | null |
2024-07-14 | Advancing Continual Learning for Robust Deepfake Audio Classification | Feiyi Dong et.al. | 2407.10108 | null |
2024-07-12 | Evaluating the Adversarial Robustness of Semantic Segmentation: Trying Harder Pays Off | Levente Halmosi et.al. | 2407.09150 | link |
2024-07-12 | Open Vocabulary Multi-Label Video Classification | Rohit Gupta et.al. | 2407.09073 | null |
2024-07-12 | GPC: Generative and General Pathology Image Classifier | Anh Tien Nguyen et.al. | 2407.09035 | null |
2024-07-12 | CAMP: Continuous and Adaptive Learning Model in Pathology | Anh Tien Nguyen et.al. | 2407.09030 | null |
2024-07-12 | SlideGCD: Slide-based Graph Collaborative Training with Knowledge Distillation for Whole Slide Image Classification | Tong Shu et.al. | 2407.08968 | null |
2024-07-12 | Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification | Ke Ji et.al. | 2407.08959 | null |
2024-07-11 | Local Clustering for Lung Cancer Image Classification via Sparse Solution Technique | Jackson Hamel et.al. | 2407.08800 | null |
2024-07-11 | Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification | Wenshuo Peng et.al. | 2407.08787 | null |
2024-07-11 | ElasticAST: An Audio Spectrogram Transformer for All Length and Resolutions | Jiu Feng et.al. | 2407.08691 | link |
2024-07-11 | Histopathological Image Classification with Cell Morphology Aware Deep Neural Networks | Andrey Ignatov et.al. | 2407.08625 | link |
2024-07-11 | BiasPruner: Debiased Continual Learning for Medical Image Classification | Nourhan Bayasi et.al. | 2407.08609 | link |
2024-07-11 | GraphMamba: An Efficient Graph Structure Learning Vision Mamba for Hyperspectral Image Classification | Aitao Yang et.al. | 2407.08255 | link |
2024-07-11 | Beyond Text: Leveraging Multi-Task Learning and Cognitive Appraisal Theory for Post-Purchase Intention Analysis | Gerard Christopher Yeo et.al. | 2407.08182 | null |
2024-07-11 | Enrich the content of the image Using Context-Aware Copy Paste | Qiushi Guo et.al. | 2407.08151 | null |
2024-07-10 | MambaVision: A Hybrid Mamba-Transformer Vision Backbone | Ali Hatamizadeh et.al. | 2407.08083 | link |
2024-07-10 | The Misclassification Likelihood Matrix: Some Classes Are More Likely To Be Misclassified Than Others | Daniel Sikar et.al. | 2407.07818 | null |
2024-07-11 | Trainable Highly-expressive Activation Functions | Irit Chelly et.al. | 2407.07564 | null |
2024-07-10 | HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification | Omar S. EL-Assiouti et.al. | 2407.07516 | null |
2024-07-10 | Towards a text-based quantitative and explainable histopathology image analysis | Anh Tien Nguyen et.al. | 2407.07360 | null |
2024-07-11 | FALFormer: Feature-aware Landmarks self-attention for Whole-slide Image Classification | Doanh C. Bui et.al. | 2407.07340 | link |
2024-07-10 | Dual-stage Hyperspectral Image Classification Model with Spectral Supertoken | Peifu Liu et.al. | 2407.07307 | link |
2024-07-09 | Exploring Camera Encoder Designs for Autonomous Driving Perception | Barath Lakshmanan et.al. | 2407.07276 | null |
2024-07-09 | CTRL-F: Pairing Convolution with Transformer for Image Classification via Multi-Level Feature Cross-Attention and Representation Learning Fusion | Hosam S. EL-Assiouti et.al. | 2407.06673 | null |
2024-07-09 | NoisyAG-News: A Benchmark for Addressing Instance-Dependent Noise in Text Classification | Hongfei Huang et.al. | 2407.06579 | null |
2024-07-08 | Hybrid Classical-Quantum architecture for vectorised image classification of hand-written sketches | Y. Cordero et.al. | 2407.06416 | null |
2024-07-08 | GeoWATCH for Detecting Heavy Construction in Heterogeneous Time Series of Satellite Images | Jon Crall et.al. | 2407.06337 | null |
2024-07-08 | Multi-Label Plant Species Classification with Self-Supervised Vision Transformers | Murilo Gustineli et.al. | 2407.06298 | link |
2024-07-08 | Active Label Refinement for Robust Training of Imbalanced Medical Image Classification Tasks in the Presence of High Label Noise | Bidur Khanal et.al. | 2407.05973 | null |
2024-07-08 | Wavelet Convolutions for Large Receptive Fields | Shahaf E. Finder et.al. | 2407.05848 | link |
2024-07-08 | Evaluating the Fairness of Neural Collapse in Medical Image Classification | Kaouther Mouheb et.al. | 2407.05843 | null |
2024-07-08 | Learning to Adapt Category Consistent Meta-Feature of CLIP for Few-Shot Classification | Jiaying Shi et.al. | 2407.05647 | null |
2024-07-08 | New Directions in Text Classification Research: Maximizing The Performance of Sentiment Classification from Limited Data | Surya Agustian et.al. | 2407.05627 | null |
2024-07-08 | Momentum Auxiliary Network for Supervised Local Learning | Junhao Su et.al. | 2407.05623 | link |
2024-07-08 | Open-world Multi-label Text Classification with Extremely Weak Supervision | Xintong Li et.al. | 2407.05609 | link |
2024-07-08 | FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance | Jiedong Zhuang et.al. | 2407.05578 | null |
2024-07-08 | An accurate detection is not all you need to combat label noise in web-noisy datasets | Paul Albert et.al. | 2407.05528 | null |
2024-07-07 | Leveraging Topological Guidance for Improved Knowledge Distillation | Eun Som Jeon et.al. | 2407.05316 | link |
2024-07-05 | AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation | Yuhan Zhu et.al. | 2407.04603 | null |
2024-07-05 | AMD: Automatic Multi-step Distillation of Large-scale Vision Models | Cheng Han et.al. | 2407.04208 | null |
2024-07-04 | LeDNet: Localization-enabled Deep Neural Network for Multi-Label Radiography Image Classification | Lalit Pant et.al. | 2407.03931 | null |
2024-07-04 | DocXplain: A Novel Model-Agnostic Explainability Method for Document Image Classification | Saifullah Saifullah et.al. | 2407.03830 | null |
2024-07-04 | reBEN: Refined BigEarthNet Dataset for Remote Sensing Image Analysis | Kai Norman Clasen et.al. | 2407.03653 | link |
2024-07-04 | Resampled Datasets Are Not Enough: Mitigating Societal Bias Beyond Single Attributes | Yusuke Hirota et.al. | 2407.03623 | null |
2024-07-04 | Self Adaptive Threshold Pseudo-labeling and Unreliable Sample Contrastive Loss for Semi-supervised Image Classification | Xuerong Zhang et.al. | 2407.03596 | null |
2024-07-04 | DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification | Wenhui Zhu et.al. | 2407.03575 | link |
2024-07-03 | A multicategory jet image classification framework using deep neural network | Jairo Orozco Sandoval et.al. | 2407.03524 | null |
2024-07-03 | Model Guidance via Explanations Turns Image Classifiers into Segmentation Models | Xiaoyan Yu et.al. | 2407.03009 | null |
2024-07-03 | ShiftAddAug: Augment Multiplication-Free Tiny Neural Network with Hybrid Computation | Yipin Guo et.al. | 2407.02881 | null |
2024-07-03 | Fine-Grained Scene Image Classification with Modality-Agnostic Adapter | Yiqun Wang et.al. | 2407.02769 | link |
2024-07-03 | ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers | Yanfeng Jiang et.al. | 2407.02763 | null |
2024-07-02 | Spectral Graph Reasoning Network for Hyperspectral Image Classification | Huiling Wang et.al. | 2407.02647 | null |
2024-07-01 | CGRclust: Chaos Game Representation for Twin Contrastive Clustering of Unlabelled DNA Sequences | Fatemeh Alipour et.al. | 2407.02538 | link |
2024-07-02 | Exploring the Role of Transliteration in In-Context Learning for Low-resource Languages Written in Non-Latin Scripts | Chunlan Ma et.al. | 2407.02320 | null |
2024-07-03 | Federated Distillation for Medical Image Classification: Towards Trustworthy Computer-Aided Diagnosis | Sufen Ren et.al. | 2407.02261 | null |
2024-07-02 | Hybrid Feature Collaborative Reconstruction Network for Few-Shot Fine-Grained Image Classification | Shulei Qiu et.al. | 2407.02123 | null |
2024-07-01 | Optimized Learning for X-Ray Image Classification for Multi-Class Disease Diagnoses with Accelerated Computing Strategies | Sebastian A. Cruz Romero et.al. | 2407.01705 | null |
2024-07-02 | xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart | Tianrun Chen et.al. | 2407.01530 | link |
2024-07-01 | Scarecrow monitoring system:employing mobilenet ssd for enhanced animal supervision | Balaji VS et.al. | 2407.01435 | null |
2024-07-01 | Semantic Compositions Enhance Vision-Language Contrastive Learning | Maxwell Aladago et.al. | 2407.01408 | null |
2024-07-01 | GalLoP: Learning Global and Local Prompts for Vision-Language Models | Marc Lafon et.al. | 2407.01400 | null |
2024-07-01 | Protecting Privacy in Classifiers by Token Manipulation | Re'em Harel et.al. | 2407.01334 | null |
2024-07-01 | Gradient-based Class Weighting for Unsupervised Domain Adaptation in Dense Prediction Visual Tasks | Roberto Alcover-Couso et.al. | 2407.01327 | null |
2024-06-28 | Extract More from Less: Efficient Fine-Grained Visual Recognition in Low-Data Regimes | Dmitry Demidov et.al. | 2406.19814 | link |
2024-06-27 | Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads | Ali Khaleghi Rahimian et.al. | 2406.19391 | link |
2024-06-27 | Learning Visual Conditioning Tokens to Correct Domain Shift for Fully Test-time Adaptation | Yushun Tang et.al. | 2406.19341 | null |
2024-06-27 | Spiking Convolutional Neural Networks for Text Classification | Changze Lv et.al. | 2406.19230 | link |
2024-06-27 | Adaptive Stochastic Weight Averaging | Caglar Demir et.al. | 2406.19092 | link |
2024-06-27 | FedMLP: Federated Multi-Label Medical Image Classification under Task Heterogeneity | Zhaobin Sun et.al. | 2406.18995 | link |
2024-06-26 | Detecting Machine-Generated Texts: Not Just "AI vs Humans" and Explainability is Complicated | Jiazhou Ji et.al. | 2406.18259 | null |
2024-06-26 | ViT-1.58b: Mobile Vision Transformers in the 1-bit Era | Zhengqing Yuan et.al. | 2406.18051 | null |
2024-06-25 | Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation | Tushar Prasanna Swaminathan et.al. | 2406.17749 | link |
2024-06-25 | Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning | Arijit Sehanobish et.al. | 2406.17740 | null |
2024-06-25 | BayTTA: Uncertainty-aware medical image classification with optimized test-time augmentation using Bayesian model averaging | Zeinab Sherkatghanad et.al. | 2406.17640 | link |
2024-06-26 | Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP | Sedigheh Eslami et.al. | 2406.17639 | null |
2024-06-25 | Knowledge Distillation in Automated Annotation: Supervised Text Classification with LLM-Generated Training Labels | Nicholas Pangakis et.al. | 2406.17633 | null |
2024-06-25 | Retrieval-style In-Context Learning for Few-shot Hierarchical Text Classification | Huiyao Chen et.al. | 2406.17534 | link |
2024-06-25 | TSynD: Targeted Synthetic Data Generation for Enhanced Medical Image Classification | Joshua Niemeijer et.al. | 2406.17473 | null |
2024-06-25 | Dynamic Scheduling for Vehicle-to-Vehicle Communications Enhanced Federated Learning | Jintao Yan et.al. | 2406.17470 | null |
2024-06-25 | Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes | Qi Ma et.al. | 2406.17438 | null |
2024-06-25 | Robustly Optimized Deep Feature Decoupling Network for Fatty Liver Diseases Detection | Peng Huang et.al. | 2406.17338 | null |
2024-06-24 | Evaluation of Language Models in the Medical Context Under Resource-Constrained Settings | Andrea Posada et.al. | 2406.16611 | link |
2024-06-24 | Improving robustness to corruptions with multiplicative weight perturbations | Trung Trinh et.al. | 2406.16540 | null |
2024-06-24 | UNICAD: A Unified Approach for Attack Detection, Noise Reduction and Novel Class Identification | Alvaro Lopez Pellicer et.al. | 2406.16501 | null |
2024-06-24 | Improving Quaternion Neural Networks with Quaternionic Activation Functions | Johannes Pöppelbaum et.al. | 2406.16481 | null |
2024-06-24 | Learning in Wilson-Cowan model for metapopulation | Raffaele Marino et.al. | 2406.16453 | link |
2024-06-24 | Context-augmented Retrieval: A Novel Framework for Fast Information Retrieval based Response Generation using Large Language Model | Sai Ganesh et.al. | 2406.16383 | null |
2024-06-24 | Combining Supervised Learning and Reinforcement Learning for Multi-Label Classification Tasks with Partial Labels | Zixia Jia et.al. | 2406.16293 | null |
2024-06-23 | Jacobian Descent for Multi-Objective Optimization | Pierre Quinton et.al. | 2406.16232 | null |
2024-06-23 | Learning with Noisy Ground Truth: From 2D Classification to 3D Reconstruction | Yangdi Lu et.al. | 2406.15982 | null |
2024-06-22 | PUDD: Towards Robust Multi-modal Prototype-based Deepfake Detection | Alvaro Lopez Pellcier et.al. | 2406.15921 | null |
2024-06-21 | Retrieval Augmented Zero-Shot Text Classification | Tassallah Abdullahi et.al. | 2406.15241 | null |
2024-06-21 | DiffExplainer: Unveiling Black Box Models Via Counterfactual Generation | Yingying Fang et.al. | 2406.15182 | null |
2024-06-21 | This actually looks like that: Proto-BagNets for local and global interpretability-by-design | Kerol Djoumessi et.al. | 2406.15168 | link |
2024-06-21 | Hierarchical thematic classification of major conference proceedings | Arsentii Kuzmin et.al. | 2406.14983 | null |
2024-06-21 | Demonstrating the Efficacy of Kolmogorov-Arnold Networks in Vision Tasks | Minjong Cheon et.al. | 2406.14916 | link |
2024-06-21 | MU-Bench: A Multitask Multimodal Benchmark for Machine Unlearning | Jiali Cheng et.al. | 2406.14796 | null |
2024-06-20 | Depth |
Parker Seegmiller et.al. | 2406.14695 | null |
2024-06-20 | Automatic Labels are as Effective as Manual Labels in Biomedical Images Classification with Deep Learning | Niccolò Marini et.al. | 2406.14351 | null |
2024-06-20 | Self-supervised Interpretable Concept-based Models for Text Classification | Francesco De Santis et.al. | 2406.14335 | null |
2024-06-20 | Adaptive Adversarial Cross-Entropy Loss for Sharpness-Aware Minimization | Tanapat Ratchatorn et.al. | 2406.14329 | null |
2024-06-20 | Boosting Hyperspectral Image Classification with Gate-Shift-Fuse Mechanisms in a Novel CNN-Transformer Approach | Mohamed Fadhlallah Guerri et.al. | 2406.14120 | null |
2024-06-20 | Seg-LSTM: Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images | Qinfeng Zhu et.al. | 2406.14086 | link |
2024-06-21 | CMTNet: Convolutional Meets Transformer Network for Hyperspectral Images Classification | Faxu Guo et.al. | 2406.14080 | null |
2024-06-20 | Communication-Efficient Adaptive Batch Size Strategies for Distributed Local Gradient Methods | Tim Tsz-Kit Lau et.al. | 2406.13936 | null |
2024-06-19 | WATT: Weight Average Test-Time Adaption of CLIP | David Osowiechi et.al. | 2406.13875 | link |
2024-06-19 | CNN Based Flank Predictor for Quadruped Animal Species | Vanessa Suessle et.al. | 2406.13588 | null |
2024-06-19 | Online Domain-Incremental Learning Approach to Classify Acoustic Scenes in All Locations | Manjunath Mulimani et.al. | 2406.13386 | null |
2024-06-18 | LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging | Jinuk Kim et.al. | 2406.12837 | link |
2024-06-18 | Privacy Preserving Federated Learning in Medical Imaging with Uncertainty Estimation | Nikolas Koutsoubis et.al. | 2406.12815 | link |
2024-06-18 | Online Anchor-based Training for Image Classification Tasks | Maria Tzelepi et.al. | 2406.12662 | null |
2024-06-18 | Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation | Branislav Pecher et.al. | 2406.12471 | null |
2024-06-18 | GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory | Haoze Wu et.al. | 2406.12375 | null |
2024-06-18 | What Did I Do Wrong? Quantifying LLMs' Sensitivity and Consistency to Prompt Engineering | Federico Errica et.al. | 2406.12334 | null |
2024-06-18 | Unleashing the Potential of Open-set Noisy Samples Against Label Noise for Medical Image Classification | Zehui Liao et.al. | 2406.12293 | null |
2024-06-18 | Advancing Cross-Domain Generalizability in Face Anti-Spoofing: Insights, Design, and Metrics | Hyojin Kim et.al. | 2406.12258 | null |
2024-06-19 | MiSuRe is all you need to explain your image segmentation | Syed Nouman Hasany et.al. | 2406.12173 | null |
2024-06-17 | Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation | Hamidreza Rouzegar et.al. | 2406.12114 | link |
2024-06-17 | Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% | Lei Zhu et.al. | 2406.11837 | link |
2024-06-17 | PrAViC: Probabilistic Adaptation Framework for Real-Time Video Classification | Magdalena Trędowicz et.al. | 2406.11443 | null |
2024-06-17 | Cross-domain Open-world Discovery | Shuo Wen et.al. | 2406.11422 | link |
2024-06-17 | BaFTA: Backprop-Free Test-Time Adaptation For Zero-Shot Vision-Language Models | Xuefeng Hu et.al. | 2406.11309 | null |
2024-06-17 | An Empirical Investigation of Matrix Factorization Methods for Pre-trained Transformers | Ashim Gupta et.al. | 2406.11307 | null |
2024-06-17 | Text Grafting: Near-Distribution Weak Supervision for Minority Classes in Text Classification | Letian Peng et.al. | 2406.11115 | null |
2024-06-16 | Fine-grained Classes and How to Find Them | Matej Grcić et.al. | 2406.11070 | link |
2024-06-16 | Leveraging Foundation Models for Multi-modal Federated Learning with Incomplete Modality | Liwei Che et.al. | 2406.11048 | null |
2024-06-16 | Curating Stopwords in Marathi: A TF-IDF Approach for Improved Text Analysis and Information Retrieval | Rohan Chavan et.al. | 2406.11029 | link |
2024-06-16 | Universal Cross-Lingual Text Classification | Riya Savant et.al. | 2406.11028 | null |
2024-06-14 | UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot Audio Task Learner | Dongchao Yang et.al. | 2406.10056 | null |
2024-06-14 | Comparison of fine-tuning strategies for transfer learning in medical image classification | Ana Davila et.al. | 2406.10050 | null |
2024-06-14 | Forgetting Order of Continual Learning: Examples That are Learned First are Forgotten Last | Guy Hacohen et.al. | 2406.09935 | null |
2024-06-13 | MirrorCheck: Efficient Adversarial Defense for Vision-Language Models | Samar Fares et.al. | 2406.09250 | null |
2024-06-13 | Self-Training for Sample-Efficient Active Learning for Text Classification with Pre-Trained Language Models | Christopher Schröder et.al. | 2406.09206 | null |
2024-06-13 | Large-Scale Evaluation of Open-Set Image Classification Techniques | Halil Bisgin et.al. | 2406.09112 | link |
2024-06-13 | LaCoOT: Layer Collapse through Optimal Transport | Victor Quétu et.al. | 2406.08933 | null |
2024-06-13 | The Penalized Inverse Probability Measure for Conformal Classification | Paul Melki et.al. | 2406.08884 | null |
2024-06-13 | Conceptual Learning via Embedding Approximations for Reinforcing Interpretability and Transparency | Maor Dikter et.al. | 2406.08840 | link |
2024-06-13 | DenoiseReID: Denoising Model for Representation Learning of Person Re-Identification | Zhengrui Xu et.al. | 2406.08773 | null |
2024-06-12 | Fine-Tuned 'Small' LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification | Martin Juan José Bucher et.al. | 2406.08660 | null |
2024-06-12 | Intelligent Multi-View Test Time Augmentation | Efe Ozturk et.al. | 2406.08593 | null |
2024-06-12 | Transformation-Dependent Adversarial Attacks | Yaoteng Tan et.al. | 2406.08443 | null |
2024-06-12 | AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer | Yitao Xu et.al. | 2406.08298 | null |
2024-06-12 | DistilDoc: Knowledge Distillation for Visually-Rich Document Applications | Jordy Van Landeghem et.al. | 2406.08226 | null |
2024-06-12 | Fully Few-shot Class-incremental Audio Classification Using Expandable Dual-embedding Extractor | Yongjie Si et.al. | 2406.08122 | null |
2024-06-12 | Low-Complexity Acoustic Scene Classification Using Parallel Attention-Convolution Network | Yanxiong Li et.al. | 2406.08119 | null |
2024-06-12 | A |
Lixian Zhang et.al. | 2406.08079 | null |
2024-06-12 | Adversarial Evasion Attack Efficiency against Large Language Models | João Vitorino et.al. | 2406.08050 | null |
2024-06-12 | Accurate Explanation Model for Image Classifiers using Class Association Embedding | Ruitao Xie et.al. | 2406.07961 | link |
2024-06-12 | Multi-Teacher Multi-Objective Meta-Learning for Zero-Shot Hyperspectral Band Selection | Jie Feng et.al. | 2406.07949 | null |
2024-06-12 | Small Scale Data-Free Knowledge Distillation | He Liu et.al. | 2406.07876 | link |
2024-06-11 | fKAN: Fractional Kolmogorov-Arnold Networks with trainable Jacobi basis functions | Alireza Afzal Aghaei et.al. | 2406.07456 | link |
2024-06-11 | Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach | Challapalli Phanindra Revanth et.al. | 2406.07332 | null |
2024-06-11 | Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment | Takuto Igarashi et.al. | 2406.07280 | null |
2024-06-11 | EEG-ImageNet: An Electroencephalogram Dataset and Benchmarks with Image Visual Stimuli of Multi-Granularity Labels | Shuqi Zhu et.al. | 2406.07151 | link |
2024-06-11 | RS-Agent: Automating Remote Sensing Tasks through Intelligent Agents | Wenjia Xu et.al. | 2406.07089 | null |
2024-06-11 | DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification | Jiamu Sheng et.al. | 2406.07050 | null |
2024-06-11 | Fairness-Aware Meta-Learning via Nash Bargaining | Yi Zeng et.al. | 2406.07029 | null |
2024-06-11 | Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models | Zhenyi Lu et.al. | 2406.07001 | link |
2024-06-11 | Scaling up masked audio encoder learning for general audio classification | Heinrich Dinkel et.al. | 2406.06992 | null |
2024-06-10 | Multi-Objective Neural Architecture Search for In-Memory Computing | Md Hasibul Amin et.al. | 2406.06746 | null |
2024-06-10 | Robust Latent Representation Tuning for Image-text Classification | Hao Sun et.al. | 2406.06048 | null |
2024-06-09 | Contrastive Learning from Synthetic Audio Doppelgangers | Manuel Cherep et.al. | 2406.05923 | null |
2024-06-09 | Scaling Graph Convolutions for Mobile Vision | William Avery et.al. | 2406.05850 | link |
2024-06-09 | Evolution-aware VAriance (EVA) Coreset Selection for Medical Image Classification | Yuxin Hong et.al. | 2406.05677 | null |
2024-06-09 | Which Backbone to Use: A Resource-efficient Domain Specific Comparison for Computer Vision | Pranav Jeevan et.al. | 2406.05612 | link |
2024-06-08 | Aligning Human Knowledge with Visual Concepts Towards Explainable Medical Image Classification | Yunhe Gao et.al. | 2406.05596 | null |
2024-06-07 | The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better | Scott Geng et.al. | 2406.05184 | link |
2024-06-07 | A Novel Time Series-to-Image Encoding Approach for Weather Phenomena Classification | Christian Giannetti et.al. | 2406.05096 | null |
2024-06-07 | Classification Metrics for Image Explanations: Towards Building Reliable XAI-Evaluations | Benjamin Fresz et.al. | 2406.05068 | link |
2024-06-07 | REP: Resource-Efficient Prompting for On-device Continual Learning | Sungho Jeon et.al. | 2406.04772 | null |
2024-06-07 | AICoderEval: Improving AI Domain Code Generation of Large Language Models | Yinghui Xia et.al. | 2406.04712 | null |
2024-06-07 | Cooperative Meta-Learning with Gradient Augmentation | Jongyun Shin et.al. | 2406.04639 | link |
2024-06-06 | OCCAM: Towards Cost-Efficient and Accuracy-Aware Image Classification Inference | Dujian Ding et.al. | 2406.04508 | null |
2024-06-06 | Can Language Models Use Forecasting Strategies? | Sarah Pratt et.al. | 2406.04446 | null |
2024-06-06 | Parameter-Inverted Image Pyramid Networks | Xizhou Zhu et.al. | 2406.04330 | link |
2024-06-07 | BEADs: Bias Evaluation Across Domains | Shaina Raza et.al. | 2406.04220 | null |
2024-06-06 | What Do Language Models Learn in Context? The Structured Task Hypothesis | Jiaoda Li et.al. | 2406.04216 | null |
2024-06-06 | Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness | Lars Hillebrand et.al. | 2406.04156 | link |
2024-06-07 | ReDistill: Residual Encoded Distillation for Peak Memory Reduction | Fang Chen et.al. | 2406.03744 | null |
2024-06-06 | LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text Classification | Chun Liu et.al. | 2406.03725 | link |
2024-06-05 | Convolutional Neural Networks and Vision Transformers for Fashion MNIST Classification: A Literature Review | Sonia Bbouzidi et.al. | 2406.03478 | null |
2024-06-05 | IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models | David Ifeoluwa Adelani et.al. | 2406.03368 | null |
2024-06-05 | Audio Mamba: Bidirectional State Space Model for Audio Representation Learning | Mehmet Hamza Erol et.al. | 2406.03344 | link |
2024-06-05 | FusionBench: A Comprehensive Benchmark of Deep Model Fusion | Anke Tang et.al. | 2406.03280 | null |
2024-06-05 | VWise: A novel benchmark for evaluating scene classification for vehicular applications | Pedro Azevedo et.al. | 2406.03273 | null |
2024-06-05 | Tiny models from tiny data: Textual and null-text inversion for few-shot distillation | Erik Landolsi et.al. | 2406.03146 | link |
2024-06-05 | Exploiting LMM-based knowledge for image classification tasks | Maria Tzelepi et.al. | 2406.03071 | null |
2024-06-04 | Randomized Geometric Algebra Methods for Convex Neural Networks | Yifei Wang et.al. | 2406.02806 | null |
2024-06-04 | DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the Dark | Chi-Jui Chang et.al. | 2406.02468 | null |
2024-06-04 | GrootVL: Tree Topology is All You Need in State Space Model | Yicheng Xiao et.al. | 2406.02395 | link |
2024-06-04 | Hybrid Quantum-Classical Neural Network for LAB Color Space Image Classification | Kwokho Ng et.al. | 2406.02229 | null |
2024-06-03 | Few-Shot Classification of Interactive Activities of Daily Living (InteractADL) | Zane Durante et.al. | 2406.01662 | link |
2024-06-03 | CoLa-DCE -- Concept-guided Latent Diffusion Counterfactual Explanations | Franz Motzkus et.al. | 2406.01649 | null |
2024-06-03 | Asynchronous Multi-Server Federated Learning for Geo-Distributed Clients | Yuncong Zuo et.al. | 2406.01439 | null |
2024-06-03 | Compute-Efficient Medical Image Classification with Softmax-Free Transformers and Sequence Normalization | Firas Khader et.al. | 2406.01314 | null |
2024-06-03 | Continuous Geometry-Aware Graph Diffusion via Hyperbolic Neural PDE | Jiaxu Liu et.al. | 2406.01282 | null |
2024-06-04 | MultiMax: Sparse and Multi-Modal Attention Learning | Yuxuan Zhou et.al. | 2406.01189 | link |
2024-06-03 | Synergizing Unsupervised and Supervised Learning: A Hybrid Approach for Accurate Natural Language Task Modeling | Wrick Talukdar et.al. | 2406.01096 | null |
2024-05-31 | You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet | Zhen Qin et.al. | 2405.21022 | null |
2024-05-31 | Investigating Calibration and Corruption Robustness of Post-hoc Pruned Perception CNNs: An Image Classification Benchmark Study | Pallavi Mitra et.al. | 2405.20876 | null |
2024-05-31 | Improving Generalization and Convergence by Enhancing Implicit Regularization | Mingze Wang et.al. | 2405.20763 | null |
2024-05-31 | Robust Stable Spiking Neural Networks | Jianhao Ding et.al. | 2405.20694 | null |
2024-05-31 | Enhancing Counterfactual Image Generation Using Mahalanobis Distance with Distribution Preferences in Feature Space | Yukai Zhang et.al. | 2405.20685 | null |
2024-05-31 | GenMix: Combining Generative and Mixture Data Augmentation for Medical Image Classification | Hansang Lee et.al. | 2405.20650 | null |
2024-05-31 | ToxVidLLM: A Multimodal LLM-based Framework for Toxicity Detection in Code-Mixed Videos | Krishanu Maity et.al. | 2405.20628 | null |
2024-05-30 | Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation | Louis L. Chen et.al. | 2405.20531 | null |
2024-05-30 | DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark | Haoxing Chen et.al. | 2405.19707 | link |
2024-05-30 | A Novel Approach for Automated Design Information Mining from Issue Logs | Jiuang Zhao et.al. | 2405.19623 | null |
2024-05-29 | I Bet You Did Not Mean That: Testing Semantic Importance via Betting | Jacopo Teneggi et.al. | 2405.19146 | link |
2024-05-29 | Verifiably Robust Conformal Prediction | Linus Jeary et.al. | 2405.18942 | null |
2024-05-29 | Leveraging Many-To-Many Relationships for Defending Against Visual-Language Adversarial Attacks | Futa Waseda et.al. | 2405.18770 | null |
2024-05-29 | GIST: Greedy Independent Set Thresholding for Diverse Data Summarization | Matthew Fahrbach et.al. | 2405.18754 | null |
2024-05-29 | LLM-based Hierarchical Concept Decomposition for Interpretable Fine-Grained Image Classification | Renyi Qu et.al. | 2405.18672 | null |
2024-05-28 | Its Not a Modality Gap: Characterizing and Addressing the Contrastive Gap | Abrar Fahim et.al. | 2405.18570 | null |
2024-05-28 | Why are Visually-Grounded Language Models Bad at Image Classification? | Yuhui Zhang et.al. | 2405.18415 | link |
2024-05-28 | MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution | Wenzhuo Liu et.al. | 2405.18240 | null |
2024-05-28 | Confidence-aware multi-modality learning for eye disease screening | Ke Zou et.al. | 2405.18167 | link |
2024-05-28 | 4-bit Shampoo for Memory-Efficient Network Training | Sike Wang et.al. | 2405.18144 | null |
2024-05-28 | DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture | Shentong Mo et.al. | 2405.17995 | null |
2024-05-27 | WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average | Louis Fournier et.al. | 2405.17517 | null |
2024-05-27 | Model-Agnostic Zeroth-Order Policy Optimization for Meta-Learning of Ergodic Linear Quadratic Regulators | Yunian Pan et.al. | 2405.17370 | null |
2024-05-27 | On the Noise Robustness of In-Context Learning for Text Generation | Hongfu Gao et.al. | 2405.17264 | null |
2024-05-27 | Superpixelwise Low-rank Approximation based Partial Label Learning for Hyperspectral Image Classification | Shujun Yang et.al. | 2405.17110 | link |
2024-05-26 | Demystify Mamba in Vision: A Linear Attention Perspective | Dongchen Han et.al. | 2405.16605 | null |
2024-05-26 | AdaFisher: Adaptive Second Order Optimization via Fisher Information | Damien Martins Gomes et.al. | 2405.16397 | null |
2024-05-25 | ModelLock: Locking Your Model With a Spell | Yifeng Gao et.al. | 2405.16285 | null |
2024-05-25 | Accelerating Transformers with Spectrum-Preserving Token Merging | Hoai-Chau Tran et.al. | 2405.16148 | null |
2024-05-25 | Breaking the False Sense of Security in Backdoor Defense through Re-Activation Attack | Mingli Zhu et.al. | 2405.16134 | null |
2024-05-24 | Grounding Stylistic Domain Generalization with Quantitative Domain Shift Measures and Synthetic Scene Images | Yiran Luo et.al. | 2405.15961 | null |
2024-05-24 | A Neurosymbolic Framework for Bias Correction in CNNs | Parth Padalkar et.al. | 2405.15886 | null |
2024-05-24 | What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models | Abdelrahman Abdelhamed et.al. | 2405.15668 | null |
2024-05-24 | Class Machine Unlearning for Complex Data via Concepts Inference and Data Poisoning | Wenhan Chang et.al. | 2405.15662 | null |
2024-05-24 | Exposing Image Classifier Shortcuts with Counterfactual Frequency (CoF) Tables | James Hinns et.al. | 2405.15661 | null |
2024-05-24 | Harnessing Increased Client Participation with Cohort-Parallel Federated Learning | Akash Dhasade et.al. | 2405.15644 | null |
2024-05-24 | Transformer-based Federated Learning for Multi-Label Remote Sensing Image Classification | Barış Büyüktaş et.al. | 2405.15405 | null |
2024-05-24 | CLIP model is an Efficient Online Lifelong Learner | Leyuan Wang et.al. | 2405.15155 | null |
2024-05-24 | OptLLM: Optimal Assignment of Queries to Large Language Models | Yueyue Liu et.al. | 2405.15130 | null |
2024-05-23 | A Lost Opportunity for Vision-Language Models: A Comparative Study of Online Test-time Adaptation for Vision-Language Models | Mario Döbler et.al. | 2405.14977 | link |
2024-05-23 | Domain Wall Magnetic Tunnel Junction Reliable Integrate and Fire Neuron | Can Cui1 et.al. | 2405.14851 | null |
2024-05-23 | Explaining Black-box Model Predictions via Two-level Nested Feature Attributions with Consistency Property | Yuya Yoshikawa et.al. | 2405.14522 | null |
2024-05-23 | SIAVC: Semi-Supervised Framework for Industrial Accident Video Classification | Zuoyong Li et.al. | 2405.14506 | null |
2024-05-23 | Scalable Visual State Space Model with Fractal Scanning | Lv Tang et.al. | 2405.14480 | null |
2024-05-23 | Segformer++: Efficient Token-Merging Strategies for High-Resolution Semantic Segmentation | Daniel Kienzle et.al. | 2405.14467 | null |
2024-05-23 | Boosting Robustness by Clipping Gradients in Distributed Learning | Youssef Allouah et.al. | 2405.14432 | null |
2024-05-23 | Advancing Spiking Neural Networks for Sequential Modeling with Central Pattern Generators | Changze Lv et.al. | 2405.14362 | null |
2024-05-23 | Simple Hamiltonian dynamics is a powerful quantum processing resource | Akitada Sakurai et.al. | 2405.14245 | null |
2024-05-23 | ChronosLex: Time-aware Incremental Training for Temporal Generalization of Legal Classification Tasks | T. Y. S. S Santosh et.al. | 2405.14211 | null |
2024-05-22 | Just rotate it! Uncertainty estimation in closed-source models via multiple queries | Konstantinos Pitas et.al. | 2405.13864 | null |
2024-05-21 | Decentralized Federated Learning Over Imperfect Communication Channels | Weicai Li et.al. | 2405.12894 | null |
2024-05-21 | Multimodal Adaptive Inference for Document Image Classification with Anytime Early Exiting | Omar Hamed et.al. | 2405.12705 | null |
2024-05-21 | Exploration of Masked and Causal Language Modelling for Text Generation | Nicolo Micheletti et.al. | 2405.12630 | null |
2024-05-21 | 3DSS-Mamba: 3D-Spectral-Spatial Mamba for Hyperspectral Image Classification | Yan He et.al. | 2405.12487 | null |
2024-05-20 | Alzheimer's Magnetic Resonance Imaging Classification Using Deep and Meta-Learning Models | Nida Nasir et.al. | 2405.12126 | null |
2024-05-20 | Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification | Weilian Zhou et.al. | 2405.12003 | link |
2024-05-20 | A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers | Tom Roth et.al. | 2405.11904 | null |
2024-05-21 | A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus | Eduard Poesina et.al. | 2405.11877 | link |
2024-05-20 | SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model | Siavash Shams et.al. | 2405.11831 | link |
2024-05-20 | Exploring Ordinality in Text Classification: A Comparative Study of Explicit and Implicit Techniques | Siva Rajesh Kasa et.al. | 2405.11775 | null |
2024-05-19 | SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization | Jialong Guo et.al. | 2405.11582 | link |
2024-05-19 | Reproducibility Study of CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification | Manan Shah et.al. | 2405.11574 | link |
2024-05-19 | An Invisible Backdoor Attack Based On Semantic Feature | Yangming Chen et.al. | 2405.11551 | null |
2024-05-19 | Verification technology for finger vein biometric | George Kumi Kyeremeh et.al. | 2405.11540 | null |
2024-05-17 | Reduced storage direct tensor ring decomposition for convolutional neural networks compression | Mateusz Gabor et.al. | 2405.10802 | link |
2024-05-17 | Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset | Jie Zhu et.al. | 2405.10542 | link |
2024-05-17 | Smart Expert System: Large Language Models as Text Classifiers | Zhiqiang Wang et.al. | 2405.10523 | link |
2024-05-16 | Data-Efficient Low-Complexity Acoustic Scene Classification in the DCASE 2024 Challenge | Florian Schmid et.al. | 2405.10018 | null |
2024-05-16 | ROCOv2: Radiology Objects in COntext Version 2, an Updated Multimodal Image Dataset | Johannes Rückert et.al. | 2405.10004 | link |
2024-05-15 | Improving Label Error Detection and Elimination with Uncertainty Quantification | Johannes Jakubik et.al. | 2405.09602 | null |
2024-05-15 | Tackling Distribution Shifts in Task-Oriented Communication with Information Bottleneck | Hongru Li et.al. | 2405.09514 | null |
2024-05-15 | Feature-based Federated Transfer Learning: Communication Efficiency, Robustness and Privacy | Feng Wang et.al. | 2405.09014 | link |
2024-05-14 | The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks | Ziquan Liu et.al. | 2405.08886 | link |
2024-05-14 | Harnessing the power of longitudinal medical imaging for eye disease prognosis using Transformer-based sequence modeling | Gregory Holste et.al. | 2405.08780 | null |
2024-05-14 | FolkTalent: Enhancing Classification and Tagging of Indian Folk Paintings | Nancy Hada et.al. | 2405.08776 | null |
2024-05-14 | The impact of Compositionality in Zero-shot Multi-label action recognition for Object-based tasks | Carmela Calabrese et.al. | 2405.08695 | null |
2024-05-14 | Achieving Fairness Through Channel Pruning for Dermatological Disease Diagnosis | Qingpeng Kong et.al. | 2405.08681 | link |
2024-05-14 | Investigating Design Choices in Joint-Embedding Predictive Architectures for General Audio Representation Learning | Alain Riou et.al. | 2405.08679 | null |
2024-05-14 | Dual-Branch Network for Portrait Image Quality Assessment | Wei Sun et.al. | 2405.08555 | null |
2024-05-13 | Who's in and who's out? A case study of multimodal CLIP-filtering in DataComp | Rachel Hong et.al. | 2405.08209 | link |
2024-05-14 | MambaOut: Do We Really Need Mamba for Vision? | Weihao Yu et.al. | 2405.07992 | link |
2024-05-13 | Constrained Exploration via Reflected Replica Exchange Stochastic Gradient Langevin Dynamics | Haoyang Zheng et.al. | 2405.07839 | link |
2024-05-13 | Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent | Michael Kohler et.al. | 2405.07619 | null |
2024-05-13 | On-device Online Learning and Semantic Management of TinyML Systems | Haoyu Ren et.al. | 2405.07601 | null |
2024-05-13 | GLiRA: Black-Box Membership Inference Attack via Knowledge Distillation | Andrey V. Galichin et.al. | 2405.07562 | null |
2024-05-13 | Fine-tuning the SwissBERT Encoder Model for Embedding Sentences and Documents | Juri Grosjean et.al. | 2405.07513 | null |
2024-05-13 | MoVL:Exploring Fusion Strategies for the Domain-Adaptive Application of Pretrained Models in Medical Imaging Tasks | Haijiang Tian et.al. | 2405.07411 | null |
2024-05-12 | Explainable Convolutional Neural Networks for Retinal Fundus Classification and Cutting-Edge Segmentation Models for Retinal Blood Vessels from Fundus Images | Fatema Tuj Johora Faria et.al. | 2405.07338 | null |
2024-05-12 | Differentiable Model Scaling using Differentiable Topk | Kai Liu et.al. | 2405.07194 | null |
2024-05-11 | A framework of text-dependent speaker verification for chinese numerical string corpus | Litong Zheng et.al. | 2405.07029 | null |
2024-05-10 | Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image Classification | Yaoqin Ye et.al. | 2405.06468 | null |
2024-05-10 | Multi-level Personalized Federated Learning on Heterogeneous and Long-Tailed Data | Rongyu Zhang et.al. | 2405.06413 | null |
2024-05-10 | SaudiBERT: A Large Language Model Pretrained on Saudi Dialect Corpora | Faisal Qarah et.al. | 2405.06239 | null |
2024-05-09 | Deep Multi-Task Learning for Malware Image Classification | Ahmed Bensaoud et.al. | 2405.05906 | null |
2024-05-09 | Enhancing Suicide Risk Detection on Social Media through Semi-Supervised Deep Label Smoothing | Matthew Squires et.al. | 2405.05795 | null |
2024-05-09 | CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks | Nick et.al. | 2405.05755 | null |
2024-05-09 | How Quality Affects Deep Neural Networks in Fine-Grained Image Classification | Joseph Smith et.al. | 2405.05742 | null |
2024-05-09 | End-to-End Generative Semantic Communication Powered by Shared Semantic Knowledge Base | Shuling Li et.al. | 2405.05738 | null |
2024-05-09 | Using Machine Translation to Augment Multilingual Classification | Adam King et.al. | 2405.05478 | null |
2024-05-08 | AFEN: Respiratory Disease Classification using Ensemble Learning | Rahul Nadkarni et.al. | 2405.05467 | null |
2024-05-08 | XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples | Peiqin Lin et.al. | 2405.05116 | link |
2024-05-08 | Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Watermarking Feature Attribution | Shuo Shao et.al. | 2405.04825 | null |
2024-05-07 | Exploring Explainable AI Techniques for Improved Interpretability in Lung and Colon Cancer Classification | Mukaffi Bin Moin et.al. | 2405.04610 | link |
2024-05-07 | Pragmatist Intelligence: Where the Principle of Usefulness Can Take ANNs | Antonio Bikić et.al. | 2405.04386 | null |
2024-05-07 | Semi-Supervised Disease Classification based on Limited Medical Image Data | Yan Zhang et.al. | 2405.04295 | null |
2024-05-07 | DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects | Da Fu et.al. | 2405.04093 | null |
2024-05-07 | Feature Map Convergence Evaluation for Functional Module | Ludan Zhang et.al. | 2405.04041 | null |
2024-05-07 | VMambaCC: A Visual State Space Model for Crowd Counting | Hao-Yuan Ma et.al. | 2405.03978 | null |
2024-05-06 | On Adversarial Examples for Text Classification by Perturbing Latent Representations | Korn Sooksatra et.al. | 2405.03789 | null |
2024-05-06 | CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification | Sankalp Sinha et.al. | 2405.03660 | null |
2024-05-06 | Deep Space Separable Distillation for Lightweight Acoustic Scene Classification | ShuQi Ye et.al. | 2405.03567 | null |
2024-05-06 | Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing | Han Liu et.al. | 2405.03565 | null |
2024-05-06 | A Lightweight Neural Architecture Search Model for Medical Image Classification | Lunchen Xie et.al. | 2405.03462 | null |
2024-05-06 | Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification | Matteo Bianchi et.al. | 2405.03301 | null |
2024-05-06 | TED: Accelerate Model Training by Internal Generalization | Jinying Xiao et.al. | 2405.03228 | null |
2024-05-06 | Advancing Multimodal Medical Capabilities of Gemini | Lin Yang et.al. | 2405.03162 | null |
2024-05-05 | A scoping review of using Large Language Models (LLMs) to investigate Electronic Health Records (EHRs) | Lingyao Li et.al. | 2405.03066 | null |
2024-05-05 | Parameter-Efficient Fine-Tuning with Discrete Fourier Transform | Ziqi Gao et.al. | 2405.03003 | null |
2024-05-04 | MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning | Vishal Nedungadi et.al. | 2405.02771 | null |
2024-05-03 | Multi-method Integration with Confidence-based Weighting for Zero-shot Image Classification | Siqi Yin et.al. | 2405.02155 | null |
2024-05-03 | The Trade-off between Performance, Efficiency, and Fairness in Adapter Modules for Text Classification | Minh Duc Bui et.al. | 2405.02010 | null |
2024-05-03 | Which Identities Are Mobilized: Towards an automated detection of social group appeals in political texts | Felicia Riethmüller et.al. | 2405.01904 | null |
2024-05-02 | PVF (Parameter Vulnerability Factor): A Quantitative Metric Measuring AI Vulnerability and Resilience Against Parameter Corruptions | Xun Jiao et.al. | 2405.01741 | null |
2024-05-02 | Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey | Guoping Xu et.al. | 2405.01725 | link |
2024-05-02 | SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients | Tushar Verma et.al. | 2405.01699 | null |
2024-05-02 | Explainable AI (XAI) in Image Segmentation in Medicine, Industry, and Beyond: A Survey | Rokas Gipiškis et.al. | 2405.01636 | null |
2024-05-02 | Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models | Nishad Singhi et.al. | 2405.01531 | null |
2024-05-03 | Decoupling Feature Extraction and Classification Layers for Calibrated Neural Networks | Mikkel Jordahn et.al. | 2405.01196 | null |
2024-05-02 | Uncertainty-aware self-training with expectation maximization basis transformation | Zijia Wang et.al. | 2405.01175 | null |
2024-05-02 | Transformers Fusion across Disjoint Samples for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2405.01095 | null |
2024-05-02 | Efficient and Flexible Method for Reducing Moderate-size Deep Neural Networks with Condensation | Tianyi Chen et.al. | 2405.01041 | null |
2024-05-02 | Benchmarking Representations for Speech, Music, and Acoustic Events | Moreno La Quatra et.al. | 2405.00934 | link |
2024-05-01 | Digital-analog quantum convolutional neural networks for image classification | Anton Simen et.al. | 2405.00548 | null |
2024-05-03 | BiomedRAG: A Retrieval Augmented Large Language Model for Biomedicine | Mingchen Li et.al. | 2405.00465 | null |
2024-05-01 | Visual and audio scene classification for detecting discrepancies in video: a baseline method and experimental protocol | Konstantinos Apostolidis et.al. | 2405.00384 | null |
2024-05-01 | Data Augmentation Policy Search for Long-Term Forecasting | Liran Nochumsohn et.al. | 2405.00319 | null |
2024-04-30 | Let's Focus: Focused Backdoor Attack against Federated Transfer Learning | Marco Arazzi et.al. | 2404.19420 | null |
2024-04-30 | Large Language Model Informed Patent Image Retrieval | Hao-Cheng Lo et.al. | 2404.19360 | null |
2024-04-30 | Enhancing Intrinsic Features for Debiasing via Investigating Class-Discerning Common Attributes in Bias-Contrastive Pair | Jeonghoon Park et.al. | 2404.19250 | null |
2024-04-29 | Spectral-Spatial Mamba for Hyperspectral Image Classification | Lingbo Huang et.al. | 2404.18401 | null |
2024-04-28 | TextGram: Towards a better domain-adaptive pretraining | Sharayu Hiwarkhedkar et.al. | 2404.18228 | null |
2024-04-28 | L3Cube-MahaNews: News-based Short Text and Long Document Classification Datasets in Marathi | Saloni Mittal et.al. | 2404.18216 | link |
2024-04-28 | S |
Guanchun Wang et.al. | 2404.18213 | null |
2024-04-27 | Implicit Generative Prior for Bayesian Neural Networks | Yijia Liu et.al. | 2404.18008 | link |
2024-04-27 | Towards Privacy-Preserving Audio Classification Systems | Bhawana Chhaglani et.al. | 2404.18002 | null |
2024-04-27 | A Method of Moments Embedding Constraint and its Application to Semi-Supervised Learning | Michael Majurski et.al. | 2404.17978 | null |
2024-04-27 | Spatial, Temporal, and Geometric Fusion for Remote Sensing Images | Hessah Albanwan et.al. | 2404.17851 | null |
2024-04-27 | Leveraging Cross-Modal Neighbor Representation for Improved CLIP Classification | Chao Yi et.al. | 2404.17753 | link |
2024-04-26 | SPLICE -- Streamlining Digital Pathology Image Processing | Areej Alsaafin et.al. | 2404.17704 | null |
2024-04-26 | SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes | Georgia Baltsou et.al. | 2404.17255 | null |
2024-04-25 | Incorporating Lexical and Syntactic Knowledge for Unsupervised Cross-Lingual Transfer | Jianyu Zheng et.al. | 2404.16627 | link |
2024-04-25 | IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced Learning Tasks | Zitong Huang et.al. | 2404.16331 | null |
2024-04-25 | Lacunarity Pooling Layers for Plant Image Classification using Texture Analysis | Akshatha Mohan et.al. | 2404.16268 | link |
2024-04-24 | MiMICRI: Towards Domain-centered Counterfactual Explanations of Cardiovascular Image Classification Models | Grace Guo et.al. | 2404.16174 | null |
2024-04-24 | MoDE: CLIP Data Experts via Clustering | Jiawei Ma et.al. | 2404.16030 | link |
2024-04-26 | A Survey on Visual Mamba | Hanwei Zhang et.al. | 2404.15956 | null |
2024-04-24 | Vision Transformer-based Adversarial Domain Adaptation | Yahan Li et.al. | 2404.15817 | link |
2024-04-24 | Rethinking Model Prototyping through the MedMNIST+ Dataset Collection | Sebastian Doerrich et.al. | 2404.15786 | null |
2024-04-24 | Efficient Multi-Model Fusion with Adversarial Complementary Representation Learning | Zuheng Kang et.al. | 2404.15704 | null |
2024-04-24 | Brain Storm Optimization Based Swarm Learning for Diabetic Retinopathy Image Classification | Liang Qu et.al. | 2404.15585 | null |
2024-04-23 | An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models | Yangchen Pan et.al. | 2404.15518 | null |
2024-04-23 | Deep multi-prototype capsule networks | Saeid Abbassi et.al. | 2404.15445 | null |
2024-04-23 | A review of deep learning-based information fusion techniques for multimodal medical image classification | Yihao Li et.al. | 2404.15022 | null |
2024-04-23 | Social Media and Artificial Intelligence for Sustainable Cities and Societies: A Water Quality Analysis Use-case | Muhammad Asif Auyb et.al. | 2404.14977 | null |
2024-04-23 | Traditional to Transformers: A Survey on Current Trends and Future Prospects for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2404.14955 | link |
2024-04-23 | Pyramid Hierarchical Transformer for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2404.14945 | link |
2024-04-23 | Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image Classification | Muhammad Ahmad et.al. | 2404.14944 | link |
2024-04-23 | CoProNN: Concept-based Prototypical Nearest Neighbors for Explaining Vision Models | Teodor Chiaburu et.al. | 2404.14830 | link |
2024-04-22 | WangLab at MEDIQA-M3G 2024: Multimodal Medical Answer Generation using Large Language Models | Ronald Xie et.al. | 2404.14567 | null |
2024-04-22 | CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective | Wencheng Zhu et.al. | 2404.14109 | null |
2024-04-21 | EncodeNet: A Framework for Boosting DNN Accuracy with Entropy-driven Generalized Converting Autoencoder | Hasanul Mahmud et.al. | 2404.13770 | null |
2024-04-21 | PEACH: Pretrained-embedding Explanation Across Contextual and Hierarchical Structure | Feiqi Cao et.al. | 2404.13645 | link |
2024-04-21 | I2CANSAY:Inter-Class Analogical Augmentation and Intra-Class Significance Analysis for Non-Exemplar Online Task-Free Continual Learning | Songlin Dong et.al. | 2404.13576 | null |
2024-04-21 | IMO: Greedy Layer-Wise Sparse Representation Learning for Out-of-Distribution Text Classification with Pre-trained Models | Tao Feng et.al. | 2404.13504 | null |
2024-04-20 | Nested-TNT: Hierarchical Vision Transformers with Multi-Scale Feature Processing | Yuang Liu et.al. | 2404.13434 | null |
2024-04-20 | Evaluating Subword Tokenization: Alien Subword Composition and OOV Generalization Challenge | Khuyagbaatar Batsuren et.al. | 2404.13292 | link |
2024-04-20 | 3D-Convolution Guided Spectral-Spatial Transformer for Hyperspectral Image Classification | Shyam Varahagiri et.al. | 2404.13252 | link |
2024-04-19 | On-board classification of underwater images using hybrid classical-quantum CNN based method | Sreeraj Rajan Warrier et.al. | 2404.13130 | null |
2024-04-19 | Next Generation Loss Function for Image Classification | Shakhnaz Akhmedova et.al. | 2404.12948 | null |
2024-04-19 | A Hybrid Generative and Discriminative PointNet on Unordered Point Sets | Yang Ye et.al. | 2404.12925 | null |
2024-04-19 | Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment | Danqing Ma et.al. | 2404.12634 | null |
2024-04-18 | When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes | Asaf Yehudai et.al. | 2404.12365 | null |
2024-04-18 | Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training | Jin Gao et.al. | 2404.12210 | link |
2024-04-18 | Concept Induction using LLMs: a user experiment for assessment | Adrita Barua et.al. | 2404.11875 | null |
2024-04-17 | Pretraining Billion-scale Geospatial Foundational Models on Frontier | Aristeidis Tsaris et.al. | 2404.11706 | null |
2024-04-17 | AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts | Meng Jiang et.al. | 2404.11449 | null |
2024-04-17 | Achieving Rotation Invariance in Convolution Operations: Shifting from Data-Driven to Mechanism-Assured | Hanlin Mo et.al. | 2404.11309 | null |
2024-04-17 | A Progressive Framework of Vision-language Knowledge Distillation and Alignment for Multilingual Scene | Wenbo Zhang et.al. | 2404.11249 | null |
2024-04-17 | A Novel ICD Coding Framework Based on Associated and Hierarchical Code Description Distillation | Bin Zhang et.al. | 2404.11132 | null |
2024-04-17 | Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification | Pierre Lepagnol et.al. | 2404.11122 | null |
2024-04-18 | Supervised Contrastive Vision Transformer for Breast Histopathological Image Classification | Mohammad Shiri et.al. | 2404.11052 | null |
2024-04-17 | InfoMatch: Entropy Neural Estimation for Semi-Supervised Image Classification | Qi Han et.al. | 2404.11003 | link |
2024-04-16 | Incubating Text Classifiers Following User Instruction with Nothing but LLM | Letian Peng et.al. | 2404.10877 | null |
2024-04-16 | Vocabulary-free Image Classification and Semantic Segmentation | Alessandro Conti et.al. | 2404.10864 | link |
2024-04-16 | Assessing The Impact of CNN Auto Encoder-Based Image Denoising on Image Classification Tasks | Mohsen Hami et.al. | 2404.10664 | null |
2024-04-16 | Tree Bandits for Generative Bayes | Sean O'Hagan et.al. | 2404.10436 | null |
2024-04-16 | AudioProtoPNet: An interpretable deep learning model for bird sound classification | René Heinrich et.al. | 2404.10420 | null |
2024-04-16 | Lighter, Better, Faster Multi-Source Domain Adaptation with Gaussian Mixture Models and Optimal Transport | Eduardo Fernandes Montesuma et.al. | 2404.10261 | null |
2024-04-15 | Distributed Federated Learning-Based Deep Learning Model for Privacy MRI Brain Tumor Detection | Lisang Zhou et.al. | 2404.10026 | null |
2024-04-15 | Interaction as Explanation: A User Interaction-based Method for Explaining Image Classification Models | Hyeonggeun Yun et.al. | 2404.09828 | null |
2024-04-15 | Quantization of Large Language Models with an Overdetermined Basis | Daniil Merkulov et.al. | 2404.09737 | null |
2024-04-15 | Pseudo-label Learning with Calibrated Confidence Using an Energy-based Model | Masahito Toba et.al. | 2404.09585 | null |
2024-04-14 | Breast Cancer Image Classification Method Based on Deep Transfer Learning | Weimin Wang et.al. | 2404.09226 | null |
2024-04-14 | Coreset Selection for Object Detection | Hojun Lee et.al. | 2404.09161 | null |
2024-04-13 | Exploring Explainability in Video Action Recognition | Avinab Saha et.al. | 2404.09067 | null |
2024-04-13 | Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification | Denis Huseljic et.al. | 2404.08981 | link |
2024-04-13 | PM2: A New Prompting Multi-modal Model Paradigm for Few-shot Medical Image Classification | Zhenwei Wang et.al. | 2404.08915 | null |
2024-04-12 | VertAttack: Taking advantage of Text Classifiers' horizontal vision | Jonathan Rusert et.al. | 2404.08538 | null |
2024-04-12 | SpectralMamba: Efficient Mamba for Hyperspectral Image Classification | Jing Yao et.al. | 2404.08489 | null |
2024-04-12 | OTTER: Improving Zero-Shot Classification via Optimal Transport | Changho Shin et.al. | 2404.08461 | null |
2024-04-12 | A Survey of Neural Network Robustness Assessment in Image Recognition | Jie Wang et.al. | 2404.08285 | null |
2024-04-12 | Convolutional neural network classification of cancer cytopathology images: taking breast cancer as an example | MingXuan Xiao et.al. | 2404.08279 | null |
2024-04-11 | HGRN2: Gated Linear RNNs with State Expansion | Zhen Qin et.al. | 2404.07904 | link |
2024-04-11 | Exploiting Object-based and Segmentation-based Semantic Features for Deep Learning-based Indoor Scene Classification | Ricardo Pereira et.al. | 2404.07739 | null |
2024-04-11 | Contrastive-Based Deep Embeddings for Label Noise-Resilient Histopathology Image Classification | Lucas Dedieu et.al. | 2404.07605 | link |
2024-04-11 | Learning to Classify New Foods Incrementally Via Compressed Exemplars | Justin Yang et.al. | 2404.07507 | null |
2024-04-11 | Interactive Prompt Debugging with Sequence Salience | Ian Tenney et.al. | 2404.07498 | null |
2024-04-11 | Privacy preserving layer partitioning for Deep Neural Network models | Kishore Rajasekar et.al. | 2404.07437 | null |
2024-04-11 | CopilotCAD: Empowering Radiologists with Report Completion Models and Quantitative Evidence from Medical Image Foundation Models | Sheng Wang et.al. | 2404.07424 | null |
2024-04-11 | Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling | Sourajit Saha et.al. | 2404.07410 | null |
2024-04-10 | Lost in Translation: Modern Neural Networks Still Struggle With Small Realistic Image Transformations | Ofir Shifman et.al. | 2404.07153 | null |
2024-04-10 | Learning of deep convolutional network image classifiers via stochastic gradient descent and over-parametrization | Michael Kohler et.al. | 2404.07128 | null |
2024-04-10 | Accelerating Cardiac MRI Reconstruction with CMRatt: An Attention-Driven Approach | Anam Hashmi et.al. | 2404.06941 | null |
2024-04-10 | Multi-Label Continual Learning for the Medical Domain: A Novel Benchmark | Marina Ceccon et.al. | 2404.06859 | null |
2024-04-10 | Neural Optimizer Equation, Decay Function, and Learning Rate Schedule Joint Evolution | Brandon Morgan et.al. | 2404.06679 | null |
2024-04-09 | Variational Stochastic Gradient Descent for Deep Neural Networks | Haotian Chen et.al. | 2404.06549 | link |
2024-04-09 | On adversarial training and the 1 Nearest Neighbor classifier | Amir Hagai et.al. | 2404.06313 | link |
2024-04-09 | Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models | David Kurzendörfer et.al. | 2404.06309 | link |
2024-04-09 | Counterfactual Reasoning for Multi-Label Image Classification via Patching-Based Training | Ming-Kun Xie et.al. | 2404.06287 | null |
2024-04-09 | Quantum Circuit |
Yuka Hashimoto et.al. | 2404.06218 | null |
2024-04-09 | VI-OOD: A Unified Representation Learning Framework for Textual Out-of-distribution Detection | Li-Ming Zhan et.al. | 2404.06217 | link |
2024-04-09 | Symmetry-guided gradient descent for quantum neural networks | Kaiming Bian et.al. | 2404.06108 | null |
2024-04-10 | Using Few-Shot Learning to Classify Primary Lung Cancer and Other Malignancy with Lung Metastasis in Cytological Imaging via Endobronchial Ultrasound Procedures | Ching-Kai Lin et.al. | 2404.06080 | null |
2024-04-08 | Neural Cellular Automata for Lightweight, Robust and Explainable Classification of White Blood Cell Images | Michael Deutges et.al. | 2404.05584 | null |
2024-04-08 | On the Convergence of Continual Learning with Adaptive Methods | Seungyub Han et.al. | 2404.05555 | null |
2024-04-08 | Multi-Task Learning for Features Extraction in Financial Annual Reports | Syrielle Montariol et.al. | 2404.05281 | link |
2024-04-08 | Allowing humans to interactively guide machines where to look does not always improve a human-AI team's classification accuracy | Giang Nguyen et.al. | 2404.05238 | null |
2024-04-08 | iVPT: Improving Task-relevant Information Sharing in Visual Prompt Tuning by Cross-layer Dynamic Connection | Nan Zhou et.al. | 2404.05207 | null |
2024-04-08 | Semantic Stealth: Adversarial Text Attacks on NLP Using Several Methods | Roopkatha Dey et.al. | 2404.05159 | null |
2024-04-07 | PairAug: What Can Augmented Image-Text Pairs Do for Radiology? | Yutong Xie et.al. | 2404.04960 | link |
2024-04-07 | GvT: A Graph-based Vision Transformer with Talking-Heads Utilizing Sparsity, Trained from Scratch on Small Datasets | Dongjing Shan et.al. | 2404.04924 | null |
2024-04-06 | Focused Active Learning for Histopathological Image Classification | Arne Schmidt et.al. | 2404.04663 | null |
2024-04-06 | Trustless Audits without Revealing Data or Models | Suppakit Waiwitlikhit et.al. | 2404.04500 | null |
2024-04-05 | Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism | Trilokesh Ranjan Sarkar et.al. | 2404.04245 | null |
2024-04-05 | Noisy Label Processing for Classification: A Survey | Mengting Li et.al. | 2404.04159 | null |
2024-04-05 | Learning Correlation Structures for Vision Transformers | Manjin Kim et.al. | 2404.03924 | null |
2024-04-05 | LiDAR-Guided Cross-Attention Fusion for Hyperspectral Band Selection and Image Classification | Judy X Yang et.al. | 2404.03883 | null |
2024-04-04 | Dendrites endow artificial neural networks with accurate, robust and parameter-efficient learning | Spyridon Chavlis et.al. | 2404.03708 | null |
2024-04-05 | A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data | Iqra Bano et.al. | 2404.03493 | null |
2024-04-04 | Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks | Lei Zhang et.al. | 2404.03340 | null |
2024-04-04 | Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning | Andrei Semenov et.al. | 2404.03323 | link |
2024-04-04 | FACTUAL: A Novel Framework for Contrastive Learning Based Robust SAR Image Classification | Xu Wang et.al. | 2404.03225 | null |
2024-04-03 | Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales | Lucas E. Resck et.al. | 2404.03098 | link |
2024-04-03 | Guarantees of confidentiality via Hammersley-Chapman-Robbins bounds | Kamalika Chaudhuri et.al. | 2404.02866 | link |
2024-04-03 | FPT: Feature Prompt Tuning for Few-shot Readability Assessment | Ziyang Wang et.al. | 2404.02772 | link |
2024-04-03 | Adversarial Attacks and Dimensionality in Text Classifiers | Nandish Chattopadhyay et.al. | 2404.02660 | null |
2024-04-04 | Non-negative Subspace Feature Representation for Few-shot Learning in Medical Imaging | Keqiang Fan et.al. | 2404.02656 | null |
2024-04-03 | Adaptive Cross-lingual Text Classification through In-Context One-Shot Demonstrations | Emilio Villa-Cueva et.al. | 2404.02452 | link |
2024-04-03 | A Novel Approach to Breast Cancer Histopathological Image Classification Using Cross-Colour Space Feature Fusion and Quantum-Classical Stack Ensemble Method | Sambit Mallick et.al. | 2404.02447 | null |
2024-04-03 | Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data | Parth Patwa et.al. | 2404.02422 | null |
2024-04-02 | Smooth Deep Saliency | Rudolf Herdt et.al. | 2404.02282 | null |
2024-04-02 | Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models | Matthew Kowal et.al. | 2404.02233 | null |
2024-04-02 | ImageNot: A contrast with ImageNet preserves model rankings | Olawale Salaudeen et.al. | 2404.02112 | null |
2024-04-02 | Explainability in JupyterLab and Beyond: Interactive XAI Systems for Integrated and Collaborative Workflows | Grace Guo et.al. | 2404.02081 | null |
2024-04-02 | Ukrainian Texts Classification: Exploration of Cross-lingual Knowledge Transfer Approaches | Daryna Dementieva et.al. | 2404.02043 | null |
2024-04-02 | CAM-Based Methods Can See through Walls | Magamed Taimeskhanov et.al. | 2404.01964 | link |
2024-04-02 | Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss | Jaeha Kim et.al. | 2404.01692 | null |
2024-04-02 | A Universal Knowledge Embedded Contrastive Learning Framework for Hyperspectral Image Classification | Quanwei Liu et.al. | 2404.01673 | null |
2024-04-01 | Can Biases in ImageNet Models Explain Generalization? | Paul Gavrikov et.al. | 2404.01509 | link |
2024-04-01 | Parallel Proportional Fusion of Spiking Quantum Neural Network for Optimizing Image Classification | Zuyu Xu et.al. | 2404.01359 | null |
2024-04-01 | Bridging Remote Sensors with Multisensor Geospatial Foundation Models | Boran Han et.al. | 2404.01260 | link |
2024-04-01 | Diagnosis of Skin Cancer Using VGG16 and VGG19 Based Transfer Learning Models | Amir Faghihi et.al. | 2404.01160 | null |
2024-03-29 | Learn "No" to Say "Yes" Better: Improving Vision-Language Models via Negations | Jaisidh Singh et.al. | 2403.20312 | link |
2024-03-29 | MCNet: A crowd denstity estimation network based on integrating multiscale attention module | Qiang Guo et.al. | 2403.20173 | null |
2024-03-29 | Segmentation, Classification and Interpretation of Breast Cancer Medical Images using Human-in-the-Loop Machine Learning | David Vázquez-Lema et.al. | 2403.20112 | null |
2024-03-29 | Adverb Is the Key: Simple Text Data Augmentation with Adverb Deletion | Juhwan Choi et.al. | 2403.20015 | null |
2024-03-29 | Diverse Feature Learning by Self-distillation and Reset | Sejik Park et.al. | 2403.19941 | null |
2024-03-29 | Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover Classification | Jianfeng Cai et.al. | 2403.19902 | link |
2024-03-28 | X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization | Anna Kukleva et.al. | 2403.19811 | link |
2024-03-28 | RSMamba: Remote Sensing Image Classification with State Space Model | Keyan Chen et.al. | 2403.19654 | link |
2024-03-28 | Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model | Zhicai Wang et.al. | 2403.19600 | link |
2024-03-28 | The Bad Batches: Enhancing Self-Supervised Learning in Image Classification Through Representative Batch Curation | Ozgu Goksu et.al. | 2403.19579 | null |
2024-03-28 | Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach | Wei Dong et.al. | 2403.19067 | link |
2024-03-27 | Evaluating Large Language Models for Health-Related Text Classification Tasks with Public Social Media Data | Yuting Guo et.al. | 2403.19031 | null |
2024-03-27 | Robustness and Visual Explanation for Black Box Image, Video, and ECG Signal Classification with Reinforcement Learning | Soumyendu Sarkar et.al. | 2403.18985 | null |
2024-03-27 | The Impact of Uniform Inputs on Activation Sparsity and Energy-Latency Attacks in Computer Vision | Andreas Müller et.al. | 2403.18587 | link |
2024-03-27 | Uncertainty-Aware SAR ATR: Defending Against Adversarial Attacks via Bayesian Neural Networks | Tian Ye et.al. | 2403.18318 | null |
2024-03-27 | Multi-scale Unified Network for Image Classification | Wenzhuo Liu et.al. | 2403.18294 | null |
2024-03-26 | The Need for Speed: Pruning Transformers with One Recipe | Samir Khaki et.al. | 2403.17921 | link |
2024-03-26 | Compressed Multi-task embeddings for Data-Efficient Downstream training and inference in Earth Observation | Carlos Gomes et.al. | 2403.17886 | null |
2024-03-26 | PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition | Chenhongyi Yang et.al. | 2403.17695 | link |
2024-03-26 | Language Models for Text Classification: Is In-Context Learning Enough? | Aleksandra Edwards et.al. | 2403.17661 | null |
2024-03-26 | Boosting Few-Shot Learning with Disentangled Self-Supervised Learning and Meta-Learning for Medical Image Classification | Eva Pachetti et.al. | 2403.17530 | null |
2024-03-26 | HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification | He Zhu et.al. | 2403.17307 | link |
2024-03-25 | Histogram Layers for Neural Engineered Features | Joshua Peeples et.al. | 2403.17176 | link |
2024-03-25 | Task2Box: Box Embeddings for Modeling Asymmetric Task Relationships | Rangel Daroya et.al. | 2403.17173 | link |
2024-03-25 | CipherFormer: Efficient Transformer Private Inference with Low Round Complexity | Weize Wang et.al. | 2403.16860 | null |
2024-03-25 | Assessing the Performance of Deep Learning for Automated Gleason Grading in Prostate Cancer | Dominik Müller et.al. | 2403.16695 | null |
2024-03-25 | DeepGleason: a System for Automated Gleason Grading of Prostate Cancer using Deep Neural Networks | Dominik Müller et.al. | 2403.16678 | link |
2024-03-25 | LARA: Linguistic-Adaptive Retrieval-Augmented LLMs for Multi-Turn Intent Classification | Liu Junhua et.al. | 2403.16504 | null |
2024-03-24 | On machine learning analysis of atomic force microscopy images for image classification, sample surface recognition | Igor Sokolov et.al. | 2403.16230 | null |
2024-03-24 | Leveraging Deep Learning and Xception Architecture for High-Accuracy MRI Classification in Alzheimer Diagnosis | Shaojie Li et.al. | 2403.16212 | null |
2024-03-24 | Multi-Task Learning with Multi-Task Optimization | Lu Bai et.al. | 2403.16162 | null |
2024-03-24 | CBGT-Net: A Neuromimetic Architecture for Robust Classification of Streaming Data | Shreya Sharma et.al. | 2403.15974 | link |
2024-03-23 | A Deep Learning Architectures for Kidney Disease Classification | Muhammad Shoaib Farooq et.al. | 2403.15895 | null |
2024-03-23 | VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding | Phong Nguyen-Thuan Do et.al. | 2403.15882 | null |
2024-03-23 | VLM-CPL: Consensus Pseudo Labels from Vision-Language Models for Human Annotation-Free Pathological Image Classification | Lanfeng Zhong et.al. | 2403.15836 | null |
2024-03-22 | Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion | Sofia Casarin et.al. | 2403.15194 | null |
2024-03-22 | Image Classification with Rotation-Invariant Variational Quantum Circuits | Paul San Sebastian et.al. | 2403.15031 | null |
2024-03-22 | Extracting Human Attention through Crowdsourced Patch Labeling | Minsuk Chang et.al. | 2403.15013 | null |
2024-03-22 | Clean-image Backdoor Attacks | Dazhong Rong et.al. | 2403.15010 | null |
2024-03-22 | ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding | Novendra Setyawan et.al. | 2403.15004 | null |
2024-03-22 | MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection | Sadiya Sayara Chowdhury Puspo et.al. | 2403.14989 | null |
2024-03-21 | Learning with SASQuaTCh: a Novel Variational Quantum Transformer Architecture with Kernel-Based Self-Attention | Ethan N. Evans et.al. | 2403.14753 | null |
2024-03-21 | Estimating Physical Information Consistency of Channel Data Augmentation for Remote Sensing Images | Tom Burgert et.al. | 2403.14547 | null |
2024-03-21 | Multi-Level Explanations for Generative Language Models | Lucas Monteiro Paes et.al. | 2403.14459 | null |
2024-03-21 | Tensor network compressibility of convolutional models | Sukhbinder Singh et.al. | 2403.14379 | null |
2024-03-21 | LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document Understanding | Masato Fujitake et.al. | 2403.14252 | null |
2024-03-21 | Safeguarding Medical Image Segmentation Datasets against Unauthorized Training via Contour- and Texture-Aware Perturbations | Xun Lin et.al. | 2403.14250 | null |
2024-03-21 | Improving Image Classification Accuracy through Complementary Intra-Class and Inter-Class Mixup | Ye Xu et.al. | 2403.14137 | link |
2024-03-20 | Bridge the Modality and Capacity Gaps in Vision-Language Model Selection | Chao Yi et.al. | 2403.13797 | null |
2024-03-20 | Leveraging feature communication in federated learning for remote sensing image classification | Anh-Kiet Duong et.al. | 2403.13575 | null |
2024-03-20 | MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Di Wang et.al. | 2403.13430 | link |
2024-03-20 | Building Optimal Neural Architectures using Interpretable Knowledge | Keith G. Mills et.al. | 2403.13293 | link |
2024-03-19 | LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images | Jing Zhang et.al. | 2403.13171 | null |
2024-03-19 | Improved EATFormer: A Vision Transformer for Medical Image Classification | Yulong Shisu et.al. | 2403.13167 | null |
2024-03-19 | SIFT-DBT: Self-supervised Initialization and Fine-Tuning for Imbalanced Digital Breast Tomosynthesis Image Classification | Yuexi Du et.al. | 2403.13148 | link |
2024-03-19 | Using evolutionary computation to optimize task performance of unclocked, recurrent Boolean circuits in FPGAs | Raphael Norman-Tenazas et.al. | 2403.13105 | null |
2024-03-19 | Investigating Text Shortening Strategy in BERT: Truncation vs Summarization | Mirza Alim Mutasodirin et.al. | 2403.12799 | link |
2024-03-18 | Posterior Uncertainty Quantification in Neural Networks using Data Augmentation | Luhuan Wu et.al. | 2403.12729 | null |
2024-03-19 | SEVEN: Pruning Transformer Model by Reserving Sentinels | Jinying Xiao et.al. | 2403.12688 | link |
2024-03-19 | Simple Hack for Transformers against Heavy Long-Text Classification on a Time- and Memory-Limited GPU Service | Mirza Alim Mutasodirin et.al. | 2403.12563 | null |
2024-03-19 | Prompt-Guided Adaptive Model Transformation for Whole Slide Image Classification | Yi Lin et.al. | 2403.12537 | null |
2024-03-19 | CrossTune: Black-Box Few-Shot Classification with Label Enhancement | Danqing Luo et.al. | 2403.12468 | null |
2024-03-18 | Generalizing deep learning models for medical image classification | Matta Sarah et.al. | 2403.12167 | null |
2024-03-19 | Leveraging Spatial and Semantic Feature Extraction for Skin Cancer Diagnosis with Capsule Networks and Graph Neural Networks | K. P. Santoso et.al. | 2403.12009 | null |
2024-03-18 | High-energy physics image classification: A Survey of Jet Applications | Hamza Kheddar et.al. | 2403.11934 | null |
2024-03-18 | Better (pseudo-)labels for semi-supervised instance segmentation | François Porcher et.al. | 2403.11675 | null |
2024-03-18 | Continual Forgetting for Pre-trained Vision Models | Hongbo Zhao et.al. | 2403.11530 | link |
2024-03-18 | Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting | Mingkui Tan et.al. | 2403.11491 | null |
2024-03-17 | Potential of Domain Adaptation in Machine Learning in Ecology and Hydrology to Improve Model Extrapolability | Haiyang Shi et.al. | 2403.11331 | null |
2024-03-17 | A Modified Word Saliency-Based Adversarial Attack on Text Classification Models | Hetvi Waghela et.al. | 2403.11297 | null |
2024-03-17 | Forging the Forger: An Attempt to Improve Authorship Verification via Data Augmentation | Silvia Corbara et.al. | 2403.11265 | null |
2024-03-17 | Multiple Teachers-Meticulous Student: A Domain Adaptive Meta-Knowledge Distillation Model for Medical Image Classification | Shahabedin Nabavi et.al. | 2403.11226 | null |
2024-03-16 | Forward Learning of Graph Neural Networks | Namyong Park et.al. | 2403.11004 | null |
2024-03-16 | Understanding Robustness of Visual State Space Models for Image Classification | Chengbin Du et.al. | 2403.10935 | null |
2024-03-16 | Automatic location detection based on deep learning | Anjali Karangiya et.al. | 2403.10912 | null |
2024-03-14 | Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models | Akhil Kedia et.al. | 2403.09635 | link |
2024-03-14 | XCoOp: Explainable Prompt Learning for Computer-Aided Diagnosis via Concept-guided Context Optimization | Yequan Bie et.al. | 2403.09410 | null |
2024-03-14 | ConDiSR: Contrastive Disentanglement and Style Regularization for Single Domain Generalization | Aleksandr Matsun et.al. | 2403.09400 | null |
2024-03-14 | A Hierarchical Fused Quantum Fuzzy Neural Network for Image Classification | Sheng-Yao Wu et.al. | 2403.09318 | null |
2024-03-14 | CLIP-EBC: CLIP Can Count Accurately through Enhanced Blockwise Classification | Yiming Ma et.al. | 2403.09281 | null |
2024-03-14 | Are Vision Language Models Texture or Shape Biased and Can We Steer Them? | Paul Gavrikov et.al. | 2403.09193 | null |
2024-03-14 | Randomized Principal Component Analysis for Hyperspectral Image Classification | Mustafa Ustuner et.al. | 2403.09117 | null |
2024-03-14 | CardioCaps: Attention-based Capsule Network for Class-Imbalanced Echocardiogram Classification | Hyunkyung Han et.al. | 2403.09108 | link |
2024-03-14 | The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models? | Qinyu Zhao et.al. | 2403.09037 | link |
2024-03-13 | PathM3: A Multimodal Multi-Task Multiple Instance Learning Framework for Whole Slide Image Classification and Captioning | Qifeng Zhou et.al. | 2403.08967 | null |
2024-03-13 | DAM: Dynamic Adapter Merging for Continual Video QA Learning | Feng Cheng et.al. | 2403.08755 | link |
2024-03-13 | Leveraging Compressed Frame Sizes For Ultra-Fast Video Classification | Yuxing Han et.al. | 2403.08580 | null |
2024-03-13 | HOLMES: HOLonym-MEronym based Semantic inspection for Convolutional Image Classifiers | Francesco Dibitonto et.al. | 2403.08536 | link |
2024-03-13 | Pig aggression classification using CNN, Transformers and Recurrent Networks | Junior Silva Souza et.al. | 2403.08528 | null |
2024-03-13 | Reduced Jeffries-Matusita distance: A Novel Loss Function to Improve Generalization Performance of Deep Classification Models | Mohammad Lashkari et.al. | 2403.08408 | null |
2024-03-13 | Iterative Online Image Synthesis via Diffusion Model for Imbalanced Classification | Shuhan Li et.al. | 2403.08407 | null |
2024-03-13 | Advancing Security in AI Systems: A Novel Approach to Detecting Backdoors in Deep Neural Networks | Khondoker Murad Hossain et.al. | 2403.08208 | null |
2024-03-13 | Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks | Fuzhi Wu et.al. | 2403.08157 | link |
2024-03-12 | Harnessing Artificial Intelligence to Combat Online Hate: Exploring the Challenges and Opportunities of Large Language Models in Hate Speech Detection | Tharindu Kumarage et.al. | 2403.08035 | null |
2024-03-13 | Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion | Dongyang Li et.al. | 2403.07721 | link |
2024-03-12 | FPT: Fine-grained Prompt Tuning for Parameter and Memory Efficient Fine Tuning in High-resolution Medical Image Classification | Yijin Huang et.al. | 2403.07576 | null |
2024-03-12 | Backdoor Attack with Mode Mixture Latent Modification | Hongwei Zhang et.al. | 2403.07463 | null |
2024-03-12 | In-context learning enables multimodal large language models to classify cancer pathology images | Dyke Ferber et.al. | 2403.07407 | null |
2024-03-12 | Premonition: Using Generative Models to Preempt Future Data Changes in Continual Learning | Mark D. McDonnell et.al. | 2403.07356 | null |
2024-03-12 | How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance | Hongkang Li et.al. | 2403.07310 | null |
2024-03-12 | A Bayesian Approach to OOD Robustness in Image Classification | Prakhar Kaushik et.al. | 2403.07277 | null |
2024-03-11 | LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations | Mohammad Alkhalefi et.al. | 2403.06813 | null |
2024-03-11 | Dynamic Perturbation-Adaptive Adversarial Training on Medical Image Classification | Shuai Li et.al. | 2403.06798 | null |
2024-03-11 | Leveraging Internal Representations of Model for Magnetic Image Classification | Adarsh N L et.al. | 2403.06797 | null |
2024-03-11 | Shortcut Learning in Medical Image Segmentation | Manxi Lin et.al. | 2403.06748 | null |
2024-03-11 | Active Generation for Image Classification | Tao Huang et.al. | 2403.06517 | null |
2024-03-11 | Evolving Knowledge Distillation with Large Language Models and Active Learning | Chengyuan Liu et.al. | 2403.06414 | null |
2024-03-11 | 'One size doesn't fit all': Learning how many Examples to use for In-Context Learning for Improved Text Classification | Manish Chandra et.al. | 2403.06402 | null |
2024-03-10 | Probing Image Compression For Class-Incremental Learning | Justin Yang et.al. | 2403.06288 | null |
2024-03-10 | Bayesian Random Semantic Data Augmentation for Medical Image Classification | Yaoyao Zhu et.al. | 2403.06138 | link |
2024-03-10 | Universal Debiased Editing for Fair Medical Image Classification | Ruinan Jin et.al. | 2403.06104 | null |
2024-03-08 | Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets | Lorenzo Brigato et.al. | 2403.05532 | null |
2024-03-08 | Generalized Correspondence Matching via Flexible Hierarchical Refinement and Patch Descriptor Distillation | Yu Han et.al. | 2403.05388 | null |
2024-03-08 | The Impact of Quantization on the Robustness of Transformer-based Text Classifiers | Seyed Parsa Neshaei et.al. | 2403.05365 | null |
2024-03-08 | Multiple Instance Learning with random sampling for Whole Slide Image Classification | H. Keshvarikhojasteh et.al. | 2403.05351 | null |
2024-03-08 | Learning Expressive And Generalizable Motion Features For Face Forgery Detection | Jingyi Zhang et.al. | 2403.05172 | null |
2024-03-08 | Defending Against Unforeseen Failure Modes with Latent Adversarial Training | Stephen Casper et.al. | 2403.05030 | link |
2024-03-07 | Fooling Neural Networks for Motion Forecasting via Adversarial Attacks | Edgar Medina et.al. | 2403.04954 | null |
2024-03-07 | T-TAME: Trainable Attention Mechanism for Explaining Convolutional Networks and Vision Transformers | Mariano V. Ntrougkas et.al. | 2403.04523 | null |
2024-03-07 | Source Matters: Source Dataset Impact on Model Robustness in Medical Imaging | Dovile Juodelyte et.al. | 2403.04484 | link |
2024-03-07 | Advancing Biomedical Text Mining with Community Challenges | Hui Zong et.al. | 2403.04261 | null |
2024-03-07 | Scalable On-Chip Optical Linear Processing Unit Using a Single Thin-Film Lithium Niobate Ring Modulator | Zhaoang Deng et.al. | 2403.04216 | null |
2024-03-07 | Scalable and Robust Transformer Decoders for Interpretable Image Classification with Foundation Models | Evelyn Mannix et.al. | 2403.04125 | null |
2024-03-07 | Privacy-preserving Fine-tuning of Large Language Models through Flatness | Tiejin Chen et.al. | 2403.04124 | null |
2024-03-06 | MedMamba: Vision Mamba for Medical Image Classification | Yubiao Yue et.al. | 2403.03849 | link |
2024-03-06 | On the Effectiveness of Distillation in Mitigating Backdoors in Pre-trained Encoder | Tingxu Han et.al. | 2403.03846 | link |
2024-03-06 | RADIA -- Radio Advertisement Detection with Intelligent Analytics | Jorge Álvarez et.al. | 2403.03538 | null |
2024-03-06 | Inverse-Free Fast Natural Gradient Descent Method for Deep Learning | Xinwei Ou et.al. | 2403.03473 | null |
2024-03-06 | Sparse Spiking Neural Network: Exploiting Heterogeneity in Timescales for Pruning Recurrent SNN | Biswadeep Chakraborty et.al. | 2403.03409 | null |
2024-03-05 | RulePrompt: Weakly Supervised Text Classification with Prompting PLMs and Self-Iterative Logical Rules | Miaomiao Li et.al. | 2403.02932 | link |
2024-03-05 | Demonstrating Mutual Reinforcement Effect through Information Flow | Chengguang Gan et.al. | 2403.02902 | null |
2024-03-05 | Quantum Mixed-State Self-Attention Network | Fu Chen et.al. | 2403.02871 | null |
2024-03-05 | SOFIM: Stochastic Optimization Using Regularized Fisher Information Matrix | Gayathri C et.al. | 2403.02833 | null |
2024-03-05 | SGD with Partial Hessian for Deep Neural Networks Optimization | Ying Sun et.al. | 2403.02681 | link |
2024-03-05 | G-EvoNAS: Evolutionary Neural Architecture Search Based on Network Growth | Juan Zou et.al. | 2403.02667 | null |
2024-03-05 | Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad | Sayantan Choudhury et.al. | 2403.02648 | link |
2024-03-05 | Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use | Imad Eddine Toubal et.al. | 2403.02626 | null |
2024-03-04 | When do Convolutional Neural Networks Stop Learning? | Sahan Ahmad et.al. | 2403.02473 | link |
2024-03-04 | NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function | Abdullah Nazhat Abdullah et.al. | 2403.02411 | link |
2024-03-02 | Can a Confident Prior Replace a Cold Posterior? | Martin Marek et.al. | 2403.01272 | link |
2024-03-02 | Leveraging Self-Supervised Learning for Scene Recognition in Child Sexual Abuse Imagery | Pedro H. V. Valois et.al. | 2403.01183 | null |
2024-03-02 | Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation | Lian Xu et.al. | 2403.01156 | null |
2024-03-02 | ELA: Efficient Local Attention for Deep Convolutional Neural Networks | Wei Xu et.al. | 2403.01123 | null |
2024-03-01 | Margin Discrepancy-based Adversarial Training for Multi-Domain Text Classification | Yuan Wu et.al. | 2403.00888 | null |
2024-03-01 | Text classification of column headers with a controlled vocabulary: leveraging LLMs for metadata enrichment | Margherita Martorana et.al. | 2403.00884 | null |
2024-03-01 | SURE: SUrvey REcipes for building reliable and robust deep networks | Yuting Li et.al. | 2403.00543 | link |
2024-03-01 | Invariant Test-Time Adaptation for Vision-Language Model Generalization | Huan Ma et.al. | 2403.00376 | null |
2024-02-29 | TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision | Yunyi Zhang et.al. | 2403.00165 | null |
2024-02-29 | Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance | Huakun Shen et.al. | 2402.19401 | null |
2024-02-29 | Stitching Gaps: Fusing Situated Perceptual Knowledge with Vision Transformers for High-Level Image Classification | Delfina Sol Martinez Pandiani et.al. | 2402.19339 | null |
2024-02-29 | Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction | Hao Li et.al. | 2402.19326 | null |
2024-02-29 | Decompose-and-Compose: A Compositional Approach to Mitigating Spurious Correlation | Fahimeh Hosseini Noohdani et.al. | 2402.18919 | null |
2024-02-29 | Utilizing Local Hierarchy with Adversarial Training for Hierarchical Text Classification | Zihan Wang et.al. | 2402.18825 | link |
2024-02-28 | Comparing Importance Sampling Based Methods for Mitigating the Effect of Class Imbalance | Indu Panigrahi et.al. | 2402.18742 | link |
2024-02-28 | Deep Neural Network Models Trained With A Fixed Random Classifier Transfer Better Across Domains | Hafiz Tiomoko Ali et.al. | 2402.18614 | null |
2024-02-28 | Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling | Mahdi Karami et.al. | 2402.18508 | null |
2024-02-28 | Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization | Deng Li et.al. | 2402.18447 | null |
2024-02-29 | A Modular System for Enhanced Robustness of Multimedia Understanding Networks via Deep Parametric Estimation | Francesco Barbato et.al. | 2402.18402 | null |
2024-02-28 | A Multimodal Handover Failure Detection Dataset and Baselines | Santosh Thoduka et.al. | 2402.18319 | null |
2024-02-28 | Classes Are Not Equal: An Empirical Study on Image Recognition Fairness | Jiequan Cui et.al. | 2402.18133 | null |
2024-02-27 | Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers | Yiwei Lu et.al. | 2402.17710 | null |
2024-02-27 | SDF2Net: Shallow to Deep Feature Fusion Network for PolSAR Image Classification | Mohammed Q. Alkhatib et.al. | 2402.17672 | link |
2024-02-27 | Predict the Next Word: | Evgenia Ilia et.al. | 2402.17527 | null |
2024-02-27 | Scaling Supervised Local Learning with Augmented Auxiliary Networks | Chenxiang Ma et.al. | 2402.17318 | link |
2024-02-26 | Offline Writer Identification Using Convolutional Neural Network Activation Features | Vincent Christlein et.al. | 2402.17029 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-08-29 | Space3D-Bench: Spatial 3D Question Answering Benchmark | Emilia Szymanska et.al. | 2408.16662 | null |
2024-08-29 | SODAWideNet++: Combining Attention and Convolutions for Salient Object Detection | Rohit Venkata Sai Dulam et.al. | 2408.16645 | null |
2024-08-29 | UAV-Based Human Body Detector Selection and Fusion for Geolocated Saliency Map Generation | Piotr Rudol et.al. | 2408.16501 | null |
2024-08-29 | Weakly Supervised Object Detection for Automatic Tooth-marked Tongue Recognition | Yongcun Zhang et.al. | 2408.16451 | link |
2024-08-29 | Enhancing Sound Source Localization via False Negative Elimination | Zengjie Song et.al. | 2408.16448 | link |
2024-08-29 | High-yield large-scale suspended graphene membranes over closed cavities for sensor applications | Sebastian Lukas et.al. | 2408.16408 | null |
2024-08-29 | FA-YOLO: Research On Efficient Feature Selection YOLO Improved Algorithm Based On FMDS and AGMF Modules | Yukang Huo et.al. | 2408.16313 | null |
2024-08-29 | Anno-incomplete Multi-dataset Detection | Yiran Xu et.al. | 2408.16247 | null |
2024-08-29 | PolarBEVDet: Exploring Polar Representation for Multi-View 3D Object Detection in Bird's-Eye-View | Zichen Yu et.al. | 2408.16200 | null |
2024-08-28 | ChartEye: A Deep Learning Framework for Chart Information Extraction | Osama Mustafa et.al. | 2408.16123 | null |
2024-08-28 | microYOLO: Towards Single-Shot Object Detection on Microcontrollers | Mark Deutel et.al. | 2408.15865 | null |
2024-08-28 | What is YOLOv8: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector | Muhammad Yaseen et.al. | 2408.15857 | null |
2024-08-28 | Network transferability of adversarial patches in real-time object detection | Jens Bayer et.al. | 2408.15833 | link |
2024-08-28 | Object Detection for Vehicle Dashcams using Transformers | Osama Mustafa et.al. | 2408.15809 | null |
2024-08-29 | RIDE: Boosting 3D Object Detection for LiDAR Point Clouds via Rotation-Invariant Analysis | Zhaoxuan Wang et.al. | 2408.15643 | null |
2024-08-28 | MMDRFuse: Distilled Mini-Model with Dynamic Refresh for Multi-Modality Image Fusion | Yanglin Deng et.al. | 2408.15641 | link |
2024-08-28 | Semantic and goal-oriented edge computing for satellite Earth Observation | Beatriz Soret et.al. | 2408.15639 | null |
2024-08-28 | Transfer Learning from Simulated to Real Scenes for Monocular 3D Object Detection | Sondos Mohamed et.al. | 2408.15637 | null |
2024-08-28 | Can Visual Language Models Replace OCR-Based Visual Question Answering Pipelines in Production? A Case Study in Retail | Bianca Lamm et.al. | 2408.15626 | null |
2024-08-28 | RoboSense: Large-scale Dataset and Benchmark for Multi-sensor Low-speed Autonomous Driving | Haisheng Su et.al. | 2408.15503 | null |
2024-08-27 | A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships | Gracile Astlin Pereira et.al. | 2408.15178 | null |
2024-08-27 | Adapting Segment Anything Model to Multi-modal Salient Object Detection with Semantic Feature Fusion Guidance | Kunpeng Wang et.al. | 2408.15063 | null |
2024-08-27 | Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection | Siyuan Yao et.al. | 2408.15020 | link |
2024-08-27 | Knowledge Discovery in Optical Music Recognition: Enhancing Information Retrieval with Instance Segmentation | Elona Shatri et.al. | 2408.15002 | null |
2024-08-27 | BOX3D: Lightweight Camera-LiDAR Fusion for 3D Object Detection and Localization | Mario A. V. Saucedo et.al. | 2408.14941 | null |
2024-08-26 | PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection | Yidi Li et.al. | 2408.14600 | null |
2024-08-26 | A Survey of Camouflaged Object Detection and Beyond | Fengyang Xiao et.al. | 2408.14562 | null |
2024-08-26 | Beyond Few-shot Object Detection: A Detailed Survey | Vishal Chudasama et.al. | 2408.14249 | null |
2024-08-26 | TC-PDM: Temporally Consistent Patch Diffusion Models for Infrared-to-Visible Video Translation | Anh-Dzung Doan et.al. | 2408.14227 | null |
2024-08-26 | EMDFNet: Efficient Multi-scale and Diverse Feature Network for Traffic Sign Detection | Pengyu Li et.al. | 2408.14189 | null |
2024-08-26 | More Pictures Say More: Visual Intersection Network for Open Set Object Detection | Bingcheng Dong et.al. | 2408.14032 | null |
2024-08-25 | Bridging the Gap between Real-world and Synthetic Images for Testing Autonomous Driving Systems | Mohammad Hossein Amini et.al. | 2408.13950 | null |
2024-08-25 | OpenNav: Efficient Open Vocabulary 3D Object Detection for Smart Wheelchair Navigation | Muhammad Rameez ur Rahman et.al. | 2408.13936 | link |
2024-08-25 | Infrared Domain Adaptation with Zero-Shot Quantization | Burak Sevsay et.al. | 2408.13925 | null |
2024-08-25 | TraIL-Det: Transformation-Invariant Local Feature Networks for 3D LiDAR Object Detection with Unsupervised Pre-Training | Li Li et.al. | 2408.13902 | null |
2024-08-25 | Selectively Dilated Convolution for Accuracy-Preserving Sparse Pillar-based Embedded 3D Object Detection | Seongmin Park et.al. | 2408.13798 | null |
2024-08-24 | Mean Height Aided Post-Processing for Pedestrian Detection | Jing Yuan et.al. | 2408.13646 | null |
2024-08-23 | MCTR: Multi Camera Tracking Transformer | Alexandru Niculescu-Mizil et.al. | 2408.13243 | null |
2024-08-23 | DeTPP: Leveraging Object Detection for Robust Long-Horizon Event Prediction | Ivan Karpukhin et.al. | 2408.13131 | null |
2024-08-23 | VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models | Wentao Wu et.al. | 2408.13031 | link |
2024-08-23 | Can AI Assistance Aid in the Grading of Handwritten Answer Sheets? | Pritam Sil et.al. | 2408.12870 | null |
2024-08-23 | Symmetric masking strategy enhances the performance of Masked Image Modeling | Khanh-Binh Nguyen et.al. | 2408.12772 | null |
2024-08-22 | CatFree3D: Category-agnostic 3D Object Detection with Diffusion | Wenjing Bian et.al. | 2408.12747 | null |
2024-08-22 | Revisiting Cross-Domain Problem for LiDAR-based 3D Object Detection | Ruixiao Zhang et.al. | 2408.12708 | null |
2024-08-22 | xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations | Can Qin et.al. | 2408.12590 | null |
2024-08-22 | Enhanced Parking Perception by Multi-Task Fisheye Cross-view Transformers | Antonyo Musabini et.al. | 2408.12575 | null |
2024-08-22 | Comparing YOLOv5 Variants for Vehicle Detection: A Performance Analysis | Athulya Sundaresan Geetha et.al. | 2408.12550 | null |
2024-08-22 | UMAD: University of Macau Anomaly Detection Benchmark Dataset | Dong Li et.al. | 2408.12527 | link |
2024-08-22 | Class-balanced Open-set Semi-supervised Object Detection for Medical Images | Zhanyun Lu et.al. | 2408.12355 | null |
2024-08-22 | OVA-DETR: Open Vocabulary Aerial Object Detection Using Image-Text Alignment and Fusion | Guoting Wei et.al. | 2408.12246 | null |
2024-08-22 | On the Credibility of Backdoor Attacks Against Object Detectors in the Physical World | Bao Gia Doan et.al. | 2408.12122 | null |
2024-08-21 | CARLA Drone: Monocular 3D Object Detection from a Different Perspective | Johannes Meier et.al. | 2408.11958 | null |
2024-08-21 | SBDet: A Symmetry-Breaking Object Detector via Relaxed Rotation-Equivariance | Zhiqiang Wu et.al. | 2408.11760 | null |
2024-08-21 | Video-to-Text Pedestrian Monitoring (VTPM): Leveraging Computer Vision and Large Language Models for Privacy-Preserve Pedestrian Activity Monitoring at Intersections | Ahmed S. Abdelrahman et.al. | 2408.11649 | null |
2024-08-21 | Domain-invariant Progressive Knowledge Distillation for UAV-based Object Detection | Liang Yao et.al. | 2408.11407 | null |
2024-08-20 | On the Potential of Open-Vocabulary Models for Object Detection in Unusual Street Scenes | Sadia Ilyas et.al. | 2408.11221 | null |
2024-08-20 | Quantum Inverse Contextual Vision Transformers (Q-ICVT): A New Frontier in 3D Object Detection for AVs | Sanjay Bhargav Dharavath et.al. | 2408.11207 | link |
2024-08-20 | A Closer Look at Data Augmentation Strategies for Finetuning-Based Low/Few-Shot Object Detection | Vladislav Li et.al. | 2408.10940 | null |
2024-08-20 | Aligning Object Detector Bounding Boxes with Human Preference | Ombretta Strafforello et.al. | 2408.10844 | null |
2024-08-20 | LightMDETR: A Lightweight Approach for Low-Cost Open-Vocabulary Object Detection Training | Binta Sow et.al. | 2408.10787 | null |
2024-08-20 | Just a Hint: Point-Supervised Camouflaged Object Detection | Huafeng Chen et.al. | 2408.10777 | null |
2024-08-21 | Generative AI in Industrial Machine Vision -- A Review | Hans Aoyang Zhou et.al. | 2408.10775 | null |
2024-08-20 | Detection of Intracranial Hemorrhage for Trauma Patients | Antoine P. Sanner et.al. | 2408.10768 | null |
2024-08-20 | SAM-COD: SAM-guided Unified Framework for Weakly-Supervised Camouflaged Object Detection | Huafeng Chen et.al. | 2408.10760 | null |
2024-08-20 | Leveraging Temporal Contexts to Enhance Vehicle-Infrastructure Cooperative Perception | Jiaru Zhong et.al. | 2408.10531 | null |
2024-08-19 | Leveraging Superfluous Information in Contrastive Representation Learning | Xuechu Yu et.al. | 2408.10292 | null |
2024-08-19 | SHARP: Segmentation of Hands and Arms by Range using Pseudo-Depth for Enhanced Egocentric 3D Hand Pose Estimation and Action Recognition | Wiktor Mucha et.al. | 2408.10037 | null |
2024-08-19 | Segment-Anything Models Achieve Zero-shot Robustness in Autonomous Driving | Jun Yan et.al. | 2408.09839 | link |
2024-08-19 | Latent Diffusion for Guided Document Table Generation | Syed Jawwad Haider Hamdani et.al. | 2408.09800 | null |
2024-08-18 | Adversarial Attacked Teacher for Unsupervised Domain Adaptive Object Detection | Kaiwen Wang et.al. | 2408.09431 | null |
2024-08-18 | Boundary-Recovering Network for Temporal Action Detection | Jihwan Kim et.al. | 2408.09354 | null |
2024-08-18 | YOLOv1 to YOLOv10: The fastest and most accurate real-time object detection systems | Chien-Yao Wang et.al. | 2408.09332 | null |
2024-08-17 | GSLAMOT: A Tracklet and Query Graph-based Simultaneous Locating, Mapping, and Multiple Object Tracking System | Shuo Wang et.al. | 2408.09191 | null |
2024-08-17 | PADetBench: Towards Benchmarking Physical Attacks against Object Detection | Jiawei Lian et.al. | 2408.09181 | link |
2024-08-17 | MaskBEV: Towards A Unified Framework for BEV Detection and Map Segmentation | Xiao Zhao et.al. | 2408.09122 | null |
2024-08-17 | Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community | Jiancheng Pan et.al. | 2408.09110 | null |
2024-08-16 | SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation | Xinyu Xiong et.al. | 2408.08870 | link |
2024-08-16 | Multimodal Relational Triple Extraction with Query-based Entity Object Transformer | Lei Hei et.al. | 2408.08709 | null |
2024-08-16 | Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs | Jinming Liu et.al. | 2408.08575 | null |
2024-08-15 | 5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks | Dongshuo Yin et.al. | 2408.08345 | link |
2024-08-15 | Learned Multimodal Compression for Autonomous Driving | Hadi Hadizadeh et.al. | 2408.08211 | null |
2024-08-16 | OC3D: Weakly Supervised Outdoor 3D Object Detection with Only Coarse Click Annotation | Qiming Xia et.al. | 2408.08092 | null |
2024-08-15 | CamoTeacher: Dual-Rotation Consistency Learning for Semi-Supervised Camouflaged Object Detection | Xunfa Lai et.al. | 2408.08050 | null |
2024-08-15 | Co-Fix3D: Enhancing 3D Object Detection with Collaborative Refinement | Wenxuan Li et.al. | 2408.07999 | null |
2024-08-15 | GOReloc: Graph-based Object-Level Relocalization for Visual SLAM | Yutong Wang et.al. | 2408.07917 | link |
2024-08-14 | See It All: Contextualized Late Aggregation for 3D Dense Captioning | Minjung Kim et.al. | 2408.07648 | null |
2024-08-14 | Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving | Yuqing Wen et.al. | 2408.07605 | null |
2024-08-14 | Infra-YOLO: Efficient Neural Network Structure with Model Compression for Real-Time Infrared Small Object Detection | Zhonglin Chen et.al. | 2408.07455 | null |
2024-08-14 | Sign language recognition based on deep learning and low-cost handcrafted descriptors | Alvaro Leandro Cavalcante Carneiro et.al. | 2408.07244 | link |
2024-08-13 | Vision Language Model for Interpretable and Fine-grained Detection of Safety Compliance in Diverse Workplaces | Zhiling Chen et.al. | 2408.07146 | null |
2024-08-13 | Divide and Conquer: Improving Multi-Camera 3D Perception with 2D Semantic-Depth Priors and Input-Dependent Queries | Qi Song et.al. | 2408.06901 | null |
2024-08-13 | Integrating Saliency Ranking and Reinforcement Learning for Enhanced Object Detection | Matthias Bartolo et.al. | 2408.06803 | link |
2024-08-13 | Exploring Domain Shift on Radar-Based 3D Object Detection Amidst Diverse Environmental Conditions | Miao Zhang et.al. | 2408.06772 | null |
2024-08-13 | Unified-IoU: For High-Quality Object Detection | Xiangjie Luo et.al. | 2408.06636 | link |
2024-08-13 | A lightweight YOLOv5-FFM model for occlusion pedestrian detection | Xiangjie Luo et.al. | 2408.06633 | null |
2024-08-13 | MV-DETR: Multi-modality indoor object detection by Multi-View DEtecton TRansformers | Zichao Dong et.al. | 2408.06604 | null |
2024-08-12 | Latent Disentanglement for Low Light Image Enhancement | Zhihao Zheng et.al. | 2408.06245 | null |
2024-08-12 | MR3D-Net: Dynamic Multi-Resolution 3D Sparse Voxel Grid Fusion for LiDAR-Based Collective Perception | Sven Teufel et.al. | 2408.06137 | link |
2024-08-12 | DPDETR: Decoupled Position Detection Transformer for Infrared-Visible Object Detection | Junjie Guo et.al. | 2408.06123 | null |
2024-08-12 | Optimizing Vision Transformers with Data-Free Knowledge Transfer | Gousia Habib et.al. | 2408.05952 | null |
2024-08-12 | MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection | Zitian Wang et.al. | 2408.05945 | null |
2024-08-12 | Multi-scale Contrastive Adaptor Learning for Segmenting Anything in Underperformed Scenes | Ke Zhou et.al. | 2408.05936 | null |
2024-08-12 | Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts | Peng Wu et.al. | 2408.05905 | null |
2024-08-12 | Toward Pedestrian Head Tracking: A Benchmark Dataset and an Information Fusion Network | Kailai Sun et.al. | 2408.05877 | null |
2024-08-11 | U-DECN: End-to-End Underwater Object Detection ConvNet with Improved DeNoising Training | Zhuoyan Liu et.al. | 2408.05780 | link |
2024-08-11 | FADE: A Dataset for Detecting Falling Objects around Buildings in Video | Zhigang Tu et.al. | 2408.05750 | null |
2024-08-09 | DeepInteraction++: Multi-Modality Interaction for Autonomous Driving | Zeyu Yang et.al. | 2408.05075 | link |
2024-08-09 | RadarPillars: Efficient Object Detection from 4D Radar Point Clouds | Alexander Musiat et.al. | 2408.05020 | null |
2024-08-09 | Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation | Yifan Feng et.al. | 2408.04804 | link |
2024-08-08 | SOD-YOLOv8 -- Enhancing YOLOv8 for Small Object Detection in Traffic Scenes | Boshra Khalili et.al. | 2408.04786 | null |
2024-08-08 | Data-Driven Pixel Control: Challenges and Prospects | Saurabh Farkya et.al. | 2408.04767 | null |
2024-08-10 | SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More | Tianrun Chen et.al. | 2408.04579 | null |
2024-08-07 | Impact Analysis of Data Drift Towards The Development of Safety-Critical Automotive System | Md Shahi Amran Hossain et.al. | 2408.04476 | null |
2024-08-08 | Detecting Car Speed using Object Detection and Depth Estimation: A Deep Learning Framework | Subhasis Dasgupta et.al. | 2408.04360 | null |
2024-08-08 | Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection | Shixuan Gao et.al. | 2408.04326 | null |
2024-08-08 | LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection | Mervat Abassy et.al. | 2408.04284 | null |
2024-08-08 | Learning to Rewrite: Generalized LLM-Generated Text Detection | Wei Hao et.al. | 2408.04237 | null |
2024-08-07 | PaveCap: The First Multimodal Framework for Comprehensive Pavement Condition Assessment with Dense Captioning and PCI Estimation | Blessing Agyei Kyem et.al. | 2408.04110 | link |
2024-08-07 | Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection | Christian Fruhwirth-Reisinger et.al. | 2408.03790 | null |
2024-08-07 | Data Generation Scheme for Thermal Modality with Edge-Guided Adversarial Conditional Diffusion Model | Guoqing Zhu et.al. | 2408.03748 | link |
2024-08-07 | CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications | Tianfang Zhang et.al. | 2408.03703 | link |
2024-08-07 | L4DR: LiDAR-4DRadar Fusion for Weather-Robust 3D Object Detection | Xun Huang et.al. | 2408.03677 | null |
2024-08-07 | Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks | Jaewook Lee et.al. | 2408.03663 | null |
2024-08-07 | Leveraging LLMs for Enhanced Open-Vocabulary 3D Scene Understanding in Autonomous Driving | Amirhosein Chahe et.al. | 2408.03516 | null |
2024-08-07 | GUI Element Detection Using SOTA YOLO Deep Learning Models | Seyed Shayan Daneshvar et.al. | 2408.03507 | null |
2024-08-06 | AI Foundation Models in Remote Sensing: A Survey | Siqi Lu et.al. | 2408.03464 | null |
2024-08-06 | Biomedical Image Segmentation: A Systematic Literature Review of Deep Learning Based Object Detection Methods | Fazli Wahid et.al. | 2408.03393 | null |
2024-08-06 | Nighttime Pedestrian Detection Based on Fore-Background Contrast Learning | He Yao et.al. | 2408.03030 | null |
2024-08-06 | Diverse Generation while Maintaining Semantic Coordination: A Diffusion-Based Data Augmentation Method for Object Detection | Sen Nie et.al. | 2408.02891 | null |
2024-08-05 | HQOD: Harmonious Quantization for Object Detection | Long Huang et.al. | 2408.02561 | null |
2024-08-05 | Tensorial template matching for fast cross-correlation with rotations and its application for tomography | Antonio Martinez-Sanchez et.al. | 2408.02398 | null |
2024-08-05 | Mixture-of-Noises Enhanced Forgery-Aware Predictor for Multi-Face Manipulation Detection and Localization | Changtao Miao et.al. | 2408.02306 | null |
2024-08-05 | AssemAI: Interpretable Image-Based Anomaly Detection for Manufacturing Pipelines | Renjith Prasad et.al. | 2408.02181 | null |
2024-08-04 | KAN-RCBEVDepth: A multi-modal fusion algorithm in object detection for autonomous driving | Zhihao Lai et.al. | 2408.02088 | null |
2024-08-06 | A Survey and Evaluation of Adversarial Attacks for Object Detection | Khoi Nguyen Tiet Nguyen et.al. | 2408.01934 | null |
2024-08-04 | CAF-YOLO: A Robust Framework for Multi-Scale Lesion Detection in Biomedical Imagery | Zilin Chen et.al. | 2408.01897 | null |
2024-08-03 | Supervised Image Translation from Visible to Infrared Domain for Object Detection | Prahlad Anand et.al. | 2408.01843 | null |
2024-08-03 | Domain penalisation for improved Out-of-Distribution Generalisation | Shuvam Jena et.al. | 2408.01746 | null |
2024-08-03 | LAM3D: Leveraging Attention for Monocular 3D Object Detection | Diana-Alexandra Sas et.al. | 2408.01739 | null |
2024-08-02 | A Robotics-Inspired Scanpath Model Reveals the Importance of Uncertainty and Semantic Object Cues for Gaze Guidance in Dynamic Scenes | Vito Mengers et.al. | 2408.01322 | null |
2024-08-02 | Underwater Object Detection Enhancement via Channel Stabilization | Muhammad Ali et.al. | 2408.01293 | null |
2024-08-02 | PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network | Changqun Xia et.al. | 2408.01137 | null |
2024-08-02 | Effect of Fog Particle Size Distribution on 3D Object Detection Under Adverse Weather Conditions | Ajinkya Shinde et.al. | 2408.01085 | null |
2024-08-02 | Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model | Yang Jin et.al. | 2408.01044 | null |
2024-08-02 | MambaST: A Plug-and-Play Cross-Spectral Spatial-Temporal Fuser for Efficient Pedestrian Detection | Xiangbo Gao et.al. | 2408.01037 | null |
2024-08-02 | Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach | Yabin Zhu et.al. | 2408.00969 | null |
2024-08-01 | Joint Neural Networks for One-shot Object Recognition and Detection | Camilo J. Vargas et.al. | 2408.00701 | null |
2024-08-01 | Harnessing Uncertainty-aware Bounding Boxes for Unsupervised 3D Object Detection | Ruiyang Zhang et.al. | 2408.00619 | null |
2024-08-01 | U2UData: A Large-scale Cooperative Perception Dataset for Swarm UAVs Autonomous Flight | Tongtong Feng et.al. | 2408.00606 | null |
2024-08-01 | MUFASA: Multi-View Fusion and Adaptation Network with Spatial Awareness for Radar Object Detection | Xiangyuan Peng et.al. | 2408.00565 | null |
2024-08-01 | Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval | Gangyan Zeng et.al. | 2408.00441 | null |
2024-08-01 | MonoMM: A Multi-scale Mamba-Enhanced Network for Real-time Monocular 3D Object Detection | Youjia Fu et.al. | 2408.00438 | null |
2024-08-01 | DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training | Yu Xie et.al. | 2408.00355 | null |
2024-08-01 | A Simple Background Augmentation Method for Object Detection with Diffusion Model | Yuhang Li et.al. | 2408.00350 | null |
2024-08-01 | Diff3DETR:Agent-based Diffusion Model for Semi-supervised 3D Object Detection | Jiacheng Deng et.al. | 2408.00286 | null |
2024-08-01 | RoCo:Robust Collaborative Perception By Iterative Object Matching and Pose Adjustment | Zhe Huang et.al. | 2408.00257 | null |
2024-07-31 | Dynamic Object Queries for Transformer-based Incremental Object Detection | Jichuan Zhang et.al. | 2407.21687 | null |
2024-07-31 | Spatial Transformer Network YOLO Model for Agricultural Object Detection | Yash Zambre et.al. | 2407.21652 | null |
2024-07-31 | Evaluating SAM2's Role in Camouflaged Object Detection: From SAM to SAM2 | Lv Tang et.al. | 2407.21596 | null |
2024-07-31 | InScope: A New Real-world 3D Infrastructure-side Collaborative Perception Dataset for Open Traffic Scenarios | Xiaofei Zhang et.al. | 2407.21581 | null |
2024-07-31 | Voxel Scene Graph for Intracranial Hemorrhage | Antoine P. Sanner et.al. | 2407.21580 | null |
2024-07-31 | MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection | Kuo Wang et.al. | 2407.21465 | null |
2024-07-31 | Generalized Tampered Scene Text Detection in the era of Generative AI | Chenfan Qu et.al. | 2407.21422 | null |
2024-07-30 | Candidate Distant Trans-Neptunian Objects Detected by the New Horizons Subaru TNO Survey | Wesley C. Fraser et.al. | 2407.21142 | null |
2024-07-30 | What is YOLOv5: A deep look into the internal features of the popular object detector | Rahima Khanam et.al. | 2407.20892 | null |
2024-07-30 | WARM-3D: A Weakly-Supervised Sim2Real Domain Adaptation Framework for Roadside Monocular 3D Object Detection | Xingcheng Zhou et.al. | 2407.20818 | null |
2024-07-31 | Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-performance and Energy-efficient Object Detection | Xinhao Luo et.al. | 2407.20708 | link |
2024-07-29 | Uncertainty-Rectified YOLO-SAM for Weakly Supervised ICH Segmentation | Pascal Spiegler et.al. | 2407.20461 | null |
2024-07-29 | MEVDT: Multi-Modal Event-Based Vehicle Detection and Tracking Dataset | Zaid A. El Shair et.al. | 2407.20446 | null |
2024-07-30 | AxiomVision: Accuracy-Guaranteed Adaptive Visual Model Selection for Perspective-Aware Video Analytics | Xiangxiang Dai et.al. | 2407.20124 | link |
2024-07-29 | Octave-YOLO: Cross frequency detection network with octave convolution | Sangjune Shin et.al. | 2407.19746 | null |
2024-07-29 | Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images | Zewen Du et.al. | 2407.19696 | null |
2024-07-29 | Practical Video Object Detection via Feature Selection and Aggregation | Yuheng Shi et.al. | 2407.19650 | link |
2024-07-28 | Solving Short-Term Relocalization Problems In Monocular Keyframe Visual SLAM Using Spatial And Semantic Data | Azmyin Md. Kamal et.al. | 2407.19518 | link |
2024-07-28 | Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets | Tianxiao Zhang et.al. | 2407.19394 | link |
2024-07-27 | Sewer Image Super-Resolution with Depth Priors and Its Lightweight Network | Gang Pan et.al. | 2407.19271 | null |
2024-07-27 | Enhancing Tree Type Detection in Forest Fire Risk Assessment: Multi-Stage Approach and Color Encoding with Forest Fire Risk Evaluation Framework for UAV Imagery | Jinda Zhang et.al. | 2407.19184 | null |
2024-07-27 | Reducing Spurious Correlation for Federated Domain Generalization | Shuran Ma et.al. | 2407.19174 | null |
2024-07-27 | Robust Multimodal 3D Object Detection via Modality-Agnostic Decoding and Proximity-based Modality Ensemble | Juhan Cha et.al. | 2407.19156 | link |
2024-07-26 | Local Binary Pattern(LBP) Optimization for Feature Extraction | Zeinab Sedaghatjoo et.al. | 2407.18665 | null |
2024-07-25 | LION: Linear Group RNN for 3D Object Detection in Point Clouds | Zhe Liu et.al. | 2407.18232 | link |
2024-07-25 | XS-VID: An Extremely Small Video Object Detection Dataset | Jiahao Guo et.al. | 2407.18137 | null |
2024-07-25 | SaccadeDet: A Novel Dual-Stage Architecture for Rapid and Accurate Detection in Gigapixel Images | Wenxi Li et.al. | 2407.17956 | null |
2024-07-25 | A Novel Perception Entropy Metric for Optimizing Vehicle Perception with LiDAR Deployment | Yongjiang He et.al. | 2407.17942 | null |
2024-07-25 | Hierarchical Object Detection and Recognition Framework for Practical Plant Disease Diagnosis | Kohei Iwano et.al. | 2407.17906 | null |
2024-07-25 | Advancing 3D Point Cloud Understanding through Deep Transfer Learning: A Comprehensive Survey | Shahab Saquib Sohail et.al. | 2407.17877 | null |
2024-07-25 | Enhancing Fine-grained Object Detection in Aerial Images via Orthogonal Mapping | Haoran Zhu et.al. | 2407.17738 | link |
2024-07-26 | Unsqueeze [CLS] Bottleneck to Learn Rich Representations | Qing Su et.al. | 2407.17671 | link |
2024-07-24 | SDLNet: Statistical Deep Learning Network for Co-Occurring Object Detection and Identification | Binay Kumar Singh et.al. | 2407.17664 | null |
2024-07-24 | PEEKABOO: Hiding parts of an image for unsupervised object localization | Hasib Zunair et.al. | 2407.17628 | link |
2024-07-24 | ALPI: Auto-Labeller with Proxy Injection for 3D Object Detection using 2D Labels Only | Saad Lahlali et.al. | 2407.17197 | null |
2024-07-24 | DVPE: Divided View Position Embedding for Multi-View 3D Object Detection | Jiasen Wang et.al. | 2407.16955 | link |
2024-07-23 | What Matters in Range View 3D Object Detection | Benjamin Wilson et.al. | 2407.16789 | link |
2024-07-23 | A Framework for Pupil Tracking with Event Cameras | Khadija Iddrisu et.al. | 2407.16665 | null |
2024-07-24 | Velocity Driven Vision: Asynchronous Sensor Fusion Birds Eye View Models for Autonomous Vehicles | Seamie Hayes et.al. | 2407.16636 | null |
2024-07-23 | COALA: A Practical and Vision-Centric Federated Learning Platform | Weiming Zhuang et.al. | 2407.16560 | link |
2024-07-23 | Dynamic Retraining-Updating Mean Teacher for Source-Free Object Detection | Trinh Le Ba Khanh et.al. | 2407.16497 | link |
2024-07-23 | MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection | Youngmin Oh et.al. | 2407.16448 | link |
2024-07-23 | ESOD: Efficient Small Object Detection on High-Resolution Images | Kai Liu et.al. | 2407.16424 | null |
2024-07-23 | Understanding Impacts of Electromagnetic Signal Injection Attacks on Object Detection | Youqian Zhang et.al. | 2407.16327 | null |
2024-07-23 | DeepClean: Integrated Distortion Identification and Algorithm Selection for Rectifying Image Corruptions | Aditya Kapoor et.al. | 2407.16302 | null |
2024-07-23 | FoRA: Low-Rank Adaptation Model beyond Multimodal Siamese Network | Weiying Xie et.al. | 2407.16129 | link |
2024-07-22 | PLayerTV: Advanced Player Tracking and Identification for Automatic Soccer Highlight Clips | Håkon Maric Solberg et.al. | 2407.16076 | null |
2024-07-22 | Disentangling spatio-temporal knowledge for weakly supervised object detection and segmentation in surgical video | Guiqiu Liao et.al. | 2407.15794 | null |
2024-07-22 | Towards Open-World Object-based Anomaly Detection via Self-Supervised Outlier Synthesis | Brian K. S. Isaac-Medina et.al. | 2407.15763 | null |
2024-07-22 | Counter Turing Test ( |
Ishan Kavathekar et.al. | 2407.15694 | null |
2024-07-22 | YOLOv10 for Automated Fracture Detection in Pediatric Wrist Trauma X-rays | Ammar Ahmed et.al. | 2407.15689 | link |
2024-07-22 | SS-SFR: Synthetic Scenes Spatial Frequency Response on Virtual KITTI and Degraded Automotive Simulations for Object Detection | Daniel Jakab et.al. | 2407.15646 | null |
2024-07-22 | YOLO-pdd: A Novel Multi-scale PCB Defect Detection Method Using Deep Representations with Sequential Images | Bowen Liu et.al. | 2407.15427 | null |
2024-07-22 | Learning High-resolution Vector Representation from Multi-Camera Images for 3D Object Detection | Zhili Chen et.al. | 2407.15354 | null |
2024-07-22 | Explore the LiDAR-Camera Dynamic Adjustment Fusion for 3D Object Detection | Yiran Yang et.al. | 2407.15334 | null |
2024-07-21 | Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection | Kwanyong Park et.al. | 2407.15296 | null |
2024-07-21 | Multiple Object Detection and Tracking in Panoramic Videos for Cycling Safety Analysis | Jingwei Guo et.al. | 2407.15199 | null |
2024-07-19 | Enhancing Layout Hotspot Detection Efficiency with YOLOv8 and PCA-Guided Augmentation | Dongyang Wu et.al. | 2407.14498 | null |
2024-07-19 | MLMT-CNN for Object Detection and Segmentation in Multi-layer and Multi-spectral Images | Majedaldein Almahasneh et.al. | 2407.14473 | null |
2024-07-19 | EmoCAM: Toward Understanding What Drives CNN-based Emotion Recognition | Youssef Doulfoukar et.al. | 2407.14314 | null |
2024-07-19 | Bucketed Ranking-based Losses for Efficient Training of Object Detectors | Feyza Yavuz et.al. | 2407.14204 | link |
2024-07-19 | Visual Text Generation in the Wild | Yuanzhi Zhu et.al. | 2407.14138 | link |
2024-07-18 | GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model | Abdelrahman Shaker et.al. | 2407.13772 | link |
2024-07-18 | General Geometry-aware Weakly Supervised 3D Object Detection | Guowen Zhang et.al. | 2407.13748 | link |
2024-07-18 | Enhancing Source-Free Domain Adaptive Object Detection with Low-confidence Pseudo Label Distillation | Ilhoon Yoon et.al. | 2407.13524 | link |
2024-07-18 | The use of the symmetric finite difference in the local binary pattern (symmetric LBP) | Zeinab Sedaghatjoo et.al. | 2407.13178 | null |
2024-07-18 | Learning Camouflaged Object Detection from Noisy Pseudo Label | Jin Zhang et.al. | 2407.13157 | null |
2024-07-18 | DFMSD: Dual Feature Masking Stage-wise Knowledge Distillation for Object Detection | Zhourui Zhang et.al. | 2407.13147 | null |
2024-07-18 | FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection | Jianwei Zhao et.al. | 2407.13133 | null |
2024-07-17 | AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer | Zhuguanyu Wu et.al. | 2407.12951 | link |
2024-07-17 | Toward INT4 Fixed-Point Training via Exploring Quantization Error for Gradients | Dohyung Kim et.al. | 2407.12637 | null |
2024-07-17 | CerberusDet: Unified Multi-Task Object Detection | Irina Tolstykh et.al. | 2407.12632 | link |
2024-07-17 | Weighting Pseudo-Labels via High-Activation Feature Index Similarity and Object Detection for Semi-Supervised Segmentation | Prantik Howlader et.al. | 2407.12630 | link |
2024-07-17 | Enhancing Wrist Abnormality Detection with YOLO: Analysis of State-of-the-art Single-stage Detection Models | Ammar Ahmed et.al. | 2407.12597 | link |
2024-07-17 | Embracing Events and Frames with Hierarchical Feature Refinement Network for Object Detection | Hu Cao et.al. | 2407.12582 | null |
2024-07-17 | Close the Sim2real Gap via Physically-based Structured Light Synthetic Data Simulation | Kaixin Bai et.al. | 2407.12449 | null |
2024-07-17 | GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval | Han Zhou et.al. | 2407.12431 | link |
2024-07-17 | Exploring Deeper! Segment Anything Model with Depth Perception for Camouflaged Object Detection | Zhenni Yu et.al. | 2407.12339 | null |
2024-07-16 | AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs | Yunling Zheng et.al. | 2407.12217 | null |
2024-07-16 | The object detection method aids in image reconstruction evaluation and clinical interpretation of meniscal abnormalities | Natalia Konovalova et.al. | 2407.12184 | null |
2024-07-16 | A Case for Application-Aware Space Radiation Tolerance in Orbital Computing | Meiqi Wang et.al. | 2407.11853 | null |
2024-07-16 | Improving Unsupervised Video Object Segmentation via Fake Flow Generation | Suhwan Cho et.al. | 2407.11714 | link |
2024-07-16 | Relation DETR: Exploring Explicit Position Relation Prior for Object Detection | Xiuquan Hou et.al. | 2407.11699 | link |
2024-07-16 | Bridge Past and Future: Overcoming Information Asymmetry in Incremental Object Detection | Qijie Mo et.al. | 2407.11499 | null |
2024-07-16 | Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes | Zhi Cai et.al. | 2407.11464 | link |
2024-07-16 | Generative AI Driven Task-Oriented Adaptive Semantic Communications | Yuzhou Fu et.al. | 2407.11354 | null |
2024-07-16 | LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction | Penghui Du et.al. | 2407.11335 | null |
2024-07-16 | TCFormer: Visual Recognition via Token Clustering Transformer | Wang Zeng et.al. | 2407.11321 | link |
2024-07-16 | PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer | Pierre-David Letourneau et.al. | 2407.11306 | null |
2024-07-15 | OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models | Zijian Zhou et.al. | 2407.11213 | null |
2024-07-15 | Interpreting Hand gestures using Object Detection and Digits Classification | Sangeetha K et.al. | 2407.10902 | null |
2024-07-15 | RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception | Chunliang Li et.al. | 2407.10876 | link |
2024-07-15 | OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection | Jinghua Hou et.al. | 2407.10753 | null |
2024-07-15 | Anticipating Future Object Compositions without Forgetting | Youssef Zahran et.al. | 2407.10723 | null |
2024-07-15 | OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer | Yu Wang et.al. | 2407.10655 | link |
2024-07-15 | Backdoor Attacks against Image-to-Image Networks | Wenbo Jiang et.al. | 2407.10445 | null |
2024-07-14 | Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data | Tuo Feng et.al. | 2407.10200 | link |
2024-07-14 | LabelDistill: Label-guided Cross-modal Knowledge Distillation for Camera-based 3D Object Detection | Sanmin Kim et.al. | 2407.10164 | link |
2024-07-14 | FSD-BEV: Foreground Self-Distillation for Multi-view 3D Object Detection | Zheng Jiang et.al. | 2407.10135 | null |
2024-07-14 | When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset | Yi Zhang et.al. | 2407.10125 | null |
2024-07-12 | DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training | Chen Xin et.al. | 2407.09174 | link |
2024-07-12 | Open Vocabulary Multi-Label Video Classification | Rohit Gupta et.al. | 2407.09073 | null |
2024-07-12 | DroneMOT: Drone-based Multi-Object Tracking Considering Detection Difficulties and Simultaneous Moving of Drones and Objects | Peng Wang et.al. | 2407.09051 | null |
2024-07-12 | Task-driven single-image super-resolution reconstruction of document scans | Maciej Zyrek et.al. | 2407.08993 | null |
2024-07-11 | OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects | Akshay Krishnan et.al. | 2407.08711 | null |
2024-07-11 | Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene | Ruiyang Zhang et.al. | 2407.08569 | link |
2024-07-11 | Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation | Zeyang Zhao et.al. | 2407.08489 | null |
2024-07-11 | Semi-Supervised Object Detection: A Survey on Progress from CNN to Transformer | Tahira Shehzadi et.al. | 2407.08460 | null |
2024-07-11 | PowerYOLO: Mixed Precision Model for Hardware Efficient Object Detection with Event Data | Dominika Przewlocka-Rus et.al. | 2407.08272 | null |
2024-07-11 | Knowledge distillation to effectively attain both region-of-interest and global semantics from an image where multiple objects appear | Seonwhee Jin et.al. | 2407.08257 | link |
2024-07-11 | Enrich the content of the image Using Context-Aware Copy Paste | Qiushi Guo et.al. | 2407.08151 | null |
2024-07-11 | DMM: Disparity-guided Multispectral Mamba for Oriented Object Detection in Remote Sensing | Minghang Zhou et.al. | 2407.08132 | null |
2024-07-10 | MambaVision: A Hybrid Mamba-Transformer Vision Backbone | Ali Hatamizadeh et.al. | 2407.08083 | link |
2024-07-10 | Bayesian Detector Combination for Object Detection with Crowdsourced Annotations | Zhi Qin Tan et.al. | 2407.07958 | link |
2024-07-10 | Cross Domain Object Detection via Multi-Granularity Confidence Alignment based Mean Teacher | Jiangming Chen et.al. | 2407.07780 | null |
2024-07-10 | LSM: A Comprehensive Metric for Assessing the Safety of Lane Detection Systems in Autonomous Driving | Jörg Gamerdinger et.al. | 2407.07740 | null |
2024-07-10 | Few-Shot Domain Adaptive Object Detection for Microscopic Images | Sumayya Inayat et.al. | 2407.07633 | null |
2024-07-10 | Simplifying Source-Free Domain Adaptation for Object Detection: Effective Self-Training Strategies and Performance Insights | Yan Hao et.al. | 2407.07586 | link |
2024-07-09 | Exploring Camera Encoder Designs for Autonomous Driving Perception | Barath Lakshmanan et.al. | 2407.07276 | null |
2024-07-09 | ConvNLP: Image-based AI Text Detection | Suriya Prakash Jambunathan et.al. | 2407.07225 | null |
2024-07-09 | Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images | Chuanrui Zhang et.al. | 2407.06984 | null |
2024-07-09 | Cue Point Estimation using Object Detection | Giulia Argüello et.al. | 2407.06823 | link |
2024-07-09 | CoLA: Conditional Dropout and Language-driven Robust Dual-modal Salient Object Detection | Shuang Hao et.al. | 2407.06780 | link |
2024-07-09 | Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions | Yu-Guan Hsieh et.al. | 2407.06723 | null |
2024-07-08 | Stochastic Traveling Salesperson Problem with Neighborhoods for Object Detection | Cheng Peng et.al. | 2407.06366 | null |
2024-07-08 | GeoWATCH for Detecting Heavy Construction in Heterogeneous Time Series of Satellite Images | Jon Crall et.al. | 2407.06337 | null |
2024-07-08 | Multi-clue Consistency Learning to Bridge Gaps Between General and Oriented Object in Semi-supervised Detection | Chenxu Wang et.al. | 2407.05909 | link |
2024-07-08 | Boosting 3D Object Detection with Semantic-Aware Multi-Branch Framework | Hao Jing et.al. | 2407.05769 | null |
2024-07-08 | Short-term Object Interaction Anticipation with Disentangled Object Detection @ Ego4D Short Term Object Interaction Anticipation Challenge | Hyunjin Cho et.al. | 2407.05713 | link |
2024-07-08 | Weakly Supervised Test-Time Domain Adaptation for Object Detection | Anh-Dzung Doan et.al. | 2407.05607 | null |
2024-07-08 | Towards Reflected Object Detection: A Benchmark | Zhongtian Wang et.al. | 2407.05575 | null |
2024-07-08 | GMC: A General Framework of Multi-stage Context Learning and Utilization for Visual Detection Tasks | Xuan Wang et.al. | 2407.05566 | null |
2024-07-07 | CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs | Akshat Ramachandran et.al. | 2407.05266 | link |
2024-07-07 | Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image | Pengkun Jiao et.al. | 2407.05256 | null |
2024-07-06 | SCSA: Exploring the Synergistic Effects Between Spatial and Channel Attention | Yunzhong Si et.al. | 2407.05128 | null |
2024-07-06 | Quantizing YOLOv7: A Comprehensive Study | Mohammadamin Baghbanbashi et.al. | 2407.04943 | null |
2024-07-05 | SH17: A Dataset for Human Safety and Personal Protective Equipment Detection in Manufacturing Industry | Hafiz Mughees Ahmad et.al. | 2407.04590 | link |
2024-07-05 | Optimizing the image correction pipeline for pedestrian detection in the thermal-infrared domain | Christophe Karam et.al. | 2407.04484 | null |
2024-07-05 | Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detection | Zhiqiang Yang et.al. | 2407.04381 | link |
2024-07-05 | Towards Stable 3D Object Detection | Jiabao Wang et.al. | 2407.04305 | null |
2024-07-05 | Research, Applications and Prospects of Event-Based Pedestrian Detection: A Survey | Han Wang et.al. | 2407.04277 | null |
2024-07-04 | LiDAR-based Real-Time Object Detection and Tracking in Dynamic Environments | Wenqiang Du et.al. | 2407.04115 | null |
2024-07-04 | FIPGNet:Pyramid grafting network with feature interaction strategies | Ziyi Ding et.al. | 2407.04085 | null |
2024-07-04 | Detect Closer Surfaces that can be Seen: New Modeling and Evaluation in Cross-domain 3D Object Detection | Ruixiao Zhang et.al. | 2407.04061 | null |
2024-07-04 | The Solution for the GAIIC2024 RGB-TIR object detection Challenge | Xiangyu Wu et.al. | 2407.03872 | null |
2024-07-04 | StreamLTS: Query-based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection | Yunshuang Yuan et.al. | 2407.03825 | null |
2024-07-03 | Visual Grounding with Attention-Driven Constraint Balancing | Weitai Kang et.al. | 2407.03243 | null |
2024-07-03 | Category-Aware Dynamic Label Assignment with High-Quality Oriented Proposal | Mingkui Feng et.al. | 2407.03205 | null |
2024-07-03 | SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding | Weitai Kang et.al. | 2407.03200 | link |
2024-07-03 | Global Context Modeling in YOLOv8 for Pediatric Wrist Fracture Detection | Rui-Yang Ju et.al. | 2407.03163 | link |
2024-07-03 | YOLOv5, YOLOv8 and YOLOv10: The Go-To Detectors for Real-time Vision | Muhammad Hussain et.al. | 2407.02988 | null |
2024-07-03 | Mast Kalandar at SemEval-2024 Task 8: On the Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text | Jainit Sushil Bafna et.al. | 2407.02978 | null |
2024-07-03 | A Pairwise DomMix Attentive Adversarial Network for Unsupervised Domain Adaptive Object Detection | Jie Shao et.al. | 2407.02835 | null |
2024-07-03 | ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers | Yanfeng Jiang et.al. | 2407.02763 | null |
2024-07-02 | SMILe: Leveraging Submodular Mutual Information For Robust Few-Shot Object Detection | Anay Majee et.al. | 2407.02665 | null |
2024-07-02 | Robust ADAS: Enhancing Robustness of Machine Learning-based Advanced Driver Assistance Systems for Adverse Weather | Muhammad Zaeem Shahzad et.al. | 2407.02581 | null |
2024-07-02 | Similarity Distance-Based Label Assignment for Tiny Object Detection | Shuohao Shi et.al. | 2407.02394 | link |
2024-07-02 | OpenSlot: Mixed Open-set Recognition with Object-centric Learning | Xu Yin et.al. | 2407.02386 | null |
2024-07-02 | DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection | Kaixin Xu et.al. | 2407.02098 | null |
2024-07-02 | Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning | Chengchao Shen et.al. | 2407.02014 | link |
2024-07-02 | Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection | Zixing Li et.al. | 2407.01894 | link |
2024-07-01 | Scarecrow monitoring system:employing mobilenet ssd for enhanced animal supervision | Balaji VS et.al. | 2407.01435 | null |
2024-07-01 | Formal Verification of Object Detection | Avraham Raviv et.al. | 2407.01295 | null |
2024-07-01 | Cross-Architecture Auxiliary Feature Space Translation for Efficient Few-Shot Personalized Object Detection | Francesco Barbato et.al. | 2407.01193 | null |
2024-07-01 | Eliminating Position Bias of Language Models: A Mechanistic Approach | Ziqi Wang et.al. | 2407.01100 | null |
2024-07-01 | No More Potentially Dynamic Objects: Static Point Cloud Map Generation based on 3D Object Detection and Ground Projection | Soojin Woo et.al. | 2407.01073 | null |
2024-06-28 | Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood | Yang Xu et.al. | 2406.19874 | link |
2024-07-01 | Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding | Yifan Tang et.al. | 2406.19791 | null |
2024-06-28 | Basketball-SORT: An Association Method for Complex Multi-object Occlusion Problems in Basketball Multi-object Tracking | Qingrui Hu et.al. | 2406.19655 | null |
2024-06-27 | Robustness Testing of Black-Box Models Against CT Degradation Through Test-Time Augmentation | Jack Highton et.al. | 2406.19557 | null |
2024-06-27 | BOrg: A Brain Organoid-Based Mitosis Dataset for Automatic Analysis of Brain Diseases | Muhammad Awais et.al. | 2406.19556 | link |
2024-06-27 | Weighted Circle Fusion: Ensembling Circle Representation from Different Object Detection Results | Jialin Yue et.al. | 2406.19540 | null |
2024-06-27 | Stereo Vision Based Robot for Remote Monitoring with VR Support | Mohamed Fazil M. S. et.al. | 2406.19498 | null |
2024-06-27 | HUWSOD: Holistic Self-training for Unified Weakly Supervised Object Detection | Liujuan Cao et.al. | 2406.19394 | link |
2024-06-27 | STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning | Yanan Zhang et.al. | 2406.19362 | null |
2024-06-27 | Towards Reducing Data Acquisition and Labeling for Defect Detection using Simulated Data | Lukas Malte Kemeter et.al. | 2406.19175 | null |
2024-06-27 | FDLite: A Single Stage Lightweight Face Detector Network | Yogesh Aggarwal et.al. | 2406.19107 | null |
2024-06-27 | Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO | Fuseini Mumuni et.al. | 2406.19057 | null |
2024-06-27 | BiCo-Fusion: Bidirectional Complementary LiDAR-Camera Fusion for Semantic- and Spatial-Aware 3D Object Detection | Yang Song et.al. | 2406.19048 | null |
2024-06-27 | A Universal Railway Obstacle Detection System based on Semi-supervised Segmentation And Optical Flow | Qiushi Guo et.al. | 2406.18908 | null |
2024-06-26 | SpY: A Context-Based Approach to Spacecraft Component Detection | Trupti Mahendrakar et.al. | 2406.18709 | null |
2024-06-26 | Unveiling the Unknown: Conditional Evidence Decoupling for Unknown Rejection | Zhaowei Wu et.al. | 2406.18443 | link |
2024-06-26 | Detecting Machine-Generated Texts: Not Just "AI vs Humans" and Explainability is Complicated | Jiazhou Ji et.al. | 2406.18259 | null |
2024-06-26 | CTS: Sim-to-Real Unsupervised Domain Adaptation on 3D Detection | Meiying Zhang et.al. | 2406.18129 | null |
2024-06-26 | The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval | Meinardus Boris et.al. | 2406.18113 | link |
2024-06-25 | Unmasking the Imposters: In-Domain Detection of Human vs. Machine-Generated Tweets | Bryan E. Tuck et.al. | 2406.17967 | null |
2024-06-25 | ET tu, CLIP? Addressing Common Object Errors for Unseen Environments | Ye Won Byun et.al. | 2406.17876 | null |
2024-06-25 | MDHA: Multi-Scale Deformable Transformer with Hybrid Anchors for Multi-View 3D Object Detection | Michelle Adeline et.al. | 2406.17654 | link |
2024-06-25 | Embedded event based object detection with spiking neural network | Jonathan Courtois et.al. | 2406.17617 | null |
2024-06-27 | Towards Open-set Camera 3D Object Detection | Zhuolin He et.al. | 2406.17297 | null |
2024-06-25 | Exploring Test-Time Adaptation for Object Detection in Continually Changing Environments | Shilei Cao et.al. | 2406.16439 | null |
2024-06-24 | Artistic-style text detector and a new Movie-Poster dataset | Aoxiang Ning et.al. | 2406.16307 | null |
2024-06-24 | Investigating the Influence of Prompt-Specific Shortcuts in AI Generated Text Detection | Choonghyun Park et.al. | 2406.16275 | null |
2024-06-23 | Review of Zero-Shot and Few-Shot AI Algorithms in The Medical Domain | Maged Badawi et.al. | 2406.16143 | null |
2024-06-22 | Understanding Student and Academic Staff Perceptions of AI Use in Assessment and Feedback | Jasper Roe et.al. | 2406.15808 | null |
2024-06-22 | Smart Feature is What You Need | Zhaoxin Hu et.al. | 2406.15805 | link |
2024-06-22 | MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception | Guanqun Wang et.al. | 2406.15768 | null |
2024-06-21 | Towards Robust Training Datasets for Machine Learning with Ontologies: A Case Study for Emergency Road Vehicle Detection | Lynn Vonderhaar et.al. | 2406.15268 | null |
2024-06-21 | DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection | Jia Syuen Lim et.al. | 2406.14924 | null |
2024-06-21 | MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection | Zhuoxiao Chen et.al. | 2406.14878 | null |
2024-06-20 | Visible-Thermal Tiny Object Detection: A Benchmark Dataset and Baselines | Xinyi Ying et.al. | 2406.14482 | link |
2024-06-20 | Enhanced Bank Check Security: Introducing a Novel Dataset and Transformer-Based Approach for Detection and Verification | Muhammad Saif Ullah Khan et.al. | 2406.14370 | link |
2024-06-20 | HoTPP Benchmark: Are We Good at the Long Horizon Events Forecasting? | Ivan Karpukhin et.al. | 2406.14341 | link |
2024-06-20 | LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection | Lilian Hollard et.al. | 2406.14239 | link |
2024-06-20 | SSAD: Self-supervised Auxiliary Detection Framework for Panoramic X-ray based Dental Disease Diagnosis | Zijian Cai et.al. | 2406.13963 | link |
2024-06-20 | Towards the in-situ Trunk Identification and Length Measurement of Sea Cucumbers via Bézier Curve Modelling | Shuaixin Liu et.al. | 2406.13951 | link |
2024-06-19 | DPO: Dual-Perturbation Optimization for Test-time Adaptation in 3D Object Detection | Zhuoxiao Chen et.al. | 2406.13891 | link |
2024-06-19 | Semantic Enhanced Few-shot Object Detection | Zheng Wang et.al. | 2406.13498 | null |
2024-06-19 | Snowy Scenes,Clear Detections: A Robust Model for Traffic Light Detection in Adverse Weather Conditions | Shivank Garg et.al. | 2406.13473 | link |
2024-06-19 | Strengthening Layer Interaction via Dynamic Layer Attention | Kaishen Wang et.al. | 2406.13392 | link |
2024-06-18 | Privacy Preserving Federated Learning in Medical Imaging with Uncertainty Estimation | Nikolas Koutsoubis et.al. | 2406.12815 | link |
2024-06-18 | Online Anchor-based Training for Image Classification Tasks | Maria Tzelepi et.al. | 2406.12662 | null |
2024-06-18 | Applying Ensemble Methods to Model-Agnostic Machine-Generated Text Detection | Ivan Ong et.al. | 2406.12570 | null |
2024-06-18 | MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts | Dominik Macko et.al. | 2406.12549 | null |
2024-06-18 | ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection | Junhao Lin et.al. | 2406.12536 | link |
2024-06-18 | SDNIA-YOLO: A Robust Object Detection Model for Extreme Weather Conditions | Yuexiong Ding et.al. | 2406.12395 | null |
2024-06-18 | Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines | Honglei Zhang et.al. | 2406.12367 | null |
2024-06-18 | Certified ML Object Detection for Surveillance Missions | Mohammed Belcaid et.al. | 2406.12362 | null |
2024-06-18 | DASSF: Dynamic-Attention Scale-Sequence Fusion for Aerial Object Detection | Haodong Li et.al. | 2406.12285 | null |
2024-06-18 | The Solution for CVPR2024 Foundational Few-Shot Object Detection Challenge | Hongpeng Pan et.al. | 2406.12225 | null |
2024-06-17 | V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results | Jiaqi Wang et.al. | 2406.11739 | null |
2024-06-17 | YOLO-FEDER FusionNet: A Novel Deep Learning Architecture for Drone Detection | Tamara R. Lenhard et.al. | 2406.11641 | null |
2024-06-17 | Low-power Ship Detection in Satellite Images Using Neuromorphic Hardware | Gregor Lenz et.al. | 2406.11319 | null |
2024-06-17 | Semi-Supervised Domain Adaptation Using Target-Oriented Domain Augmentation for 3D Object Detection | Yecheol Kim et.al. | 2406.11313 | link |
2024-06-17 | Syn-to-Real Unsupervised Domain Adaptation for Indoor 3D Object Detection | Yunsong Wang et.al. | 2406.11311 | null |
2024-06-17 | Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding | Yunsong Wang et.al. | 2406.11283 | null |
2024-06-17 | YOLO9tr: A Lightweight Model for Pavement Damage Detection Utilizing a Generalized Efficient Layer Aggregation Network and Attention Mechanism | Sompote Youwai et.al. | 2406.11254 | link |
2024-06-16 | GANmut: Generating and Modifying Facial Expressions | Maria Surani et.al. | 2406.11079 | null |
2024-06-16 | Exploring the Limitations of Detecting Machine-Generated Text | Jad Doughman et.al. | 2406.11073 | null |
2024-06-16 | Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP | Shuyang Lin et.al. | 2406.10961 | null |
2024-06-14 | EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models | Julian Straub et.al. | 2406.10224 | null |
2024-06-14 | YOLOv1 to YOLOv10: A comprehensive review of YOLO variants and their application in the agricultural domain | Mujadded Al Rabbani Alif et.al. | 2406.10139 | null |
2024-06-14 | Shelf-Supervised Multi-Modal Pre-Training for 3D Object Detection | Mehar Khurana et.al. | 2406.10115 | null |
2024-06-14 | Automated GIS-Based Framework for Detecting Crosswalk Changes from Bi-Temporal High-Resolution Aerial Images | Richard Boadu Antwi et.al. | 2406.09731 | null |
2024-06-14 | An alternate approach for estimating grain-growth kinetics | Manoj Prabakar et.al. | 2406.09653 | null |
2024-06-13 | Scene Graph Generation in Large-Size VHR Satellite Imagery: A Large-Scale Dataset and A Context-Aware Approach | Yansheng Li et.al. | 2406.09410 | link |
2024-06-13 | Towards Evaluating the Robustness of Visual State Space Models | Hashmat Shadab Malik et.al. | 2406.09407 | link |
2024-06-13 | Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models | Yushi Hu et.al. | 2406.09403 | null |
2024-06-13 | Enhanced Object Detection: A Study on Vast Vocabulary Object Detection Track for V3Det Challenge 2024 | Peixi Wu et.al. | 2406.09201 | null |
2024-06-13 | Navigating the Shadows: Unveiling Effective Disturbances for Modern AI Content Detectors | Ying Zhou et.al. | 2406.08922 | link |
2024-06-13 | Computer vision-based model for detecting turning lane features on Florida's public roadways | Richard Boadu Antwi et.al. | 2406.08822 | null |
2024-06-13 | BEVSpread: Spread Voxel Pooling for Bird's-Eye-View Representation in Vision-based Roadside 3D Object Detection | Wenjie Wang et.al. | 2406.08785 | null |
2024-06-12 | UnO: Unsupervised Occupancy Fields for Perception and Forecasting | Ben Agro et.al. | 2406.08691 | null |
2024-06-12 | Transformation-Dependent Adversarial Attacks | Yaoteng Tan et.al. | 2406.08443 | null |
2024-06-12 | Dataset Enhancement with Instance-Level Augmentations | Orest Kupyn et.al. | 2406.08249 | link |
2024-06-12 | Chemistry3D: Robotic Interaction Benchmark for Chemistry Experiments | Shoujie Li et.al. | 2406.08160 | null |
2024-06-12 | CT3D++: Improving 3D Object Detection with Keypoint-induced Channel-wise Transformer | Hualian Sheng et.al. | 2406.08152 | null |
2024-06-12 | MWIRSTD: A MWIR Small Target Detection Dataset | Nikhil Kumar et.al. | 2406.08063 | link |
2024-06-12 | Sense Less, Generate More: Pre-training LiDAR Perception with Masked Autoencoders for Ultra-Efficient 3D Sensing | Sina Tayebati et.al. | 2406.07833 | null |
2024-06-11 | A Deep Learning Approach to Detect Complete Safety Equipment For Construction Workers Based On YOLOv7 | Md. Shariful Islam et.al. | 2406.07707 | null |
2024-06-11 | Transforming a rare event search into a not-so-rare event search in real-time with deep learning-based object detection | J. Schueler et.al. | 2406.07538 | null |
2024-06-11 | Understanding Visual Concepts Across Models | Brandon Trabucco et.al. | 2406.07506 | link |
2024-06-11 | Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach | Challapalli Phanindra Revanth et.al. | 2406.07332 | null |
2024-06-11 | Unsupervised Object Detection with Theoretical Guarantees | Marian Longa et.al. | 2406.07284 | null |
2024-06-11 | Advancing Grounded Multimodal Named Entity Recognition via LLM-Based Reformulation and Box-Based Segmentation | Jinyuan Li et.al. | 2406.07268 | null |
2024-06-11 | EFFOcc: A Minimal Baseline for EFficient Fusion-based 3D Occupancy Network | Yining Shi et.al. | 2406.07042 | link |
2024-06-11 | RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks | Zhechao Wang et.al. | 2406.07032 | null |
2024-06-12 | LiSD: An Efficient Multi-Task Learning Framework for LiDAR Segmentation and Detection | Jiahua Xu et.al. | 2406.07023 | null |
2024-06-11 | Teaching with Uncertainty: Unleashing the Potential of Knowledge Distillation in Object Detection | Junfei Yi et.al. | 2406.06999 | null |
2024-06-10 | UnSupDLA: Towards Unsupervised Document Layout Analysis | Talha Uddin Sheikh et.al. | 2406.06236 | null |
2024-06-10 | UEMM-Air: A Synthetic Multi-modal Dataset for Unmanned Aerial Vehicle Object Detection | Fan Liu et.al. | 2406.06230 | link |
2024-06-10 | ReCon1M:A Large-scale Benchmark Dataset for Relation Comprehension in Remote Sensing Imagery | Xian Sun et.al. | 2406.06028 | null |
2024-06-10 | Solution for SMART-101 Challenge of CVPR Multi-modal Algorithmic Reasoning Task 2024 | Jinwoo Ahn et.al. | 2406.05963 | null |
2024-06-10 | Open-Vocabulary Part-Based Grasping | Tjeard van Oort et.al. | 2406.05951 | null |
2024-06-09 | Stealthy Targeted Backdoor Attacks against Image Captioning | Wenshu Fan et.al. | 2406.05874 | null |
2024-06-09 | Scaling Graph Convolutions for Mobile Vision | William Avery et.al. | 2406.05850 | link |
2024-06-09 | Mamba YOLO: SSMs-Based YOLO For Object Detection | Zeyu Wang et.al. | 2406.05835 | link |
2024-06-09 | ControlLoc: Physical-World Hijacking Attack on Visual Perception in Autonomous Driving | Chen Ma et.al. | 2406.05810 | null |
2024-06-09 | SAM-PM: Enhancing Video Camouflaged Object Detection using Spatio-Temporal Attention | Muhammad Nawfal Meeran et.al. | 2406.05802 | link |
2024-06-07 | Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment | Venkanna Babu Guthula et.al. | 2406.04949 | null |
2024-06-07 | EGOR: Efficient Generated Objects Replay for incremental object detection | Zijia An et.al. | 2406.04829 | null |
2024-06-07 | UCDNet: Multi-UAV Collaborative 3D Object Detection Network by Reliable Feature Mapping | Pengju Tian et.al. | 2406.04648 | null |
2024-06-07 | UVCPNet: A UAV-Vehicle Collaborative Perception Network for 3D Object Detection | Yuchao Wang et.al. | 2406.04647 | null |
2024-06-06 | CORU: Comprehensive Post-OCR Parsing and Receipt Understanding Dataset | Abdelrahman Abdallah et.al. | 2406.04493 | link |
2024-06-06 | DeTra: A Unified Model for Object Detection and Trajectory Forecasting | Sergio Casas et.al. | 2406.04426 | null |
2024-06-06 | Parameter-Inverted Image Pyramid Networks | Xizhou Zhu et.al. | 2406.04330 | link |
2024-06-06 | LenslessFace: An End-to-End Optimized Lensless System for Privacy-Preserving Face Verification | Xin Cai et.al. | 2406.04129 | null |
2024-06-06 | Semmeldetector: Application of Machine Learning in Commercial Bakeries | Thomas H. Schmitt et.al. | 2406.04050 | null |
2024-06-06 | Frequency-based Matcher for Long-tailed Semantic Segmentation | Shan Li et.al. | 2406.03917 | link |
2024-06-06 | Instance Segmentation and Teeth Classification in Panoramic X-rays | Devichand Budagam et.al. | 2406.03747 | link |
2024-06-05 | FedPylot: Navigating Federated Learning for Real-Time Object Detection in Internet of Vehicles | Cyprien Quéméneur et.al. | 2406.03611 | link |
2024-06-05 | LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection | Qiang Chen et.al. | 2406.03459 | link |
2024-06-05 | Global Clipper: Enhancing Safety and Reliability of Transformer-based Object Detection Models | Qutub Syed Sha et.al. | 2406.03229 | null |
2024-06-05 | Situation Monitor: Diversity-Driven Zero-Shot Out-of-Distribution Detection using Budding Ensemble Architecture for Object Detection | Qutub Syed et.al. | 2406.03188 | null |
2024-06-05 | Enhanced Automotive Object Detection via RGB-D Fusion in a DiffusionDet Framework | Eliraz Orfaig et.al. | 2406.03129 | null |
2024-06-04 | Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation | Mohamed El Amine Boudjoghra et.al. | 2406.02548 | link |
2024-06-04 | SatSplatYOLO: 3D Gaussian Splatting-based Virtual Object Detection Ensembles for Satellite Feature Recognition | Van Minh Nguyen et.al. | 2406.02533 | null |
2024-06-04 | GrootVL: Tree Topology is All You Need in State Space Model | Yicheng Xiao et.al. | 2406.02395 | link |
2024-06-04 | Low-Rank Adaption on Transformer-based Oriented Object Detector for Satellite Onboard Processing of Remote Sensing Images | Xinyang Pu et.al. | 2406.02385 | link |
2024-06-04 | Radar Spectra-Language Model for Automotive Scene Parsing | Mariia Pushkareva et.al. | 2406.02158 | null |
2024-06-04 | Detecting Endangered Marine Species in Autonomous Underwater Vehicle Imagery Using Point Annotations and Few-Shot Learning | Heather Doig et.al. | 2406.01932 | null |
2024-06-04 | GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer | Ding Jia et.al. | 2406.01210 | link |
2024-06-03 | Learning Adaptive Fusion Bank for Multi-modal Salient Object Detection | Kunpeng Wang et.al. | 2406.01127 | link |
2024-06-03 | Visual Car Brand Classification by Implementing a Synthetic Image Dataset Creation Pipeline | Jan Lippemeier et.al. | 2406.01071 | null |
2024-06-03 | Multi-Object Tracking based on Imaging Radar 3D Object Detection | Patrick Palmer et.al. | 2406.01011 | null |
2024-05-31 | Power of Cooperative Supervision: Multiple Teachers Framework for Enhanced 3D Semi-Supervised Object Detection | Jin-Hee Lee et.al. | 2405.20720 | link |
2024-05-30 | On Calibration of Object Detectors: Pitfalls, Evaluation and Baselines | Selim Kuzucu et.al. | 2405.20459 | null |
2024-05-30 | RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection | Fangyi Chen et.al. | 2405.19854 | null |
2024-05-30 | Improving Object Detector Training on Synthetic Data by Starting With a Strong Baseline Methodology | Frank A. Ruis et.al. | 2405.19822 | null |
2024-05-30 | Towards Unified Multi-granularity Text Detection with Interactive Attention | Xingyu Wan et.al. | 2405.19765 | null |
2024-05-30 | Fully Test-Time Adaptation for Monocular 3D Object Detection | Hongbin Lin et.al. | 2405.19682 | null |
2024-05-30 | YotoR-You Only Transform One Representation | José Ignacio Díaz Villa et.al. | 2405.19629 | null |
2024-05-29 | Enabling Visual Recognition at Radio Frequency | Haowen Lai et.al. | 2405.19516 | null |
2024-05-29 | Model Agnostic Defense against Adversarial Patch Attacks on Object Detection in Unmanned Aerial Vehicles | Saurabh Pathak et.al. | 2405.19179 | null |
2024-05-29 | RGB-T Object Detection via Group Shuffled Multi-receptive Attention and Multi-modal Supervision | Jinzhong Wang et.al. | 2405.18955 | null |
2024-05-29 | SSGA-Net: Stepwise Spatial Global-local Aggregation Networks for for Autonomous Driving | Yiming Cui et.al. | 2405.18857 | null |
2024-05-29 | PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram | Sifan Zhou et.al. | 2405.18734 | null |
2024-05-28 | A Review and Implementation of Object Detection Models and Optimizations for Real-time Medical Mask Detection during the COVID-19 Pandemic | Ioanna Gogou et.al. | 2405.18387 | link |
2024-05-28 | Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving? | Yifan Bai et.al. | 2405.18361 | null |
2024-05-28 | Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention | Weitai Kang et.al. | 2405.18295 | null |
2024-05-28 | DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture | Shentong Mo et.al. | 2405.17995 | null |
2024-05-28 | Transformer and Hybrid Deep Learning Based Models for Machine-Generated Text Detection | Teodor-George Marchitan et.al. | 2405.17964 | null |
2024-05-28 | Self-supervised Pre-training for Transferable Multi-modal Perception | Xiaohao Xu et.al. | 2405.17942 | null |
2024-05-28 | Boosting General Trimap-free Matting in the Real-World Image | Leo Shan Wenzhang Zhou Grace Zhao et.al. | 2405.17916 | null |
2024-05-28 | The Binary Quantized Neural Network for Dense Prediction via Specially Designed Upsampling and Attention | Xingyu Ding et.al. | 2405.17776 | null |
2024-05-27 | Understanding differences in applying DETR to natural and medical images | Yanqi Xu et.al. | 2405.17677 | null |
2024-05-27 | Hardness-Aware Scene Synthesis for Semi-Supervised 3D Object Detection | Shuai Zeng et.al. | 2405.17422 | link |
2024-05-27 | Tracking Small Birds by Detection Candidate Region Filtering and Detection History-aware Association | Tingwei Liu et.al. | 2405.17323 | null |
2024-05-27 | Enhanced Automotive Radar Collaborative Sensing By Exploiting Constructive Interference | Lifan Xu et.al. | 2405.17297 | null |
2024-05-27 | SCaRL- A Synthetic Multi-Modal Dataset for Autonomous Driving | Avinash Nittur Ramesh et.al. | 2405.17030 | null |
2024-05-27 | Collective Perception Datasets for Autonomous Driving: A Comprehensive Review | Sven Teufel et.al. | 2405.16973 | null |
2024-05-27 | OED: Towards One-stage End-to-End Dynamic Scene Graph Generation | Guan Wang et.al. | 2405.16925 | link |
2024-05-27 | ContrastAlign: Toward Robust BEV Feature Alignment via Contrastive Learning for Multi-Modal 3D Object Detection | Ziying Song et.al. | 2405.16873 | null |
2024-05-27 | A re-calibration method for object detection with multi-modal alignment bias in autonomous driving | Zhihang Song et.al. | 2405.16848 | null |
2024-05-26 | A Study on Unsupervised Anomaly Detection and Defect Localization using Generative Model in Ultrasonic Non-Destructive Testing | Yusaku Ando et.al. | 2405.16580 | null |
2024-05-26 | AI-Generated Text Detection and Classification Based on BERT Deep Learning Algorithm | Hao Wang et.al. | 2405.16422 | null |
2024-05-24 | UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes | Ted Lentsch et.al. | 2405.15688 | null |
2024-05-24 | Multimodal Object Detection via Probabilistic a priori Information Integration | Hafsa El Hafyani et.al. | 2405.15596 | null |
2024-05-24 | Scale-Invariant Feature Disentanglement via Adversarial Learning for UAV-based Object Detection | Fan Liu et.al. | 2405.15465 | null |
2024-05-24 | Leveraging knowledge distillation for partial multi-task learning from multiple remote sensing datasets | Hoàng-Ân Lê et.al. | 2405.15394 | null |
2024-05-24 | Towards Global Optimal Visual In-Context Learning Prompt Selection | Chengming Xu et.al. | 2405.15279 | null |
2024-05-24 | Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection | Yajing Liu et.al. | 2405.15225 | null |
2024-05-24 | ODGEN: Domain-specific Object Detection Data Generation with Diffusion Models | Jingyuan Zhu et.al. | 2405.15199 | null |
2024-05-24 | MonoDETRNext: Next-generation Accurate and Efficient Monocular 3D Object Detection Method | Pan Liao et.al. | 2405.15176 | null |
2024-05-23 | Learning to Detect and Segment Mobile Objects from Unlabeled Videos | Yihong Sun et.al. | 2405.14841 | null |
2024-05-23 | Designing A Sustainable Marine Debris Clean-up Framework without Human Labels | Raymond Wang et.al. | 2405.14815 | null |
2024-05-23 | Drones Help Drones: A Collaborative Framework for Multi-Drone Object Trajectory Prediction and Beyond | Zhechao Wang et.al. | 2405.14674 | null |
2024-05-23 | Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment | Muhammad Sohail Danish et.al. | 2405.14497 | null |
2024-05-23 | YOLOv10: Real-Time End-to-End Object Detection | Ao Wang et.al. | 2405.14458 | link |
2024-05-23 | Harmony: A Joint Self-Supervised and Weakly-Supervised Framework for Learning General Purpose Visual Representations | Mohammed Baharoon et.al. | 2405.14239 | null |
2024-05-22 | Two Heads are Better Than One: Neural Networks Quantization with 2D Hilbert Curve-based Output Representation | Mykhailo Uss et.al. | 2405.14024 | null |
2024-05-22 | TS40K: a 3D Point Cloud Dataset of Rural Terrain and Electrical Transmission System | Diogo Lavado et.al. | 2405.13989 | null |
2024-05-22 | Class-Conditional self-reward mechanism for improved Text-to-Image models | Safouane El Ghazouali et.al. | 2405.13473 | link |
2024-05-22 | Adaptive Wireless Image Semantic Transmission and Over-The-Air Testing | Jiarun Ding et.al. | 2405.13403 | null |
2024-05-21 | BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once | Theodore Zhao et.al. | 2405.12971 | null |
2024-05-21 | AMFD: Distillation via Adaptive Multimodal Fusion for Multispectral Pedestrian Detection | Zizhao Chen et.al. | 2405.12944 | link |
2024-05-21 | Predicting the Influence of Adverse Weather on Pedestrian Detection with Automotive Radar and Lidar Sensors | Daniel Weihmayr et.al. | 2405.12736 | null |
2024-05-21 | Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text | Yafu Li et.al. | 2405.12689 | null |
2024-05-21 | Automating Attendance Management in Human Resources: A Design Science Approach Using Computer Vision and Facial Recognition | Bao-Thien Nguyen-Tat et.al. | 2405.12633 | null |
2024-05-21 | FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors | Shuai Liu et.al. | 2405.12601 | link |
2024-05-21 | Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering | Hiba Maryam et.al. | 2405.12533 | null |
2024-05-21 | Active Object Detection with Knowledge Aggregation and Distillation from Large Models | Dejie Yang et.al. | 2405.12509 | null |
2024-05-21 | Mutual Information Analysis in Multimodal Learning Systems | Hadi Hadizadeh et.al. | 2405.12456 | null |
2024-05-20 | Multi-View Attentive Contextualization for Multi-View 3D Object Detection | Xianpeng Liu et.al. | 2405.12200 | null |
2024-05-20 | Bangladeshi Native Vehicle Detection in Wild | Bipin Saha et.al. | 2405.12150 | link |
2024-05-20 | Salience-guided Ground Factor for Robust Localization of Delivery Robots in Complex Urban Environments | Jooyong Park et.al. | 2405.11855 | null |
2024-05-20 | DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment | Jianhong Han et.al. | 2405.11765 | link |
2024-05-20 | Versatile Teacher: A Class-aware Teacher-student Framework for Cross-domain Adaptation | Runou Yang et.al. | 2405.11754 | link |
2024-05-19 | FADet: A Multi-sensor 3D Object Detection Network based on Local Featured Attention | Ziang Guo et.al. | 2405.11682 | link |
2024-05-19 | SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization | Jialong Guo et.al. | 2405.11582 | link |
2024-05-19 | The First Swahili Language Scene Text Detection and Recognition Dataset | Fadila Wendigoundi Douamba et.al. | 2405.11437 | link |
2024-05-18 | InfRS: Incremental Few-Shot Object Detection in Remote Sensing Images | Wuzhou Li et.al. | 2405.11293 | null |
2024-05-18 | Visible and Clear: Finding Tiny Objects in Difference Map | Bing Cao et.al. | 2405.11276 | null |
2024-05-17 | A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model | Mingxiang Fu et.al. | 2405.10890 | null |
2024-05-17 | DeepPavlov at SemEval-2024 Task 8: Leveraging Transfer Learning for Detecting Boundaries of Machine-Generated Texts | Anastasia Voznyuk et.al. | 2405.10629 | link |
2024-05-17 | DuoSpaceNet: Leveraging Both Bird's-Eye-View and Perspective View Representations for 3D Object Detection | Zhe Huang et.al. | 2405.10577 | null |
2024-05-16 | Drone-type-Set: Drone types detection benchmark for drone detection and tracking | Kholoud AlDosari et.al. | 2405.10398 | null |
2024-05-16 | Grounded 3D-LLM with Referent Tokens | Yilun Chen et.al. | 2405.10370 | null |
2024-05-16 | Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection | Tianhe Ren et.al. | 2405.10300 | link |
2024-05-16 | Towards Task-Compatible Compressible Representations | Anderson de Andrade et.al. | 2405.10244 | link |
2024-05-16 | SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network | Zhaoxu Li et.al. | 2405.10148 | null |
2024-05-16 | SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection | Mingxuan Liu et.al. | 2405.10053 | null |
2024-05-16 | FPDIoU Loss: A Loss Function for Efficient Bounding Box Regression of Rotated Object Detection | Siliang Ma et.al. | 2405.09942 | null |
2024-05-16 | Infrared Adversarial Car Stickers | Xiaopei Zhu et.al. | 2405.09924 | null |
2024-05-16 | PillarNeXt: Improving the 3D detector by introducing Voxel2Pillar feature encoding and extracting multi-scale features | Xusheng Li et.al. | 2405.09828 | null |
2024-05-16 | Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection | Feiran Li et.al. | 2405.09782 | link |
2024-05-15 | Synth-to-Real Unsupervised Domain Adaptation for Instance Segmentation | Guo Yachan et.al. | 2405.09682 | null |
2024-05-15 | Dynamic Loss Decay based Robust Oriented Object Detection on Remote Sensing Images with Noisy Labels | Guozhang Liu et.al. | 2405.09024 | null |
2024-05-14 | CLIP with Quality Captions: A Strong Pretraining for Vision Tasks | Pavan Kumar Anasosalu Vasu et.al. | 2405.08911 | null |
2024-05-14 | Open-Vocabulary Object Detection via Neighboring Region Attention Alignment | Sunyuan Qiang et.al. | 2405.08593 | null |
2024-05-14 | Semantic Contextualization of Face Forgery: A New Definition, Dataset, and Detection Method | Mian Zou et.al. | 2405.08487 | null |
2024-05-14 | RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images | Zong-Wei Hong et.al. | 2405.08483 | link |
2024-05-14 | Multimodal Collaboration Networks for Geospatial Vehicle Detection in Dense, Occluded, and Large-Scale Events | Xin Wu et.al. | 2405.08251 | link |
2024-05-13 | RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors | Liam Dugan et.al. | 2405.07940 | null |
2024-05-13 | oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving | Abdul Hannan Khan et.al. | 2405.07698 | null |
2024-05-13 | MonoMAE: Enhancing Monocular 3D Detection through Depth-Aware Masked Autoencoders | Xueying Jiang et.al. | 2405.07696 | null |
2024-05-13 | Quality-aware Selective Fusion Network for V-D-T Salient Object Detection | Liuxin Bao et.al. | 2405.07655 | link |
2024-05-13 | Fast Training Data Acquisition for Object Detection and Segmentation using Black Screen Luminance Keying | Thomas Pöllabauer et.al. | 2405.07653 | null |
2024-05-13 | Integrity Monitoring of 3D Object Detection in Automated Driving Systems using Raw Activation Patterns and Spatial Filtering | Hakan Yekta Yatbaz et.al. | 2405.07600 | null |
2024-05-13 | Environmental Matching Attack Against Unmanned Aerial Vehicles Object Detection | Dehong Kong et.al. | 2405.07595 | null |
2024-05-13 | Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis | Tianci Bi et.al. | 2405.07481 | null |
2024-05-13 | Enhancing 3D Object Detection by Using Neural Network with Self-adaptive Thresholding | Houze Liu et.al. | 2405.07479 | null |
2024-05-12 | MAML MOT: Multiple Object Tracking based on Meta-Learning | Jiayi Chen et.al. | 2405.07272 | null |
2024-05-10 | How to Augment for Atmospheric Turbulence Effects on Thermal Adapted Object Detection Models? | Engin Uzun et.al. | 2405.06383 | null |
2024-05-10 | Precise Apple Detection and Localization in Orchards using YOLOv5 for Robotic Harvesting Systems | Jiang Ziyue et.al. | 2405.06260 | null |
2024-05-09 | CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks | Nick et.al. | 2405.05755 | null |
2024-05-09 | Depth Awakens: A Depth-perceptual Attention Fusion Network for RGB-D Camouflaged Object Detection | Xinran Liua et.al. | 2405.05614 | null |
2024-05-09 | The object detection model uses combined extraction with KNN and RF classification | Florentina Tatrin Kurniati et.al. | 2405.05551 | null |
2024-05-08 | Reviewing Intelligent Cinematography: AI research for camera-based video production | Adrian Azzarelli et.al. | 2405.05039 | null |
2024-05-07 | A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching | Xianlei Long et.al. | 2405.04589 | null |
2024-05-07 | DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving | Chen Min et.al. | 2405.04390 | null |
2024-05-07 | A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields | Raiyan Rahman et.al. | 2405.04305 | null |
2024-05-07 | ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers | Jinke Li et.al. | 2405.04299 | null |
2024-05-07 | Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore | Junchao Wu et.al. | 2405.04286 | null |
2024-05-07 | Deep Event-based Object Detection in Autonomous Driving: A Survey | Bingquan Zhou et.al. | 2405.03995 | null |
2024-05-06 | BadFusion: 2D-Oriented Backdoor Attacks against 3D Object Detection | Saket S. Chaturvedi et.al. | 2405.03884 | null |
2024-05-06 | RepVGG-GELAN: Enhanced GELAN with VGG-STYLE ConvNets for Brain Tumour Detection | Thennarasi Balakrishnan et.al. | 2405.03541 | link |
2024-05-06 | Low-light Object Detection | Pengpeng Li et.al. | 2405.03519 | null |
2024-05-06 | Salient Object Detection From Arbitrary Modalities | Nianchang Huang et.al. | 2405.03352 | null |
2024-05-06 | Modality Prompts for Arbitrary Modality Salient Object Detection | Nianchang Huang et.al. | 2405.03351 | null |
2024-05-06 | Vietnamese AI Generated Text Detection | Quang-Dan Tran et.al. | 2405.03206 | null |
2024-05-06 | PTQ4SAM: Post-Training Quantization for Segment Anything | Chengtao Lv et.al. | 2405.03144 | link |
2024-05-05 | Performance Evaluation of Real-Time Object Detection for Electric Scooters | Dong Chen et.al. | 2405.03039 | link |
2024-05-05 | SalFAU-Net: Saliency Fusion Attention U-Net for Salient Object Detection | Kassaw Abraham Mulat et.al. | 2405.02906 | null |
2024-05-07 | Adaptive Guidance Learning for Camouflaged Object Detection | Zhennan Chen et.al. | 2405.02824 | null |
2024-05-05 | PVTransformer: Point-to-Voxel Transformer for Scalable 3D Object Detection | Zhaoqi Leng et.al. | 2405.02811 | null |
2024-05-02 | Segmentation-Free Outcome Prediction in Head and Neck Cancer: Deep Learning-based Feature Extraction from Multi-Angle Maximum Intensity Projections (MA-MIPs) of PET Images | Amirhosein Toosi et.al. | 2405.01756 | null |
2024-05-02 | PointCompress3D -- A Point Cloud Compression Framework for Roadside LiDARs in Intelligent Transportation Systems | Walter Zimmer et.al. | 2405.01750 | null |
2024-05-02 | Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey | Guoping Xu et.al. | 2405.01725 | link |
2024-05-02 | SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients | Tushar Verma et.al. | 2405.01699 | null |
2024-05-02 | Imagine the Unseen: Occluded Pedestrian Detection via Adversarial Feature Completion | Shanshan Zhang et.al. | 2405.01311 | null |
2024-05-02 | Overcoming LLM Challenges using RAG-Driven Precision in Coffee Leaf Disease Remediation | Dr. Selva Kumar S et.al. | 2405.01310 | null |
2024-05-02 | Towards Consistent Object Detection via LiDAR-Camera Synergy | Kai Luo et.al. | 2405.01258 | link |
2024-05-02 | Federated Learning with Heterogeneous Data Handling for Robust Vehicular Object Detection | Ahmad Khalil et.al. | 2405.01108 | null |
2024-05-01 | Grains of Saliency: Optimizing Saliency-based Training of Biometric Attack Detection Models | Colton R. Crum et.al. | 2405.00650 | null |
2024-05-01 | Object detection under the linear subspace model with application to cryo-EM images | Amitay Eldar et.al. | 2405.00364 | null |
2024-04-30 | Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation | Yunhao Ge et.al. | 2404.19752 | null |
2024-04-30 | Quantifying Nematodes through Images: Datasets, Models, and Baselines of Deep Learning | Zhipeng Yuan et.al. | 2404.19748 | null |
2024-04-30 | Masked Multi-Query Slot Attention for Unsupervised Object Discovery | Rishav Pramanik et.al. | 2404.19654 | link |
2024-04-30 | Physical Backdoor: Towards Temperature-based Backdoor Attacks in the Physical World | Wen Yin et.al. | 2404.19417 | null |
2024-04-30 | UniFS: Universal Few-shot Instance Perception with Point Representations | Sheng Jin et.al. | 2404.19401 | null |
2024-04-30 | Pseudo Label Refinery for Unsupervised Domain Adaptation on Cross-dataset 3D Object Detection | Zhanwei Zhang et.al. | 2404.19384 | null |
2024-04-30 | Robust Pedestrian Detection via Constructing Versatile Pedestrian Knowledge Bank | Sungjune Park et.al. | 2404.19299 | null |
2024-04-29 | MiPa: Mixed Patch Infrared-Visible Modality Agnostic Object Detection | Heitor R. Medeiros et.al. | 2404.18849 | null |
2024-04-29 | Leveraging PointNet and PointNet++ for Lyft Point Cloud Classification Challenge | Rajat K. Doshi et.al. | 2404.18665 | null |
2024-04-29 | CoSense3D: an Agent-based Efficient Learning Framework for Collective Perception | Yunshuang Yuan et.al. | 2404.18617 | null |
2024-04-29 | Assessing Quality Metrics for Neural Reality Gap Input Mitigation in Autonomous Driving Testing | Stefano Carlo Lambertenghi et.al. | 2404.18577 | null |
2024-04-29 | Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images | Wenbin Guan et.al. | 2404.18426 | null |
2024-04-29 | Multi-modal Perception Dataset of In-water Objects for Autonomous Surface Vehicles | Mingi Jeong et.al. | 2404.18411 | null |
2024-04-28 | FAD-SAR: A Novel Fishing Activity Detection System via Synthetic Aperture Radar Images Based on Deep Learning Method | Yanbing Bai et.al. | 2404.18245 | null |
2024-04-28 | RadSimReal: Bridging the Gap Between Synthetic and Real Data in Radar Object Detection With Simulation | Oded Bialer et.al. | 2404.18150 | null |
2024-04-27 | Reliable Student: Addressing Noise in Semi-Supervised 3D Object Detection | Farzad Nozarian et.al. | 2404.17910 | link |
2024-04-27 | A Hybrid Approach for Document Layout Analysis in Document images | Tahira Shehzadi et.al. | 2404.17888 | null |
2024-04-26 | Inhomogeneous illuminated image enhancement under extremely low visibility condition | Libang Chen et.al. | 2404.17503 | null |
2024-04-26 | Cost-Sensitive Uncertainty-Based Failure Recognition for Object Detection | Moussa Kassem Sbeyti et.al. | 2404.17427 | null |
2024-04-26 | Enhancing mmWave Radar Point Cloud via Visual-inertial Supervision | Cong Fan et.al. | 2404.17229 | null |
2024-04-26 | MorphText: Deep Morphology Regularized Arbitrary-shape Scene Text Detection | Chengpei Xu et.al. | 2404.17151 | null |
2024-04-25 | Generating Minimalist Adversarial Perturbations to Test Object-Detection Models: An Adaptive Multi-Metric Evolutionary Search Approach | Cristopher McIntyre-Garcia et.al. | 2404.17020 | link |
2024-04-25 | Constellation Dataset: Benchmarking High-Altitude Object Detection for an Urban Intersection | Mehmet Kerem Turkcan et.al. | 2404.16944 | link |
2024-04-25 | Self-Balanced R-CNN for Instance Segmentation | Leonardo Rossi et.al. | 2404.16633 | link |
2024-04-25 | Cross-Domain Spatial Matching for Camera and Radar Sensor Data Fusion in Autonomous Vehicle Perception System | Daniel Dworak et.al. | 2404.16548 | null |
2024-04-25 | Commonsense Prototype for Outdoor Unsupervised 3D Object Detection | Hai Wu et.al. | 2404.16493 | link |
2024-04-25 | IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced Learning Tasks | Zitong Huang et.al. | 2404.16331 | null |
2024-04-25 | CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions | Haoyuan Li et.al. | 2404.16302 | link |
2024-04-24 | AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models | Zhiqiang Tang et.al. | 2404.16233 | null |
2024-04-24 | Observational parameters of Blue Large-Amplitude Pulsators | P. Pietrukowicz et.al. | 2404.16089 | null |
2024-04-24 | A Survey on Visual Mamba | Hanwei Zhang et.al. | 2404.15956 | null |
2024-04-24 | Steal Now and Attack Later: Evaluating Robustness of Object Detection against Black-box Adversarial Attacks | Erh-Chung Chen et.al. | 2404.15881 | null |
2024-04-24 | Revisiting Out-of-Distribution Detection in LiDAR-based 3D Object Detection | Michael Kösel et.al. | 2404.15879 | link |
2024-04-23 | CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection | Hongyi Cai et.al. | 2404.15451 | null |
2024-04-23 | ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning | Weifeng Chen et.al. | 2404.15449 | null |
2024-04-23 | Source-free Domain Adaptation for Video Object Detection Under Adverse Image Conditions | Xingguang Zhang et.al. | 2404.15252 | null |
2024-04-23 | Efficient Transformer Encoders for Mask2Former-style models | Manyi Yao et.al. | 2404.15244 | null |
2024-04-23 | Gallbladder Cancer Detection in Ultrasound Images based on YOLO and Faster R-CNN | Sara Dadjouy et.al. | 2404.15129 | null |
2024-04-23 | External Prompt Features Enhanced Parameter-efficient Fine-tuning for Salient Object Detection | Wen Liang et.al. | 2404.15008 | null |
2024-04-23 | ContextualFusion: Context-Based Multi-Sensor Fusion for 3D Object Detection in Adverse Operating Conditions | Shounak Sural et.al. | 2404.14780 | null |
2024-04-23 | Unified Unsupervised Salient Object Detection via Knowledge Transfer | Yao Yuan et.al. | 2404.14759 | link |
2024-04-22 | SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection | Yuxia Wang et.al. | 2404.14183 | null |
2024-04-22 | Text in the Dark: Extremely Low-Light Text Image Enhancement | Che-Tsung Lin et.al. | 2404.14135 | null |
2024-04-22 | CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective | Wencheng Zhu et.al. | 2404.14109 | null |
2024-04-22 | Benchmarking Multi-Modal LLMs for Testing Visual Deep Learning Systems Through the Lens of Image Mutation | Liwen Wang et.al. | 2404.13945 | null |
2024-04-22 | NeRF-DetS: Enhancing Multi-View 3D Object Detection with Sampling-adaptive Network of Continuous NeRF-based Representation | Chi Huang et.al. | 2404.13921 | null |
2024-04-22 | TeamTrack: A Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos | Atom Scott et.al. | 2404.13868 | null |
2024-04-22 | Toward Robust LiDAR based 3D Object Detection via Density-Aware Adaptive Thresholding | Eunho Lee et.al. | 2404.13852 | null |
2024-04-21 | A Nasal Cytology Dataset for Object Detection and Deep Learning | Mauro Camporeale et.al. | 2404.13745 | null |
2024-04-23 | Clio: Real-time Task-Driven Open-Set 3D Scene Graphs | Dominic Maggio et.al. | 2404.13696 | null |
2024-04-20 | FisheyeDetNet: Object Detection on Fisheye Surround View Camera Systems for Automated Driving | Ganesh Sistu et.al. | 2404.13443 | null |
2024-04-19 | A comparison between single-stage and two-stage 3D tracking algorithms for greenhouse robotics | David Rapado-Rincon et.al. | 2404.12963 | null |
2024-04-19 | Language-Driven Active Learning for Diverse Open-Set 3D Object Detection | Ross Greer et.al. | 2404.12856 | null |
2024-04-19 | ECOR: Explainable CLIP for Object Recognition | Ali Rasekh et.al. | 2404.12839 | null |
2024-04-19 | A Point-Based Approach to Efficient LiDAR Multi-Task Perception | Christopher Lang et.al. | 2404.12798 | null |
2024-04-19 | ELEV-VISION-SAM: Integrated Vision Language and Foundation Model for Automated Estimation of Building Lowest Floor Elevation | Yu-Hsuan Ho et.al. | 2404.12606 | null |
2024-04-18 | The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models | Cheng Shi et.al. | 2404.11957 | link |
2024-04-18 | Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition | Xunsong Li et.al. | 2404.11903 | null |
2024-04-17 | TempBEV: Improving Learned BEV Encoders with Combined Image and BEV Space Temporal Aggregation | Thomas Monninger et.al. | 2404.11803 | null |
2024-04-17 | Multimodal 3D Object Detection on Unseen Domains | Deepti Hegde et.al. | 2404.11764 | null |
2024-04-17 | Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection | Deepti Hegde et.al. | 2404.11737 | null |
2024-04-17 | Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems | Luca Bompani et.al. | 2404.11488 | link |
2024-04-17 | EcoMLS: A Self-Adaptation Approach for Architecting Green ML-Enabled Systems | Meghana Tedla et.al. | 2404.11411 | null |
2024-04-17 | Detector Collapse: Backdooring Object Detection to Catastrophic Overload or Blindness | Hangtao Zhang et.al. | 2404.11357 | null |
2024-04-17 | Simple In-place Data Augmentation for Surveillance Object Detection | Munkh-Erdene Otgonbold et.al. | 2404.11226 | null |
2024-04-17 | Feature Corrective Transfer Learning: End-to-End Solutions to Object Detection in Non-Ideal Visual Conditions | Chuheng Wei et.al. | 2404.11214 | null |
2024-04-17 | GhostNetV3: Exploring the Training Strategies for Compact Models | Zhenhua Liu et.al. | 2404.11202 | null |
2024-04-17 | How to deal with glare for improved perception of Autonomous Vehicles | Muhammad Z. Alam et.al. | 2404.10992 | null |
2024-04-17 | Leveraging 3D LiDAR Sensors to Enable Enhanced Urban Safety and Public Health: Pedestrian Monitoring and Abnormal Activity Detection | Nawfal Guefrachi et.al. | 2404.10978 | null |
2024-04-16 | OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection and Discovery | Matthew Inkawhich et.al. | 2404.10865 | null |
2024-04-16 | Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark | Jiangning Zhang et.al. | 2404.10760 | null |
2024-04-16 | Watch Your Step: Optimal Retrieval for Continual Learning at Scale | Truman Hickok et.al. | 2404.10758 | null |
2024-04-16 | Efficient optimal dispersed Haar-like filters for face detection | Zeinab Sedaghatjoo et.al. | 2404.10476 | null |
2024-04-16 | Camera clustering for scalable stream-based active distillation | Dani Manjah et.al. | 2404.10411 | null |
2024-04-15 | Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets | Dai Quoc Tran et.al. | 2404.10078 | link |
2024-04-15 | Explainable Light-Weight Deep Learning Pipeline for Improved Drought Stres | Aswini Kumar Patra et.al. | 2404.10073 | null |
2024-04-15 | VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection | Bonan Ding et.al. | 2404.09431 | null |
2024-04-14 | TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model | Wiktor Mucha et.al. | 2404.09254 | null |
2024-04-14 | DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection | Lewei Yao et.al. | 2404.09216 | null |
2024-04-14 | Coreset Selection for Object Detection | Hojun Lee et.al. | 2404.09161 | null |
2024-04-14 | Fusion-Mamba for Cross-modality Object Detection | Wenhao Dong et.al. | 2404.09146 | null |
2024-04-13 | The Snake's Beating Heart? A Millisecond Pulsar Binary in the Galactic Center Radio Filament G359.1 |
Marcus E. Lower et.al. | 2404.09098 | null |
2024-04-13 | BG-YOLO: A Bidirectional-Guided Method for Underwater Object Detection | Jian Zhang et.al. | 2404.08979 | null |
2024-04-13 | Shifting Spotlight for Co-supervision: A Simple yet Efficient Single-branch Network to See Through Camouflage | Yang Hu et.al. | 2404.08936 | null |
2024-04-12 | Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation | Yanhao Zheng et.al. | 2404.08603 | link |
2024-04-12 | FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation | Riza Velioglu et.al. | 2404.08582 | null |
2024-04-12 | Analyzing Decades-Long Environmental Changes in Namibia Using Archival Aerial Photography and Deep Learning | Girmaw Abebe Tadesse et.al. | 2404.08544 | null |
2024-04-12 | MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion | Zhe Li et.al. | 2404.08406 | null |
2024-04-12 | Overcoming Scene Context Constraints for Object Detection in wild using Defilters | Vamshi Krishna Kancharla et.al. | 2404.08293 | null |
2024-04-11 | ConsistencyDet: Robust Object Detector with Denoising Paradigm of Consistency Model | Lifan Jiang et.al. | 2404.07773 | null |
2024-04-11 | Exploiting Object-based and Segmentation-based Semantic Features for Deep Learning-based Indoor Scene Classification | Ricardo Pereira et.al. | 2404.07739 | null |
2024-04-11 | Run-time Monitoring of 3D Object Detection in Automated Driving Systems Using Early Layer Neural Activation Patterns | Hakan Yekta Yatbaz et.al. | 2404.07685 | null |
2024-04-11 | Finding Dino: A plug-and-play framework for unsupervised detection of out-of-distribution objects using prototypes | Poulami Sinhamahapatra et.al. | 2404.07664 | null |
2024-04-11 | Separated Attention: An Improved Cycle GAN Based Under Water Image Enhancement Method | Tashmoy Ghosh et.al. | 2404.07649 | null |
2024-04-11 | GLID: Pre-training a Generalist Encoder-Decoder Vision Model | Jihao Liu et.al. | 2404.07603 | null |
2024-04-11 | SFSORT: Scene Features-based Simple Online Real-Time Tracker | M. M. Morsali et.al. | 2404.07553 | link |
2024-04-11 | The Sydney Radio Star Catalogue: properties of radio stars at megahertz to gigahertz frequencies | Laura N. Driessen et.al. | 2404.07418 | null |
2024-04-11 | Simplifying Two-Stage Detectors for On-Device Inference in Remote Sensing | Jaemin Kang et.al. | 2404.07405 | null |
2024-04-11 | A fine-tuning workflow for automatic first-break picking with deep learning | Amir Mardan et.al. | 2404.07400 | link |
2024-04-10 | Identification of Fine-grained Systematic Errors via Controlled Scene Generation | Valentyn Boreiko et.al. | 2404.07045 | null |
2024-04-10 | Accurate Tennis Court Line Detection on Amateur Recorded Matches | Sameer Agrawal et.al. | 2404.06977 | null |
2024-04-10 | SARA: Smart AI Reading Assistant for Reading Comprehension | Enkeleda Thaqi et.al. | 2404.06906 | null |
2024-04-10 | Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data | Aakash Kumar et.al. | 2404.06715 | null |
2024-04-10 | Scaling Multi-Camera 3D Object Detection through Weak-to-Strong Eliciting | Hao Lu et.al. | 2404.06700 | link |
2024-04-09 | Learning Embeddings with Centroid Triplet Loss for Object Identification in Robotic Grasping | Anas Gouda et.al. | 2404.06277 | null |
2024-04-09 | Label-Efficient 3D Object Detection For Road-Side Units | Minh-Quan Dao et.al. | 2404.06256 | null |
2024-04-09 | Automatic Defect Detection in Sewer Network Using Deep Learning Based Object Detector | Bach Ha et.al. | 2404.06219 | null |
2024-04-09 | YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images | Chenguang Liu et.al. | 2404.06180 | null |
2024-04-09 | Enhanced Radar Perception via Multi-Task Learning: Towards Refined Data for Sensor Fusion Applications | Huawei Sun et.al. | 2404.06165 | null |
2024-04-09 | Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation | Zong-Wei Hong et.al. | 2404.06029 | null |
2024-04-08 | Retrieval-Augmented Open-Vocabulary Object Detection | Jooyeon Kim et.al. | 2404.05687 | link |
2024-04-08 | 3D-COCO: extension of MS-COCO dataset for image detection and 3D reconstruction modules | Maxence Bideaux et.al. | 2404.05641 | null |
2024-04-08 | PetKaz at SemEval-2024 Task 8: Can Linguistics Capture the Specifics of LLM-generated Text? | Kseniia Petukhova et.al. | 2404.05483 | null |
2024-04-08 | Detecting Every Object from Events | Haitian Zhang et.al. | 2404.05285 | link |
2024-04-08 | MOSE: Boosting Vision-based Roadside 3D Object Detection with Scene Cues | Xiahan Chen et.al. | 2404.05280 | null |
2024-04-08 | Rendering-Enhanced Automatic Image-to-Point Cloud Registration for Roadside Scenes | Yu Sheng et.al. | 2404.05164 | null |
2024-04-08 | Better Monocular 3D Detectors with LiDAR from the Past | Yurong You et.al. | 2404.05139 | link |
2024-04-07 | AirShot: Efficient Few-Shot Detection for Autonomous Exploration | Zihan Wang et.al. | 2404.05069 | link |
2024-04-07 | PlateSegFL: A Privacy-Preserving License Plate Detection Using Federated Segmentation Learning | Md. Shahriar Rahman Anuvab et.al. | 2404.05049 | null |
2024-04-07 | PathFinder: Attention-Driven Dynamic Non-Line-of-Sight Tracking with a Mobile Robot | Shenbagaraj Kannapiran et.al. | 2404.05024 | null |
2024-04-05 | SCAResNet: A ResNet Variant Optimized for Tiny Object Detection in Transmission and Distribution Towers | Weile Li et.al. | 2404.04179 | link |
2024-04-05 | Designing Robots to Help Women | Martin Cooney et.al. | 2404.04123 | null |
2024-04-04 | Is CLIP the main roadblock for fine-grained open-world perception? | Lorenzo Bianchi et.al. | 2404.03539 | link |
2024-04-04 | DQ-DETR: DETR with Dynamic Query for Tiny Object Detection | Yi-Xin Huang et.al. | 2404.03507 | null |
2024-04-05 | A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data | Iqra Bano et.al. | 2404.03493 | null |
2024-04-04 | MonoCD: Monocular 3D Object Detection with Complementary Depths | Longfei Yan et.al. | 2404.03181 | link |
2024-04-03 | DPFT: Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection | Felix Fent et.al. | 2404.03015 | null |
2024-04-03 | ALOHa: A New Measure for Hallucination in Captioning Models | Suzanne Petryk et.al. | 2404.02904 | null |
2024-04-03 | FlightScope: A Deep Comprehensive Assessment of Aircraft Detection Algorithms in Satellite Imagery | Safouane El Ghazouali et.al. | 2404.02877 | link |
2024-04-03 | HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras | Zhongyu Xia et.al. | 2404.02517 | link |
2024-04-04 | TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression | Ho-Joong Kim et.al. | 2404.02405 | null |
2024-04-04 | EGTR: Extracting Graph from Transformer for Scene Graph Generation | Jinbae Im et.al. | 2404.02072 | link |
2024-04-03 | Cooperative Students: Navigating Unsupervised Domain Adaptation in Nighttime Object Detection | Jicheng Yuan et.al. | 2404.01988 | link |
2024-04-02 | Towards Enhanced Analysis of Lung Cancer Lesions in EBUS-TBNA -- A Semi-Supervised Video Object Detection Method | Jyun-An Lin et.al. | 2404.01929 | null |
2024-04-02 | Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack | Ying Zhou et.al. | 2404.01907 | link |
2024-04-02 | Scene Adaptive Sparse Transformer for Event-based Object Detection | Yansong Peng et.al. | 2404.01882 | link |
2024-04-02 | Semi-Supervised Domain Adaptation for Wildfire Detection | JooYoung Jang et.al. | 2404.01842 | null |
2024-04-02 | Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection | Tahira Shehzadi et.al. | 2404.01819 | null |
2024-04-02 | Analyzing the Single Event Upset Vulnerability of Binarized Neural Networks on SRAM FPGAs | Ioanna Souvatzoglou et.al. | 2404.01757 | null |
2024-04-02 | Disentangled Pre-training for Human-Object Interaction Detection | Zhuolong Li et.al. | 2404.01725 | null |
2024-04-02 | Task Integration Distillation for Object Detectors | Hai Su et.al. | 2404.01699 | null |
2024-03-29 | PLoc: A New Evaluation Criterion Based on Physical Location for Autonomous Driving Datasets | Ruining Yang et.al. | 2403.19893 | null |
2024-03-29 | MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection | Ali Behrouz et.al. | 2403.19888 | null |
2024-03-28 | DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs | Donghyun Kim et.al. | 2403.19588 | link |
2024-03-28 | OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation | Zhenyu Wang et.al. | 2403.19580 | null |
2024-03-28 | AIpom at SemEval-2024 Task 8: Detecting AI-produced Outputs in M4 | Alexander Shirnin et.al. | 2403.19354 | null |
2024-03-28 | Sparse Generation: Making Pseudo Labels Sparse for weakly supervision with points | Tian Ma et.al. | 2403.19306 | null |
2024-03-28 | CAT: Exploiting Inter-Class Dynamics for Domain Adaptive Object Detection | Mikhail Kennerley et.al. | 2403.19278 | link |
2024-03-28 | Algorithmic Ways of Seeing: Using Object Detection to Facilitate Art Exploration | Louie Søs Meyer et.al. | 2403.19174 | null |
2024-03-28 | CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation | Lingjun Zhao et.al. | 2403.19104 | null |
2024-03-28 | A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement | Junjie Wen et.al. | 2403.19079 | null |
2024-03-27 | Illicit object detection in X-ray images using Vision Transformers | Jorgen Cani et.al. | 2403.19043 | null |
2024-03-27 | Benchmarking Object Detectors with COCO: A New Path Forward | Shweta Singh et.al. | 2403.18819 | link |
2024-03-27 | PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations | Ehsan Latif et.al. | 2403.18721 | null |
2024-03-27 | CosalPure: Learning Concept from Group Images for Robust Co-Saliency Detection | Jiayi Zhu et.al. | 2403.18554 | null |
2024-03-27 | BAM: Box Abstraction Monitors for Real-time OoD Detection in Object Detection | Changshun Wu et.al. | 2403.18373 | null |
2024-03-27 | Ship in Sight: Diffusion Models for Ship-Image Super Resolution | Luigi Sigillo et.al. | 2403.18370 | link |
2024-03-27 | DODA: Diffusion for Object-detection Domain Adaptation in Agriculture | Shuai Xiang et.al. | 2403.18334 | null |
2024-03-27 | Tracking-Assisted Object Detection with Event Cameras | Ting-Kang Yen et.al. | 2403.18330 | null |
2024-03-27 | SGDM: Static-Guided Dynamic Module Make Stronger Visual Models | Wenjie Xing et.al. | 2403.18282 | null |
2024-03-27 | Road Obstacle Detection based on Unknown Objectness Scores | Chihiro Noguchi et.al. | 2403.18207 | null |
2024-03-26 | State of the art applications of deep learning within tracking and detecting marine debris: A survey | Zoe Moorton et.al. | 2403.18067 | null |
2024-03-26 | The Solution for the CVPR 2023 1st foundation model challenge-Track2 | Haonan Xu et.al. | 2403.17702 | null |
2024-03-26 | PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition | Chenhongyi Yang et.al. | 2403.17695 | link |
2024-03-26 | UADA3D: Unsupervised Adversarial Domain Adaptation for 3D Object Detection with Sparse LiDAR and Large Domain Gaps | Maciej K Wozniak et.al. | 2403.17633 | null |
2024-03-26 | SSF3D: Strict Semi-Supervised 3D Object Detection with Switching Filter | Songbur Wong et.al. | 2403.17390 | null |
2024-03-26 | Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection | Jiacheng Zhang et.al. | 2403.17387 | null |
2024-03-26 | AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving | Mingfu Liang et.al. | 2403.17373 | null |
2024-03-26 | Staircase Localization for Autonomous Exploration in Urban Environments | Jinrae Kim et.al. | 2403.17330 | null |
2024-03-25 | Co-Occurring of Object Detection and Identification towards unlabeled object discovery | Binay Kumar Singh et.al. | 2403.17223 | null |
2024-03-25 | Optimizing LiDAR Placements for Robust Driving Perception in Adverse Conditions | Ye Li et.al. | 2403.17009 | link |
2024-03-25 | Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance | Jingyuan Zhu et.al. | 2403.16954 | null |
2024-03-25 | TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques | Ashok Urlana et.al. | 2403.16592 | null |
2024-03-25 | RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection | Zhiwei Lin et.al. | 2403.16440 | link |
2024-03-25 | ASDF: Assembly State Detection Utilizing Late Fusion by Integrating 6D Pose Estimation | Hannah Schieber et.al. | 2403.16400 | null |
2024-03-25 | Impact of Video Compression Artifacts on Fisheye Camera Visual Perception Tasks | Madhumitha Sakthi et.al. | 2403.16338 | null |
2024-03-24 | Cross-domain Multi-modal Few-shot Object Detection via Rich Text | Zeyu Shangguan et.al. | 2403.16188 | null |
2024-03-24 | Semantic Is Enough: Only Semantic Information For NeRF Reconstruction | Ruibo Wang et.al. | 2403.16043 | null |
2024-03-23 | Adversarial Defense Teacher for Cross-Domain Object Detection under Poor Visibility Conditions | Kaiwen Wang et.al. | 2403.15786 | null |
2024-03-23 | EAGLE: A Domain Generalization Framework for AI-generated Text Detection | Amrita Bhattacharjee et.al. | 2403.15690 | null |
2024-03-25 | Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection | Hongzhi Gao et.al. | 2403.15317 | null |
2024-03-22 | CR3DT: Camera-RADAR Fusion for 3D Detection and Tracking | Nicolas Baumann et.al. | 2403.15313 | null |
2024-03-22 | IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection | Junbo Yin et.al. | 2403.15241 | null |
2024-03-22 | MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection | Taeheon Kim et.al. | 2403.15209 | null |
2024-03-22 | SFOD: Spiking Fusion Object Detector | Yimeng Fan et.al. | 2403.15192 | link |
2024-03-22 | CRPlace: Camera-Radar Fusion with BEV Representation for Place Recognition | Shaowei Fu et.al. | 2403.15183 | null |
2024-03-22 | An In-Depth Analysis of Data Reduction Methods for Sustainable Deep Learning | Víctor Toscano-Durán et.al. | 2403.15150 | null |
2024-03-22 | Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection | Jiaming Li et.al. | 2403.15127 | link |
2024-03-22 | VRSO: Visual-Centric Reconstruction for Static Object Annotation | Chenyao Yu et.al. | 2403.15026 | null |
2024-03-22 | Vehicle Detection Performance in Nordic Region | Hamam Mokayed et.al. | 2403.15017 | null |
2024-03-21 | T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy | Qing Jiang et.al. | 2403.14610 | link |
2024-03-21 | UAV-Assisted Maritime Search and Rescue: A Holistic Approach | Martin Messmer et.al. | 2403.14281 | null |
2024-03-21 | Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection | Tim Salzmann et.al. | 2403.14270 | null |
2024-03-21 | 3D Object Detection from Point Cloud via Voting Step Diffusion | Haoran Hou et.al. | 2403.14133 | null |
2024-03-20 | EcoSense: Energy-Efficient Intelligent Sensing for In-Shore Ship Detection through Edge-Cloud Collaboration | Wenjun Huang et.al. | 2403.14027 | null |
2024-03-20 | RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition | Ziyu Liu et.al. | 2403.13805 | link |
2024-03-20 | Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments | Yang Yang et.al. | 2403.13803 | link |
2024-03-20 | Fostc3net:A Lightweight YOLOv5 Based On the Network Structure Optimization | Danqing Ma et.al. | 2403.13703 | null |
2024-03-20 | Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments | Djamahl Etchegaray et.al. | 2403.13556 | null |
2024-03-20 | MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Di Wang et.al. | 2403.13430 | link |
2024-03-20 | Few-shot Oriented Object Detection with Memorable Contrastive Learning in Remote Sensing Images | Jiawei Zhou et.al. | 2403.13375 | null |
2024-03-20 | Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection | Zhixin Lai et.al. | 2403.13335 | null |
2024-03-20 | DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception | Yibo Wang et.al. | 2403.13304 | null |
2024-03-20 | Facilitating Pornographic Text Detection for Open-Domain Dialogue Systems via Knowledge Distillation of Large Language Models | Huachuan Qiu et.al. | 2403.13250 | null |
2024-03-19 | SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model | Armen Avetisyan et.al. | 2403.13064 | null |
2024-03-19 | Wildfire danger prediction optimization with transfer learning | Spiros Maggioros et.al. | 2403.12871 | link |
2024-03-19 | As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks? | Anjun Hu et.al. | 2403.12693 | null |
2024-03-19 | EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks | Ziming Wang et.al. | 2403.12574 | null |
2024-03-19 | DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM | Yixuan Wu et.al. | 2403.12488 | null |
2024-03-19 | TransformMix: Learning Transformation and Mixing Strategies from Data | Tsz-Him Cheung et.al. | 2403.12429 | null |
2024-03-19 | VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation | Hao Wang et.al. | 2403.12415 | null |
2024-03-19 | Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition | Jielin Qiu et.al. | 2403.12339 | null |
2024-03-18 | EffiPerception: an Efficient Framework for Various Perception Tasks | Xinhao Xiang et.al. | 2403.12317 | null |
2024-03-18 | Prototipo de un Contador Bidireccional Automático de Personas basado en sensores de visión 3D | Benjamín Ojeda-Magaña et.al. | 2403.12310 | null |
2024-03-18 | Align and Distill: Unifying and Improving Domain Adaptive Object Detection | Justin Kay et.al. | 2403.12029 | link |
2024-03-18 | TrajectoryNAS: A Neural Architecture Search for Trajectory Prediction | Ali Asghar Sharifi et.al. | 2403.11695 | null |
2024-03-18 | Just Add $100 More: Augmenting NeRF-based Pseudo-LiDAR Point Cloud for Resolving Class-imbalance Problem | Mincheol Chang et.al. | 2403.11573 | null |
2024-03-18 | R2SNet: Scalable Domain Adaptation for Object Detection in Cloud-Based Robots Ecosystems via Proposal Refinement | Michele Antonazzi et.al. | 2403.11567 | null |
2024-03-18 | Continual Forgetting for Pre-trained Vision Models | Hongbo Zhao et.al. | 2403.11530 | link |
2024-03-17 | V2X-DGW: Domain Generalization for Multi-agent Perception under Adverse Weather Conditions | Baolu Li et.al. | 2403.11371 | null |
2024-03-17 | Advanced Knowledge Extraction of Physical Design Drawings, Translation and conversion to CAD formats using Deep Learning | Jesher Joshua M et.al. | 2403.11291 | null |
2024-03-17 | ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models | Siyuan Huang et.al. | 2403.11289 | null |
2024-03-17 | CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations | Yuwei Zhang et.al. | 2403.11220 | link |
2024-03-17 | GRA: Detecting Oriented Objects through Group-wise Rotating and Attention | Jiangshan Wang et.al. | 2403.11127 | null |
2024-03-17 | Self-supervised co-salient object detection via feature correspondence at multiple scales | Souradeep Chakraborty et.al. | 2403.11107 | link |
2024-03-14 | Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization | Zhao Wang et.al. | 2403.09433 | null |
2024-03-14 | D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection | Dinh Phat Do et.al. | 2403.09359 | link |
2024-03-14 | Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring | Yufei Zhan et.al. | 2403.09333 | link |
2024-03-14 | EfficientMFD: Towards More Efficient Multimodal Synchronous Fusion Detection | Jiaqing Zhang et.al. | 2403.09323 | link |
2024-03-14 | Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection | Martin Aubard et.al. | 2403.09313 | link |
2024-03-14 | MOTPose: Multi-object 6D Pose Estimation for Dynamic Video Sequences using Attention-based Temporal Fusion | Arul Selvam Periyasamy et.al. | 2403.09309 | null |
2024-03-14 | CLIP-EBC: CLIP Can Count Accurately through Enhanced Blockwise Classification | Yiming Ma et.al. | 2403.09281 | null |
2024-03-14 | D-YOLO a robust framework for object detection in adverse weather conditions | Zihan Chu et.al. | 2403.09233 | null |
2024-03-14 | Improving Distant 3D Object Detection Using 2D Box Supervision | Zetong Yang et.al. | 2403.09230 | null |
2024-03-14 | PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest | Jiajun Deng et.al. | 2403.09212 | null |
2024-03-13 | VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis | Enric Corona et.al. | 2403.08764 | null |
2024-03-13 | MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning | Jialv Zou et.al. | 2403.08760 | link |
2024-03-13 | Data Augmentation in Human-Centric Vision | Wentao Jiang et.al. | 2403.08650 | null |
2024-03-13 | PRAGO: Differentiable Multi-View Pose Optimization From Objectness Detections | Matteo Taiana et.al. | 2403.08586 | null |
2024-03-13 | A Multimodal Fusion Network For Student Emotion Recognition Based on Transformer and Tensor Product | Ao Xiang et.al. | 2403.08511 | null |
2024-03-13 | Improved YOLOv5 Based on Attention Mechanism and FasterNet for Foreign Object Detection on Railway and Airway tracks | Zongqing Qi et.al. | 2403.08499 | null |
2024-03-13 | IAMCV Multi-Scenario Vehicle Interaction Dataset | Novel Certad et.al. | 2403.08455 | null |
2024-03-13 | Advancing Security in AI Systems: A Novel Approach to Detecting Backdoors in Deep Neural Networks | Khondoker Murad Hossain et.al. | 2403.08208 | null |
2024-03-12 | TaskCLIP: Extend Large Vision-Language Model for Task Oriented Object Detection | Hanning Chen et.al. | 2403.08108 | null |
2024-03-12 | Aedes aegypti Egg Counting with Neural Networks for Object Detection | Micheli Nayara de Oliveira Vicente et.al. | 2403.08016 | null |
2024-03-12 | Mondrian: On-Device High-Performance Video Analytics with Compressive Packed Inference | Changmin Jeon et.al. | 2403.07598 | null |
2024-03-12 | PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution | Honghao Chen et.al. | 2403.07589 | null |
2024-03-12 | A Survey of Vision Transformers in Autonomous Driving: Current Trends and Future Directions | Quoc-Vinh Lai-Dang et.al. | 2403.07542 | null |
2024-03-12 | JSTR: Joint Spatio-Temporal Reasoning for Event-based Moving Object Detection | Hanyu Zhou et.al. | 2403.07436 | null |
2024-03-12 | Eliminating Cross-modal Conflicts in BEV Space for LiDAR-Camera 3D Object Detection | Jiahui Fu et.al. | 2403.07372 | null |
2024-03-12 | GPT-generated Text Detection: Benchmark Dataset and Tensor-based Detection Method | Zubair Qazi et.al. | 2403.07321 | link |
2024-03-12 | MENTOR: Multilingual tExt detectioN TOward leaRning by analogy | Hsin-Ju Lin et.al. | 2403.07286 | null |
2024-03-12 | SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection | Hongcheng Zhang et.al. | 2403.07284 | null |
2024-03-12 | Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction | Alexander Timans et.al. | 2403.07263 | null |
2024-03-11 | Class Imbalance in Object Detection: An Experimental Diagnosis and Study of Mitigation Strategies | Nieves Crasto et.al. | 2403.07113 | link |
2024-03-11 | Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head | Tiancheng Zhao et.al. | 2403.06892 | null |
2024-03-11 | LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations | Mohammad Alkhalefi et.al. | 2403.06813 | null |
2024-03-11 | Genetic Learning for Designing Sim-to-Real Data Augmentations | Bram Vanherle et.al. | 2403.06786 | null |
2024-03-11 | Evaluating the Energy Efficiency of Few-Shot Learning for Object Detection in Industrial Settings | Georgios Tsoumplekas et.al. | 2403.06631 | null |
2024-03-11 | Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers | Alexander H. Berger et.al. | 2403.06601 | null |
2024-03-11 | SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection | Yuxuan Li et.al. | 2403.06534 | link |
2024-03-11 | 3D Semantic Segmentation-Driven Representations for 3D Object Detection | Hayeon O et.al. | 2403.06501 | null |
2024-03-11 | Fine-Grained Pillar Feature Encoding Via Spatio-Temporal Virtual Grid for 3D Object Detection | Konyul Park et.al. | 2403.06433 | null |
2024-03-10 | Transformer based Multitask Learning for Image Captioning and Object Detection | Debolena Basak et.al. | 2403.06292 | null |
2024-03-10 | Poly Kernel Inception Network for Remote Sensing Detection | Xinhao Cai et.al. | 2403.06258 | link |
2024-03-08 | EVD4UAV: An Altitude-Sensitive Benchmark to Evade Vehicle Detection in UAV | Huiming Sun et.al. | 2403.05422 | null |
2024-03-08 | SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection | Yahao Lu et.al. | 2403.05416 | link |
2024-03-08 | Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery | Xavier Bou et.al. | 2403.05381 | null |
2024-03-08 | Frequency-Adaptive Dilated Convolution for Semantic Segmentation | Linwei Chen et.al. | 2403.05369 | link |
2024-03-08 | VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model | Junsu Kim et.al. | 2403.05346 | null |
2024-03-08 | Improving the Successful Robotic Grasp Detection Using Convolutional Neural Networks | Hamed Hosseini et.al. | 2403.05211 | null |
2024-03-08 | LanePtrNet: Revisiting Lane Detection as Point Voting and Grouping on Curves | Jiayan Cao et.al. | 2403.05155 | null |
2024-03-08 | RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features | Geonho Bang et.al. | 2403.05061 | null |
2024-03-08 | ActFormer: Scalable Collaborative Perception via Active Queries | Suozhi Huang et.al. | 2403.04968 | null |
2024-03-07 | FriendNet: Detection-Friendly Dehazing Network | Yihua Fan et.al. | 2403.04443 | null |
2024-03-07 | Effectiveness Assessment of Recent Large Vision-Language Models | Yao Jiang et.al. | 2403.04306 | null |
2024-03-07 | ACC-ViT : Atrous Convolution's Comeback in Vision Transformers | Nabil Ibtehaz et.al. | 2403.04200 | null |
2024-03-07 | CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images | Guanlin Shen et.al. | 2403.04198 | null |
2024-03-07 | Scalable and Robust Transformer Decoders for Interpretable Image Classification with Foundation Models | Evelyn Mannix et.al. | 2403.04125 | null |
2024-03-07 | CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D Object Detection | Gyusam Chang et.al. | 2403.03721 | null |
2024-03-06 | Adversarial Infrared Geometry: Using Geometry to Perform Adversarial Attack against Infrared Pedestrian Detectors | Kalibinuer Tiliwalidi et.al. | 2403.03674 | null |
2024-03-06 | Towards Detecting AI-Generated Text within Human-AI Collaborative Hybrid Texts | Zijie Zeng et.al. | 2403.03506 | null |
2024-03-06 | Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator | Wonhyeok Choi et.al. | 2403.03468 | null |
2024-03-06 | FLAME Diffuser: Grounded Wildfire Image Synthesis using Mask Guided Diffusion | Hao Wang et.al. | 2403.03463 | null |
2024-03-06 | Performance Evaluation of Semi-supervised Learning Frameworks for Multi-Class Weed Detection | Jiajia Li et.al. | 2403.03390 | link |
2024-03-05 | Detecting Concrete Visual Tokens for Multimodal Machine Translation | Braeden Bowen et.al. | 2403.03075 | null |
2024-03-05 | Loss Design for Single-carrier Joint Communication and Neural Network-based Sensing | Charlotte Muth et.al. | 2403.02929 | null |
2024-03-05 | Are Dense Labels Always Necessary for 3D Object Detection from Point Cloud? | Chenqiang Gao et.al. | 2403.02818 | null |
2024-03-05 | Bootstrapping Rare Object Detection in High-Resolution Satellite Imagery | Akram Zaytar et.al. | 2403.02736 | null |
2024-03-05 | FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View | Jiawei Hou et.al. | 2403.02710 | null |
2024-03-05 | False Positive Sampling-based Data Augmentation for Enhanced 3D Object Detection Accuracy | Jiyong Oh et.al. | 2403.02639 | null |
2024-03-05 | BSDP: Brain-inspired Streaming Dual-level Perturbations for Online Open World Object Detection | Yu Chen et.al. | 2403.02637 | null |
2024-03-04 | NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function | Abdullah Nazhat Abdullah et.al. | 2403.02411 | link |
2024-03-04 | COMMIT: Certifying Robustness of Multi-Sensor Fusion Systems against Semantic Attacks | Zijian Huang et.al. | 2403.02329 | null |
2024-03-04 | Scalable Vision-Based 3D Object Detection and Monocular Depth Estimation for Autonomous Driving | Yuxuan Liu et.al. | 2403.02037 | link |
2024-03-02 | TUMTraf V2X Cooperative Perception Dataset | Walter Zimmer et.al. | 2403.01316 | null |
2024-03-02 | Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection | Taeheon Kim et.al. | 2403.01300 | null |
2024-03-02 | Run-time Introspection of 2D Object Detection in Automated Driving Systems Using Learning Representations | Hakan Yekta Yatbaz et.al. | 2403.01172 | null |
2024-03-02 | ELA: Efficient Local Attention for Deep Convolutional Neural Networks | Wei Xu et.al. | 2403.01123 | null |
2024-03-02 | Face Swap via Diffusion Model | Feifei Wang et.al. | 2403.01108 | null |
2024-03-02 | Beyond Night Visibility: Adaptive Multi-Scale Fusion of Infrared and Visible Images | Shufan Pei et.al. | 2403.01083 | null |
2024-03-01 | Learning Causal Features for Incremental Object Detection | Zhenwei He et.al. | 2403.00591 | null |
2024-03-01 | Abductive Ego-View Accident Video Understanding for Safe Driving Perception | Jianwu Fang et.al. | 2403.00436 | null |
2024-03-04 | DAMS-DETR: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion | Junjie Guo et.al. | 2403.00326 | null |
2024-03-01 | ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting | Chen Duan et.al. | 2403.00303 | null |
2024-02-29 | SeMoLi: What Moves Together Belongs Together | Jenny Seidenschwarz et.al. | 2402.19463 | null |
2024-02-29 | Genie: Smart ROS-based Caching for Connected Autonomous Robots | Zexin Li et.al. | 2402.19410 | null |
2024-02-29 | ProtoP-OD: Explainable Object Detection with Prototypical Parts | Pavlos Rath-Manakidis et.al. | 2402.19142 | null |
2024-02-29 | Theoretically Achieving Continuous Representation of Oriented Bounding Boxes | Zikai Xiao et.al. | 2402.18975 | link |
2024-02-29 | Boosting Semi-Supervised Object Detection in Remote Sensing Images With Active Teaching | Boxuan Zhang et.al. | 2402.18958 | null |
2024-02-29 | Edge Computing Enabled Real-Time Video Analysis via Adaptive Spatial-Temporal Semantic Filtering | Xiang Chen et.al. | 2402.18927 | null |
2024-02-29 | A Simple yet Effective Network based on Vision Transformer for Camouflaged Object and Salient Object Detection | Chao Hao et.al. | 2402.18922 | null |
2024-02-29 | Privacy-Preserving Autoencoder for Collaborative Object Detection | Bardia Azizian et.al. | 2402.18864 | null |
2024-02-29 | Debiased Novel Category Discovering and Localization | Juexiao Feng et.al. | 2402.18821 | null |
2024-02-28 | Spatial Coherence Loss for Salient and Camouflaged Object Detection and Beyond | Ziyun Yang et.al. | 2402.18698 | null |
2024-02-28 | UniMODE: Unified Monocular 3D Object Detection | Zhuoling Li et.al. | 2402.18573 | null |
2024-02-28 | Detection of Micromobility Vehicles in Urban Traffic Videos | Khalil Sabri et.al. | 2402.18503 | link |
2024-02-28 | Sunshine to Rainstorm: Cross-Weather Knowledge Distillation for Robust 3D Object Detection | Xun Huang et.al. | 2402.18493 | null |
2024-02-28 | Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization | Deng Li et.al. | 2402.18447 | null |
2024-02-28 | Unveiling novel insights into Kirchhoff migration for effective object detection using experimental Fresnel dataset | Won-Kwang Park et.al. | 2402.18322 | null |
2024-02-28 | Zero-Shot Aerial Object Detection with Visual Description Regularization | Zhengqing Zang et.al. | 2402.18233 | null |
2024-02-28 | VulMCI : Code Splicing-based Pixel-row Oversampling for More Continuous Vulnerability Image Generation | Tao Peng et.al. | 2402.18189 | null |
2024-02-27 | SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection | Junsu Kim et.al. | 2402.17323 | null |
2024-02-27 | A Vanilla Multi-Task Framework for Dense Visual Prediction Solution to 1st VCL Challenge -- Multi-Task Robustness Track | Zehui Chen et.al. | 2402.17319 | null |
2024-02-27 | Probing Multimodal Large Language Models for Global and Local Semantic Representation | Mingxu Tao et.al. | 2402.17304 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-08-29 | Eigen-Cluster VIS: Improving Weakly-supervised Video Instance Segmentation by Leveraging Spatio-temporal Consistency | Farnoosh Arefi et.al. | 2408.16661 | link |
2024-08-29 | SODAWideNet++: Combining Attention and Convolutions for Salient Object Detection | Rohit Venkata Sai Dulam et.al. | 2408.16645 | null |
2024-08-29 | A Simple and Generalist Approach for Panoptic Segmentation | Nedyalko Prisadnikov et.al. | 2408.16504 | null |
2024-08-29 | MICDrop: Masking Image and Depth Features via Complementary Dropout for Domain-Adaptive Semantic Segmentation | Linyan Yang et.al. | 2408.16478 | null |
2024-08-29 | Multi-source Domain Adaptation for Panoramic Semantic Segmentation | Jing Jiang et.al. | 2408.16469 | null |
2024-08-29 | EvLight++: Low-Light Video Enhancement with an Event Camera: A Large-Scale Real-World Dataset, Novel Method, and More | Kanghao Chen et.al. | 2408.16254 | null |
2024-08-28 | InstanSeg: an embedding-based instance segmentation algorithm optimized for accurate, efficient and portable cell segmentation | Thibaut Goldsborough et.al. | 2408.15954 | link |
2024-08-28 | SpineMamba: Enhancing 3D Spinal Segmentation in Clinical Imaging through Residual Visual Mamba Layers and Shape Priors | Zhiqing Zhang et.al. | 2408.15887 | null |
2024-08-28 | DQFormer: Towards Unified LiDAR Panoptic Segmentation with Decoupled Queries | Yu Yang et.al. | 2408.15813 | null |
2024-08-28 | TeFF: Tracking-enhanced Forgetting-free Few-shot 3D LiDAR Semantic Segmentation | Junbao Zhou et.al. | 2408.15657 | link |
2024-08-27 | Handling Geometric Domain Shifts in Semantic Segmentation of Surgical RGB and Hyperspectral Images | Silvia Seidlitz et.al. | 2408.15373 | link |
2024-08-27 | An Investigation on The Position Encoding in Vision-Based Dynamics Prediction | Jiageng Zhu et.al. | 2408.15201 | null |
2024-08-27 | Knowledge Discovery in Optical Music Recognition: Enhancing Information Retrieval with Instance Segmentation | Elona Shatri et.al. | 2408.15002 | null |
2024-08-27 | Applying ViT in Generalized Few-shot Semantic Segmentation | Liyuan Geng et.al. | 2408.14957 | link |
2024-08-27 | Adversarial Manhole: Challenging Monocular Depth Estimation and Semantic Segmentation Models with Patch Attack | Naufal Suryanto et.al. | 2408.14879 | null |
2024-08-27 | MROVSeg: Breaking the Resolution Curse of Vision-Language Models in Open-Vocabulary Semantic Segmentation | Yuanbing Zhu et.al. | 2408.14776 | null |
2024-08-26 | Physically Feasible Semantic Segmentation | Shamik Basu et.al. | 2408.14672 | link |
2024-08-26 | A Survey of Camouflaged Object Detection and Beyond | Fengyang Xiao et.al. | 2408.14562 | null |
2024-08-26 | Satellite Sunroof: High-res Digital Surface Models and Roof Segmentation for Global Solar Mapping | Vishal Batchu et.al. | 2408.14400 | null |
2024-08-25 | OpenNav: Efficient Open Vocabulary 3D Object Detection for Smart Wheelchair Navigation | Muhammad Rameez ur Rahman et.al. | 2408.13936 | link |
2024-08-25 | Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation | Yuwen Pan et.al. | 2408.13838 | null |
2024-08-25 | TripleMixer: A 3D Point Cloud Denoising Model for Adverse Weather | Xiongwei Zhao et.al. | 2408.13802 | link |
2024-08-25 | ICFRNet: Image Complexity Prior Guided Feature Refinement for Real-time Semantic Segmentation | Xin Zhang et.al. | 2408.13771 | null |
2024-08-25 | Localization and Expansion: A Decoupled Framework for Point Cloud Few-shot Semantic Segmentation | Zhaoyang Li et.al. | 2408.13752 | null |
2024-08-24 | ESA: Annotation-Efficient Active Learning for Semantic Segmentation | Jinchao Ge et.al. | 2408.13491 | link |
2024-08-23 | Accuracy Improvement of Cell Image Segmentation Using Feedback Former | Hinako Mitsuoka et.al. | 2408.12974 | null |
2024-08-23 | Image Segmentation in Foundation Model Era: A Survey | Tianfei Zhou et.al. | 2408.12957 | null |
2024-08-23 | Symmetric masking strategy enhances the performance of Masked Image Modeling | Khanh-Binh Nguyen et.al. | 2408.12772 | null |
2024-08-22 | Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets | Wolfgang Boettcher et.al. | 2408.12489 | null |
2024-08-22 | The 2nd Solution for LSVOS Challenge RVOS Track: Spatial-temporal Refinement for Consistent Semantic Segmentation | Tuyen Tran et.al. | 2408.12447 | null |
2024-08-22 | ISETHDR: A Physics-based Synthetic Radiance Dataset for High Dynamic Range Driving Scenes | Zhenyi Liu et.al. | 2408.12048 | link |
2024-08-21 | EmbodiedSAM: Online Segment Any 3D Thing in Real Time | Xiuwei Xu et.al. | 2408.11811 | null |
2024-08-21 | NuSegDG: Integration of Heterogeneous Space and Gaussian Kernel for Domain-Generalized Nuclei Segmentation | Zhenye Lou et.al. | 2408.11787 | link |
2024-08-21 | Open-Ended 3D Point Cloud Instance Segmentation | Phuc D. A. Nguyen et.al. | 2408.11747 | null |
2024-08-21 | UNetMamba: Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images | Enze Zhu et.al. | 2408.11545 | null |
2024-08-22 | SAM-REF: Rethinking Image-Prompt Synergy for Refinement in Segment Anything | Chongkai Yu et.al. | 2408.11535 | null |
2024-08-21 | Exploring Scene Coherence for Semi-Supervised 3D Semantic Segmentation | Chuandong Liu et.al. | 2408.11280 | null |
2024-08-20 | An Interpretable Deep Learning Approach for Morphological Script Type Analysis | Malamatenia Vlachou-Efstathiou et.al. | 2408.11150 | null |
2024-08-20 | NeCo: Improving DINOv2's spatial representations in 19 GPU hours with Patch Neighbor Consistency | Valentinos Pariza et.al. | 2408.11054 | null |
2024-08-20 | CO2Wounds-V2: Extended Chronic Wounds Dataset From Leprosy Patients | Karen Sanchez et.al. | 2408.10827 | null |
2024-08-20 | Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant | Guofeng Mei et.al. | 2408.10652 | null |
2024-08-20 | Rethinking Video Segmentation with Masked Video Consistency: Did the Model Learn as Intended? | Chen Liang et.al. | 2408.10627 | null |
2024-08-20 | Subspace Prototype Guidance for Mitigating Class Imbalance in Point Cloud Semantic Segmentation | Jiawei Han et.al. | 2408.10537 | link |
2024-08-21 | LSVOS Challenge 3rd Place Report: SAM2 and Cutie based VOS | Xinyu Liu et.al. | 2408.10469 | null |
2024-08-19 | Leveraging Superfluous Information in Contrastive Representation Learning | Xuechu Yu et.al. | 2408.10292 | null |
2024-08-19 | Imbalance-Aware Culvert-Sewer Defect Segmentation Using an Enhanced Feature Pyramid Network | Rasha Alshawi et.al. | 2408.10181 | null |
2024-08-19 | Dynamic Label Injection for Imbalanced Industrial Defect Segmentation | Emanuele Caruso et.al. | 2408.10031 | link |
2024-08-19 | Detecting Adversarial Attacks in Semantic Segmentation via Uncertainty Estimation: A Deep Analysis | Kira Maag et.al. | 2408.10021 | null |
2024-08-19 | DiscoNeRF: Class-Agnostic Object Field for 3D Object Discovery | Corentin Dumery et.al. | 2408.09928 | null |
2024-08-19 | 3D-Aware Instance Segmentation and Tracking in Egocentric Videos | Yash Bhalgat et.al. | 2408.09860 | null |
2024-08-19 | Segment-Anything Models Achieve Zero-shot Robustness in Autonomous Driving | Jun Yan et.al. | 2408.09839 | link |
2024-08-18 | OVOSE: Open-Vocabulary Semantic Segmentation in Event-Based Cameras | Muhammad Rameez Ur Rahman et.al. | 2408.09424 | link |
2024-08-18 | VrdONE: One-stage Video Visual Relation Detection | Xinjie Jiang et.al. | 2408.09408 | link |
2024-08-18 | Elite360M: Efficient 360 Multi-task Learning via Bi-projection Fusion and Cross-task Collaboration | Hao Ai et.al. | 2408.09336 | null |
2024-08-17 | Cross-Species Data Integration for Enhanced Layer Segmentation in Kidney Pathology | Junchao Zhu et.al. | 2408.09278 | link |
2024-08-16 | Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D Instance Segmentation | Tri Ton et.al. | 2408.08591 | null |
2024-08-16 | Tuning a SAM-Based Model with Multi-Cognitive Visual Adapter to Remote Sensing Instance Segmentation | Linghao Zheng et.al. | 2408.08576 | null |
2024-08-16 | Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs | Jinming Liu et.al. | 2408.08575 | null |
2024-08-15 | 5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks | Dongshuo Yin et.al. | 2408.08345 | link |
2024-08-14 | MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis | Nimeesha Chan et.al. | 2408.07773 | link |
2024-08-15 | MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic Segmentation | Beoungwoo Kang et.al. | 2408.07576 | link |
2024-08-15 | MagicFace: Training-free Universal-Style Human Image Customized Synthesis | Yibin Wang et.al. | 2408.07433 | null |
2024-08-14 | Segment Using Just One Example | Pratik Vora et.al. | 2408.07393 | null |
2024-08-14 | Ensemble architecture in polyp segmentation | Hao-Yun Hsu et.al. | 2408.07262 | link |
2024-08-14 | Leveraging Perceptual Scores for Dataset Pruning in Computer Vision Tasks | Raghavendra Singh et.al. | 2408.07243 | null |
2024-08-14 | Enhancing Autonomous Vehicle Perception in Adverse Weather through Image Augmentation during Semantic Segmentation Training | Ethan Kou et.al. | 2408.07239 | null |
2024-08-13 | ReCLIP++: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation | Jingyun Wang et.al. | 2408.06747 | link |
2024-08-10 | Dilated Convolution with Learnable Spacings | Ismail Khalfaoui-Hassani et.al. | 2408.06383 | null |
2024-08-12 | Correlation Weighted Prototype-based Self-Supervised One-Shot Segmentation of Medical Images | Siladittya Manna et.al. | 2408.06235 | null |
2024-08-12 | A-BDD: Leveraging Data Augmentations for Safe Autonomous Driving in Adverse Weather and Lighting | Felix Assion et.al. | 2408.06071 | null |
2024-08-13 | ClickAttention: Click Region Similarity Guided Interactive Segmentation | Long Xu et.al. | 2408.06021 | null |
2024-08-12 | Enhancing 3D Transformer Segmentation Model for Medical Image with Token-level Representation Learning | Xinrong Hu et.al. | 2408.05889 | null |
2024-08-11 | Seg-CycleGAN : SAR-to-optical image translation guided by a downstream task | Hannuo Zhang et.al. | 2408.05777 | null |
2024-08-11 | MacFormer: Semantic Segmentation with Fine Object Boundaries | Guoan Xu et.al. | 2408.05699 | null |
2024-08-13 | Performance Evaluation of YOLOv8 Model Configurations, for Instance Segmentation of Strawberry Fruit Development Stages in an Open Field Environment | Abdul-Razak Alhassan Gamani et.al. | 2408.05661 | null |
2024-08-10 | Multimodal generative semantic communication based on latent diffusion model | Weiqi Fu et.al. | 2408.05455 | null |
2024-08-09 | PRISM Lite: A lightweight model for interactive 3D placenta segmentation in ultrasound | Hao Li et.al. | 2408.05372 | link |
2024-08-09 | In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation | Dahyun Kang et.al. | 2408.04961 | link |
2024-08-09 | ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation | Mengcheng Lan et.al. | 2408.04883 | link |
2024-08-09 | Extracting Signal Electron Trajectories in the COMET Phase-I Cylindrical Drift Chamber Using Deep Learning | Fumihiro Kaneko et.al. | 2408.04795 | null |
2024-08-08 | Embodied Uncertainty-Aware Object Segmentation | Xiaolin Fang et.al. | 2408.04760 | null |
2024-08-08 | SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation | Jieming Yu et.al. | 2408.04593 | null |
2024-08-08 | Robust Approximate Characterization of Single-Cell Heterogeneity in Microbial Growth | Richard D. Paul et.al. | 2408.04501 | link |
2024-08-08 | SegXAL: Explainable Active Learning for Semantic Segmentation in Driving Scene Scenarios | Sriram Mandalika et.al. | 2408.04482 | null |
2024-08-08 | What could go wrong? Discovering and describing failure modes in computer vision | Gabriela Csurka et.al. | 2408.04471 | null |
2024-08-07 | Performance and Non-adversarial Robustness of the Segment Anything Model 2 in Surgical Video Segmentation | Yiqing Shen et.al. | 2408.04098 | null |
2024-08-07 | CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications | Tianfang Zhang et.al. | 2408.03703 | link |
2024-08-07 | SAM2-PATH: A better segment anything model for semantic segmentation in digital pathology | Mingya Zhang et.al. | 2408.03651 | link |
2024-08-06 | Post-Mortem Human Iris Segmentation Analysis with Deep Learning | Afzal Hossain et.al. | 2408.03448 | null |
2024-08-06 | Comb, Prune, Distill: Towards Unified Pruning for Vision Model Compression | Jonas Schmitt et.al. | 2408.03046 | link |
2024-08-06 | Evaluation of Segment Anything Model 2: The Role of SAM2 in the Underwater Environment | Shijie Lian et.al. | 2408.02924 | link |
2024-08-05 | Scribble-Based Interactive Segmentation of Medical Hyperspectral Images | Zhonghao Wang et.al. | 2408.02708 | null |
2024-08-05 | Perception Matters: Enhancing Embodied AI with Uncertainty-Aware Semantic Segmentation | Sai Prasanna et.al. | 2408.02297 | null |
2024-08-05 | Cross-Domain Semantic Segmentation on Inconsistent Taxonomy using VLMs | Jeongkee Lim et.al. | 2408.02261 | null |
2024-08-05 | Curriculum learning based pre-training using Multi-Modal Contrastive Masked Autoencoders | Muhammad Abdullah Jamal et.al. | 2408.02245 | null |
2024-08-04 | Pixel-Level Domain Adaptation: A New Perspective for Enhancing Weakly Supervised Semantic Segmentation | Ye Du et.al. | 2408.02039 | null |
2024-08-03 | NuLite -- Lightweight and Fast Model for Nuclei Instance Segmentation and Classification | Cristian Tommasino et.al. | 2408.01797 | null |
2024-08-03 | Bayesian Active Learning for Semantic Segmentation | Sima Didari et.al. | 2408.01694 | null |
2024-08-03 | A Comparative Analysis of CNN-based Deep Learning Models for Landslide Detection | Omkar Oak et.al. | 2408.01692 | null |
2024-08-03 | Leveraging GNSS and Onboard Visual Data from Consumer Vehicles for Robust Road Network Estimation | Balázs Opra et.al. | 2408.01640 | null |
2024-08-02 | Multi-Unit Floor Plan Recognition and Reconstruction Using Improved Semantic Segmentation of Raster-Wise Floor Plans | Lukas Kratochvila et.al. | 2408.01526 | null |
2024-08-02 | Balanced Residual Distillation Learning for 3D Point Cloud Class-Incremental Semantic Segmentation | Yuanzhi Su et.al. | 2408.01356 | null |
2024-08-02 | StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation | Bingyu Li et.al. | 2408.01343 | null |
2024-08-02 | Amodal Segmentation for Laparoscopic Surgery Video Instruments | Ruohua Shi et.al. | 2408.01067 | null |
2024-08-02 | Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach | Yabin Zhu et.al. | 2408.00969 | null |
2024-08-01 | Medical SAM 2: Segment medical images as video via Segment Anything Model 2 | Jiayuan Zhu et.al. | 2408.00874 | null |
2024-08-01 | Leaf Angle Estimation using Mask R-CNN and LETR Vision Transformer | Venkat Margapuri et.al. | 2408.00749 | null |
2024-08-01 | Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation | Siyu Jiao et.al. | 2408.00744 | null |
2024-08-01 | Synthetic dual image generation for reduction of labeling efforts in semantic segmentation of micrographs with a customized metric function | Matias Oscar Volman Stern et.al. | 2408.00707 | null |
2024-08-01 | AMAES: Augmented Masked Autoencoder Pretraining on Public Brain MRI Data for 3D-Native Segmentation | Asbjørn Munk et.al. | 2408.00640 | null |
2024-08-01 | SegStitch: Multidimensional Transformer for Robust and Efficient Medical Imaging Segmentation | Shengbo Tan et.al. | 2408.00496 | null |
2024-08-01 | A Simple Background Augmentation Method for Object Detection with Diffusion Model | Yuhang Li et.al. | 2408.00350 | null |
2024-07-31 | Con4m: Context-aware Consistency Learning Framework for Segmented Time Series Classification | Junru Chen et.al. | 2408.00041 | null |
2024-07-31 | Open-Vocabulary Audio-Visual Semantic Segmentation | Ruohao Guo et.al. | 2407.21721 | null |
2024-07-31 | MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment | Anurag Das et.al. | 2407.21654 | null |
2024-07-31 | MaskUno: Switch-Split Block For Enhancing Instance Segmentation | Jawad Haidar et.al. | 2407.21498 | null |
2024-07-31 | Small Object Few-shot Segmentation for Vision-based Industrial Inspection | Zilong Zhang et.al. | 2407.21351 | null |
2024-07-31 | On-the-fly Point Feature Representation for Point Clouds Analysis | Jiangyi Wang et.al. | 2407.21335 | null |
2024-07-31 | Fine-grained Metrics for Point Cloud Semantic Segmentation | Zhuheng Lu et.al. | 2407.21289 | null |
2024-07-30 | PLANesT-3D: A new annotated dataset for segmentation of 3D plant point clouds | Kerem Mertoğlu et.al. | 2407.21150 | null |
2024-07-30 | Learning Ordinality in Semantic Segmentation | Rafael Cristino et.al. | 2407.20959 | null |
2024-07-29 | Improving 2D Feature Representations by 3D-Aware Fine-Tuning | Yuanwen Yue et.al. | 2407.20229 | null |
2024-07-29 | Background Semantics Matter: Cross-Task Feature Exchange Network for Clustered Infrared Small Target Detection With Sky-Annotated Dataset | Yimian Dai et.al. | 2407.20078 | link |
2024-07-29 | Language-driven Grasp Detection with Mask-guided Attention | Tuan Van Vo et.al. | 2407.19877 | null |
2024-07-29 | Rethinking RGB-D Fusion for Semantic Segmentation in Surgical Datasets | Muhammad Abdullah Jamal et.al. | 2407.19714 | null |
2024-07-29 | ALEN: A Dual-Approach for Uniform and Non-Uniform Low-Light Image Enhancement | Ezequiel Perez-Zarate et.al. | 2407.19708 | link |
2024-07-28 | ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention Understanding | Zhen Chen et.al. | 2407.19435 | link |
2024-07-28 | Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets | Tianxiao Zhang et.al. | 2407.19394 | link |
2024-07-27 | Ensembling convolutional neural networks for human skin segmentation | Patryk Kuban et.al. | 2407.19310 | null |
2024-07-27 | Sewer Image Super-Resolution with Depth Priors and Its Lightweight Network | Gang Pan et.al. | 2407.19271 | null |
2024-07-26 | Sparse Refinement for Efficient High-Resolution Semantic Segmentation | Zhijian Liu et.al. | 2407.19014 | null |
2024-07-26 | A Survey on Cell Nuclei Instance Segmentation and Classification: Leveraging Context and Attention | João D. Nunes et.al. | 2407.18673 | null |
2024-07-26 | Learning Spectral-Decomposed Tokens for Domain Generalized Semantic Segmentation | Jingjun Yi et.al. | 2407.18568 | null |
2024-07-25 | Taxonomy-Aware Continual Semantic Segmentation in Hyperbolic Spaces for Open-World Perception | Julia Hindel et.al. | 2407.18145 | null |
2024-07-25 | LKCell: Efficient Cell Nuclei Instance Segmentation with Large Convolution Kernels | Ziwei Cui et.al. | 2407.18054 | link |
2024-07-25 | TiCoSS: Tightening the Coupling between Semantic Segmentation and Stereo Matching within A Joint Learning Framework | Guanfeng Tang et.al. | 2407.18038 | null |
2024-07-25 | Segmentation-guided MRI reconstruction for meaningfully diverse reconstructions | Jan Nikolas Morshuis et.al. | 2407.18026 | link |
2024-07-26 | Quality Assured: Rethinking Annotation Strategies in Imaging AI | Tim Rädsch et.al. | 2407.17596 | null |
2024-07-24 | Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation | Hyunwoo Yu et.al. | 2407.17261 | link |
2024-07-24 | Trans2Unet: Neural fusion for Nuclei Semantic Segmentation | Dinh-Phu Tran et.al. | 2407.17181 | null |
2024-07-24 | PiPa++: Towards Unification of Domain Adaptive Semantic Segmentation via Self-supervised Learning | Mu Chen et.al. | 2407.17101 | null |
2024-07-25 | Enhancing Environmental Monitoring through Multispectral Imaging: The WasteMS Dataset for Semantic Segmentation of Lakeside Waste | Qinfeng Zhu et.al. | 2407.17028 | link |
2024-07-24 | Progressive Query Refinement Framework for Bird's-Eye-View Semantic Segmentation from Surrounding Images | Dooseop Choi et.al. | 2407.17003 | link |
2024-07-24 | McGAN: Generating Manufacturable Designs by Embedding Manufacturing Rules into Conditional Generative Adversarial Network | Zhichao Wang et.al. | 2407.16943 | null |
2024-07-23 | SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation | Pengfei Chen et.al. | 2407.16682 | null |
2024-07-23 | Deformable Convolution Based Road Scene Semantic Segmentation of Fisheye Images in Autonomous Driving | Anam Manzoor et.al. | 2407.16647 | null |
2024-07-23 | Deep Bayesian segmentation for colon polyps: Well-calibrated predictions in medical imaging | Daniela L. Ramos et.al. | 2407.16608 | null |
2024-07-23 | Strike a Balance in Continual Panoptic Segmentation | Jinpeng Chen et.al. | 2407.16354 | link |
2024-07-23 | Augmented Efficiency: Reducing Memory Footprint and Accelerating Inference for 3D Semantic Segmentation through Hybrid Vision | Aditya Krishnan et.al. | 2407.16102 | null |
2024-07-22 | Enhancing Cell Instance Segmentation in Scanning Electron Microscopy Images via a Deep Contour Closing Operator | Florian Robert et.al. | 2407.15817 | null |
2024-07-22 | MILAN: Milli-Annotations for Lidar Semantic Segmentation | Nermin Samet et.al. | 2407.15797 | null |
2024-07-22 | Diffusion for Out-of-Distribution Detection on Road Scenes and Beyond | Silvio Galesso et.al. | 2407.15739 | link |
2024-07-22 | MSSPlace: Multi-Sensor Place Recognition with Visual and Text Semantics | Alexander Melekhin et.al. | 2407.15663 | link |
2024-07-22 | Learning at a Glance: Towards Interpretable Data-limited Continual Semantic Segmentation via Semantic-Invariance Modelling | Bo Yuan et.al. | 2407.15429 | link |
2024-07-22 | Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data | Junha Song et.al. | 2407.15383 | null |
2024-07-21 | Point Transformer V3 Extreme: 1st Place Solution for 2024 Waymo Open Dataset Challenge in Semantic Segmentation | Xiaoyang Wu et.al. | 2407.15282 | null |
2024-07-20 | Downstream-Pretext Domain Knowledge Traceback for Active Learning | Beichen Zhang et.al. | 2407.14720 | null |
2024-07-19 | Panoptic Segmentation of Mammograms with Text-To-Image Diffusion Model | Kun Zhao et.al. | 2407.14326 | null |
2024-07-19 | Early Preparation Pays Off: New Classifier Pre-tuning for Class Incremental Semantic Segmentation | Zhengyuan Xie et.al. | 2407.14142 | link |
2024-07-19 | MC-PanDA: Mask Confidence for Panoptic Domain Adaptation | Ivan Martinović et.al. | 2407.14110 | link |
2024-07-19 | GaussianBeV: 3D Gaussian Representation meets Perception Models for BeV Segmentation | Florian Chabot et.al. | 2407.14108 | null |
2024-07-19 | Scale Disparity of Instances in Interactive Point Cloud Segmentation | Chenrui Han et.al. | 2407.14009 | null |
2024-07-18 | Many Perception Tasks are Highly Redundant Functions of their Input Data | Rahul Ramesh et.al. | 2407.13841 | null |
2024-07-18 | GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model | Abdelrahman Shaker et.al. | 2407.13772 | link |
2024-07-18 | SegPoint: Segment Any Point Cloud via Large Language Model | Shuting He et.al. | 2407.13761 | null |
2024-07-18 | MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture Synthesis | Ziming Zhong et.al. | 2407.13675 | link |
2024-07-18 | Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models | Xiaoyu Zhu et.al. | 2407.13642 | null |
2024-07-18 | FADE: A Task-Agnostic Upsampling Operator for Encoder-Decoder Architectures | Hao Lu et.al. | 2407.13500 | null |
2024-07-18 | FREST: Feature RESToration for Semantic Segmentation under Multiple Adverse Conditions | Sohyun Lee et.al. | 2407.13437 | null |
2024-07-18 | Lightweight Uncertainty Quantification with Simplex Semantic Segmentation for Terrain Traversability | Judith Dijk et.al. | 2407.13392 | null |
2024-07-18 | Learning from the Web: Language Drives Weakly-Supervised Incremental Learning for Semantic Segmentation | Chang Liu et.al. | 2407.13363 | null |
2024-07-18 | Make a Strong Teacher with Label Assistance: A Novel Knowledge Distillation Approach for Semantic Segmentation | Shoumeng Qiu et.al. | 2407.13254 | null |
2024-07-18 | OE-BevSeg: An Object Informed and Environment Aware Multimodal Framework for Bird's-eye-view Vehicle Semantic Segmentation | Jian Sun et.al. | 2407.13137 | null |
2024-07-17 | FastSAM-3DSlicer: A 3D-Slicer Extension for 3D Volumetric Segment Anything Model with Uncertainty Quantification | Yiqing Shen et.al. | 2407.12658 | null |
2024-07-17 | Weighting Pseudo-Labels via High-Activation Feature Index Similarity and Object Detection for Semi-Supervised Segmentation | Prantik Howlader et.al. | 2407.12630 | link |
2024-07-17 | Instance-wise Uncertainty for Class Imbalance in Semantic Segmentation | Luís Almeida et.al. | 2407.12609 | null |
2024-07-17 | Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks | Antoni Kowalczuk et.al. | 2407.12588 | link |
2024-07-17 | Dual-level Adaptive Self-Labeling for Novel Class Discovery in Point Cloud Segmentation | Ruijie Xu et.al. | 2407.12489 | link |
2024-07-17 | Progressive Proxy Anchor Propagation for Unsupervised Semantic Segmentation | Hyun Seok Seong et.al. | 2407.12463 | null |
2024-07-17 | Close the Sim2real Gap via Physically-based Structured Light Synthetic Data Simulation | Kaixin Bai et.al. | 2407.12449 | null |
2024-07-17 | ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference | Mengcheng Lan et.al. | 2407.12442 | null |
2024-07-17 | Serialized Point Mamba: A Serialized Point Cloud Mamba Segmentation Model | Tao Wang et.al. | 2407.12319 | null |
2024-07-16 | FoodMem: Near Real-time and Precise Food Video Segmentation | Ahmad AlMughrabi et.al. | 2407.12121 | null |
2024-07-16 | Mitigating Background Shift in Class-Incremental Semantic Segmentation | Gilhan Park et.al. | 2407.11859 | link |
2024-07-16 | Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation | Juncheng Ma et.al. | 2407.11820 | null |
2024-07-16 | Click-Gaussian: Interactive Segmentation to Any 3D Gaussians | Seokhun Choi et.al. | 2407.11793 | null |
2024-07-16 | XEdgeAI: A Human-centered Industrial Inspection Framework with Data-centric Explainable Edge AI Approach | Truong Thanh Hung Nguyen et.al. | 2407.11771 | null |
2024-07-16 | OAM-TCD: A globally diverse dataset of high-resolution tree cover maps | Josh Veitch-Michaelis et.al. | 2407.11743 | null |
2024-07-16 | SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point Clouds | Yanbo Wang et.al. | 2407.11569 | link |
2024-07-16 | SGIFormer: Semantic-guided and Geometric-enhanced Interleaving Transformer for 3D Instance Segmentation | Lei Yao et.al. | 2407.11564 | null |
2024-07-16 | Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes | Zhi Cai et.al. | 2407.11464 | link |
2024-07-16 | Leveraging Segment Anything Model in Identifying Buildings within Refugee Camps (SAM4Refugee) from Satellite Imagery for Humanitarian Operations | Yunya Gao et.al. | 2407.11381 | link |
2024-07-16 | Generative AI Driven Task-Oriented Adaptive Semantic Communications | Yuzhou Fu et.al. | 2407.11354 | null |
2024-07-15 | No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations | Walter Simoncini et.al. | 2407.10964 | link |
2024-07-15 | APC: Adaptive Patch Contrast for Weakly Supervised Semantic Segmentation | Wangyu Wu et.al. | 2407.10649 | null |
2024-07-15 | Automated Label Unification for Multi-Dataset Semantic Segmentation with GNNs | Rong Ma et.al. | 2407.10534 | null |
2024-07-14 | Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data | Tuo Feng et.al. | 2407.10200 | link |
2024-07-14 | RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation | Li Li et.al. | 2407.10159 | link |
2024-07-14 | Part2Object: Hierarchical Unsupervised 3D Instance Segmentation | Cheng Shi et.al. | 2407.10084 | link |
2024-07-14 | HSFusion: A high-level vision task-driven infrared and visible image fusion network via semantic and geometric domain transformation | Chengjie Jiang et.al. | 2407.10047 | null |
2024-07-13 | Background Adaptation with Residual Modeling for Exemplar-Free Class-Incremental Semantic Segmentation | Anqi Zhang et.al. | 2407.09838 | null |
2024-07-13 | Enhancing Semantic Segmentation with Adaptive Focal Loss: A Novel Approach | Md Rakibul Islam et.al. | 2407.09828 | null |
2024-07-13 | 3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance | Xiaoxu Xu et.al. | 2407.09826 | null |
2024-07-12 | FANet: Feature Amplification Network for Semantic Segmentation in Cluttered Background | Muhammad Ali et.al. | 2407.09379 | link |
2024-07-12 | WSESeg: Introducing a Dataset for the Segmentation of Winter Sports Equipment with a Baseline for Interactive Segmentation | Robin Schön et.al. | 2407.09288 | null |
2024-07-12 | A Fair Ranking and New Model for Panoptic Scene Graph Generation | Julian Lorenz et.al. | 2407.09216 | null |
2024-07-12 | Salt & Pepper Heatmaps: Diffusion-informed Landmark Detection Strategy | Julian Wyatt et.al. | 2407.09192 | null |
2024-07-12 | From Easy to Hard: Learning Curricular Shape-aware Features for Robust Panoptic Scene Graph Generation | Hanrong Shi et.al. | 2407.09191 | null |
2024-07-12 | Evaluating the Adversarial Robustness of Semantic Segmentation: Trying Harder Pays Off | Levente Halmosi et.al. | 2407.09150 | link |
2024-07-12 | Cs2K: Class-specific and Class-shared Knowledge Guidance for Incremental Semantic Segmentation | Wei Cong et.al. | 2407.09047 | null |
2024-07-12 | Textual Query-Driven Mask Transformer for Domain Generalized Segmentation | Byeonghyun Pak et.al. | 2407.09033 | null |
2024-07-12 | Global Attention-Guided Dual-Domain Point Cloud Feature Learning for Classification and Segmentation | Zihao Li et.al. | 2407.08994 | null |
2024-07-11 | SLoRD: Structural Low-Rank Descriptors for Shape Consistency in Vertebrae Segmentation | Xin You et.al. | 2407.08555 | null |
2024-07-11 | Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation | Tong Shao et.al. | 2407.08268 | null |
2024-07-11 | Enrich the content of the image Using Context-Aware Copy Paste | Qiushi Guo et.al. | 2407.08151 | null |
2024-07-10 | MambaVision: A Hybrid Mamba-Transformer Vision Backbone | Ali Hatamizadeh et.al. | 2407.08083 | link |
2024-07-10 | Interactive Segmentation Model for Placenta Segmentation from 3D Ultrasound images | Hao Li et.al. | 2407.08020 | link |
2024-07-10 | Satellite Image Time Series Semantic Change Detection: Novel Architecture and Analysis of Domain Shift | Elliot Vincent et.al. | 2407.07616 | link |
2024-07-10 | H-FCBFormer Hierarchical Fully Convolutional Branch Transformer for Occlusal Contact Segmentation with Articulating Paper | Ryan Banks et.al. | 2407.07604 | link |
2024-07-11 | Trainable Highly-expressive Activation Functions | Irit Chelly et.al. | 2407.07564 | null |
2024-07-10 | Panoptic Segmentation of Galactic Structures in LSB Images | Felix Richards et.al. | 2407.07494 | null |
2024-07-10 | Deformable-Heatmap-Segmentation for Automobile Visual Perception | Hongyu Jin et.al. | 2407.07493 | null |
2024-07-10 | Exploring the Untouched Sweeps for Conflict-Aware 3D Segmentation Pretraining | Tianfang Sun et.al. | 2407.07465 | null |
2024-07-11 | HAFormer: Unleashing the Power of Hierarchy-Aware Features for Lightweight Semantic Segmentation | Guoan Xu et.al. | 2407.07441 | null |
2024-07-10 | Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation | Hao Fang et.al. | 2407.07427 | link |
2024-07-09 | ItTakesTwo: Leveraging Peer Representations for Semi-supervised LiDAR Semantic Segmentation | Yuyuan Liu et.al. | 2407.07171 | link |
2024-07-09 | Improved Block Merging for 3D Point Cloud Instance Segmentation | Leon Denis et.al. | 2407.06991 | null |
2024-07-09 | Joint prototype and coefficient prediction for 3D instance segmentation | Remco Royen et.al. | 2407.06958 | null |
2024-07-08 | Training-free CryoET Tomogram Segmentation | Yizhou Zhao et.al. | 2407.06833 | link |
2024-07-09 | CycleSAM: One-Shot Surgical Scene Segmentation using Cycle-Consistent Feature Matching to Prompt SAM | Aditya Murali et.al. | 2407.06795 | null |
2024-07-09 | LuSNAR:A Lunar Segmentation, Navigation and Reconstruction Dataset based on Muti-sensor for Autonomous Exploration | Jiayi Liu et.al. | 2407.06512 | link |
2024-07-08 | Leveraging image captions for selective whole slide image annotation | Jingna Qiu et.al. | 2407.06363 | null |
2024-07-08 | Object-Oriented Material Classification and 3D Clustering for Improved Semantic Perception and Mapping in Mobile Robots | Siva Krishna Ravipati et.al. | 2407.06077 | null |
2024-07-08 | Test-time adaptation for geospatial point cloud semantic segmentation with distinct domain shifts | Puzuo Wang et.al. | 2407.06043 | null |
2024-07-08 | RHRSegNet: Relighting High-Resolution Night-Time Semantic Segmentation | Sarah Elmahdy et.al. | 2407.06016 | link |
2024-07-07 | Semantic Segmentation for Real-World and Synthetic Vehicle's Forward-Facing Camera Images | Tuan T. Nguyen et.al. | 2407.05452 | null |
2024-07-07 | Self-supervised Learning via Cluster Distance Prediction for Operating Room Context Awareness | Idris Hamoud et.al. | 2407.05448 | null |
2024-07-06 | A Study of Test-time Contrastive Concepts for Open-world, Open-vocabulary Semantic Segmentation | Monika Wysoczańska et.al. | 2407.05061 | null |
2024-07-06 | BlessemFlood21: Advancing Flood Analysis with a High-Resolution Georeferenced Dataset for Humanitarian Aid Support | Vladyslav Polushko et.al. | 2407.05007 | null |
2024-07-05 | Explainable Metric Learning for Deflating Data Bias | Emma Andrews et.al. | 2407.04866 | null |
2024-07-05 | Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge | Yuanze Lin et.al. | 2407.04681 | null |
2024-07-05 | LMSeg: A deep graph message-passing network for efficient and accurate semantic segmentation of large-scale 3D landscape meshes | Zexian Huang et.al. | 2407.04326 | null |
2024-07-04 | Slice-100K: A Multimodal Dataset for Extrusion-based 3D Printing | Anushrut Jignasu et.al. | 2407.04180 | null |
2024-07-04 | Beyond Pixels: Semi-Supervised Semantic Segmentation with a Multi-scale Patch-based Multi-Label Classifier | Prantik Howlader et.al. | 2407.04036 | link |
2024-07-04 | Performance of Medical Image Fusion in High-level Analysis Tasks: A Mutual Enhancement Framework for Unaligned PAT and MRI Image Fusion | Yutian Zhong et.al. | 2407.03992 | link |
2024-07-04 | Relative Difficulty Distillation for Semantic Segmentation | Dong Liang et.al. | 2407.03719 | null |
2024-07-04 | POSTURE: Pose Guided Unsupervised Domain Adaptation for Human Body Part Segmentation | Arindam Dutta et.al. | 2407.03549 | null |
2024-07-03 | A Unified Framework for 3D Scene Understanding | Wei Xu et.al. | 2407.03263 | null |
2024-07-03 | ISWSST: Index-space-wave State Superposition Transformers for Multispectral Remotely Sensed Imagery Semantic Segmentation | Chang Li et.al. | 2407.03033 | null |
2024-07-03 | Context-Aware Video Instance Segmentation | Seunghun Lee et.al. | 2407.03010 | link |
2024-07-03 | ShiftAddAug: Augment Multiplication-Free Tiny Neural Network with Hybrid Computation | Yipin Guo et.al. | 2407.02881 | null |
2024-07-03 | Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation | Tao Chen et.al. | 2407.02768 | null |
2024-07-03 | ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers | Yanfeng Jiang et.al. | 2407.02763 | null |
2024-07-02 | Open Panoramic Segmentation | Junwei Zheng et.al. | 2407.02685 | null |
2024-07-02 | Holistically-Nested Structure-Aware Graph Neural Network for Road Extraction | Tinghuai Wang et.al. | 2407.02639 | null |
2024-07-02 | Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather | Junsung Park et.al. | 2407.02286 | link |
2024-07-02 | MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders | Baijiong Lin et.al. | 2407.02228 | link |
2024-07-02 | Occlusion-Aware Seamless Segmentation | Yihong Cao et.al. | 2407.02182 | link |
2024-07-02 | VRBiom: A New Periocular Dataset for Biometric Applications of HMD | Ketan Kotwal et.al. | 2407.02150 | null |
2024-07-02 | HRSAM: Efficiently Segment Anything in High-Resolution Images | You Huang et.al. | 2407.02109 | null |
2024-07-02 | Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts | Pasquale De Marinis et.al. | 2407.02075 | null |
2024-07-02 | LiDAR-based HD Map Localization using Semantic Generalized ICP with Road Marking Detection | Yansong Gong et.al. | 2407.02061 | null |
2024-07-02 | Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning | Chengchao Shen et.al. | 2407.02014 | link |
2024-07-01 | Label-free Neural Semantic Image Synthesis | Jiayi Wang et.al. | 2407.01790 | null |
2024-07-01 | PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction | Xuan Yu et.al. | 2407.01349 | null |
2024-06-28 | EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model | Yuxuan Zhang et.al. | 2406.20076 | null |
2024-07-01 | Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding | Yifan Tang et.al. | 2406.19791 | null |
2024-06-28 | PM-VIS+: High-Performance Video Instance Segmentation without Video Annotation | Zhangjing Yang et.al. | 2406.19665 | link |
2024-06-28 | Precision matters: Precision-aware ensemble for weakly supervised semantic segmentation | Junsung Park et.al. | 2406.19638 | link |
2024-06-28 | PPTFormer: Pseudo Multi-Perspective Transformer for UAV Segmentation | Deyi Ji et.al. | 2406.19632 | null |
2024-06-27 | Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model | Haobo Yuan et.al. | 2406.19369 | null |
2024-06-27 | ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation | Nazanin Moradinasab et.al. | 2406.19225 | null |
2024-06-30 | Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO | Fuseini Mumuni et.al. | 2406.19057 | null |
2024-06-27 | Divide, Ensemble and Conquer: The Last Mile on Unsupervised Domain Adaptation for On-Board Semantic Segmentation | Tao Lian et.al. | 2406.18809 | null |
2024-07-01 | 3D Feature Distillation with Object-Centric Priors | Georgios Tziafas et.al. | 2406.18742 | null |
2024-06-26 | CAS: Confidence Assessments of classification algorithms for Semantic segmentation of EO data | Nikolaos Dionelis et.al. | 2406.18279 | null |
2024-06-26 | CoDA: Interactive Segmentation and Morphological Analysis of Dendroid Structures Exemplified on Stony Cold-Water Corals | Kira Schmitt et.al. | 2406.18236 | link |
2024-06-26 | The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval | Meinardus Boris et.al. | 2406.18113 | link |
2024-06-26 | Few-Shot Medical Image Segmentation with High-Fidelity Prototypes | Song Tang et.al. | 2406.18074 | link |
2024-06-25 | Semi-supervised classification of dental conditions in panoramic radiographs using large language model and instance segmentation: A real-world dataset evaluation | Bernardo Silva et.al. | 2406.17915 | null |
2024-06-25 | Local-to-Global Cross-Modal Attention-Aware Fusion for HSI-X Semantic Segmentation | Xuming Zhang et.al. | 2406.17679 | null |
2024-06-25 | DocParseNet: Advanced Semantic Segmentation and OCR Embeddings for Efficient Scanned Document Annotation | Ahmad Mohammadshirazi et.al. | 2406.17591 | link |
2024-06-25 | Principal Component Clustering for Semantic Segmentation in Synthetic Data Generation | Felix Stillger et.al. | 2406.17541 | null |
2024-06-25 | Investigating Self-Supervised Methods for Label-Efficient Learning | Srinivasa Rao Nandam et.al. | 2406.17460 | null |
2024-06-25 | Pseudo Labelling for Enhanced Masked Autoencoders | Srinivasa Rao Nandam et.al. | 2406.17450 | null |
2024-06-25 | Mamba24/8D: Enhancing Global Interaction in Point Clouds via State Space Model | Zhuoyuan Li et.al. | 2406.17442 | null |
2024-06-25 | Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes | Qi Ma et.al. | 2406.17438 | null |
2024-06-25 | Depth-Guided Semi-Supervised Instance Segmentation | Xin Chen et.al. | 2406.17413 | null |
2024-06-25 | XAMI -- A Benchmark Dataset for Artefact Detection in XMM-Newton Optical Images | Elisabeta-Iulia Dima et.al. | 2406.17323 | link |
2024-06-24 | GMT: Guided Mask Transformer for Leaf Instance Segmentation | Feng Chen et.al. | 2406.17109 | null |
2024-06-24 | Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation | Yizheng Wu et.al. | 2406.16776 | link |
2024-06-24 | μ-Net: A Deep Learning-Based Architecture for μ-CT Segmentation | Pierangela Bruno et.al. | 2406.16724 | null |
2024-06-24 | GATSBI: An Online GTSP-Based Algorithm for Targeted Surface Bridge Inspection and Defect Detection | Harnaik Dhami et.al. | 2406.16625 | null |
2024-06-24 | LOGCAN++: Local-global class-aware network for semantic segmentation of remote sensing images | Xiaowen Ma et.al. | 2406.16502 | link |
2024-06-24 | Cascade Reward Sampling for Efficient Decoding-Time Alignment | Bolian Li et.al. | 2406.16306 | null |
2024-06-24 | SegNet4D: Effective and Efficient 4D LiDAR Semantic Segmentation in Autonomous Driving Environments | Neng Wang et.al. | 2406.16279 | link |
2024-06-23 | UDHF2-Net: An Uncertainty-diffusion-model-based High-Frequency TransFormer Network for High-accuracy Interpretation of Remotely Sensed Imagery | Pengfei Zhang et.al. | 2406.16129 | null |
2024-06-23 | CholecInstanceSeg: A Tool Instance Segmentation Dataset for Laparoscopic Surgery | Oluwatosin Alabi et.al. | 2406.16039 | null |
2024-06-22 | Fine-grained Background Representation for Weakly Supervised Semantic Segmentation | Xu Yin et.al. | 2406.15755 | null |
2024-06-21 | TraceNet: Segment one thing efficiently | Mingyuan Wu et.al. | 2406.14874 | null |
2024-06-19 | 3D Instance Segmentation Using Deep Learning on RGB-D Indoor Data | Siddiqui Muhammad Yasir et.al. | 2406.14581 | null |
2024-06-20 | Evaluation of Deep Learning Semantic Segmentation for Land Cover Mapping on Multispectral, Hyperspectral and High Spatial Aerial Imagery | Ilham Adi Panuntun et.al. | 2406.14220 | null |
2024-06-20 | Trusting Semantic Segmentation Networks | Samik Some et.al. | 2406.14201 | null |
2024-06-20 | EvSegSNN: Neuromorphic Semantic Segmentation for Event Data | Dalia Hareb et.al. | 2406.14178 | null |
2024-06-20 | Seg-LSTM: Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images | Qinfeng Zhu et.al. | 2406.14086 | link |
2024-06-20 | 2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation | Bin Cao et.al. | 2406.13939 | null |
2024-06-19 | Search-based DNN Testing and Retraining with GAN-enhanced Simulations | Mohammed Oualid Attaoui et.al. | 2406.13359 | null |
2024-06-19 | Deep Learning-Based 3D Instance and Semantic Segmentation: A Review | Siddiqui Muhammad Yasir et.al. | 2406.13308 | null |
2024-06-18 | Reparameterizable Dual-Resolution Network for Real-time Semantic Segmentation | Guoyu Yang et.al. | 2406.12496 | link |
2024-06-18 | Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines | Honglei Zhang et.al. | 2406.12367 | null |
2024-06-18 | Agriculture-Vision Challenge 2024 -- The Runner-Up Solution for Agricultural Pattern Recognition via Class Balancing and Model Ensemble | Wang Liu et.al. | 2406.12271 | null |
2024-06-17 | OoDIS: Anomaly Instance Segmentation Benchmark | Alexey Nekrasov et.al. | 2406.11835 | link |
2024-06-17 | Multimodal Learning To Improve Segmentation With Intraoperative CBCT & Preoperative CT | Maximilian E. Tschuchnig et.al. | 2406.11650 | null |
2024-06-17 | Learning from Exemplars for Interactive Image Segmentation | Kun Li et.al. | 2406.11472 | null |
2024-06-17 | SWCF-Net: Similarity-weighted Convolution and Local-global Fusion for Efficient Large-scale Point Cloud Semantic Segmentation | Zhenchao Lin et.al. | 2406.11441 | link |
2024-06-17 | Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding | Yunsong Wang et.al. | 2406.11283 | null |
2024-06-17 | Frozen CLIP: A Strong Backbone for Weakly Supervised Semantic Segmentation | Bingfeng Zhang et.al. | 2406.11189 | null |
2024-06-16 | Sanbao Su et.al. | 2406.11021 | null | |
2024-06-16 | Benchmarking Label Noise in Instance Segmentation: Spatial Noise Matters | Moshe Kimhi et.al. | 2406.10891 | link |
2024-06-16 | PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing Imagery | Libo Wang et.al. | 2406.10828 | link |
2024-06-15 | GenMM: Geometrically and Temporally Consistent Multimodal Data Generation for Video and LiDAR | Bharat Singh et.al. | 2406.10722 | null |
2024-06-14 | Task-aligned Part-aware Panoptic Segmentation through Joint Object-Part Representations | Daan de Geus et.al. | 2406.10114 | null |
2024-06-14 | ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers | Narges Norouzi et.al. | 2406.09936 | null |
2024-06-14 | Label-Efficient Semantic Segmentation of LiDAR Point Clouds in Adverse Weather Conditions | Aldi Piroli et.al. | 2406.09906 | null |
2024-06-14 | Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation | Brunó B. Englert et.al. | 2406.09896 | link |
2024-06-14 | Open-Vocabulary Semantic Segmentation with Image Embedding Balancing | Xiangheng Shan et.al. | 2406.09829 | link |
2024-06-14 | 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities | Roman Bachmann et.al. | 2406.09406 | null |
2024-06-13 | Instance-level quantitative saliency in multiple sclerosis lesion segmentation | Federico Spagnolo et.al. | 2406.09335 | null |
2024-06-13 | APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation | Weizhao He et.al. | 2406.08372 | null |
2024-06-12 | Dataset Enhancement with Instance-Level Augmentations | Orest Kupyn et.al. | 2406.08249 | link |
2024-06-12 | 2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation | Zhensong Xu et.al. | 2406.08192 | null |
2024-06-13 | A |
Lixian Zhang et.al. | 2406.08079 | null |
2024-06-12 | OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding | Yinan Deng et.al. | 2406.08009 | link |
2024-06-12 | SimSAM: Simple Siamese Representations Based Semantic Affinity Matrix for Unsupervised Image Segmentation | Chanda Grover Kamra et.al. | 2406.07986 | link |
2024-06-12 | Small Scale Data-Free Knowledge Distillation | He Liu et.al. | 2406.07876 | link |
2024-06-11 | Beyond Bare Queries: Open-Vocabulary Object Retrieval with 3D Scene Graph | Sergey Linok et.al. | 2406.07113 | null |
2024-06-11 | PanoSSC: Exploring Monocular Panoptic 3D Scene Reconstruction for Autonomous Driving | Yining Shi et.al. | 2406.07037 | null |
2024-06-11 | RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks | Zhechao Wang et.al. | 2406.07032 | null |
2024-06-12 | LiSD: An Efficient Multi-Task Learning Framework for LiDAR Segmentation and Detection | Jiahua Xu et.al. | 2406.07023 | null |
2024-06-11 | Dual Thinking and Perceptual Analysis of Deep Learning Models using Human Adversarial Examples | Kailas Dayanandan et.al. | 2406.06967 | link |
2024-06-11 | UVIS: Unsupervised Video Instance Segmentation | Shuaiyi Huang et.al. | 2406.06908 | null |
2024-06-10 | Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation | Dong Zhao et.al. | 2406.06813 | null |
2024-06-10 | Merlin: A Vision Language Foundation Model for 3D Computed Tomography | Louis Blankemeier et.al. | 2406.06512 | null |
2024-06-10 | UMAD: Unsupervised Mask-Level Anomaly Detection for Autonomous Driving | Daniel Bogdoll et.al. | 2406.06370 | null |
2024-06-10 | Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset | Shijie Lian et.al. | 2406.06039 | link |
2024-06-09 | Scaling Graph Convolutions for Mobile Vision | William Avery et.al. | 2406.05850 | link |
2024-06-09 | Solution for CVPR 2024 UG2+ Challenge Track on All Weather Semantic Segmentation | Jun Yu et.al. | 2406.05837 | null |
2024-06-09 | Convolution and Attention-Free Mamba-based Cardiac Image Segmentation | Abbas Khan et.al. | 2406.05786 | null |
2024-06-09 | Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language | Mark Hamilton et.al. | 2406.05629 | link |
2024-06-08 | A Two-Stage Adverse Weather Semantic Segmentation Method for WeatherProof Challenge CVPR 2024 Workshop UG2+ | Jianzhao Wang et.al. | 2406.05513 | null |
2024-06-08 | Layered Image Vectorization via Semantic Simplification | Zhenyu Wang et.al. | 2406.05404 | null |
2024-06-08 | 1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation | Qingfeng Liu et.al. | 2406.05352 | null |
2024-06-07 | Semantic Segmentation on VSPW Dataset through Masked Video Consistency | Chen Liang et.al. | 2406.04979 | null |
2024-06-07 | Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment | Venkanna Babu Guthula et.al. | 2406.04949 | null |
2024-06-06 | Characterizing segregation in blast rock piles a deep-learning approach leveraging aerial image analysis | Chengeng Liu et.al. | 2406.04149 | null |
2024-06-07 | 3rd Place Solution for PVUW Challenge 2024: Video Panoptic Segmentation | Ruipu Wu et.al. | 2406.04002 | null |
2024-06-06 | Frequency-based Matcher for Long-tailed Semantic Segmentation | Shan Li et.al. | 2406.03917 | link |
2024-06-07 | Enhanced Semantic Segmentation Pipeline for WeatherProof Dataset Challenge | Nan Zhang et.al. | 2406.03799 | link |
2024-06-06 | Instance Segmentation and Teeth Classification in Panoramic X-rays | Devichand Budagam et.al. | 2406.03747 | link |
2024-06-06 | DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation | Zilu Guo et.al. | 2406.03702 | link |
2024-06-05 | Comparative Benchmarking of Failure Detection Methods in Medical Image Segmentation: Unveiling the Role of Confidence Aggregation | Maximilian Zenk et.al. | 2406.03323 | null |
2024-06-05 | Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy | Yunho Kim et.al. | 2406.02989 | null |
2024-06-04 | W-RIZZ: A Weakly-Supervised Framework for Relative Traversability Estimation in Mobile Robotics | Andre Schreiber et.al. | 2406.02822 | link |
2024-06-04 | Window to Wall Ratio Detection using SegFormer | Zoe De Simone et.al. | 2406.02706 | link |
2024-06-04 | Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation | Mohamed El Amine Boudjoghra et.al. | 2406.02548 | link |
2024-06-04 | Generative Active Learning for Long-tailed Instance Segmentation | Muzhi Zhu et.al. | 2406.02435 | link |
2024-06-04 | Detecting Endangered Marine Species in Autonomous Underwater Vehicle Imagery Using Point Annotations and Few-Shot Learning | Heather Doig et.al. | 2406.01932 | null |
2024-06-03 | MultiPly: Reconstruction of Multiple People from Monocular Video in the Wild | Zeren Jiang et.al. | 2406.01595 | null |
2024-06-03 | Towards Flexible Interactive Reflection Removal with Human Guidance | Xiao Chen et.al. | 2406.01555 | link |
2024-06-03 | EAGLE: Efficient Adaptive Geometry-based Learning in Cross-view Understanding | Thanh-Dat Truong et.al. | 2406.01429 | null |
2024-06-03 | An expert-driven data generation pipeline for histological images | Roberto Basla et.al. | 2406.01403 | link |
2024-06-03 | TE-NeXt: A LiDAR-Based 3D Sparse Convolutional Network for Traversability Estimation | Antonio Santo et.al. | 2406.01395 | link |
2024-06-03 | MP-PolarMask: A Faster and Finer Instance Segmentation for Concave Images | Ke-Lei Wang et.al. | 2406.01356 | null |
2024-06-03 | ARCH2S: Dataset, Benchmark and Challenges for Learning Exterior Architectural Structures from Point Clouds | Ka Lung Cheung et.al. | 2406.01337 | link |
2024-05-31 | Uncertainty Quantification for Bird's Eye View Semantic Segmentation: Methods and Benchmarks | Linlin Yu et.al. | 2405.20986 | null |
2024-05-31 | Extreme Point Supervised Instance Segmentation | Hyeonjun Lee et.al. | 2405.20729 | null |
2024-05-31 | Revisiting and Maximizing Temporal Knowledge in Semi-supervised Semantic Segmentation | Wooseok Shin et.al. | 2405.20610 | link |
2024-05-30 | P-MSDiff: Parallel Multi-Scale Diffusion for Remote Sensing Image Segmentation | Qi Zhang et.al. | 2405.20443 | null |
2024-05-30 | SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow | Chaoyang Wang et.al. | 2405.20282 | link |
2024-05-30 | MCDS-VSS: Moving Camera Dynamic Scene Video Semantic Segmentation by Filtering with Self-Supervised Geometry and Motion | Angel Villar-Corrales et.al. | 2405.19921 | link |
2024-05-30 | Open-Set Domain Adaptation for Semantic Segmentation | Seun-An Choe et.al. | 2405.19899 | link |
2024-05-30 | DenseSeg: Joint Learning for Semantic Segmentation and Landmark Detection Using Dense Image-to-Shape Representation | Ron Keuth et.al. | 2405.19746 | link |
2024-05-30 | Twin Deformable Point Convolutions for Point Cloud Semantic Segmentation in Remote Sensing Scenes | Yong-Qiang Mao et.al. | 2405.19735 | null |
2024-05-30 | CRIS: Collaborative Refinement Integrated with Segmentation for Polyp Segmentation | Ankush Gajanan Arudkar et.al. | 2405.19672 | null |
2024-05-29 | Organizing Background to Explore Latent Classes for Incremental Few-shot Semantic Segmentation | Lianlei Shan et.al. | 2405.19568 | null |
2024-05-29 | Enabling Visual Recognition at Radio Frequency | Haowen Lai et.al. | 2405.19516 | null |
2024-05-29 | Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models | Tianrun Chen et.al. | 2405.19326 | null |
2024-05-29 | A Good Foundation is Worth Many Labels: Label-Efficient Panoptic Segmentation | Niclas Vödisch et.al. | 2405.19035 | link |
2024-05-29 | Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation | Zelin Peng et.al. | 2405.18840 | null |
2024-05-29 | FocSAM: Delving Deeply into Focused Objects in Segmenting Anything | You Huang et.al. | 2405.18706 | null |
2024-05-28 | Learning to Detour: Shortcut Mitigating Augmentation for Weakly Supervised Semantic Segmentation | JuneHyoung Kwon et.al. | 2405.18148 | null |
2024-05-28 | Edge-guided and Class-balanced Active Learning for Semantic Segmentation of Aerial Images | Lianlei Shan et.al. | 2405.18078 | null |
2024-05-28 | RT-GS2: Real-Time Generalizable Semantic Segmentation for 3D Gaussian Representations of Radiance Fields | Mihnea-Bogdan Jurca et.al. | 2405.18033 | null |
2024-05-28 | DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture | Shentong Mo et.al. | 2405.17995 | null |
2024-05-28 | Adapting Pre-Trained Vision Models for Novel Instance Detection and Segmentation | Yangxiao Lu et.al. | 2405.17859 | link |
2024-05-28 | The Binary Quantized Neural Network for Dense Prediction via Specially Designed Upsampling and Attention | Xingyu Ding et.al. | 2405.17776 | null |
2024-05-27 | Evaluation of Multi-task Uncertainties in Joint Semantic Segmentation and Monocular Depth Estimation | Steven Landgraf et.al. | 2405.17097 | null |
2024-05-27 | DSU-Net: Dynamic Snake U-Net for 2-D Seismic First Break Picking | Hongtao Wang et.al. | 2405.16980 | null |
2024-05-27 | Collective Perception Datasets for Autonomous Driving: A Comprehensive Review | Sven Teufel et.al. | 2405.16973 | null |
2024-05-27 | Zero-Shot Video Semantic Segmentation based on Pre-Trained Diffusion Models | Qian Wang et.al. | 2405.16947 | null |
2024-05-27 | A re-calibration method for object detection with multi-modal alignment bias in autonomous driving | Zhihang Song et.al. | 2405.16848 | null |
2024-05-26 | Understanding the Effect of using Semantically Meaningful Tokens for Visual Representation Learning | Neha Kalibhat et.al. | 2405.16401 | null |
2024-05-25 | Video Prediction Models as General Visual Encoders | James Maier et.al. | 2405.16382 | null |
2024-05-25 | BOLD: Boolean Logic Deep Learning | Van Minh Nguyen et.al. | 2405.16339 | null |
2024-05-25 | Improving 3D Occupancy Prediction through Class-balancing Loss and Multi-scale Representation | Huizhou Chen et.al. | 2405.16099 | null |
2024-05-25 | Intensity and Texture Correction of Omnidirectional Image Using Camera Images for Indirect Augmented Reality | Hakim Ikebayashi et.al. | 2405.16008 | null |
2024-05-24 | Visualize and Paint GAN Activations | Rudolf Herdt et.al. | 2405.15636 | null |
2024-05-24 | Leveraging knowledge distillation for partial multi-task learning from multiple remote sensing datasets | Hoàng-Ân Lê et.al. | 2405.15394 | null |
2024-05-24 | Autonomous Quilt Spreading for Caregiving Robots | Yuchun Guo et.al. | 2405.15373 | null |
2024-05-24 | U3M: Unbiased Multiscale Modal Fusion Model for Multimodal Semantic Segmentation | Bingyu Li et.al. | 2405.15365 | link |
2024-05-24 | Cross-Domain Few-Shot Semantic Segmentation via Doubly Matching Transformation | Jiayi Chen et.al. | 2405.15265 | null |
2024-05-23 | Mamba-R: Vision Mamba ALSO Needs Registers | Feng Wang et.al. | 2405.14858 | null |
2024-05-23 | Efficient Robot Learning for Perception and Mapping | Niclas Vödisch et.al. | 2405.14688 | null |
2024-05-23 | Segformer++: Efficient Token-Merging Strategies for High-Resolution Semantic Segmentation | Daniel Kienzle et.al. | 2405.14467 | null |
2024-05-23 | MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models | Jiuming Liu et.al. | 2405.14338 | null |
2024-05-23 | Tuning-free Universally-Supervised Semantic Segmentation | Xiaobo Yang et.al. | 2405.14294 | null |
2024-05-23 | SCMix: Stochastic Compound Mixing for Open Compound Domain Adaptation in Semantic Segmentation | Kai Yao et.al. | 2405.14278 | null |
2024-05-23 | Harmony: A Joint Self-Supervised and Weakly-Supervised Framework for Learning General Purpose Visual Representations | Mohammed Baharoon et.al. | 2405.14239 | null |
2024-05-23 | Leveraging Semantic Segmentation Masks with Embeddings for Fine-Grained Form Classification | Taylor Archibald et.al. | 2405.14162 | null |
2024-05-23 | Skip-SCAR: A Modular Approach to ObjectGoal Navigation with Sparsity and Adaptive Skips | Yaotian Liu et.al. | 2405.14154 | null |
2024-05-22 | TS40K: a 3D Point Cloud Dataset of Rural Terrain and Electrical Transmission System | Diogo Lavado et.al. | 2405.13989 | null |
2024-05-21 | Transparency Distortion Robustness for SOTA Image Segmentation Tasks | Volker Knauthe et.al. | 2405.12864 | null |
2024-05-20 | A comprehensive overview of deep learning techniques for 3D point cloud classification and semantic segmentation | Sushmita Sarker et.al. | 2405.11903 | null |
2024-05-20 | Salience-guided Ground Factor for Robust Localization of Delivery Robots in Complex Urban Environments | Jooyong Park et.al. | 2405.11855 | null |
2024-05-20 | Improving the Explain-Any-Concept by Introducing Nonlinearity to the Trainable Surrogate Model | Mounes Zaval et.al. | 2405.11837 | null |
2024-05-20 | Universal Organizer of SAM for Unsupervised Semantic Segmentation | Tingting Li et.al. | 2405.11742 | null |
2024-05-19 | Interpreting a Semantic Segmentation Model for Coastline Detection | Conor O'Sullivan et.al. | 2405.11500 | null |
2024-05-19 | Unifying 3D Vision-Language Understanding via Promptable Queries | Ziyu Zhu et.al. | 2405.11442 | null |
2024-05-18 | PS6D: Point Cloud Based Symmetry-Aware 6D Object Pose Estimation in Robot Bin-Picking | Yifan Yang et.al. | 2405.11257 | null |
2024-05-17 | CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation | Mushui Liu et.al. | 2405.10530 | link |
2024-05-16 | 4D Panoptic Scene Graph Generation | Jingkang Yang et.al. | 2405.10305 | link |
2024-05-16 | Towards Task-Compatible Compressible Representations | Anderson de Andrade et.al. | 2405.10244 | link |
2024-05-16 | DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data | Chengxiang Fan et.al. | 2405.10185 | link |
2024-05-16 | An Integrated Framework for Multi-Granular Explanation of Video Summarization | Konstantinos Tsigos et.al. | 2405.10082 | null |
2024-05-16 | A Preprocessing and Postprocessing Voxel-based Method for LiDAR Semantic Segmentation Improvement in Long Distance | Andrea Matteazzi et.al. | 2405.10046 | null |
2024-05-16 | Towards Realistic Incremental Scenario in Class Incremental Semantic Segmentation | Jihwan Kwak et.al. | 2405.09858 | null |
2024-05-15 | Synth-to-Real Unsupervised Domain Adaptation for Instance Segmentation | Guo Yachan et.al. | 2405.09682 | null |
2024-05-14 | CLIP with Quality Captions: A Strong Pretraining for Vision Tasks | Pavan Kumar Anasosalu Vasu et.al. | 2405.08911 | null |
2024-05-14 | Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study | Qinfeng Zhu et.al. | 2405.08493 | null |
2024-05-14 | TEDNet: Twin Encoder Decoder Neural Network for 2D Camera and LiDAR Road Detection | Martín Bayón-Gutiérrez et.al. | 2405.08429 | link |
2024-05-13 | IMAFD: An Interpretable Multi-stage Approach to Flood Detection from time series Multispectral Data | Ziyang Zhang et.al. | 2405.07916 | null |
2024-05-13 | PLUTO: Pathology-Universal Transformer | Dinkar Juyal et.al. | 2405.07905 | null |
2024-05-12 | PotatoGANs: Utilizing Generative Adversarial Networks, Instance Segmentation, and Explainable AI for Enhanced Potato Disease Identification and Classification | Mohammad Shafiul Alam et.al. | 2405.07332 | link |
2024-05-12 | Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception | Haoming Chen et.al. | 2405.07201 | null |
2024-05-11 | Global Motion Understanding in Large-Scale Video Object Segmentation | Volodymyr Fedynyak et.al. | 2405.07031 | null |
2024-05-10 | GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs | Mustafa Munir et.al. | 2405.06849 | link |
2024-05-10 | Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach | Elham Ravanbakhsh et.al. | 2405.06586 | null |
2024-05-10 | Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation | Xiaowen Ma et.al. | 2405.06525 | link |
2024-05-10 | Multi-Target Unsupervised Domain Adaptation for Semantic Segmentation without External Data | Yonghao Xu et.al. | 2405.06502 | null |
2024-05-10 | Multi-level Personalized Federated Learning on Heterogeneous and Long-Tailed Data | Rongyu Zhang et.al. | 2405.06413 | null |
2024-05-10 | Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation | Zhenliang Ni et.al. | 2405.06228 | link |
2024-05-10 | Zero-shot Degree of Ill-posedness Estimation for Active Small Object Change Detection | Koji Takeda et.al. | 2405.06185 | null |
2024-05-10 | Prior-guided Diffusion Model for Cell Segmentation in Quantitative Phase Imaging | Zhuchen Shao et.al. | 2405.06175 | null |
2024-05-09 | Mask-TS Net: Mask Temperature Scaling Uncertainty Calibration for Polyp Segmentation | Yudian Zhang et.al. | 2405.05830 | null |
2024-05-09 | CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks | Nick et.al. | 2405.05755 | null |
2024-05-08 | OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies | Lingdong Kong et.al. | 2405.05259 | link |
2024-05-08 | Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving | Lingdong Kong et.al. | 2405.05258 | link |
2024-05-08 | Weakly-supervised Semantic Segmentation via Dual-stream Contrastive Learning of Cross-image Contextual Information | Qi Lai et.al. | 2405.04913 | null |
2024-05-08 | DeepDamageNet: A two-step deep-learning model for multi-disaster building damage segmentation and classification using satellite imagery | Irene Alisjahbana et.al. | 2405.04800 | null |
2024-05-07 | A Self-Supervised Method for Body Part Segmentation and Keypoint Detection of Rat Images | László Kopácsi et.al. | 2405.04650 | null |
2024-05-07 | FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes | Charles Gaydon et.al. | 2405.04634 | link |
2024-05-07 | AugmenTory: A Fast and Flexible Polygon Augmentation Library | Tanaz Ghahremani et.al. | 2405.04442 | null |
2024-05-07 | A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields | Raiyan Rahman et.al. | 2405.04305 | null |
2024-05-07 | ELiTe: Efficient Image-to-LiDAR Knowledge Transfer for Semantic Segmentation | Zhibo Zhang et.al. | 2405.04121 | null |
2024-05-07 | Structured Click Control in Transformer-based Interactive Segmentation | Long Xu et.al. | 2405.04009 | link |
2024-05-06 | PTQ4SAM: Post-Training Quantization for Segment Anything | Chengtao Lv et.al. | 2405.03144 | link |
2024-05-04 | MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning | Vishal Nedungadi et.al. | 2405.02771 | null |
2024-05-04 | Few-Shot Fruit Segmentation via Transfer Learning | Jordan A. James et.al. | 2405.02556 | null |
2024-05-03 | Panoptic-SLAM: Visual SLAM in Dynamic Environments using Panoptic Segmentation | Gabriel Fischer Abati et.al. | 2405.02177 | null |
2024-05-03 | Towards general deep-learning-based tree instance segmentation models | Jonathan Henrich et.al. | 2405.02061 | null |
2024-05-03 | DiffMap: Enhancing Map Segmentation with Map Prior Using Diffusion Model | Peijin Jia et.al. | 2405.02008 | null |
2024-05-02 | Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey | Guoping Xu et.al. | 2405.01725 | link |
2024-05-02 | Explainable AI (XAI) in Image Segmentation in Medicine, Industry, and Beyond: A Survey | Rokas Gipiškis et.al. | 2405.01636 | null |
2024-05-02 | CromSS: Cross-modal pre-training with noisy labels for remote sensing image segmentation | Chenying Liu et.al. | 2405.01217 | null |
2024-05-02 | Uncertainty-aware self-training with expectation maximization basis transformation | Zijia Wang et.al. | 2405.01175 | null |
2024-05-01 | GraCo: Granularity-Controllable Interactive Segmentation | Yian Zhao et.al. | 2405.00587 | null |
2024-05-01 | Exploring Self-Supervised Vision Transformers for Deepfake Detection: A Comparative Analysis | Huy H. Nguyen et.al. | 2405.00355 | null |
2024-04-30 | Masked Multi-Query Slot Attention for Unsupervised Object Discovery | Rishav Pramanik et.al. | 2404.19654 | link |
2024-04-30 | UniFS: Universal Few-shot Instance Perception with Point Representations | Sheng Jin et.al. | 2404.19401 | null |
2024-04-30 | DELINE8K: A Synthetic Data Pipeline for the Semantic Segmentation of Historical Documents | Taylor Archibald et.al. | 2404.19259 | null |
2024-04-29 | Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing | Leonardo Rossi et.al. | 2404.18924 | null |
2024-04-29 | IPixMatch: Boost Semi-supervised Semantic Segmentation with Inter-Pixel Relation | Kebin Wu et.al. | 2404.18891 | null |
2024-04-29 | From Density to Geometry: YOLOv8 Instance Segmentation for Reverse Engineering of Optimized Structures | Thomas Rochefort-Beaudoin et.al. | 2404.18763 | null |
2024-04-29 | Towards Long-term Robotics in the Wild | Stephen Hausler et.al. | 2404.18477 | null |
2024-04-29 | Clicks2Line: Using Lines for Interactive Image Segmentation | Chaewon Lee et.al. | 2404.18461 | null |
2024-04-29 | MFP: Making Full Use of Probability Maps for Interactive Image Segmentation | Chaewon Lee et.al. | 2404.18448 | null |
2024-04-28 | Panoptic Segmentation and Labelling of Lumbar Spine Vertebrae using Modified Attention Unet | Rikathi Pal et.al. | 2404.18291 | null |
2024-04-28 | Garbage Segmentation and Attribute Analysis by Robotic Dogs | Nuo Xu et.al. | 2404.18112 | null |
2024-04-27 | Multi-Stream Cellular Test-Time Adaptation of Real-Time Models Evolving in Dynamic Environments | Benoît Gérin et.al. | 2404.17930 | link |
2024-04-27 | GLIMS: Attention-Guided Lightweight Multi-Scale Hybrid Network for Volumetric Semantic Segmentation | Ziya Ata Yazıcı et.al. | 2404.17854 | link |
2024-04-26 | Optimizing Universal Lesion Segmentation: State Space Model-Guided Hierarchical Networks with Feature Importance Adjustment | Kazi Shahriar Sanjid et.al. | 2404.17235 | null |
2024-04-25 | Calculation of Femur Caput Collum Diaphyseal angle for X-Rays images using Semantic Segmentation | Deepak Bhatia et.al. | 2404.17083 | null |
2024-04-25 | Boosting Unsupervised Semantic Segmentation with Principal Mask Proposals | Oliver Hahn et.al. | 2404.16818 | link |
2024-04-25 | Self-Balanced R-CNN for Instance Segmentation | Leonardo Rossi et.al. | 2404.16633 | link |
2024-04-26 | Multi-Scale Representations by Varying Window Attention for Semantic Segmentation | Haotian Yan et.al. | 2404.16573 | link |
2024-04-25 | 360SFUDA++: Towards Source-free UDA for Panoramic Segmentation by Learning Reliable Category Prototypes | Xu Zheng et.al. | 2404.16501 | null |
2024-04-25 | Semantic Segmentation Refiner for Ultrasound Applications with Zero-Shot Foundation Models | Hedda Cohen Indelman et.al. | 2404.16325 | null |
2024-04-25 | Style Adaptation for Domain-adaptive Semantic Segmentation | Ting Li et.al. | 2404.16301 | null |
2024-04-25 | A Multi-objective Optimization Benchmark Test Suite for Real-time Semantic Segmentation | Yifan Zhao et.al. | 2404.16266 | link |
2024-04-24 | Does SAM dream of EIG? Characterizing Interactive Segmenter Performance using Expected Information Gain | Kuan-I Chung et.al. | 2404.16155 | null |
2024-04-24 | 3D Freehand Ultrasound using Visual Inertial and Deep Inertial Odometry for Measuring Patellar Tracking | Russell Buchanan et.al. | 2404.15847 | null |
2024-04-24 | Vision Transformer-based Adversarial Domain Adaptation | Yahan Li et.al. | 2404.15817 | link |
2024-04-23 | PRISM: A Promptable and Robust Interactive Segmentation Model with Visual Prompts | Hao Li et.al. | 2404.15028 | link |
2024-04-23 | Unknown Object Grasping for Assistive Robotics | Elle Miller et.al. | 2404.15001 | null |
2024-04-22 | Surgical-DeSAM: Decoupling SAM for Instrument Segmentation in Robotic Surgery | Yuyang Sheng et.al. | 2404.14040 | link |
2024-04-22 | OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks | Sophia Sirko-Galouchenko et.al. | 2404.14027 | null |
2024-04-22 | PM-VIS: High-Performance Box-Supervised Video Instance Segmentation | Zhangjing Yang et.al. | 2404.13863 | null |
2024-04-21 | Semantic-Rearrangement-Based Multi-Level Alignment for Domain Generalized Segmentation | Guanlong Jiao et.al. | 2404.13701 | null |
2024-04-21 | PV-S3: Advancing Automatic Photovoltaic Defect Detection using Semi-Supervised Semantic Segmentation of Electroluminescence Images | Abhishek Jha et.al. | 2404.13693 | null |
2024-04-21 | A Complete System for Automated 3D Semantic-Geometric Mapping of Corrosion in Industrial Environments | Rui Pimentel de Figueiredo et.al. | 2404.13691 | null |
2024-04-21 | LMFNet: An Efficient Multimodal Fusion Approach for Semantic Segmentation in High-Resolution Remote Sensing | Tong Wang et.al. | 2404.13659 | null |
2024-04-21 | Towards Unified Representation of Multi-Modal Pre-training for 3D Understanding via Differentiable Rendering | Ben Fei et.al. | 2404.13619 | null |
2024-04-20 | FisheyeDetNet: Object Detection on Fisheye Surround View Camera Systems for Automated Driving | Ganesh Sistu et.al. | 2404.13443 | null |
2024-04-20 | AMMUNet: Multi-Scale Attention Map Merging for Remote Sensing Image Segmentation | Yang Yang et.al. | 2404.13408 | null |
2024-04-19 | Nuclei Instance Segmentation of Cryosectioned H&E Stained Histological Images using Triple U-Net Architecture | Zarif Ahmed et.al. | 2404.12986 | null |
2024-04-19 | FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving | Xingtai Gui et.al. | 2404.12867 | null |
2024-04-19 | Foundation Model assisted Weakly Supervised LiDAR Semantic Segmentation | Yilong Chen et.al. | 2404.12861 | null |
2024-04-19 | COIN: Counterfactual inpainting for weakly supervised semantic segmentation for medical images | Dmytro Shvetsov et.al. | 2404.12832 | link |
2024-04-19 | A Point-Based Approach to Efficient LiDAR Multi-Task Perception | Christopher Lang et.al. | 2404.12798 | null |
2024-04-19 | Generalized Few-Shot Meets Remote Sensing: Discovering Novel Classes in Land Cover Mapping via Hybrid Semantic Segmentation Framework | Zhuohong Li et.al. | 2404.12721 | link |
2024-04-19 | Improving Prediction Accuracy of Semantic Segmentation Methods Using Convolutional Autoencoder Based Pre-processing Layers | Hisashi Shimodaira et.al. | 2404.12718 | null |
2024-04-19 | Show and Grasp: Few-shot Semantic Segmentation for Robot Grasping through Zero-shot Foundation Models | Leonardo Barcellona et.al. | 2404.12717 | null |
2024-04-18 | Spot-Compose: A Framework for Open-Vocabulary Object Retrieval and Drawer Manipulation in Point Clouds | Oliver Lemke et.al. | 2404.12440 | null |
2024-04-18 | A Perspective on Deep Vision Performance with Standard Image and Video Codecs | Christoph Reich et.al. | 2404.12330 | null |
2024-04-18 | Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery | Yona Falinie A. Gaus et.al. | 2404.12285 | null |
2024-04-18 | Deep Gaussian mixture model for unsupervised image segmentation | Matthias Schwab et.al. | 2404.12252 | null |
2024-04-18 | Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training | Jin Gao et.al. | 2404.12210 | link |
2024-04-18 | How to Benchmark Vision Foundation Models for Semantic Segmentation? | Tommie Kerssies et.al. | 2404.12172 | null |
2024-04-17 | Mushroom Segmentation and 3D Pose Estimation from Point Clouds using Fully Convolutional Geometric Features and Implicit Pose Encoding | George Retsinas et.al. | 2404.12144 | link |
2024-04-18 | Tendency-driven Mutual Exclusivity for Weakly Supervised Incremental Semantic Segmentation | Chongjie Si et.al. | 2404.11981 | null |
2024-04-18 | The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models | Cheng Shi et.al. | 2404.11957 | link |
2024-04-18 | Group-On: Boosting One-Shot Segmentation with Supportive Query | Hanjing Zhou et.al. | 2404.11871 | null |
2024-04-17 | Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach | Mir Rayat Imtiaz Hossain et.al. | 2404.11732 | null |
2024-04-17 | A Semantic Segmentation-guided Approach for Ground-to-Aerial Image Matching | Francesco Pro et.al. | 2404.11302 | link |
2024-04-17 | Learning from Unlabelled Data with Transformers: Domain Adaptation for Semantic Segmentation of High Resolution Aerial Images | Nikolaos Dionelis et.al. | 2404.11299 | link |
2024-04-17 | Criteria for Uncertainty-based Corner Cases Detection in Instance Segmentation | Florian Heidecker et.al. | 2404.11266 | null |
2024-04-16 | A Concise Tiling Strategy for Preserving Spatial Context in Earth Observation Imagery | Ellianna Abrahams et.al. | 2404.10927 | link |
2024-04-16 | Vocabulary-free Image Classification and Semantic Segmentation | Alessandro Conti et.al. | 2404.10864 | link |
2024-04-16 | Gasformer: A Transformer-based Architecture for Segmenting Methane Emissions from Livestock in Optical Gas Imaging | Toqi Tahamid Sarker et.al. | 2404.10841 | link |
2024-04-16 | Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark | Jiangning Zhang et.al. | 2404.10760 | null |
2024-04-16 | ECLAIR: A High-Fidelity Aerial LiDAR Dataset for Semantic Segmentation | Iaroslav Melekhov et.al. | 2404.10699 | null |
2024-04-16 | Contextrast: Contextual Contrastive Learning for Semantic Segmentation | Changki Sung et.al. | 2404.10633 | null |
2024-04-16 | Label merge-and-split: A graph-colouring approach for memory-efficient brain parcellation | Aaron Kujawa et.al. | 2404.10572 | null |
2024-04-16 | LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Perception System | Shijing Hu et.al. | 2404.10498 | null |
2024-04-16 | Adversarial Identity Injection for Semantic Face Image Synthesis | Giuseppe Tarollo et.al. | 2404.10408 | null |
2024-04-16 | Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation | Jiapeng Su et.al. | 2404.10322 | null |
2024-04-16 | Learnable Prompt for Few-Shot Semantic Segmentation in Remote Sensing Domain | Steve Andreas Immanuel et.al. | 2404.10307 | link |
2024-04-15 | NOISe: Nuclei-Aware Osteoclast Instance Segmentation for Mouse-to-Human Domain Transfer | Sai Kumar Reddy Manne et.al. | 2404.10130 | link |
2024-04-15 | Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL | Fangwei Zhong et.al. | 2404.09857 | null |
2024-04-15 | In-Context Translation: Towards Unifying Image Recognition, Processing, and Generation | Han Xue et.al. | 2404.09633 | null |
2024-04-15 | The revenge of BiSeNet: Efficient Multi-Task Image Segmentation | Gabriele Rosi et.al. | 2404.09570 | null |
2024-04-15 | kNN-CLIP: Retrieval Enables Training-Free Segmentation on Continually Expanding Large Vocabularies | Zhongrui Gui et.al. | 2404.09447 | null |
2024-04-15 | Human-in-the-Loop Segmentation of Multi-species Coral Imagery | Scarlett Raine et.al. | 2404.09406 | null |
2024-04-14 | Bridging Data Islands: Geographic Heterogeneity-Aware Federated Learning for Collaborative Remote Sensing Semantic Segmentation | Jieyi Tan et.al. | 2404.09292 | null |
2024-04-12 | Structured Model Pruning for Efficient Inference in Computational Pathology | Mohammed Adnan et.al. | 2404.08831 | null |
2024-04-12 | COCONut: Modernizing COCO Segmentation | Xueqing Deng et.al. | 2404.08639 | null |
2024-04-12 | Benchmarking the Cell Image Segmentation Models Robustness under the Microscope Optical Aberrations | Boyuan Peng et.al. | 2404.08549 | null |
2024-04-12 | Analyzing Decades-Long Environmental Changes in Namibia Using Archival Aerial Photography and Deep Learning | Girmaw Abebe Tadesse et.al. | 2404.08544 | null |
2024-04-12 | LaSagnA: Language-based Segmentation Assistant for Complex Queries | Cong Wei et.al. | 2404.08506 | link |
2024-04-12 | Adapting the Segment Anything Model During Usage in Novel Situations | Robin Schön et.al. | 2404.08421 | null |
2024-04-12 | Let It Flow: Simultaneous Optimization of 3D Flow and Object Clustering | Patrik Vacek et.al. | 2404.08363 | null |
2024-04-12 | AdaContour: Adaptive Contour Descriptor with Hierarchical Representation | Tianyu Ding et.al. | 2404.08292 | null |
2024-04-12 | Tackling Ambiguity from Perspective of Uncertainty Inference and Affinity Diversification for Weakly Supervised Semantic Segmentation | Zhiwei Yang et.al. | 2404.08195 | link |
2024-04-12 | Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation | Sina Hajimiri et.al. | 2404.08181 | link |
2024-04-11 | Exploiting Object-based and Segmentation-based Semantic Features for Deep Learning-based Indoor Scene Classification | Ricardo Pereira et.al. | 2404.07739 | null |
2024-04-11 | OpenTrench3D: A Photogrammetric 3D Point Cloud Dataset for Semantic Segmentation of Underground Utilities | Lasse H. Hansen et.al. | 2404.07711 | link |
2024-04-11 | ViM-UNet: Vision Mamba for Biomedical Segmentation | Anwai Archit et.al. | 2404.07705 | link |
2024-04-11 | Implicit and Explicit Language Guidance for Diffusion-based Visual Perception | Hefeng Wang et.al. | 2404.07600 | null |
2024-04-11 | Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling | Sourajit Saha et.al. | 2404.07410 | null |
2024-04-10 | AI-Guided Defect Detection Techniques to Model Single Crystal Diamond Growth | Rohan Reddy Mekala et.al. | 2404.07306 | null |
2024-04-10 | RESSCAL3D: Resolution Scalable 3D Semantic Segmentation of Point Clouds | Remco Royen et.al. | 2404.06863 | null |
2024-04-10 | O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation | Muer Tie et.al. | 2404.06836 | null |
2024-04-10 | Convolution-based Probability Gradient Loss for Semantic Segmentation | Guohang Shan et.al. | 2404.06704 | null |
2024-04-09 | Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation | Luca Barsellotti et.al. | 2404.06542 | null |
2024-04-09 | QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding | Yash Mehan et.al. | 2404.06442 | null |
2024-04-09 | DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning | Senthil Yogamani et.al. | 2404.06352 | null |
2024-04-09 | Automated National Urban Map Extraction | Hasan Nasrallah et.al. | 2404.06202 | null |
2024-04-09 | Hierarchical Insights: Exploiting Structural Similarities for Reliable 3D Semantic Segmentation | Mariella Dreissig et.al. | 2404.06124 | null |
2024-04-09 | Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation | Zong-Wei Hong et.al. | 2404.06029 | null |
2024-04-08 | Evaluating the Efficacy of Cut-and-Paste Data Augmentation in Semantic Segmentation for Satellite Imagery | Ionut M. Motoi et.al. | 2404.05693 | null |
2024-04-08 | AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation | Jiannan Ge et.al. | 2404.05667 | null |
2024-04-08 | Impact of LiDAR visualisations on semantic segmentation of archaeological objects | Raveerat Jaturapitpornchai et.al. | 2404.05512 | null |
2024-04-08 | Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance | Dazhong Shen et.al. | 2404.05384 | link |
2024-04-08 | GPS-free Autonomous Navigation in Cluttered Tree Rows with Deep Semantic Segmentation | Alessandro Navone et.al. | 2404.05338 | null |
2024-04-08 | Human Detection from 4D Radar Data in Low-Visibility Field Conditions | Mikael Skog et.al. | 2404.05307 | null |
2024-04-08 | iVPT: Improving Task-relevant Information Sharing in Visual Prompt Tuning by Cross-layer Dynamic Connection | Nan Zhou et.al. | 2404.05207 | null |
2024-04-08 | UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather | Haimei Zhao et.al. | 2404.05145 | null |
2024-04-07 | D2SL: Decouple Defogging and Semantic Learning for Foggy Domain-Adaptive Segmentation | Xuan Sun et.al. | 2404.04807 | null |
2024-04-06 | HawkDrive: A Transformer-driven Visual Perception System for Autonomous Driving in Night Scene | Ziang Guo et.al. | 2404.04653 | link |
2024-04-05 | Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation | Zifu Wan et.al. | 2404.04256 | null |
2024-04-05 | Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation | Ji-Jia Wu et.al. | 2404.04231 | null |
2024-04-05 | MarsSeg: Mars Surface Semantic Segmentation with Multi-level Extractor and Connector | Junbo Li et.al. | 2404.04155 | null |
2024-04-04 | Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation | Elham Amin Mansour et.al. | 2404.03799 | null |
2024-04-04 | Flattening the Parent Bias: Hierarchical Semantic Segmentation in the Poincaré Ball | Simon Weber et.al. | 2404.03778 | null |
2024-04-04 | OW-VISCap: Open-World Video Instance Segmentation and Captioning | Anwesa Choudhuri et.al. | 2404.03657 | null |
2024-04-04 | Background Noise Reduction of Attention Map for Weakly Supervised Semantic Segmentation | Izumi Fujimori et.al. | 2404.03394 | null |
2024-04-04 | iSeg: Interactive 3D Segmentation via Interactive Attention | Itai Lang et.al. | 2404.03219 | null |
2024-04-04 | CORP: A Multi-Modal Dataset for Campus-Oriented Roadside Perception Tasks | Beibei Wang et.al. | 2404.03191 | null |
2024-04-03 | GPU-Accelerated RSF Level Set Evolution for Large-Scale Microvascular Segmentation | Meher Niger et.al. | 2404.02813 | null |
2024-04-03 | RS-Mamba for Large Remote Sensing Image Dense Prediction | Sijie Zhao et.al. | 2404.02668 | link |
2024-04-03 | A Satellite Band Selection Framework for Amazon Forest Deforestation Detection Task | Eduardo Neto et.al. | 2404.02659 | null |
2024-04-03 | SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation | Junyan Ye et.al. | 2404.02638 | link |
2024-04-03 | Active learning for efficient annotation in precision agriculture: a use-case on crop-weed semantic segmentation | Bart M. van Marrewijk et.al. | 2404.02580 | null |
2024-04-03 | HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras | Zhongyu Xia et.al. | 2404.02517 | link |
2024-04-03 | Optimizing traffic signs and lights visibility for the teleoperation of autonomous vehicles through ROI compression | I. Dror et.al. | 2404.02481 | null |
2024-04-03 | RS3Mamba: Visual State Space Model for Remote Sensing Images Semantic Segmentation | Xianping Ma et.al. | 2404.02457 | link |
2024-04-02 | Constrained Robotic Navigation on Preferred Terrains Using LLMs and Speech Instruction: Exploiting the Power of Adverbs | Faraz Lotfi et.al. | 2404.02294 | null |
2024-04-02 | Segment Any 3D Object with Language | Seungjun Lee et.al. | 2404.02157 | null |
2024-04-02 | Multi-Level Label Correction by Distilling Proximate Patterns for Semi-supervised Semantic Segmentation | Hui Xiao et.al. | 2404.02065 | null |
2024-04-01 | What is Point Supervision Worth in Video Instance Segmentation? | Shuaiyi Huang et.al. | 2404.01990 | null |
2024-04-02 | Synthetic Data for Robust Stroke Segmentation | Liam Chalcroft et.al. | 2404.01946 | link |
2024-04-02 | Improving Bird's Eye View Semantic Segmentation by Task Decomposition | Tianhao Zhao et.al. | 2404.01925 | null |
2024-04-02 | Rethinking Annotator Simulation: Realistic Evaluation of Whole-Body PET Lesion Interactive Segmentation Methods | Zdravko Marinov et.al. | 2404.01816 | null |
2024-04-02 | Samba: Semantic Segmentation of Remotely Sensed Images with State Space Model | Qinfeng Zhu et.al. | 2404.01705 | null |
2024-04-02 | Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss | Jaeha Kim et.al. | 2404.01692 | null |
2024-04-02 | JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments | Duy-Tho Le et.al. | 2404.01686 | null |
2024-04-01 | SUGAR: Pre-training 3D Visual Representations for Robotics | Shizhe Chen et.al. | 2404.01491 | null |
2024-03-29 | ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning | Beomyoung Kim et.al. | 2403.20126 | link |
2024-03-29 | Modeling Weather Uncertainty for Multi-weather Co-Presence Estimation | Qi Bi et.al. | 2403.20092 | null |
2024-03-29 | Using Images as Covariates: Measuring Curb Appeal with Deep Learning | Ardyn Nordstrom et.al. | 2403.19915 | null |
2024-03-29 | MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection | Ali Behrouz et.al. | 2403.19888 | null |
2024-03-28 | Segmentation Re-thinking Uncertainty Estimation Metrics for Semantic Segmentation | Qitian Ma et.al. | 2403.19826 | null |
2024-04-01 | Efficient 3D Instance Mapping and Localization with Neural Fields | George Tang et.al. | 2403.19797 | null |
2024-03-28 | ENet-21: An Optimized light CNN Structure for Lane Detection | Seyed Rasoul Hosseini et.al. | 2403.19782 | null |
2024-03-29 | Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers | Pingcheng Dong et.al. | 2403.19591 | link |
2024-03-28 | DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs | Donghyun Kim et.al. | 2403.19588 | link |
2024-03-28 | Learning Multiple Representations with Inconsistency-Guided Detail Regularization for Mask-Guided Matting | Weihao Jiang et.al. | 2403.19213 | null |
2024-03-27 | Lift3D: Zero-Shot Lifting of Any 2D Vision Model to 3D | Mukund Varma T et.al. | 2403.18922 | null |
2024-03-27 | Annolid: Annotate, Segment, and Track Anything You Need | Chen Yang et.al. | 2403.18690 | null |
2024-03-27 | I2CKD : Intra- and Inter-Class Knowledge Distillation for Semantic Segmentation | Ayoub Karine et.al. | 2403.18490 | null |
2024-03-28 | ViTAR: Vision Transformer with Any Resolution | Qihang Fan et.al. | 2403.18361 | null |
2024-03-27 | Generating Diverse Agricultural Data for Vision-Based Farming Applications | Mikolaj Cieslak et.al. | 2403.18351 | null |
2024-03-27 | Road Obstacle Detection based on Unknown Objectness Scores | Chihiro Noguchi et.al. | 2403.18207 | null |
2024-03-26 | Spectral Convolutional Transformer: Harmonizing Real vs. Complex Multi-View Spectral Operators for Vision Transformer | Badri N. Patro et.al. | 2403.18063 | link |
2024-03-26 | The Need for Speed: Pruning Transformers with One Recipe | Samir Khaki et.al. | 2403.17921 | link |
2024-03-26 | Compressed Multi-task embeddings for Data-Efficient Downstream training and inference in Earth Observation | Carlos Gomes et.al. | 2403.17886 | null |
2024-03-26 | PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition | Chenhongyi Yang et.al. | 2403.17695 | link |
2024-03-26 | Integrating Mamba Sequence Model and Hierarchical Upsampling Network for Accurate Semantic Segmentation of Multiple Sclerosis Legion | Kazi Shahriar Sanjid et.al. | 2403.17432 | null |
2024-03-25 | Optimizing LiDAR Placements for Robust Driving Perception in Adverse Conditions | Ye Li et.al. | 2403.17009 | link |
2024-03-25 | DreamLIP: Language-Image Pre-training with Long Captions | Kecheng Zheng et.al. | 2403.17007 | null |
2024-03-25 | TwinLiteNetPlus: A Stronger Model for Real-time Drivable Area and Lane Segmentation | Quang-Huy Che et.al. | 2403.16958 | null |
2024-03-25 | HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation | Linglin Jing et.al. | 2403.16788 | null |
2024-03-25 | Clustering Propagation for Universal Medical Image Segmentation | Yuhang Ding et.al. | 2403.16646 | null |
2024-03-25 | SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation | Aysim Toker et.al. | 2403.16605 | null |
2024-03-25 | Self-Supervised Learning for Medical Image Data with Anatomy-Oriented Imaging Planes | Tianwei Zhang et.al. | 2403.16499 | null |
2024-03-25 | GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation | Weiming Zhang et.al. | 2403.16370 | null |
2024-03-24 | AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans | Cedric Perauer et.al. | 2403.16318 | null |
2024-03-24 | Dual-modal Prior Semantic Guided Infrared and Visible Image Fusion for Intelligent Transportation System | Jing Li et.al. | 2403.16227 | null |
2024-03-24 | Segment Anything Model for Road Network Graph Extraction | Congrui Hetang et.al. | 2403.16051 | link |
2024-03-24 | SM2C: Boost the Semi-supervised Segmentation for Medical Image by using Meta Pseudo Labels and Mixed Images | Yifei Wang et.al. | 2403.16009 | null |
2024-03-22 | Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting | Jun Guo et.al. | 2403.15624 | null |
2024-03-22 | A2DMN: Anatomy-Aware Dilated Multiscale Network for Breast Ultrasound Semantic Segmentation | Kyle Lucke et.al. | 2403.15560 | null |
2024-03-22 | InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding | Yi Wang et.al. | 2403.15377 | null |
2024-03-22 | Anytime, Anywhere, Anyone: Investigating the Feasibility of Segment Anything Model for Crowd-Sourcing Medical Image Annotations | Pranav Kulkarni et.al. | 2403.15218 | null |
2024-03-22 | Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion | Sofia Casarin et.al. | 2403.15194 | null |
2024-03-22 | IFSENet : Harnessing Sparse Iterations for Interactive Few-shot Segmentation Excellence | Shreyas Chandgothia et.al. | 2403.15089 | null |
2024-03-22 | Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans | Heng Guo et.al. | 2403.15063 | null |
2024-03-22 | BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation | Jiahao Lu et.al. | 2403.15019 | null |
2024-03-22 | Improve Cross-domain Mixed Sampling with Guidance Training for Adaptive Segmentation | Wenlve Zhou et.al. | 2403.14995 | null |
2024-03-21 | WeatherProof: Leveraging Language Guidance for Semantic Segmentation in Adverse Weather | Blake Gella et.al. | 2403.14874 | null |
2024-03-21 | PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model | Zheng Zhang et.al. | 2403.14598 | link |
2024-03-21 | Learning to Project for Cross-Task Knowledge Distillation | Dylan Auty et.al. | 2403.14494 | null |
2024-03-21 | OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation | Bohao Peng et.al. | 2403.14418 | link |
2024-03-21 | Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models | Pablo Marcos-Manchón et.al. | 2403.14291 | link |
2024-03-21 | OTSeg: Multi-prompt Sinkhorn Attention for Zero-Shot Semantic Segmentation | Kwanyoung Kim et.al. | 2403.14183 | null |
2024-03-21 | Evidential Semantic Mapping in Off-road Environments with Uncertainty-aware Bayesian Kernel Inference | Junyoung Kim et.al. | 2403.14138 | null |
2024-03-21 | Soft Masked Transformer for Point Cloud Processing with Skip Attention-Based Upsampling | Yong He et.al. | 2403.14124 | null |
2024-03-21 | Semantics from Space: Satellite-Guided Thermal Semantic Segmentation Annotation for Aerial Field Robots | Connor Lee et.al. | 2403.14056 | null |
2024-03-20 | When Cars meet Drones: Hyperbolic Federated Learning for Source-Free Domain Adaptation in Adverse Weather | Giulia Rizzoli et.al. | 2403.13762 | null |
2024-03-20 | Next day fire prediction via semantic segmentation | Konstantinos Alexis et.al. | 2403.13545 | null |
2024-03-20 | MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Di Wang et.al. | 2403.13430 | link |
2024-03-20 | AMCO: Adaptive Multimodal Coupling of Vision and Proprioception for Quadruped Robot Navigation in Outdoor Environments | Mohamed Elnoor et.al. | 2403.13235 | null |
2024-03-20 | Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation | Linshan Wu et.al. | 2403.13225 | null |
2024-03-19 | Reflectivity Is All You Need!: Advancing LiDAR Semantic Segmentation | Kasi Viswanath et.al. | 2403.13188 | null |
2024-03-19 | As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks? | Anjun Hu et.al. | 2403.12693 | null |
2024-03-19 | PCT: Perspective Cue Training Framework for Multi-Camera BEV Segmentation | Haruya Ishikawa et.al. | 2403.12530 | null |
2024-03-19 | Semantics, Distortion, and Style Matter: Towards Source-free UDA for Panoramic Segmentation | Xu Zheng et.al. | 2403.12505 | null |
2024-03-19 | CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation | Wenqi Zhu et.al. | 2403.12455 | link |
2024-03-19 | Multi-Object RANSAC: Efficient Plane Clustering Method in a Clutter | Seunghyeon Lim et.al. | 2403.12449 | null |
2024-03-18 | EffiPerception: an Efficient Framework for Various Perception Tasks | Xinhao Xiang et.al. | 2403.12317 | null |
2024-03-18 | Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery | Yuqi Zhang et.al. | 2403.11812 | null |
2024-03-18 | Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation | Wangbo Zhao et.al. | 2403.11808 | null |
2024-03-18 | LSKNet: A Foundation Lightweight Backbone for Remote Sensing | Yuxuan Li et.al. | 2403.11735 | null |
2024-03-18 | TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models | Lisa Weijler et.al. | 2403.11691 | null |
2024-03-18 | Better (pseudo-)labels for semi-supervised instance segmentation | François Porcher et.al. | 2403.11675 | null |
2024-03-18 | Synthesizing multi-log grasp poses | Arvid Fälldin et.al. | 2403.11623 | null |
2024-03-18 | OurDB: Ouroboric Domain Bridging for Multi-Target Domain Adaptive Semantic Segmentation | Seungbeom Woo et.al. | 2403.11582 | null |
2024-03-18 | MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow Propagation | Chih-Chung Hsu et.al. | 2403.11576 | null |
2024-03-18 | Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance Segmentation Framework for Sport-scenes | Chih-Chung Hsu et.al. | 2403.11572 | null |
2024-03-18 | Circle Representation for Medical Instance Object Segmentation | Juming Xiong et.al. | 2403.11507 | link |
2024-03-18 | MCD: Diverse Large-Scale Multi-Campus Dataset for Robot Perception | Thien-Minh Nguyen et.al. | 2403.11496 | null |
2024-03-18 | Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting | Mingkui Tan et.al. | 2403.11491 | null |
2024-03-18 | ShapeFormer: Shape Prior Visible-to-Amodal Transformer-based Amodal Instance Segmentation | Minh Tran et.al. | 2403.11376 | null |
2024-03-14 | PosSAM: Panoptic Open-vocabulary Segment Anything | Vibashan VS et.al. | 2403.09620 | null |
2024-03-14 | WeakSurg: Weakly supervised surgical instrument segmentation using temporal equivariance and semantic continuity | Qiyuan Wang et.al. | 2403.09551 | null |
2024-03-14 | Annotation Free Semantic Segmentation with Vision Foundation Models | Soroush Seifi et.al. | 2403.09307 | null |
2024-03-14 | StainFuser: Controlling Diffusion for Faster Neural Style Transfer in Multi-Gigapixel Histology Images | Robert Jewsbury et.al. | 2403.09302 | link |
2024-03-14 | Customizing Segmentation Foundation Model via Prompt Learning for Instance Segmentation | Hyung-Il Kim et.al. | 2403.09199 | null |
2024-03-14 | When Semantic Segmentation Meets Frequency Aliasing | Linwei Chen et.al. | 2403.09065 | link |
2024-03-13 | CART: Caltech Aerial RGB-Thermal Dataset in the Wild | Connor Lee et.al. | 2403.08997 | link |
2024-03-13 | SLCF-Net: Sequential LiDAR-Camera Fusion for Semantic Scene Completion using a 3D Recurrent U-Net | Helin Cao et.al. | 2403.08885 | null |
2024-03-13 | Segmentation of Knee Bones for Osteoarthritis Assessment: A Comparative Analysis of Supervised, Few-Shot, and Zero-Shot Learning Approaches | Yun Xin Teoh et.al. | 2403.08761 | null |
2024-03-13 | Real-time 3D semantic occupancy prediction for autonomous vehicles using memory-efficient sparse convolution | Samuel Sze et.al. | 2403.08748 | null |
2024-03-13 | Semantic Segmentation of Solar Radio Spikes at Low Frequencies | Pearse C. Murphy et.al. | 2403.08546 | null |
2024-03-13 | Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation | Zicheng Zhang et.al. | 2403.08426 | null |
2024-03-13 | LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving | Sicen Guo et.al. | 2403.08215 | null |
2024-03-13 | Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks | Fuzhi Wu et.al. | 2403.08157 | link |
2024-03-12 | Mitigating the Impact of Attribute Editing on Face Recognition | Sudipta Banerjee et.al. | 2403.08092 | null |
2024-03-12 | Hunting Attributes: Context Prototype-Aware Learning for Weakly Supervised Semantic Segmentation | Feilong Tang et.al. | 2403.07630 | link |
2024-03-12 | PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution | Honghao Chen et.al. | 2403.07589 | null |
2024-03-12 | Open-World Semantic Segmentation Including Class Similarity | Matteo Sodano et.al. | 2403.07532 | null |
2024-03-11 | Average Calibration Error: A Differentiable Loss for Improved Reliability in Image Segmentation | Theodore Barfoot et.al. | 2403.06759 | link |
2024-03-11 | Forest Inspection Dataset for Aerial Semantic Segmentation and Depth Estimation | Bianca-Cerasela-Zelia Blaga et.al. | 2403.06621 | link |
2024-03-11 | OMH: Structured Sparsity via Optimally Matched Hierarchy for Unsupervised Semantic Segmentation | Baran Ozaydin et.al. | 2403.06546 | null |
2024-03-11 | 3D Semantic Segmentation-Driven Representations for 3D Object Detection | Hayeon O et.al. | 2403.06501 | link |
2024-03-11 | Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy | Jiuming Liu et.al. | 2403.06467 | link |
2024-03-11 | Towards the Uncharted: Density-Descending Feature Perturbation for Semi-supervised Semantic Segmentation | Xiaoyang Wang et.al. | 2403.06462 | null |
2024-03-11 | Refining Segmentation On-the-Fly: An Interactive Framework for Point Cloud Semantic Segmentation | Peng Zhang et.al. | 2403.06401 | null |
2024-03-10 | Style Blind Domain Generalized Semantic Segmentation via Covariance Alignment and Semantic Consistence Contrastive Learning | Woo-Jin Ahn et.al. | 2403.06122 | link |
2024-03-09 | Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation | Hairong Shi et.al. | 2403.05912 | null |
2024-03-09 | Segmentation Guided Sparse Transformer for Under-Display Camera Image Restoration | Jingyun Xue et.al. | 2403.05906 | null |
2024-03-08 | Attention-guided Feature Distillation for Semantic Segmentation | Amir M. Mansourian et.al. | 2403.05451 | link |
2024-03-08 | Generalized Correspondence Matching via Flexible Hierarchical Refinement and Patch Descriptor Distillation | Yu Han et.al. | 2403.05388 | null |
2024-03-08 | Frequency-Adaptive Dilated Convolution for Semantic Segmentation | Linwei Chen et.al. | 2403.05369 | link |
2024-03-08 | Embedded Deployment of Semantic Segmentation in Medicine through Low-Resolution Inputs | Erik Ostrowski et.al. | 2403.05340 | null |
2024-03-08 | LVIC: Multi-modality segmentation by Lifting Visual Info as Cue | Zichao Dong et.al. | 2403.05159 | null |
2024-03-07 | SAM-PD: How Far Can SAM Take Us in Tracking and Segmenting Anything in Videos by Prompt Denoising | Tao Zhou et.al. | 2403.04194 | link |
2024-03-06 | ECAP: Extensive Cut-and-Paste Augmentation for Unsupervised Domain Adaptive Semantic Segmentation | Erik Brorsson et.al. | 2403.03854 | link |
2024-03-06 | Multi-Grained Cross-modal Alignment for Learning Open-vocabulary Semantic Segmentation from Text Supervision | Yajie Liu et.al. | 2403.03707 | null |
2024-03-06 | Causal Prototype-inspired Contrast Adaptation for Unsupervised Domain Adaptive Semantic Segmentation of High-resolution Remote Sensing Imagery | Jingru Zhu et.al. | 2403.03704 | null |
2024-03-06 | GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding | Zi-Ting Chou et.al. | 2403.03608 | null |
2024-03-06 | Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator | Wonhyeok Choi et.al. | 2403.03468 | null |
2024-03-05 | CenterDisks: Real-time instance segmentation with disk covering | Katia Jodogne-Del Litto et.al. | 2403.03296 | link |
2024-03-05 | Improved LiDAR Odometry and Mapping using Deep Semantic Segmentation and Novel Outliers Detection | Mohamed Afifi et.al. | 2403.03111 | null |
2024-03-05 | ActiveAD: Planning-Oriented Active Learning for End-to-End Autonomous Driving | Han Lu et.al. | 2403.02877 | null |
2024-03-05 | DDF: A Novel Dual-Domain Image Fusion Strategy for Remote Sensing Image Semantic Segmentation with Unsupervised Domain Adaptation | Lingyan Ran et.al. | 2403.02784 | null |
2024-03-05 | Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels | Zhuohong Li et.al. | 2403.02746 | null |
2024-03-05 | FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View | Jiawei Hou et.al. | 2403.02710 | null |
2024-03-05 | Deep Common Feature Mining for Efficient Video Semantic Segmentation | Yaoyan Zheng et.al. | 2403.02689 | null |
2024-03-04 | Self-Supervised Facial Representation Learning with Facial Region Awareness | Zheng Gao et.al. | 2403.02138 | null |
2024-03-04 | Semi-Supervised Semantic Segmentation Based on Pseudo-Labels: A Survey | Lingyan Ran et.al. | 2403.01909 | null |
2024-03-04 | Map-aided annotation for pole base detection | Benjamin Missaoui et.al. | 2403.01868 | null |
2024-03-04 | AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation | Haonan Wang et.al. | 2403.01818 | link |
2024-03-02 | Benchmarking Segmentation Models with Mask-Preserved Attribute Editing | Zijin Yin et.al. | 2403.01231 | link |
2024-03-02 | Boosting Box-supervised Instance Segmentation with Pseudo Depth | Xinyi Yu et.al. | 2403.01214 | null |
2024-03-02 | Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation | Lian Xu et.al. | 2403.01156 | null |
2024-03-01 | Rethinking Few-shot 3D Point Cloud Semantic Segmentation | Zhaochong An et.al. | 2403.00592 | link |
2024-03-01 | Small, Versatile and Mighty: A Range-View Perception Framework | Qiang Meng et.al. | 2403.00325 | null |
2024-03-01 | YOLO-MED : Multi-Task Interaction Network for Biomedical Images | Suizhi Huang et.al. | 2403.00245 | null |
2024-02-29 | FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anything | Safouane El Ghazouali et.al. | 2403.00175 | link |
2024-02-29 | Leveraging AI Predicted and Expert Revised Annotations in Interactive Segmentation: Continual Tuning or Full Training? | Tiezheng Zhang et.al. | 2402.19423 | null |
2024-03-01 | PEM: Prototype-based Efficient MaskFormer for Image Segmentation | Niccolò Cavagnero et.al. | 2402.19422 | link |
2024-02-29 | RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation | Jie Zhang et.al. | 2402.19004 | null |
2024-02-28 | Spatial Coherence Loss for Salient and Camouflaged Object Detection and Beyond | Ziyun Yang et.al. | 2402.18698 | null |
2024-02-29 | Separate and Conquer: Decoupling Co-occurrence via Decomposition and Representation for Weakly Supervised Semantic Segmentation | Zhiwei Yang et.al. | 2402.18467 | link |
2024-02-29 | A Modular System for Enhanced Robustness of Multimedia Understanding Networks via Deep Parametric Estimation | Francesco Barbato et.al. | 2402.18402 | null |
2024-02-28 | Enhancing Roadway Safety: LiDAR-based Tree Clearance Analysis | Miriam Louise Carnot et.al. | 2402.18309 | null |
2024-02-28 | Feature Denoising For Low-Light Instance Segmentation Using Weighted Non-Local Blocks | Joanne Lin et.al. | 2402.18307 | null |
2024-02-28 | Self-Supervised Learning in Electron Microscopy: Towards a Foundation Model for Advanced Image Analysis | Bashir Kazimi et.al. | 2402.18286 | null |
2024-02-28 | PRCL: Probabilistic Representation Contrastive Learning for Semi-Supervised Semantic Segmentation | Haoyu Xie et.al. | 2402.18117 | null |
2024-02-28 | Spannotation: Enhancing Semantic Segmentation for Autonomous Navigation with Efficient Image Annotation | Samuel O. Folorunsho et.al. | 2402.18084 | link |
2024-02-27 | Weakly Supervised Co-training with Swapping Assignments for Semantic Segmentation | Xinyu Yang et.al. | 2402.17891 | link |
2024-02-27 | Mitigating Distributional Shift in Semantic Segmentation via Uncertainty Estimation from Unlabelled Data | David S. W. Williams et.al. | 2402.17653 | null |
2024-02-27 | Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling | David S. W. Williams et.al. | 2402.17622 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-08-29 | Mismatched: Evaluating the Limits of Image Matching Approaches and Benchmarks | Sierra Bonilla et.al. | 2408.16445 | link |
2024-08-29 | Estimating Dynamic Flow Features in Groups of Tracked Objects | Tanner D. Harms et.al. | 2408.16190 | null |
2024-08-28 | ConsistencyTrack: A Robust Multi-Object Tracker with a Generation Strategy of Consistency Model | Lifan Jiang et.al. | 2408.15548 | link |
2024-08-25 | Camouflaged_Object_Tracking__A_Benchmark | Xiaoyu Guo et.al. | 2408.13877 | null |
2024-08-23 | MCTR: Multi Camera Tracking Transformer | Alexandru Niculescu-Mizil et.al. | 2408.13243 | null |
2024-08-23 | BoostTrack++: using tracklet information to detect more objects in multiple object tracking | Vukašin Stanojević et.al. | 2408.13003 | link |
2024-08-22 | BankTweak: Adversarial Attack against Multi-Object Trackers by Manipulating Feature Banks | Woojin Shin et.al. | 2408.12727 | null |
2024-08-22 | BihoT: A Large-Scale Dataset and Benchmark for Hyperspectral Camouflaged Object Tracking | Hanzheng Wang et.al. | 2408.12232 | null |
2024-08-21 | CHOTA: A Higher Order Accuracy Metric for Cell Tracking | Timo Kaiser et.al. | 2408.11571 | link |
2024-08-21 | Low-Light Object Tracking: A Benchmark | Pengzhi Zhong et.al. | 2408.11463 | null |
2024-08-20 | MambaEVT: Event Stream based Visual Object Tracking using State Space Model | Xiao Wang et.al. | 2408.10487 | link |
2024-08-17 | GSLAMOT: A Tracklet and Query Graph-based Simultaneous Locating, Mapping, and Multiple Object Tracking System | Shuo Wang et.al. | 2408.09191 | null |
2024-08-17 | MambaTrack: A Simple Baseline for Multiple Object Tracking with State Space Model | Changcheng Xiao et.al. | 2408.09178 | null |
2024-08-14 | Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving | Yuqing Wen et.al. | 2408.07605 | null |
2024-08-14 | RTAT: A Robust Two-stage Association Tracker for Multi-Object Tracking | Song Guo et.al. | 2408.07344 | null |
2024-08-13 | Object Tracking Incorporating Transfer Learning into Unscented and Cubature Kalman Filters | Omar Alotaibi et.al. | 2408.07157 | null |
2024-08-12 | FruitNeRF: A Unified Neural Radiance Field based Fruit Counting Framework | Lukas Meyer et.al. | 2408.06190 | link |
2024-08-12 | Toward Pedestrian Head Tracking: A Benchmark Dataset and an Information Fusion Network | Kailai Sun et.al. | 2408.05877 | null |
2024-08-09 | Mesh-based Object Tracking for Dynamic Semantic 3D Scene Graphs via Ray Tracing | Lennart Niecksch et.al. | 2408.04979 | null |
2024-08-06 | Quantum Imaging Using Spatially Entangled Photon Pairs from a Nonlinear Metasurface | Jinyong Ma et.al. | 2408.02903 | null |
2024-08-05 | VoxelTrack: Exploring Voxel Representation for 3D Point Cloud Object Tracking | Yuxuan Lu et.al. | 2408.02263 | null |
2024-08-04 | 3D Single-object Tracking in Point Clouds with High Temporal Variation | Qiao Wu et.al. | 2408.02049 | null |
2024-08-03 | SiamMo: Siamese Motion-Centric 3D Object Tracking | Yuxiang Yang et.al. | 2408.01688 | link |
2024-08-02 | Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach | Yabin Zhu et.al. | 2408.00969 | link |
2024-08-05 | U2UData: A Large-scale Cooperative Perception Dataset for Swarm UAVs Autonomous Flight | Tongtong Feng et.al. | 2408.00606 | null |
2024-08-01 | A Batch Update Using Multiplicative Noise Modelling for Extended Object Tracking | Christian Gramsch et.al. | 2408.00417 | null |
2024-07-30 | SharkTrack: an accurate, generalisable software for streamlining shark and ray underwater video analysis | Filippo Varini et.al. | 2407.20623 | null |
2024-07-29 | MEVDT: Multi-Modal Event-Based Vehicle Detection and Tracking Dataset | Zaid A. El Shair et.al. | 2407.20446 | null |
2024-07-28 | Progressive Domain Adaptation for Thermal Infrared Object Tracking | Qiao Li et.al. | 2407.19430 | null |
2024-08-05 | Leveraging Foundation Models via Knowledge Distillation in Multi-Object Tracking: Distilling DINOv2 Features to FairMOT | Niels G. Faber et.al. | 2407.18288 | null |
2024-07-20 | CORT: Class-Oriented Real-time Tracking for Embedded Systems | Edoardo Cittadini et.al. | 2407.17521 | null |
2024-07-23 | 3D-UGCN: A Unified Graph Convolutional Network for Robust 3D Human Pose Estimation from Monocular RGB Images | Jie Zhao et.al. | 2407.16137 | null |
2024-07-21 | Multiple Object Detection and Tracking in Panoramic Videos for Cycling Safety Analysis | Jingwei Guo et.al. | 2407.15199 | link |
2024-07-19 | Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking | Yunfei Zhang et.al. | 2407.14086 | null |
2024-07-19 | OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking | Zekun Qian et.al. | 2407.14047 | null |
2024-07-18 | Boosting Online 3D Multi-Object Tracking through Camera-Radar Cross Check | Sheng-Yao Kuan et.al. | 2407.13937 | null |
2024-07-17 | Strawberry detection and counting based on YOLOv7 pruning and information based tracking algorithm | Shiyu Liu et.al. | 2407.12614 | null |
2024-07-16 | VideoClusterNet: Self-Supervised and Adaptive Clustering For Videos | Devesh Walawalkar et.al. | 2407.12214 | null |
2024-07-15 | Effective Motion Modeling for UAV-platform Multiple Object Tracking with Re-Margin Loss | Mufeng Yao et.al. | 2407.10485 | null |
2024-07-16 | Lost and Found: Overcoming Detector Failures in Online Multi-Object Tracking | Lorenzo Vaquero et.al. | 2407.10151 | link |
2024-07-12 | DroneMOT: Drone-based Multi-Object Tracking Considering Detection Difficulties and Simultaneous Moving of Drones and Objects | Peng Wang et.al. | 2407.09051 | null |
2024-07-11 | Manipulating a Tetris-Inspired 3D Video Representation | Mihir Godbole et.al. | 2407.08885 | null |
2024-07-11 | Visual Multi-Object Tracking with Re-Identification and Occlusion Handling using Labeled Random Finite Sets | Linh Van Ma et.al. | 2407.08872 | null |
2024-07-11 | CommRad: Context-Aware Sensing-Driven Millimeter-Wave Networks | Ish Kumar Jain et.al. | 2407.08817 | null |
2024-07-10 | Deep Learning-Based Robust Multi-Object Tracking via Fusion of mmWave Radar and Camera Sensors | Lei Cheng et.al. | 2407.08049 | null |
2024-07-08 | GeoWATCH for Detecting Heavy Construction in Heterogeneous Time Series of Satellite Images | Jon Crall et.al. | 2407.06337 | null |
2024-07-07 | Addressing single object tracking in satellite imagery through prompt-engineered solutions | Athena Psalta et.al. | 2407.05518 | null |
2024-07-09 | P2P: Part-to-Part Motion Cues Guide a Strong Tracking Framework for LiDAR Point Clouds | Jiahao Nie et.al. | 2407.05238 | link |
2024-07-06 | VIPS-Odom: Visual-Inertial Odometry Tightly-coupled with Parking Slots for Autonomous Parking | Xuefeng Jiang et.al. | 2407.05017 | null |
2024-07-05 | TF-SASM: Training-free Spatial-aware Sparse Memory for Multi-object Tracking | Thuc Nguyen-Quang et.al. | 2407.04327 | null |
2024-07-08 | SSP-GNN: Learning to Track via Bilevel Optimization | Griffin Golias et.al. | 2407.04308 | null |
2024-07-05 | FeatureSORT: Essential Features for Effective Tracking | Hamidreza Hashempoor et.al. | 2407.04249 | null |
2024-07-04 | Attention Normalization Impacts Cardinality Generalization in Slot Attention | Markus Krimmel et.al. | 2407.04170 | null |
2024-07-04 | TrackPGD: A White-box Attack using Binary Masks against Robust Transformer Trackers | Fatemeh Nourilenjan Nokabadi et.al. | 2407.03946 | null |
2024-07-03 | Applying Extended Object Tracking for Self-Localization of Roadside Radar Sensors | Longfei Han et.al. | 2407.03084 | null |
2024-07-02 | FlowTrack: Point-level Flow Network for 3D Single Object Tracking | Shuo Li et.al. | 2407.01959 | null |
2024-07-02 | The Solution for the ICCV 2023 Perception Test Challenge 2023 -- Task 6 -- Grounded videoQA | Hailiang Zhang et.al. | 2407.01907 | null |
2024-06-30 | DroBoost: An Intelligent Score and Model Boosting Method for Drone Detection | Ogulcan Eryuksel et.al. | 2407.00830 | null |
2024-06-30 | Engineering an Efficient Object Tracker for Non-Linear Motion | Momir Adžemović et.al. | 2407.00738 | null |
2024-06-28 | PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators | Kuo-Hao Zeng et.al. | 2406.20083 | null |
2024-06-28 | eMoE-Tracker: Environmental MoE-based Transformer for Robust Event-guided Object Tracking | Yucheng Chen et.al. | 2406.20024 | null |
2024-06-28 | StreamMOTP: Streaming and Unified Framework for Joint 3D Multi-Object Tracking and Trajectory Prediction | Jiaheng Zhuang et.al. | 2406.19844 | null |
2024-06-28 | Basketball-SORT: An Association Method for Complex Multi-object Occlusion Problems in Basketball Multi-object Tracking | Qingrui Hu et.al. | 2406.19655 | null |
2024-06-26 | BiTrack: Bidirectional Offline 3D Multi-Object Tracking Using Camera-LiDAR Data | Kemiao Huang et.al. | 2406.18414 | link |
2024-06-24 | POPCat: Propagation of particles for complex annotation tasks | Adam Srebrnjak Yang et.al. | 2406.17183 | null |
2024-06-24 | A Certifiable Algorithm for Simultaneous Shape Estimation and Object Tracking | Lorenzo Shaikewitz et.al. | 2406.16837 | link |
2024-06-24 | The Progression of Transformers from Language to Vision to MOT: A Literature Review on Multi-Object Tracking with Transformers | Abhi Kamboj et.al. | 2406.16784 | null |
2024-06-21 | LU2Net: A Lightweight Network for Real-time Underwater Image Enhancement | Haodong Yang et.al. | 2406.14973 | null |
2024-06-22 | Velocity Analysis of Moving Objects in Earth Observation Satellite Images Using Multi-Spectral Push Broom Scanning | Eric Keto et.al. | 2406.13710 | null |
2024-06-19 | Hierarchical IoU Tracking based on Interval | Yunhao Du et.al. | 2406.13271 | null |
2024-06-19 | Towards Robust Evaluation: A Comprehensive Taxonomy of Datasets and Metrics for Open Domain Question Answering in the Era of Large Language Models | Akchay Srivastava et.al. | 2406.13232 | null |
2024-06-17 | Deep HM-SORT: Enhancing Multi-Object Tracking in Sports with Deep Features, Harmonic Mean, and Expansion IOU | Matias Gran-Henriksen et.al. | 2406.12081 | null |
2024-06-17 | VideoVista: A Versatile Benchmark for Video Understanding and Reasoning | Yunxin Li et.al. | 2406.11303 | null |
2024-06-14 | Understanding Pedestrian Movement Using Urban Sensing Technologies: The Promise of Audio-based Sensors | Chaeyeon Han et.al. | 2406.09998 | null |
2024-06-14 | Robust compressive tracking via online weighted multiple instance learning | Sandeep Singh Sengar et.al. | 2406.09914 | null |
2024-06-13 | Introducing HOT3D: An Egocentric Dataset for 3D Hand and Object Tracking | Prithviraj Banerjee et.al. | 2406.09598 | null |
2024-06-12 | LaMOT: Language-Guided Multi-Object Tracking | Yunhao Li et.al. | 2406.08324 | link |
2024-06-12 | Vessel Re-identification and Activity Detection in Thermal Domain for Maritime Surveillance | Yasod Ginige et.al. | 2406.08294 | null |
2024-06-11 | Watching Swarm Dynamics from Above: A Framework for Advanced Object Tracking in Drone Videos | Duc Pham et.al. | 2406.07680 | null |
2024-06-11 | Haptic Repurposing with GenAI | Haoyu Wang et.al. | 2406.07228 | null |
2024-06-11 | UVIS: Unsupervised Video Instance Segmentation | Shuaiyi Huang et.al. | 2406.06908 | null |
2024-06-09 | ControlLoc: Physical-World Hijacking Attack on Visual Perception in Autonomous Driving | Chen Ma et.al. | 2406.05810 | null |
2024-06-09 | SlowPerception: Physical-World Latency Attack against Visual Perception in Autonomous Driving | Chen Ma et.al. | 2406.05800 | null |
2024-06-07 | Bootstrapping Referring Multi-Object Tracking | Yani Zhang et.al. | 2406.05039 | link |
2024-06-07 | Multi-Granularity Language-Guided Multi-Object Tracking | Yuhao Li et.al. | 2406.04844 | link |
2024-06-06 | Matching Anything by Segmenting Anything | Siyuan Li et.al. | 2406.04221 | link |
2024-06-06 | ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints | Divij Handa et.al. | 2406.04046 | null |
2024-06-04 | UA-Track: Uncertainty-Aware End-to-End 3D Multi-Object Tracking | Lijun Zhou et.al. | 2406.02147 | null |
2024-06-03 | Reproducibility Study on Adversarial Attacks Against Robust Transformer Trackers | Fatemeh Nourilenjan Nokabadi et.al. | 2406.01765 | link |
2024-06-03 | Prototypical Transformer as Unified Motion Learners | Cheng Han et.al. | 2406.01559 | null |
2024-06-03 | Convolutional Unscented Kalman Filter for Multi-Object Tracking with Outliers | Shiqi Liu et.al. | 2406.01380 | null |
2024-06-03 | Multi-Object Tracking based on Imaging Radar 3D Object Detection | Patrick Palmer et.al. | 2406.01011 | null |
2024-06-01 | Learning to Approximate Particle Smoothing Trajectories via Diffusion Generative Models | Ella Tamir et.al. | 2406.00561 | null |
2024-06-01 | Towards Generalizable Multi-Object Tracking | Zheng Qin et.al. | 2406.00429 | link |
2024-05-30 | WebUOT-1M: Advancing Deep Underwater Object Tracking with A Million-Scale Benchmark | Chunhui Zhang et.al. | 2405.19818 | link |
2024-05-30 | FaceLift: Semi-supervised 3D Facial Landmark Localization | David Ferman et.al. | 2405.19646 | null |
2024-05-29 | DGD: Dynamic 3D Gaussians Distillation | Isaac Labe et.al. | 2405.19321 | null |
2024-05-28 | Track Initialization and Re-Identification for~3D Multi-View Multi-Object Tracking | Linh Van Ma et.al. | 2405.18606 | link |
2024-05-28 | Reliable Object Tracking by Multimodal Hybrid Feature Extraction and Transformer-Based Fusion | Hongze Sun et.al. | 2405.17903 | null |
2024-05-28 | Towards a Generalist and Blind RGB-X Tracker | Yuedong Tan et.al. | 2405.17773 | null |
2024-06-03 | BaboonLand Dataset: Tracking Primates in the Wild and Automating Behaviour Recognition from Drone Videos | Isla Duporge et.al. | 2405.17698 | null |
2024-05-27 | Tracking Small Birds by Detection Candidate Region Filtering and Detection History-aware Association | Tingwei Liu et.al. | 2405.17323 | null |
2024-05-24 | ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking | Xudong Han et.al. | 2405.15755 | null |
2024-05-24 | Trackastra: Transformer-based cell tracking for live-cell microscopy | Benjamin Gallusser et.al. | 2405.15700 | link |
2024-05-24 | An Approximate Dynamic Programming Framework for Occlusion-Robust Multi-Object Tracking | Pratyusha Musunuru et.al. | 2405.15137 | null |
2024-05-23 | Awesome Multi-modal Object Tracking | Chunhui Zhang et.al. | 2405.14200 | null |
2024-05-23 | Enhanced Object Tracking by Self-Supervised Auxiliary Depth Estimation Learning | Zhenyu Wei et.al. | 2405.14195 | null |
2024-05-23 | PuTR: A Pure Transformer for Decoupled and Online Multi-Object Tracking | Chongwei Liu et.al. | 2405.14119 | null |
2024-05-22 | Multi Player Tracking in Ice Hockey with Homographic Projections | Harish Prakash et.al. | 2405.13397 | null |
2024-05-20 | DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM | Xuchen Li et.al. | 2405.12139 | null |
2024-05-19 | Track Anything Rapter(TAR) | Tharun V. Puthanveettil et.al. | 2405.11655 | link |
2024-05-19 | RobMOT: Robust 3D Multi-Object Tracking by Observational Noise and State Estimation Drift Mitigation on LiDAR PointCloud | Mohamed Nagy et.al. | 2405.11536 | null |
2024-05-18 | City-Scale Multi-Camera Vehicle Tracking System with Improved Self-Supervised Camera Link Model | Yuqiang Lin et.al. | 2405.11345 | null |
2024-05-17 | Air Signing and Privacy-Preserving Signature Verification for Digital Documents | P. Sarveswarasarma et.al. | 2405.10868 | null |
2024-05-16 | A Novel Bounding Box Regression Method for Single Object Tracking | Omar Abdelaziz et.al. | 2405.10444 | null |
2024-05-16 | Beyond Traditional Single Object Tracking: A Survey | Omar Abdelaziz et.al. | 2405.10439 | null |
2024-05-16 | Spatial Cognition: a Wave Hypothesis | Robert Worden et.al. | 2405.10112 | null |
2024-05-14 | Learning Correspondence for Deformable Objects | Priya Sundaresan et.al. | 2405.08996 | null |
2024-05-14 | ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association | Shuxiao Ding et.al. | 2405.08909 | link |
2024-05-12 | MAML MOT: Multiple Object Tracking based on Meta-Learning | Jiayi Chen et.al. | 2405.07272 | null |
2024-05-16 | Common Corruptions for Enhancing and Evaluating Robustness in Air-to-Air Visual Object Detection | Anastasios Arsenos et.al. | 2405.06765 | null |
2024-05-16 | Ensuring UAV Safety: A Vision-only and Real-time Framework for Collision Avoidance Through Object Detection, Tracking, and Distance Estimation | Vasileios Karampinis et.al. | 2405.06749 | null |
2024-05-10 | Multi-Object Tracking in the Dark | Xinzhe Wang et.al. | 2405.06600 | link |
2024-05-09 | Outlier-robust Kalman Filtering through Generalised Bayes | Gerardo Duran-Martin et.al. | 2405.05646 | link |
2024-05-08 | MOTLEE: Collaborative Multi-Object Tracking Using Temporal Consistency for Neighboring Robot Frame Alignment | Mason B. Peterson et.al. | 2405.05210 | link |
2024-05-08 | TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking | Pengcheng Shao et.al. | 2405.05004 | link |
2024-05-07 | DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving | Chen Min et.al. | 2405.04390 | null |
2024-05-07 | Bayesian Simultaneous Localization and Multi-Lane Tracking Using Onboard Sensors and a SD Map | Yuxuan Xia et.al. | 2405.04290 | null |
2024-05-06 | Collecting Consistently High Quality Object Tracks with Minimal Human Involvement by Using Self-Supervised Learning to Detect Tracker Errors | Samreen Anjum et.al. | 2405.03643 | null |
2024-05-03 | Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning | Dhruva Tirumala et.al. | 2405.02425 | null |
2024-05-03 | DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos | Wen-Hsuan Chu et.al. | 2405.02280 | link |
2024-05-02 | Tracking and classifying objects with DAS data along railway | Simon L. B. Fredriksen et.al. | 2405.01140 | null |
2024-04-29 | Innovative Integration of Visual Foundation Model with a Robotic Arm on a Mobile Platform | Shimian Zhang et.al. | 2404.18720 | null |
2024-04-27 | 3D Extended Object Tracking by Fusing Roadside Sparse Radar Point Clouds and Pixel Keypoints | Jiayin Deng et.al. | 2404.17903 | link |
2024-04-22 | 360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos | Yinzhe Xu et.al. | 2404.13953 | null |
2024-04-22 | TeamTrack: A Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos | Atom Scott et.al. | 2404.13868 | null |
2024-04-19 | A comparison between single-stage and two-stage 3D tracking algorithms for greenhouse robotics | David Rapado-Rincon et.al. | 2404.12963 | null |
2024-04-18 | Inverse Neural Rendering for Explainable Multi-Object Tracking | Julian Ost et.al. | 2404.12359 | null |
2024-04-24 | On Target Detection in the Presence of Clutter in Joint Communication and Sensing Cellular Networks | Julia Vinogradova et.al. | 2404.12133 | null |
2024-04-18 | MLS-Track: Multilevel Semantic Interaction in RMOT | Zeliang Ma et.al. | 2404.12031 | null |
2024-04-18 | KnotResolver: Tracking self-intersecting filaments in microscopy using directed graphs | Dhruv Khatri et.al. | 2404.12029 | link |
2024-04-17 | How to deal with glare for improved perception of Autonomous Vehicles | Muhammad Z. Alam et.al. | 2404.10992 | null |
2024-04-12 | Into the Fog: Evaluating Multiple Object Tracking Robustness | Nadezda Kirillova et.al. | 2404.10534 | link |
2024-04-15 | 3D Face Tracking from 2D Video through Iterative Dense UV to Image Flow | Felix Taubner et.al. | 2404.09819 | null |
2024-04-12 | IDD-X: A Multi-View Dataset for Ego-relative Important Object Localization and Explanation in Dense and Unstructured Traffic | Chirag Parikh et.al. | 2404.08561 | null |
2024-04-11 | Gaga: Group Any Gaussians via 3D-aware Memory Bank | Weijie Lyu et.al. | 2404.07977 | null |
2024-04-11 | SFSORT: Scene Features-based Simple Online Real-Time Tracker | M. M. Morsali et.al. | 2404.07553 | link |
2024-04-11 | PillarTrack: Redesigning Pillar-based Transformer Network for Single Object Tracking on Point Clouds | Weisheng Xu et.al. | 2404.07495 | link |
2024-04-11 | Trashbusters: Deep Learning Approach for Litter Detection and Tracking | Kashish Jain et.al. | 2404.07467 | null |
2024-04-09 | LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking Attacks | Jianlang Chen et.al. | 2404.06247 | link |
2024-04-08 | DepthMOT: Depth Cues Lead to a Strong Multi-Object Tracker | Jiapeng Wu et.al. | 2404.05518 | link |
2024-04-08 | Self-Supervised Multi-Object Tracking with Path Consistency | Zijia Lu et.al. | 2404.05136 | link |
2024-04-07 | Spatial Cognition from Egocentric Video: Out of Sight, Not Out of Mind | Chiara Plizzari et.al. | 2404.05072 | null |
2024-04-03 | Ego-Motion Aware Target Prediction Module for Robust Multi-Object Tracking | Navid Mahdian et.al. | 2404.03110 | link |
2024-04-03 | Representation Alignment Contrastive Regularization for Multi-Object Tracking | Shujie Chen et.al. | 2404.02562 | link |
2024-03-29 | Bayesian Nonparametrics: An Alternative to Deep Learning | Bahman Moraffah et.al. | 2404.00085 | null |
2024-03-29 | MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark | Sanghyun Woo et.al. | 2403.20225 | null |
2024-03-29 | SceneTracker: Long-term Scene Flow Estimation Network | Bo Wang et.al. | 2403.19924 | null |
2024-03-27 | Enhancing Multiple Object Tracking Accuracy via Quantum Annealing | Yasuyuki Ihara et.al. | 2403.18908 | null |
2024-03-27 | TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes | Liangyu Xu et.al. | 2403.18238 | null |
2024-03-27 | Middle Fusion and Multi-Stage, Multi-Form Prompts for Robust RGB-T Tracking | Qiming Wang et.al. | 2403.18193 | null |
2024-03-26 | OmniVid: A Generative Framework for Universal Video Understanding | Junke Wang et.al. | 2403.17935 | link |
2024-03-26 | Exploring Dynamic Transformer for Efficient Object Tracking | Jiawen Zhu et.al. | 2403.17651 | null |
2024-03-25 | Multiple Object Tracking as ID Prediction | Ruopeng Gao et.al. | 2403.16848 | link |
2024-03-25 | From Two Stream to One Stream: Efficient RGB-T Tracking via Mutual Prompt Learning and Knowledge Distillation | Yang Luo et.al. | 2403.16834 | null |
2024-03-29 | Elysium: Exploring Object-level Perception in Videos via MLLM | Han Wang et.al. | 2403.16558 | link |
2024-03-25 | Spike-NeRF: Neural Radiance Field Based On Spike Camera | Yijia Guo et.al. | 2403.16410 | null |
2024-03-28 | SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking | Xiaojun Hou et.al. | 2403.16002 | link |
2024-03-23 | Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking | Shaoyu Sun et.al. | 2403.15831 | null |
2024-03-23 | PNAS-MOT: Multi-Modal Object Tracking with Pareto Neural Architecture Search | Chensheng Peng et.al. | 2403.15712 | link |
2024-03-22 | CR3DT: Camera-RADAR Fusion for 3D Detection and Tracking | Nicolas Baumann et.al. | 2403.15313 | null |
2024-03-22 | Reasoning-Enhanced Object-Centric Learning for Videos | Jian Li et.al. | 2403.15245 | null |
2024-03-20 | Fast-Poly: A Fast Polyhedral Framework For 3D Multi-Object Tracking | Xiaoyu Li et.al. | 2403.13443 | link |
2024-03-19 | Lifting Multi-View Detection and Tracking to the Bird's Eye View | Torben Teepe et.al. | 2403.12573 | link |
2024-03-18 | Pedestrian Tracking with Monocular Camera using Unconstrained 3D Motion Model | Jan Krejčí et.al. | 2403.11978 | null |
2024-03-17 | NetTrack: Tracking Highly Dynamic Objects with a Net | Guangze Zheng et.al. | 2403.11186 | null |
2024-03-16 | View-Centric Multi-Object Tracking with Homographic Matching in Moving UAV | Deyi Ji et.al. | 2403.10830 | null |
2024-03-16 | Exploring Learning-based Motion Models in Multi-Object Tracking | Hsiang-Wei Huang et.al. | 2403.10826 | null |
2024-03-15 | NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices | Zhiyong Zhang et.al. | 2403.10425 | link |
2024-03-14 | OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning | Lingyi Hong et.al. | 2403.09634 | null |
2024-03-13 | Object Permanence Filter for Robust Tracking with Interactive Robots | Shaoting Peng et.al. | 2403.08231 | null |
2024-03-12 | Learning Data Association for Multi-Object Tracking using Only Coordinates | Mehdi Miah et.al. | 2403.08018 | null |
2024-03-12 | A Study on Centralised and Decentralised Swarm Robotics Architecture for Part Delivery System | Angelos Dimakos et.al. | 2403.07635 | null |
2024-03-12 | LiDAR Point Cloud-based Multiple Vehicle Tracking with Probabilistic Measurement-Region Association | Guanhua Ding et.al. | 2403.06423 | null |
2024-03-09 | **SSF-Net: Sp |