New submissions for Wed, 7 Jun 23 #369
Labels
abstract meaning representation
argument mining
citation context analysis
computational social science
contrastive
cross-language information retrieval
cross-lingual information retrieval
data augmentation
extreme multi-label
knowledge discovery
knowledge graph
legal text
legal
mixup
multi-task
paraphrase
passage generation
plagiarism
robustness
scholarly document processing
scholarly
semantic similarity
similarity measure
simplification
summarization
text generation
Keyword: contrastive
Unsupervised Dense Retrieval with Relevance-Aware Contrastive Pre-Training
Authors: Yibin Lei, Liang Ding, Yu Cao, Changtong Zan, Andrew Yates, Dacheng TaoArxiv: https://arxiv.org/abs/2306.03166
TLDR: Dense retrievers have achieved impressive performance, but their demand for abundant training data limits their application scenarios. Contrastive pre-training, which constructs pseudo-positive examples from unlabeled data, has shown great potential to solve this problem. However, the pseudo-negative examples crafted by data augmentations can be irrelevant. To this end, we propose relevance-aware contrastive learning. It takes the intermediate-trained model itself as an imperfect oracle to estimate the relevance of positive pairs
Repo: None
CONCORD: Clone-aware Contrastive Learning for Source Code
Authors: Yangruibo Ding, Saikat Chakraborty, Luca Buratti, Saurabh Pujar, Alessandro Morari, Gail Kaiser, Baishakhi RayArxiv: https://arxiv.org/abs/2306.03234
TLDR: Deep Learning (DL) models to analyze source code have shown immense promise during the past few years. More recently, self-supervised pre-training has gained traction for learning generic code representations valuable for many downstream SE tasks, such as clone and bug detection. While previous work successfully learned from different code abstractions (e.g., token, AST, graph), we argue that it is also essential to factor in how developers code day-to-day for general-purpose representation learning
Repo: None
CoSiNES: Contrastive Siamese Network for Entity Standardization
Authors: Jiaqing Yuan, Michele Merler, Mihir Choudhury, Raju Pavuluri, Munindar P. Singh, Maja VukovicArxiv: https://arxiv.org/abs/2306.03316
TLDR: Entity standardization maps noisy mentions from free-form text to standard entities in a knowledge base. The unique challenge of this task relative to other entity-related tasks is the lack of surrounding context and numerous variations in the surface form of the mentions, especially when it comes to generalization across domains where labeled data is scarce. Previous research mostly focuses on developing models either heavily relying on context, or dedicated solely to a specific domain. In contrast, we propose CoSiNES, a generic and
Repo: None
Stabilizing Contrastive RL: Techniques for Offline Goal Reaching
Authors: Chongyi Zheng, Benjamin Eysenbach, Homer Walke, Patrick Yin, Kuan Fang, Ruslan Salakhutdinov, Sergey LevineArxiv: https://arxiv.org/abs/2306.03346
TLDR: In the same way that the computer vision (CV) and natural language processing (NLP) communities have developed self-supervised methods, reinforcement learning (RL) can be cast as a self-Supervised problem: learning to reach any goal, without requiring human-specified rewards or labels. However, actually building a self thesupervised foundation for RL faces some important challenges. Building on prior contrastive approaches to this RL problem, we conduct careful ablation experiments and discover that a
Repo: None
Click: Controllable Text Generation with Sequence Likelihood Contrastive Learning
Authors: Chujie Zheng, Pei Ke, Zheng Zhang, Minlie HuangArxiv: https://arxiv.org/abs/2306.03350
TLDR: It has always been an important yet challenging problem to control language models to avoid generating texts with undesirable attributes, such as toxic language and unnatural repetition. We introduce Click for controllable text generation, which needs no modification to the model architecture and facilitates out-of-the-box use of trained models. It employs a contrastive loss on sequence likelihood, which fundamentally decreases the generation probability of negative samples (i.e., generations with undesirable traits). It also adopts a novel likelihood
Repo: None
BatchSampler: Sampling Mini-Batches for Contrastive Learning in Vision, Language, and Graphs
Authors: Zhen Yang, Tinglin Huang, Ming Ding, Yuxiao Dong, Rex Ying, Yukuo Cen, Yangliao Geng, Jie TangArxiv: https://arxiv.org/abs/2306.03355
TLDR: In-Batch contrastive learning is a state-of-the-art self-supervised method that brings semantically-similar instances close while pushing dissimilar instances apart within a mini-batch. Its key to success is the negative sharing strategy, in which every instance serves as a negative for the others within the mini-B. Recent studies aim to improve performance by sampling hard negatives \textit{within the current mini-batch}, whose quality is bounded by the mini
Repo: None
Identifying Shared Decodable Concepts in the Human Brain Using Image-Language Foundation Models
Authors: Cory Efird, Alex Murphy, Joel Zylberberg, Alona FysheArxiv: https://arxiv.org/abs/2306.03375
TLDR: We introduce a method that takes advantage of high-quality pretrained multimodal representations to explore fine-grained semantic networks in the human brain. Previous studies have documented evidence of functional localization in the brain, with different anatomical regions preferentially activating for different types of sensory input. Many such localized structures are known, including the fusiform face area and parahippocampal place area. This raises the question of whether additional brain regions (or conjunctions of brain regions)
Repo: None
Subgraph Networks Based Contrastive Learning
Authors: Jinhuan Wang, Jiafei Shao, Zeyu Wang, Shanqing Yu, Qi Xuan, Xiaoniu YangArxiv: https://arxiv.org/abs/2306.03506
TLDR: Graph contrastive learning (GCL), as a self-supervised learning method, can solve the problem of annotated data scarcity. It mines explicit features in unannotated graphs to generate favorable graph representations for downstream tasks. Most existing GCL methods focus on the design of graph augmentation strategies and mutual information estimation operations. Graph augmentation produces augmented views by graph perturbations. These views preserve a locally similar structure and exploit explicit features. However, these methods have not considered the
Repo: None
Semantic Segmentation on VSPW Dataset through Contrastive Loss and Multi-dataset Training Approach
Authors: Min Yan, Qianxiong Ning, Qian WangArxiv: https://arxiv.org/abs/2306.03508
TLDR: Video scene parsing incorporates temporal information, which can enhance the consistency and accuracy of predictions compared to image scene parsing. The added temporal dimension enables a more comprehensive understanding of the scene, leading to more reliable results. This paper presents the winning solution of the CVPR2023 workshop for video semantic segmentation, focusing on enhancing Spatial-Temporal correlations with contrastive loss. We also explore the influence of multi-dataset training by utilizing a label-mapping technique. And the
Repo: None
On the Difference of BERT-style and CLIP-style Text Encoders
Authors: Zhihong Chen, Guiming Hardy Chen, Shizhe Diao, Xiang Wan, Benyou WangArxiv: https://arxiv.org/abs/2306.03678
TLDR: Masked language modeling (MLM) has been one of the most popular pretraining recipes in natural language processing, e.g., BERT-style text-to-image pretraining (CLIP) has also attracted attention, especially its vision models that achieve excellent performance on a broad range of vision tasks. However, few studies are dedicated to studying the text encoders learned by CLIP. In this paper, we analyze the difference between BERT and CLIP-style
Repo: None
YONA: You Only Need One Adjacent Reference-frame for Accurate and Fast Video Polyp Detection
Authors: Yuncheng Jiang, Zixun Zhang, Ruimao Zhang, Guanbin Li, Shuguang Cui, Zhen LiArxiv: https://arxiv.org/abs/2306.03686
TLDR: Accurate polyp detection is essential for assisting clinical rectal cancer diagnoses. Colonoscopy videos contain richer information than still images, making them a valuable resource for deep learning methods. Great efforts have been made to conduct video Polyp detection through multi-frame temporal/spatial aggregation. However, unlike common fixed-camera video, the camera-moving scene in colonoscopy films can cause rapid video jitters, leading to unstable training for existing video detection models. Additionally, the concealed
Repo: None
Towards Label-free Scene Understanding by Vision Foundation Models
Authors: Runnan Chen, Youquan Liu, Lingdong Kong, Nenglun Chen, Xinge Zhu, Yuexin Ma, Tongliang Liu, Wenping WangArxiv: https://arxiv.org/abs/2306.03899
TLDR: Vision foundation models such as Contrastive Vision-Language Pre-training (CLIP) and Segmentation Supervision (SAM) have demonstrated impressive zero-shot performance on image classification and segmentation tasks. However, the incorporation of CLIP and SAM for label-free scene understanding has yet to be explored. In this paper, we investigate the potential of vision foundation models in enabling networks to comprehend 2D and 3D worlds without labelled data. The primary challenge lies in effectively supervising
Repo: None
Keyword: data augmentation
Synthesizing Affective Neurophysiological Signals Using Generative Models: A Review Paper
Authors: Alireza F. Nia, Vanessa Tang, Gonzalo Maso Talou, Mark BillinghurstArxiv: https://arxiv.org/abs/2306.03112
TLDR: The integration of emotional intelligence in machines is an important step in advancing human-computer interaction. This demands the development of reliable end-to-end emotion recognition systems. However, the scarcity of public affective datasets presents a challenge. In this literature review, we emphasize the use of generative models to address this issue in neurophysiological signals, particularly Electroencephalogram (EEG) and Functional Near-Infrared Spectroscopy (fNIRS). We provide a
Repo: None
Unsupervised Dense Retrieval with Relevance-Aware Contrastive Pre-Training
Authors: Yibin Lei, Liang Ding, Yu Cao, Changtong Zan, Andrew Yates, Dacheng TaoArxiv: https://arxiv.org/abs/2306.03166
TLDR: Dense retrievers have achieved impressive performance, but their demand for abundant training data limits their application scenarios. Contrastive pre-training, which constructs pseudo-positive examples from unlabeled data, has shown great potential to solve this problem. However, the pseudo-negative examples crafted by data augmentations can be irrelevant. To this end, we propose relevance-aware contrastive learning. It takes the intermediate-trained model itself as an imperfect oracle to estimate the relevance of positive pairs
Repo: None
Stabilizing Contrastive RL: Techniques for Offline Goal Reaching
Authors: Chongyi Zheng, Benjamin Eysenbach, Homer Walke, Patrick Yin, Kuan Fang, Ruslan Salakhutdinov, Sergey LevineArxiv: https://arxiv.org/abs/2306.03346
TLDR: In the same way that the computer vision (CV) and natural language processing (NLP) communities have developed self-supervised methods, reinforcement learning (RL) can be cast as a self-Supervised problem: learning to reach any goal, without requiring human-specified rewards or labels. However, actually building a self thesupervised foundation for RL faces some important challenges. Building on prior contrastive approaches to this RL problem, we conduct careful ablation experiments and discover that a
Repo: None
Efficient Anomaly Detection with Budget Annotation Using Semi-Supervised Residual Transformer
Authors: Hanxi Li, Jingqi Wu, Hao Chen, Mingwen Wang, Chunhua ShenArxiv: https://arxiv.org/abs/2306.03492
TLDR: Anomaly Detection is challenging as usually only the normal samples are seen during training and the detector needs to discover anomalies on-the-fly. The recently proposed deep-learning-based approaches could somehow alleviate the problem but there is still a long way to go in obtaining an industrial-class anomaly detector for real-world applications. On the other hand, in some particular AD tasks, a few anomalous samples are labeled manually for achieving higher accuracy. However, this performance gain is at the
Repo: None
Towards Adaptable and Interactive Image Captioning with Data Augmentation and Episodic Memory
Authors: Aliki Anagnostopoulou, Mareike Hartmann, Daniel SonntagArxiv: https://arxiv.org/abs/2306.03500
TLDR: Interactive machine learning (IML) is a beneficial learning paradigm in cases of limited data availability, as human feedback is incrementally integrated into the training process. In this paper, we present an IML pipeline for image captioning which allows us to incrementally adapt a pre-trained image captioned model to a new data distribution based on user input. In order to incorporate user input into the model, we explore the use of a combination of simple data augmentation methods to obtain larger
Repo: None
Rec4Ad: A Free Lunch to Mitigate Sample Selection Bias for Ads CTR Prediction in Taobao
Authors: Jingyue Gao, Shuguang Han, Han Zhu, Siran Yang, Yuning Jiang, Jian Xu, Bo ZhengArxiv: https://arxiv.org/abs/2306.03527
TLDR: Click-Through Rate (CTR) prediction serves as a fundamental component in online advertising. A common practice is to train a CTR model on advertisement (ad) impressions with user feedback. Since ad impressions are purposely selected by the model itself, their distribution differs from the inference distribution and thus exhibits sample selection bias (SSB) that affects model performance. Existing studies on SSB mainly employ sample re-weighting techniques which suffer from high variance and poor model calibration. Another line of
Repo: None
Keyword: knowledge graph
Construction d'un système de recommandation basé sur des contraintes via des graphes de connaissances
Authors: Ngoc Luyen Le, Marie-Hélène Abel, Philippe GouspillouArxiv: https://arxiv.org/abs/2306.03247
TLDR: Knowledge graphs in RDF model entities and their relations using ontologies, and have gained popularity for information modeling. In recommender systems, knowledge graphs help represent more links and relationships between users and items. Constraint-based recommender Systems leverage deep recommendation knowledge to identify relevant suggestions. When combined with knowledge graphs, they offer benefits in constraint sets. This paper explores a constraint-based reender system using RDF knowledge graphs for the vehicle purchase/sale domain. Our experiments demonstrate
Repo: None
Logic Diffusion for Knowledge Graph Reasoning
Authors: Xiaoying Xie, Biao Gong, Yiliang Lv, Zhen Han, Guoshuai Zhao, Xueming QianArxiv: https://arxiv.org/abs/2306.03515
TLDR: Most recent works focus on answering first order logical queries to explore the knowledge graph reasoning via multi-hop logic predictions. However, existing reasoning models are limited by the circumscribed logical paradigms of training samples, which leads to a weak generalization of unseen logic. To address these issues, we propose a plug-in module called Logic Diffusion (LoD) to discover unseen queries from surroundings and achieves dynamical equilibrium between different kinds of patterns. The basic idea of LoD is
Repo: None
BioBLP: A Modular Framework for Learning on Multimodal Biomedical Knowledge Graphs
Authors: Daniel Daza, Dimitrios Alivanistos, Payal Mitra, Thom Pijnenburg, Michael Cochez, Paul GrothArxiv: https://arxiv.org/abs/2306.03606
TLDR: Knowledge graphs (KGs) are an important tool for representing complex relationships between entities in the biomedical domain. Several methods have been proposed for learning embeddings that can be used to predict new links in such graphs. Some methods ignore valuable attribute data associated with entities in biomedical KGs, such as protein sequences, or molecular graphs. Other works incorporate such data, but assume that entities can be represented with the same data modality. This is not always the case for biomedical KG
Repo: https://github.com/elsevier-ai-lab/bioblp
Schema First! Learn Versatile Knowledge Graph Embeddings by Capturing Semantics with MASCHInE
Authors: Nicolas Hubert, Heiko Paulheim, Pierre Monnin, Armelle Brun, Davy MonticoloArxiv: https://arxiv.org/abs/2306.03659
TLDR: Knowledge graph embedding models (KGEMs) have gained considerable traction in recent years. These models learn a vector representation of knowledge graph entities and relations, a.k.a. knowledge graph embeddings (KGEs). Learning versatile KGEs is desirable as it makes them useful for a broad range of tasks. However, KGems are usually trained for a specific task, which makes their embedders task-dependent. In parallel, the widespread
Repo: https://github.com/nicolas-hbt/versatile-embeddings
Keyword: multi-task
Few Shot Rationale Generation using Self-Training with Dual Teachers
Authors: Aditya Srikanth Veerubhotla, Lahari Poddar, Jun Yin, György Szarvas, Sharanya EswaranArxiv: https://arxiv.org/abs/2306.03315
TLDR: Self-rationalizing models that also generate a free-text explanation for their predicted labels are an important tool to build trustworthy AI applications. Since generating explanations for annotated labels is a laborious and costly proclivity task, recent models rely on large pretrained language models (PLMs) as their backbone and few-shot learning. In this work we explore a self-training approach leveraging both labeled and unlabeled data to further improve few-stroke models, under the assumption that
Repo: None
TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision
Authors: Yukun Zhai, Xiaoqiang Zhang, Xiameng Qin, Sanyuan Zhao, Xingping Dong, Jianbing ShenArxiv: https://arxiv.org/abs/2306.03377
TLDR: End-to-end text spotting is a vital computer vision task that aims to integrate scene text detection and recognition into a unified framework. Typical methods heavily rely on Region-of-Interest (RoI) operations to extract local features and complex post-processing steps to produce final predictions. To address these limitations, we propose TextFormer, a query-based end-to the end text spotter with Transformer architecture. Specifically, using query embedding per text instance, TextFormer builds
Repo: None
FAMO: Fast Adaptive Multitask Optimization
Authors: Bo Liu, Yihao Feng, Peter Stone, Qiang LiuArxiv: https://arxiv.org/abs/2306.03792
TLDR: One of the grand enduring goals of AI is to create generalist agents that can learn multiple different tasks from diverse data via multitask learning (MTL). However, gradient descent (GD) on the average loss across all tasks may yield poor multitask performance due to severe under-optimization of certain tasks. Previous approaches that manipulate task gradients for a more balanced loss decrease require storing and computing all task gradient (O(K) space and time where K is the number of
Repo: https://github.com/cranial-xix/famo
CL-UZH at SemEval-2023 Task 10: Sexism Detection through Incremental Fine-Tuning and Multi-Task Learning with Label Descriptions
Authors: Janis GoldzycherArxiv: https://arxiv.org/abs/2306.03907
TLDR: The widespread popularity of social media has led to an increase in hateful, abusive, and sexist language, motivating methods for the automatic detection of such phenomena. The goal of the SemEval shared task \textit{Towards Explainable Detection of Online Sexism} (EDOS 2023) is to detect sexism in English social media posts (subtask A), and to categorize such posts into four coarse-grained sexism categories (subTask B), and eleven fine-
Repo: None
Keyword: plagiarism
Is AI Changing the Rules of Academic Misconduct? An In-depth Look at Students' Perceptions of 'AI-giarism'
Authors: Cecilia Ka Yuk ChanArxiv: https://arxiv.org/abs/2306.03358
TLDR: This pioneering study explores students' perceptions of AI-giarism, an emergent form of academic dishonesty involving AI and plagiarism, within the higher education context. A survey, undertaken by 393 undergraduate and postgraduate students from a variety of disciplines, investigated their perceptions of diverse AI-groarism scenarios. The findings portray a complex landscape of understanding, with clear disapproval for direct AI content generation, yet more ambivalent attitudes towards subtler uses of AI. The study introduces
Repo: None
Transformative Effects of ChatGPT on Modern Education: Emerging Era of AI Chatbots
Authors: Sukhpal Singh Gill, Minxian Xu, Panos Patros, Huaming Wu, Rupinder Kaur, Kamalpreet Kaur, Stephanie Fuller, Manmeet Singh, Priyansh Arora, Ajith Kumar Parlikad, Vlado Stankovski, Ajith Abraham, Soumya K. Ghosh, Hanan Lutfiyya, Salil S. Kanhere, Rami Bahsoon, Omer Rana, Schahram Dustdar, Rizos Sakellariou, Steve Uhlig, Rajkumar BuyyaArxiv: https://arxiv.org/abs/2306.03823
TLDR: ChatGPT, an AI-based chatbot, was released to provide coherent and useful replies based on analysis of large volumes of data. In this article, leading scientists, researchers and engineers discuss the transformative effects of ChatGPT on modern education. This research seeks to improve our knowledge of ChatgPT capabilities and its use in the education sector, identifying potential concerns and challenges. Our preliminary evaluation concludes that ChatGpt performed differently in each subject area including finance, coding and maths.
Repo: None
Keyword: robustness
Adversarial alignment: Breaking the trade-off between the strength of an attack and its relevance to human perception
Authors: Drew Linsley, Pinyuan Feng, Thibaut Boissin, Alekh Karkada Ashok, Thomas Fel, Stephanie Olaiya, Thomas SerreArxiv: https://arxiv.org/abs/2306.03229
TLDR: Deep neural networks (DNNs) are known to have a fundamental sensitivity to adversarial attacks, perturbations of the input that are imperceptible to humans yet powerful enough to change the visual decision of a model. Adversarial attacks have long been considered the "Achilles' heel" of deep learning, which may eventually force a shift in modeling paradigms. Nevertheless, the formidable capabilities of modern large-scale DNNs have somewhat eclipsed these early concerns
Repo: None
Explaining and Adapting Graph Conditional Shift
Authors: Qi Zhu, Yizhu Jiao, Natalia Ponomareva, Jiawei Han, Bryan PerozziArxiv: https://arxiv.org/abs/2306.03256
TLDR: Graph Neural Networks (GNNs) have shown remarkable performance on graph-structured data. However, recent empirical studies suggest that GNNs are very susceptible to distribution shift. There is still significant ambiguity about why graph-based models seem more vulnerable to these shifts. In this work we provide a thorough theoretical analysis on it by quantifying the magnitude of conditional shift between the input features and the output label. Our findings show that both graph heterophily and model architecture exacerbate conditional shifts
Repo: None
Survival Instinct in Offline Reinforcement Learning
Authors: Anqi Li, Dipendra Misra, Andrey Kolobov, Ching-An ChengArxiv: https://arxiv.org/abs/2306.03286
TLDR: We present a novel observation about the behavior of offline reinforcement learning (RL) algorithms: on many benchmark datasets, offline RL can produce well-performing and safe policies even when trained with "wrong" reward labels, such as those that are zero everywhere or are negatives of the true rewards. This phenomenon cannot be easily explained by offline RL's return maximization objective. Moreover, it gives offline RL a degree of robustness that is uncharacteristic of its online RL counterparts, which are known
Repo: None
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning
Authors: Bo Liu, Yifeng Zhu, Chongkai Gao, Yihao Feng, Qiang Liu, Yuke Zhu, Peter StoneArxiv: https://arxiv.org/abs/2306.03310
TLDR: Lifelong learning offers a promising paradigm of building a generalist agent that learns and adapts over its lifespan. Unlike traditional lifelong learning problems in image and text domains, which primarily involve the transfer of declarative knowledge of entities and concepts, lifelong learning in decision-making (LLDM) also necessitates the transferof procedural knowledge, such as actions and behaviors. To advance research in LLDM, we introduce LIBERO, a novel benchmark of lifelong learning for robot manipulation.
Repo: None
Phase perturbation improves channel robustness for speech spoofing countermeasures
Authors: Yongyi Zang, You Zhang, Zhiyao DuanArxiv: https://arxiv.org/abs/2306.03389
TLDR: In this paper, we aim to address the problem of channel robustness in speech countermeasure (CM) systems, which are used to distinguish synthetic speech from human natural speech. On the basis of two hypotheses, we suggest an approach for perturbing phase information during the training of time-domain CM systems. Communication networks often employ lossy compression codec that encodes only magnitude information, therefore heavily altering phase information. Also, state-of-the-art CM systems rely on
Repo: None
SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation
Authors: Xuewei Li, Tao Wu, Zhongang Qi, Gaoang Wang, Ying Shan, Xi LiArxiv: https://arxiv.org/abs/2306.03403
TLDR: As an important and challenging problem in computer vision, PAnoramic Semantic Segmentation (PASS) gives complete scene perception based on an ultra-wide angle of view. Usually, prevalent PASS methods with 2D panoramic image input focus on solving image distortions but lack consideration of the 3D properties of original
Repo: https://github.com/tencentarc/sgat4pass
Learning to Simulate Tree-Branch Dynamics for Manipulation
Authors: Jayadeep Jacob, Tirthankar Bandyopadhyay, Jason Williams, Paulo Borges, Fabio RamosArxiv: https://arxiv.org/abs/2306.03410
TLDR: We propose to use a simulation driven inverse inference approach to model the joint dynamics of tree branches under manipulation. Learning branch dynamics and gaining the ability to manipulate deformable vegetation can help with occlusion-prone tasks, such as fruit picking in dense foliage, as well as moving overhanging vines and branches for navigation in dense vegetation. The underlying deformable tree geometry is encapsulated as coarse spring abstractions executed on parallel, non-differentiable simulators. The implicit statistical model defined
Repo: None
Revisiting the Trade-off between Accuracy and Robustness via Weight Distribution of Filters
Authors: Xingxing Wei, Shiji ZhaoArxiv: https://arxiv.org/abs/2306.03430
TLDR: Adversarial attacks have been proven to be potential threats to Deep Neural Networks (DNNs), and many methods are proposed to defend against adversarial attacks. However, while enhancing the robustness, the clean accuracy will decline to a certain extent, implying a trade-off existed between the accuracy and robustness. In this paper, we firstly empirically find an obvious distinction between standard and robust models in the filters' weight distribution of the same architecture, and then theoretically explain this
Repo: None
Protecting the Intellectual Property of Diffusion Models by the Watermark Diffusion Process
Authors: Sen Peng, Yufei Chen, Cong Wang, Xiaohua JiaArxiv: https://arxiv.org/abs/2306.03436
TLDR: Diffusion models have emerged as state-of-the-art deep generative architectures with the increasing demands for generation tasks. Training large diffusion models for good performance requires high resource costs, making them valuable intellectual properties to protect. While most of the existing solutions, including watermarking, mainly focus on discriminative models. This paper proposes WDM, a novel watermark-ing method for diffusion models, including Watermark embedding, extraction, and verification. WDM embeds
Repo: None
Benchmarking Robustness of AI-enabled Multi-sensor Fusion Systems: Challenges and Opportunities
Authors: Xinyu Gao, Zhijie Wang, Yang Feng, Lei Ma, Zhenyu Chen, Baowen XuArxiv: https://arxiv.org/abs/2306.03454
TLDR: Multi-Sensor Fusion (MSF) based perception systems have been the foundation in supporting many industrial applications and domains, such as self-driving cars, robotic arms, and unmanned aerial vehicles. Over the past few years, the fast progress in data-driven artificial intelligence (AI) has brought a fast-increasing trend to empower MSF systems by deep learning techniques to further improve performance, especially on intelligent systems and their perception systems. Although quite a few AI-enabled MSF perception systems
Repo: None
On Pitfalls of Test-Time Adaptation
Authors: Hao Zhao, Yuejiang Liu, Alexandre Alahi, Tao LinArxiv: https://arxiv.org/abs/2306.03536
TLDR: Test-Time Adaptation (TTA) has recently emerged as a promising approach for tackling the robustness challenge under distribution shifts. However, the lack of consistent settings and systematic studies in prior literature hinders thorough assessments of existing methods. To address this issue, we present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols. Through extensive experiments, our benchmark reveals three common
Repo: https://github.com/lins-lab/ttab
PQM: A Point Quality Evaluation Metric for Dense Maps
Authors: Yash Turkar, Pranay Meshram, Charuvahan Adhivarahan, Karthik DantuArxiv: https://arxiv.org/abs/2306.03660
TLDR: LiDAR-based mapping/reconstruction are important for various applications, but evaluating the quality of the dense maps they produce is challenging. The current methods have limitations, including the inability to capture completeness, structural information, and local variations in error. In this paper, we propose a novel point quality evaluation metric (PQM) that consists of four sub-metrics to provide a more comprehensive evaluation of point cloud quality. The completeness sub-set evaluates the proportion
Repo: None
Computation with Sequences in the Brain
Authors: Max Dabagia, Christos H. Papadimitriou, Santosh S. VempalaArxiv: https://arxiv.org/abs/2306.03812
TLDR: Even as machine learning exceeds human-level performance on many applications, the generality, robustness, and rapidity of the brain's learning capabilities remain unmatched. How cognition arises from neural activity is a central open question in neuroscience, inextricable from the study of intelligence itself. A simple formal model of neural activity was proposed in Papadimitriou [2020] and has been subsequently shown, through both mathematical proofs and simulations, to be capable of implementing certain simple cognitive operations
Repo: None
Patient Dropout Prediction in Virtual Health: A Multimodal Dynamic Knowledge Graph and Text Mining Approach
Authors: Shuang Geng, Wenli Zhang, Jiaheng Xie, Gemin Liang, Ben NiuArxiv: https://arxiv.org/abs/2306.03833
TLDR: Virtual health has been acclaimed as a transformative force in healthcare delivery. Yet, its dropout issue is critical that leads to poor health outcomes, increased health, societal, and economic costs. Timely prediction of patient dropout enables stakeholders to take proactive steps to address patients' concerns, potentially improving retention rates. In virtual health, the information asymmetries inherent in its delivery format, between different stakeholders, and across different healthcare delivery systems hinder the performance of existing predictive methods. To resolve those
Repo: None
Keyword: scholarly
SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning
Authors: Zhishen Yang, Raj Dabre, Hideki Tanaka, Naoaki OkazakiArxiv: https://arxiv.org/abs/2306.03491
TLDR: In scholarly documents, figures provide a straightforward way of communicating scientific findings to readers. Automating figure caption generation helps move model understandings of scientific documents beyond text and will help authors write informative captions that facilitate communicating scientific discoveries. Unlike previous studies, we reframe scientific figure captioning as a knowledge-augmented image captioning task that models need to utilize knowledge embedded across modalities for caption generation. To this end, we extended the large-scale SciCap dataset~\c
Repo: https://github.com/zhishenyang/scientific_figure_captioning_dataset
Keyword: semantic similarity
Supervised Knowledge May Hurt Novel Class Discovery Performance
Authors: Ziyun Li, Jona Otholt, Ben Dai, Di Hu, Christoph Meinel, Haojin YangArxiv: https://arxiv.org/abs/2306.03648
TLDR: Novel class discovery (NCD) aims to infer novel categories in an unlabeled dataset by leveraging prior knowledge of a labeled set comprising disjoint but related classes. Given that most existing literature focuses primarily on utilizing supervised knowledge from a labeling set at the methodology level, this paper considers the question: Is supervised knowledge always helpful at different levels of semantic relevance? To proceed, we first establish a novel metric, so-called transfer flow, to measure the semantic similarity between labeled/
Repo: https://github.com/j-l-o/sk-hurt-ncd
Keyword: summarization
shs-nlp at RadSum23: Domain-Adaptive Pre-training of Instruction-tuned LLMs for Radiology Report Impression Generation
Authors: Sanjeev Kumar Karn, Rikhiya Ghosh, Kusuma P, Oladimeji FarriArxiv: https://arxiv.org/abs/2306.03264
TLDR: Instruction-tuned generative Large language models (LLMs) like ChatGPT and Bloomz possess excellent generalization abilities, but they face limitations in understanding radiology reports, particularly in the task of generating the IMPRESSIONS section from the FINDINGS section. They tend to generate either verbose or incomplete IMPRESSions, mainly due to insufficient exposure to medical text data during training. We present a system which leverages large-scale medical text in a zero-shot
Repo: None
Prompt Space Optimizing Few-shot Reasoning Success with Large Language Models
Authors: Fobo Shi, Peijun Qing, Dong Yang, Nan Wang, Youbo Lei, Haonan Lu, Xiaodong LinArxiv: https://arxiv.org/abs/2306.03799
TLDR: Prompt engineering is an essential technique for enhancing the abilities of large language models (LLMs) by providing explicit and specific instructions. It enables LLMs to excel in various tasks, such as arithmetic reasoning, question answering, summarization, relation extraction, machine translation, and sentiment analysis. Researchers have been actively exploring different prompt engineering strategies, suchas Chain of Thought (CoT), Zero-CoT, and In-context learning. However, an unresolved problem arises from the fact that
Repo: None
Correction of Errors in Preference Ratings from Automated Metrics for Text Generation
Authors: Jan Deriu, Pius von Däniken, Don Tuggener, Mark CieliebakArxiv: https://arxiv.org/abs/2306.03866
TLDR: A major challenge in the field of Text Generation is evaluation: Human evaluations are cost-intensive, and automated metrics often display considerable disagreement with human judgments. In this paper, we propose a statistical model of text Generation evaluation that accounts for the error-proneness of automated metrics when used to generate preference rankings between system outputs. We show that existing automated metrics are generally over-confident in assigning significant differences between systems in this setting. However, our model enables an efficient combination of human and
Repo: None
Keyword: text generation
Click: Controllable Text Generation with Sequence Likelihood Contrastive Learning
Authors: Chujie Zheng, Pei Ke, Zheng Zhang, Minlie HuangArxiv: https://arxiv.org/abs/2306.03350
TLDR: It has always been an important yet challenging problem to control language models to avoid generating texts with undesirable attributes, such as toxic language and unnatural repetition. We introduce Click for controllable text generation, which needs no modification to the model architecture and facilitates out-of-the-box use of trained models. It employs a contrastive loss on sequence likelihood, which fundamentally decreases the generation probability of negative samples (i.e., generations with undesirable traits). It also adopts a novel likelihood
Repo: None
Correction of Errors in Preference Ratings from Automated Metrics for Text Generation
Authors: Jan Deriu, Pius von Däniken, Don Tuggener, Mark CieliebakArxiv: https://arxiv.org/abs/2306.03866
TLDR: A major challenge in the field of Text Generation is evaluation: Human evaluations are cost-intensive, and automated metrics often display considerable disagreement with human judgments. In this paper, we propose a statistical model of text Generation evaluation that accounts for the error-proneness of automated metrics when used to generate preference rankings between system outputs. We show that existing automated metrics are generally over-confident in assigning significant differences between systems in this setting. However, our model enables an efficient combination of human and
Repo: None
The text was updated successfully, but these errors were encountered: