New submissions for Fri, 9 Jun 23 #371

e-tornike · 2023-06-09T03:22:16Z

Keyword: contrastive

Generalizable Low-Resource Activity Recognition with Diverse and Discriminative Representation Learning

Authors: Xin Qin, Jindong Wang, Shuo Ma, Wang Lu, Yongchun Zhu, Xing Xie, Yiqiang Chen
Arxiv: https://arxiv.org/abs/2306.04641
TLDR: Human activity recognition (HAR) is a time series classification task that focuses on identifying the motion patterns from human sensor readings. Adequate data is essential but a major bottleneck for training a generalizable HAR model, which assists customization and optimization of online web applications. However, it is costly in time and economy to collect large-scale labeled data in reality, i.e., the low-resource challenge. Meanwhile, data collected from different persons have distribution shifts due to different living habits
Repo: None
Automatic retrieval of corresponding US views in longitudinal examinations
Authors: Hamideh Kerdegari, Tran Huy Nhat Phung1, Van Hao Nguyen, Thi Phuong Thao Truong, Ngoc Minh Thu Le, Thanh Phuong Le, Thi Mai Thao Le, Luigi Pisani, Linda Denehy, Vital Consortium, Reza Razavi, Louise Thwaites, Sophie Yacoub, Andrew P. King, Alberto Gomez
Arxiv: https://arxiv.org/abs/2306.04739
TLDR: Skeletal muscle atrophy is a common occurrence in critically ill patients in the intensive care unit (ICU) who spend long periods in bed. Muscle mass must be recovered through physiotherapy before patient discharge and ultrasound imaging is frequently used to assess the recovery process by measuring the muscle size over time. However, these manual measurements are subject to large variability, particularly since the scans are typically acquired on different days and potentially by different operators. In this paper, we propose a self-super
Repo: None
RefineVIS: Video Instance Segmentation with Temporal Attention Refinement
Authors: Andre Abrantes, Jiang Wang, Peng Chu, Quanzeng You, Zicheng Liu
Arxiv: https://arxiv.org/abs/2306.04774
TLDR: We introduce a novel framework called RefineVIS for Video Instance Segmentation (VIS) that achieves good object association between frames and accurate segmentation masks by iteratively refining the representations using sequence context.RefineVIS learns two separate representations on top of an off-the-shelf frame-level image instance segmentation model: an association representation responsible for associating objects across frames and a segmentation representation that produces accurate segmentations masks. Contrastive learning is utilized to learn tempor
Repo: None
Generative Text-Guided 3D Vision-Language Pretraining for Unified Medical Image Segmentation
Authors: Yinda Chen, Che Liu, Wei Huang, Sibo Cheng, Rossella Arcucci, Zhiwei Xiong
Arxiv: https://arxiv.org/abs/2306.04811
TLDR: Vision-Language Pretraining (VLP) has demonstrated remarkable capabilities in learning visual representations from textual descriptions of images without annotations. Yet, effective VLP demands large-scale image-text pairs, a resource that suffers scarcity in the medical domain. Moreover, conventional VLP is limited to 2D images while medical images encompass diverse modalities, often in 3D, making the learning process more challenging. To address these challenges, we present Generative Text-Guided 3D Vision-
Repo: None
On the Effectiveness of Out-of-Distribution Data in Self-Supervised Long-Tail Learning
Authors: Jianhong Bai, Zuozhu Liu, Hualiang Wang, Jin Hao, Yang Feng, Huanpeng Chu, Haoji Hu
Arxiv: https://arxiv.org/abs/2306.04934
TLDR: Though Self-supervised learning (SSL) has been widely studied as a promising technique for representation learning, it doesn't generalize well on long-tailed datasets due to the majority classes dominating the feature space. Recent work shows that the long-tail learning performance could be boosted by sampling extra in-domain (ID) data for self-Supervised training, however, large-scale ID data which can rebalance the minority classes are expensive to collect. In this paper,
Repo: None
CoCo: A Coupled Contrastive Framework for Unsupervised Domain Adaptive Graph Classification
Authors: Nan Yin, Li Shen, Mengzhu Wang, Long Lan, Zeyu Ma, Chong Chen, Xian-Sheng Hua, Xiao Luo
Arxiv: https://arxiv.org/abs/2306.04979
TLDR: Although graph neural networks (GNNs) have achieved impressive achievements in graph classification, they often need abundant task-specific labels, which could be extensively costly to acquire. A credible solution is to explore additional labeled graphs to enhance unsupervised learning on the target domain. However, how to apply GNNs to domain adaptation remains unsolved owing to the insufficient exploration of graph topology and the significant domain discrepancy. In this paper, we propose \underline{Co}upled \
Repo: None
COURIER: Contrastive User Intention Reconstruction for Large-Scale Pre-Train of Image Features
Authors: Jia-Qi Yang, Chenglei Dai, OU Dan, Ju Huang, De-Chuan Zhan, Qingwen Liu, Xiaoyi Zeng, Yang Yang
Arxiv: https://arxiv.org/abs/2306.05001
TLDR: With the development of the multi-media internet, visual characteristics have become an important factor affecting user interests. Thus, incorporating visual features is a promising direction for further performance improvements in click-through rate (CTR) prediction. However, we found that simply injecting the image embeddings trained with established pre-training methods only has marginal improvements. We attribute the failure to two reasons: First, The pre-trained methods are designed for well-defined computer vision tasks concentrating on semantic features
Repo: None
Attention Weighted Mixture of Experts with Contrastive Learning for Personalized Ranking in E-commerce
Authors: Juan Gong, Zhenlin Chen, Chaoyi Ma, Zhuojian Xiao, Haonan Wang, Guoyu Tang, Lin Liu, Sulong Xu, Bo Long, Yunjiang Jiang
Arxiv: https://arxiv.org/abs/2306.05011
TLDR: Ranking model plays an essential role in e-commerce search and recommendation. An effective ranking model should give a personalized ranking list for each user according to the user preference. Existing algorithms usually extract a user representation vector from the user behavior sequence, then feed the vector into a feed-forward network (FFN) together with other features for feature interactions, and finally produce a personalized Ranking score. Despite tremendous progress in the past, there is still room for improvement. Firstly, the personalized
Repo: None
Sy-CON: Symmetric Contrastive Loss for Continual Self-Supervised Representation Learning
Authors: Sungmin Cha, Taesup Moon
Arxiv: https://arxiv.org/abs/2306.05101
TLDR: We introduce a novel and general loss function, called Symmetric Contrastive (Sy-CON) loss, for effective continual self-supervised learning (CSSL). We first argue that the conventional loss form of continual learning which consists of single task-specific loss (for plasticity) and a regularizer (for stability) may not be ideal for contrastive loss based CSSL that focus on representation learning. Our reasoning is that, in contrastive learning based methods, the task
Repo: None
Variable Radiance Field for Real-Life Category-Specifc Reconstruction from Single Image
Authors: Kun Wang, Zhiqiang Yan, Zhenyu Zhang, Xiang Li, Jun Li, Jian Yang
Arxiv: https://arxiv.org/abs/2306.05145
TLDR: Reconstructing category-specific objects from a single image is a challenging task that requires inferring the geometry and appearance of an object from a limited viewpoint. Existing methods typically rely on local feature retrieval based on re-projection with known camera parameters, which are slow and prone to distortion at viewpoints distant from the input image. In this paper, we present Variable Radiance Field (VRF), a novel framework that can efficiently reconstruct category-numerical objects with a single
Repo: None
Devil is in Channels: Contrastive Single Domain Generalization for Medical Image Segmentation
Authors: Shishuai Hu, Zehui Liao, Yong Xia
Arxiv: https://arxiv.org/abs/2306.05254
TLDR: Deep learning-based medical image segmentation models suffer from performance degradation when deployed to a new healthcare center. To address this issue, unsupervised domain adaptation and multi-source domain generalization methods have been proposed, which, however, are less favorable for clinical practice due to the cost of acquiring target-domain data and the privacy concerns associated with redistributing the data from multiple source domains. In this paper, we propose a \textbf{C}hannel-level \
Repo: None
Factorized Contrastive Learning: Going Beyond Multi-view Redundancy
Authors: Paul Pu Liang, Zihao Deng, Martin Ma, James Zou, Louis-Philippe Morency, Ruslan Salakhutdinov
Arxiv: https://arxiv.org/abs/2306.05268
TLDR: In a wide range of multimodal tasks, contrastive learning has become a particularly appealing approach since it can successfully learn representations from abundant unlabeled data with only pairing information (e.g., image-caption or video-audio pairs). Underpinning these approaches is the assumption of multi-view redundancy - that shared information between modalities is necessary and sufficient for downstream tasks. However, in many real-world settings, task-relevant information is also contained in modality
Repo: https://github.com/pliang279/factorcl
R-MAE: Regions Meet Masked Autoencoders
Authors: Duy-Kien Nguyen, Vaibhav Aggarwal, Yanghao Li, Martin R. Oswald, Alexander Kirillov, Cees G. M. Snoek, Xinlei Chen
Arxiv: https://arxiv.org/abs/2306.05411
TLDR: Vision-specific concepts such as "region" have played a key role in extending general machine learning frameworks to tasks like object detection. Given the success of region-based detectors for supervised learning and the progress of intra-image methods for contrastive learning, we explore the use of regions for reconstructive pre-training. Starting from Masked Autoencoding (MAE) both as a baseline and an inspiration, we propose a parallel pre-text task tailored to address the one-
Repo: None

Keyword: data augmentation

Enhancing Robustness of AI Offensive Code Generators via Data Augmentation

Authors: Cristina Improta, Pietro Liguori, Roberto Natella, Bojan Cukic, Domenico Cotroneo
Arxiv: https://arxiv.org/abs/2306.05079
TLDR: In this work, we present a method to add perturbations to the code descriptions, i.e., new inputs in natural language (NL) from well-intentioned developers, in the context of security-oriented code, and analyze how and to what extent perturbation affect the performance of AI offensive code generators. Our experiments show that the performance and diversity of the AI offensive and non-AI offensive code descriptions are highly affected by perturations in the NL descriptions. To
Repo: None
Factorized Contrastive Learning: Going Beyond Multi-view Redundancy
Authors: Paul Pu Liang, Zihao Deng, Martin Ma, James Zou, Louis-Philippe Morency, Ruslan Salakhutdinov
Arxiv: https://arxiv.org/abs/2306.05268
TLDR: In a wide range of multimodal tasks, contrastive learning has become a particularly appealing approach since it can successfully learn representations from abundant unlabeled data with only pairing information (e.g., image-caption or video-audio pairs). Underpinning these approaches is the assumption of multi-view redundancy - that shared information between modalities is necessary and sufficient for downstream tasks. However, in many real-world settings, task-relevant information is also contained in modality
Repo: https://github.com/pliang279/factorcl
KIT's Multilingual Speech Translation System for IWSLT 2023
Authors: Danni Liu, Thai Binh Nguyen, Sai Koneru, Enes Yavuz Ugan, Ngoc-Quan Pham, Tuan-Nam Nguyen, Tu Anh Dinh, Carlos Mullov, Alexander Waibel, Jan Niehues
Arxiv: https://arxiv.org/abs/2306.05320
TLDR: Many existing speech translation benchmarks focus on native-English speech in high-quality recording conditions, which often do not match the conditions in real-life use-cases. In this paper, we describe our speech translation system for the multilingual track of IWSLT 2023, which focuses on the translation of scientific conference talks. The test condition features accented input speech and terminology-dense contents. The tasks requires translation into 10 languages of varying amounts of resources. In absence of
Repo: None

Keyword: knowledge discovery

SKG: A Versatile Information Retrieval and Analysis Framework for Academic Papers with Semantic Knowledge Graphs

Authors: Yamei Tu, Rui Qiu, Han-Wei Shen
Arxiv: https://arxiv.org/abs/2306.04758
TLDR: The number of published research papers has experienced exponential growth in recent years, which makes it crucial to develop new methods for efficient and versatile information extraction and knowledge discovery. To address this need, we propose a Semantic Knowledge Graph (SKG) that integrates semantic concepts from abstracts and other meta-information to represent the corpus. The SKG can support various semantic queries in academic literature thanks to the high diversity and rich information content stored within. To extract knowledge from unstructured text,
Repo: None

Keyword: knowledge graph

SKG: A Versatile Information Retrieval and Analysis Framework for Academic Papers with Semantic Knowledge Graphs

Authors: Yamei Tu, Rui Qiu, Han-Wei Shen
Arxiv: https://arxiv.org/abs/2306.04758
TLDR: The number of published research papers has experienced exponential growth in recent years, which makes it crucial to develop new methods for efficient and versatile information extraction and knowledge discovery. To address this need, we propose a Semantic Knowledge Graph (SKG) that integrates semantic concepts from abstracts and other meta-information to represent the corpus. The SKG can support various semantic queries in academic literature thanks to the high diversity and rich information content stored within. To extract knowledge from unstructured text,
Repo: None
Enabling tabular deep learning when $d \gg n$ with an auxiliary knowledge graph
Authors: Camilo Ruiz, Hongyu Ren, Kexin Huang, Jure Leskovec
Arxiv: https://arxiv.org/abs/2306.04766
TLDR: Machine learning models exhibit strong performance on datasets with abundant labeled samples. However, for tabular datasets with extremely high $d$-dimensional features but limited $n$ samples (i.e. $d \gg n$), machine learning models struggle to achieve strong performance due to the risk of overfitting. Here, our key insight is that there is often abundant, auxiliary domain information describing input features which can be structured as a heterogeneous knowledge graph (KG). We propose PL
Repo: None
A Survey on Knowledge Graphs for Healthcare: Resources, Applications, and Promises
Authors: Hejie Cui, Jiaying Lu, Shiyu Wang, Ran Xu, Wenjing Ma, Shaojun Yu, Yue Yu, Xuan Kan, Chen Ling, Joyce Ho, Fei Wang, Carl Yang
Arxiv: https://arxiv.org/abs/2306.04802
TLDR: Healthcare knowledge graphs (HKGs) have emerged as a promising tool for organizing medical knowledge in a structured and interpretable way, which provides a comprehensive view of medical concepts and their relationships. However, challenges such as data heterogeneity and limited coverage remain, emphasizing the need for further research in the field of HKGs. This survey paper serves as the first comprehensive overview of HKG. We summarize the pipeline and key techniques for HKG construction (i.e., from scratch and through integration
Repo: None
Revisiting Inferential Benchmarks for Knowledge Graph Completion
Authors: Shuwen Liu, Bernardo Cuenca Grau, Ian Horrocks, Egor V. Kostylev
Arxiv: https://arxiv.org/abs/2306.04814
TLDR: Knowledge Graph (KG) completion is the problem of extending an incomplete KG with missing facts. A key feature of Machine Learning approaches for KG completion is their ability to learn inference patterns, so that the predicted facts are the results of applying these patterns to the KG. Standard completion benchmarks, however, are not well-suited for evaluating models' abilities to learn patterns, because the training and test sets of these benchmarks are a random split of a given KG and
Repo: None

Keyword: legal

Improving Vietnamese Legal Question--Answering System based on Automatic Data Enrichment

Authors: Thi-Hai-Yen Vuong, Ha-Thanh Nguyen, Quang-Huy Nguyen, Le-Minh Nguyen, Xuan-Hieu Phan
Arxiv: https://arxiv.org/abs/2306.04841
TLDR: Question answering (QA) in law is a challenging problem because legal documents are much more complicated than normal texts in terms of terminology, structure, and temporal and logical relationships. It is even more difficult to perform legal QA for low-resource languages like Vietnamese where labeled data are rare and pre-trained language models are still limited. In this paper, we try to overcome these limitations by implementing a Vietnamese article-level retrieval-based legalQA system and introduce a novel method to
Repo: None
NOWJ at COLIEE 2023 -- Multi-Task and Ensemble Approaches in Legal Information Processing
Authors: Thi-Hai-Yen Vuong, Hai-Long Nguyen, Tan-Minh Nguyen, Hoang-Trung Nguyen, Thai-Binh Nguyen, Ha-Thanh Nguyen
Arxiv: https://arxiv.org/abs/2306.04903
TLDR: This paper presents the NOWJ team's approach to the COLIEE 2023 Competition, which focuses on advancing legal information processing techniques and applying them to real-world legal scenarios. Our team tackles the four tasks in the competition, which involve legal case retrieval, legal case entailment, statute law retrieval, and legal textual entailment. We employ state-of-the-art machine learning models and innovative approaches, such as BERT, Longformer, BM25-ranking algorithm,
Repo: None
Reconciling Predictive and Statistical Parity: A Causal Approach
Authors: Drago Plecko, Elias Bareinboim
Arxiv: https://arxiv.org/abs/2306.05059
TLDR: Since the rise of fair machine learning as a critical field of inquiry, many different notions on how to quantify and measure discrimination have been proposed in the literature. Some of these notions, however, were shown to be mutually incompatible. Such findings make it appear that numerous different kinds of fairness exist, thereby making a consensus on the appropriate measure of fairness harder to reach, hindering the applications of these tools in practice. In this paper, we investigate one of these key notions, and present a
Repo: None
A Computational Analysis of Oral Argument in the Supreme Court
Authors: Gregory M. Dickinson
Arxiv: https://arxiv.org/abs/2306.05373
TLDR: As the most public component of the Supreme Court's decision-making process, oral argument receives an out-sized share of attention in the popular media. Despite its prominence, however, the basic function and operation of oral argument as an institution remains poorly understood, as political scientists and legal scholars continue to debate even the most fundamental questions about its role. Past study of oral position has tended to focus on discrete, quantifiable attributes of oral arguments, such as the number of questions asked to each
Repo: None

Keyword: legal text

NOWJ at COLIEE 2023 -- Multi-Task and Ensemble Approaches in Legal Information Processing

Authors: Thi-Hai-Yen Vuong, Hai-Long Nguyen, Tan-Minh Nguyen, Hoang-Trung Nguyen, Thai-Binh Nguyen, Ha-Thanh Nguyen
Arxiv: https://arxiv.org/abs/2306.04903
TLDR: This paper presents the NOWJ team's approach to the COLIEE 2023 Competition, which focuses on advancing legal information processing techniques and applying them to real-world legal scenarios. Our team tackles the four tasks in the competition, which involve legal case retrieval, legal case entailment, statute law retrieval, and legal textual entailment. We employ state-of-the-art machine learning models and innovative approaches, such as BERT, Longformer, BM25-ranking algorithm,
Repo: None

Keyword: mixup

Non-autoregressive Conditional Diffusion Models for Time Series Prediction

Authors: Lifeng Shen, James Kwok
Arxiv: https://arxiv.org/abs/2306.05043
TLDR: Recently, denoising diffusion models have led to significant breakthroughs in the generation of images, audio and text. However, it is still an open question on how to adapt their strong modeling ability to model time series. In this paper, we propose TimeDiff, a non-autoregressive diffusion model that achieves high-quality time series prediction with the introduction of two novel conditioning mechanisms: future mixup and autoregressive initialization. Similar to teacher forcing, future Mixup allows parts
Repo: None

Keyword: multi-task

Language Adaptive Weight Generation for Multi-task Visual Grounding

Authors: Wei Su, Peihan Miao, Huanzhang Dou, Gaoang Wang, Liang Qiao, Zheyang Li, Xi Li
Arxiv: https://arxiv.org/abs/2306.04652
TLDR: Although the impressive performance in visual grounding, the prevailing approaches usually exploit the visual backbone in a passive way, i.e., the Visual backbone extracts features with fixed weights without expression-related hints. The passive perception may lead to mismatches (e.g., redundant and missing), limiting further performance improvement. Ideally, the visual spine should actively extract visual features since the expressions already provide the blueprint of desired visual features. The active perception can take expressions as priors to extract relevant visual features
Repo: None
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Authors: Hanrong Ye, Dan Xu
Arxiv: https://arxiv.org/abs/2306.04842
TLDR: Multi-task scene understanding aims to design models that can simultaneously predict several scene understanding tasks with one versatile model. Previous studies typically process multi-task features in a more local way, and thus cannot effectively learn spatially global and cross-task interactions, which hampers the models' ability to fully leverage the consistency of various tasks. To tackle this problem, we propose an Inverted Pyramid multi-Task Transformer, capable of modeling cross-Task interaction among spatial features of different tasks
Repo: None
NOWJ at COLIEE 2023 -- Multi-Task and Ensemble Approaches in Legal Information Processing
Authors: Thi-Hai-Yen Vuong, Hai-Long Nguyen, Tan-Minh Nguyen, Hoang-Trung Nguyen, Thai-Binh Nguyen, Ha-Thanh Nguyen
Arxiv: https://arxiv.org/abs/2306.04903
TLDR: This paper presents the NOWJ team's approach to the COLIEE 2023 Competition, which focuses on advancing legal information processing techniques and applying them to real-world legal scenarios. Our team tackles the four tasks in the competition, which involve legal case retrieval, legal case entailment, statute law retrieval, and legal textual entailment. We employ state-of-the-art machine learning models and innovative approaches, such as BERT, Longformer, BM25-ranking algorithm,
Repo: None
Prefer to Classify: Improving Text Classifiers via Auxiliary Preference Learning
Authors: Jaehyung Kim, Jinwoo Shin, Dongyeop Kang
Arxiv: https://arxiv.org/abs/2306.04925
TLDR: The development of largely human-annotated benchmarks has driven the success of deep neural networks in various NLP tasks. To enhance the effectiveness of existing benchmarks, collecting new additional input-output pairs is often too costly and challenging, particularly considering their marginal impact on improving the current model accuracy. Instead, additional or complementary annotations on the existing input texts in the benchmarks can be preferable as an efficient way to pay the additional human cost. In this paper, we investigate task-specific preferences between pairs
Repo: None
A Dynamic Feature Interaction Framework for Multi-task Visual Perception
Authors: Yuling Xi, Hao Chen, Ning Wang, Peng Wang, Yanning Zhang, Chunhua Shen, Yifan Liu
Arxiv: https://arxiv.org/abs/2306.05061
TLDR: Multi-task visual perception has a wide range of applications in scene understanding such as autonomous driving. In this work, we devise an efficient unified framework to solve multiple common perception tasks, including instance segmentation, semantic segmentation and monocular 3D detection, and depth estimation. Simply sharing the same visual feature representations for these tasks impairs the performance of tasks, while independent task-specific feature extractors lead to parameter redundancy and latency. Thus, we design two feature-merge branches
Repo: None
LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for Sexism Detection and Classification
Authors: Konstantin Chernyshev, Ekaterina Garanina, Duygu Bayram, Qiankun Zheng, Lukas Edman
Arxiv: https://arxiv.org/abs/2306.05075
TLDR: Misogyny and sexism are growing problems in social media. Advances have been made in online sexism detection but the systems are often uninterpretable. SemEval-2023 Task 10 on Explainable Detection of Online Sexism aims at increasing explainability of the sexism detection, and our team participated in all the proposed subtasks. Our system is based on further domain-adaptive pre-training (Gururangan et al., 2020). Building on the Transformer-based
Repo: https://github.com/lct-rug-2022/edos-2023
Efficient Multi-Task Scene Analysis with RGB-D Transformers
Authors: Söhnke Benedikt Fischedick, Daniel Seichter, Robin Schmidt, Leonard Rabes, Horst-Michael Gross
Arxiv: https://arxiv.org/abs/2306.05242
TLDR: Scene analysis is essential for enabling autonomous systems, such as mobile robots, to operate in real-world environments. However, obtaining a comprehensive understanding of the scene requires solving multiple tasks, suchas panoptic segmentation, instance orientation estimation, and scene classification. Solving these tasks given limited computing and battery capabilities on mobile platforms is challenging. To address this challenge, we introduce an efficient multi-task scene analysis approach, called EMSAFormer, that uses an RGB-D Transformer
Repo: https://github.com/tui-nicr/nicr-scene-analysis-datasets

Keyword: robustness

Robust-DefReg: A Robust Deformable Point Cloud Registration Method based on Graph Convolutional Neural Networks

Authors: Sara Monji-Azad, Marvin Kinz, Jürgen Hesser
Arxiv: https://arxiv.org/abs/2306.04701
TLDR: Point cloud registration is a fundamental problem in computer vision that aims to estimate the transformation between corresponding sets of points. Non-rigid registration, in particular, involves addressing challenges including various levels of deformation, noise, outliers, and data incompleteness. This paper introduces Robust-DefReg, a robust non-rigidity point cloud registration method based on graph convolutional networks (GCNNs). Robust–DefReg is a coarse-to-fine registration approach
Repo: None
Prompter: Zero-shot Adaptive Prefixes for Dialogue State Tracking Domain Adaptation
Authors: Taha Aksu, Min-Yen Kan, Nancy F. Chen
Arxiv: https://arxiv.org/abs/2306.04724
TLDR: A challenge in the Dialogue State Tracking (DST) field is adapting models to new domains without using any supervised data, zero-shot domain adaptation. Parameter-Efficient Transfer Learning (PETL) has the potential to address this problem due to its robustness. However, it has yet to be applied to the zero-stroke scenarios, as it is not clear how to apply it unsupervisedly. Our method, Prompter, uses descriptions of target domain slots to generate
Repo: None
WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models
Authors: Changhoon Kim, Kyle Min, Maitreya Patel, Sheng Cheng, Yezhou Yang
Arxiv: https://arxiv.org/abs/2306.04744
TLDR: The rapid advancement of generative models, facilitating the creation of hyper-realistic images from textual descriptions, has concurrently escalated critical societal concerns such as misinformation. Traditional fake detection mechanisms, although providing some mitigation, fall short in attributing responsibility for the malicious use of synthetic images. This paper introduces a novel approach to model fingerprinting that assigns responsibility for a generated images, thereby serving as a potential countermeasure to model misuse. Our method modifies generative model based on each user's
Repo: None
Data Augmentation for Improving Tail-traffic Robustness in Skill-routing for Dialogue Systems
Authors: Ting-Wei Wu, Fatemeh Sheikholeslami, Mohammad Kachuee, Jaeyoung Do, Sungjin Lee
Arxiv: https://arxiv.org/abs/2306.04823
TLDR: Large-scale conversational systems typically rely on a skill-routing component to route a user request to an appropriate skill and interpretation to serve the request. In such system, the agent is responsible for serving thousands of skills and interpretations which create a long-tail distribution due to the natural frequency of requests. For example, the samples related to play music might be a thousand times more frequent than those asking for theatre show times. Moreover, inputs used for ML-based skill routing are often
Repo: None
Expanding Scope: Adapting English Adversarial Attacks to Chinese
Authors: Hanyu Liu, Chengyuan Cai, Yanjun Qi
Arxiv: https://arxiv.org/abs/2306.04874
TLDR: Recent studies have revealed that NLP predictive models are vulnerable to adversarial attacks. Most existing studies focused on designing attacks to evaluate the robustness of NLP models in the English language alone. Literature has seen an increasing need for NLP solutions for other languages. We, therefore, ask one natural question: whether state-of-the-art (SOTA) attack methods generalize to other languages? This paper investigates how to adapt SOTA adversarial attack algorithms in English to the
Repo: None
Combined Left and Right Temporal Robustness for Control under STL Specifications
Authors: Alëna Rodionova, Lars Lindemann, Manfred Morari, George J. Pappas
Arxiv: https://arxiv.org/abs/2306.04936
TLDR: Many modern autonomous systems, particularly multi-agent systems, are time-critical and need to be robust against timing uncertainties. Previous works have studied left and right time robustness of signal temporal logic specifications by considering time shifts in the predicates that are either only to the left or also to the right. We propose a combined notion of temporal robustness which simultaneously considers left andright time shifts. For instance, in a scenario where a robot plans a trajectory around a pedestrian, this combined notion
Repo: None
Layer-level activation mechanism
Authors: Yoon Kihyuk, Lim Chiehyeon
Arxiv: https://arxiv.org/abs/2306.04940
TLDR: In this work, we propose a novel activation mechanism aimed at establishing layer-level activation (LayerAct) functions. These functions are designed to be more noise-robust compared to traditional element-level activated functions by reducing the layer- level fluctuation of the activation outputs due to shift in inputs. Moreover, the LayerAct functions achieve a zero-like mean activation output without restricting the activation output space. We present an analysis and experiments demonstrating that LayerAct function exhibit superior noise-Rob
Repo: https://github.com/layeract/layeract
Robust Learning with Progressive Data Expansion Against Spurious Correlation
Authors: Yihe Deng, Yu Yang, Baharan Mirzasoleiman, Quanquan Gu
Arxiv: https://arxiv.org/abs/2306.04949
TLDR: While deep learning models have shown remarkable performance in various tasks, they are susceptible to learning non-generalizable spurious features rather than the core features that are genuinely correlated to the true label. In this paper, beyond existing analyses of linear models, we theoretically examine the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features. Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious feature during the
Repo: None
Degraded Polygons Raise Fundamental Questions of Neural Network Perception
Authors: Leonard Tang, Dan Ley
Arxiv: https://arxiv.org/abs/2306.04955
TLDR: It is well-known that modern computer vision systems often exhibit behaviors misaligned with those of humans: from adversarial attacks to image corruptions, deep learning vision models suffer in a variety of settings that humans capably handle. In light of these phenomena, here we introduce another, orthogonal perspective studying the human-machine vision gap. We revisit the task of recovering images under degradation, first introduced over 30 years ago in the Recognition-by-Components theory of human vision
Repo: None
Generalizable Lightweight Proxy for Robust NAS against Diverse Perturbations
Authors: Hyeonjeong Ha, Minseon Kim, Sung Ju Hwang
Arxiv: https://arxiv.org/abs/2306.05031
TLDR: Recent neural architecture search (NAS) frameworks have been successful in finding optimal architectures for a given conditions (e.g., performance or latency). However, they search for optimal architectures in terms of their performance on clean images only, while robustness against various types of perturbations or corruptions is crucial in practice. Although there exist several robust NAS frameworks that tackle this issue by integrating adversarial training into one-shot NAS, however, they are limited in that they only consider robustness
Repo: None
Enhancing Robustness of AI Offensive Code Generators via Data Augmentation
Authors: Cristina Improta, Pietro Liguori, Roberto Natella, Bojan Cukic, Domenico Cotroneo
Arxiv: https://arxiv.org/abs/2306.05079
TLDR: In this work, we present a method to add perturbations to the code descriptions, i.e., new inputs in natural language (NL) from well-intentioned developers, in the context of security-oriented code, and analyze how and to what extent perturbation affect the performance of AI offensive code generators. Our experiments show that the performance and diversity of the AI offensive and non-AI offensive code descriptions are highly affected by perturations in the NL descriptions. To
Repo: None
On the Robustness of Topics API to a Re-Identification Attack
Authors: Nikhil Jha, Martino Trevisan, Emilio Leonardi, Marco Mellia
Arxiv: https://arxiv.org/abs/2306.05094
TLDR: Web tracking through third-party cookies is considered a threat to users' privacy and is supposed to be abandoned in the near future. Recently, Google proposed the Topics API framework as a privacy-friendly alternative for behavioural advertising. Using this approach, the browser builds a user profile based on navigation history, which advertisers can access. The Topics API has the possibility of becoming the new standard for behavioral advertising, thus it is necessary to fully understand its operation and find possible limitations. This paper evaluates the
Repo: None
Focus for Free in Density-Based Counting
Authors: Zenglin Shi, Pascal Mettes, Cees G.M. Snoek
Arxiv: https://arxiv.org/abs/2306.05129
TLDR: This work considers supervised learning to count from images and their corresponding point annotations. Where density-based counting methods typically use the point annotations only to create Gaussian-density maps, which act as the supervision signal, the starting point of this work is that point annotations have counting potential beyond density map generation. We introduce two methods that repurpose the available point annotations to enhance counting performance. The first is a counting-specific augmentation that leverages point annot to simulate occluded objects
Repo: None
DFT-Based Channel Estimation for Holographic MIMO
Authors: Antonio Alberto D'Amico, Giacomo Bacci, Luca Sanguinetti
Arxiv: https://arxiv.org/abs/2306.05156
TLDR: Holographic MIMO (hMIMO) systems with a massive number of individually controlled antennas N make minimum mean square error (MMSE) channel estimation particularly challenging, due to its computational complexity that scales as $N^3$ . This paper investigates uniform linear arrays and proposes a low-complexity method based on the discrete Fourier transform (DFT) approximation, which follows from replacing the covariance matrix by a suitable circulant matrix. Numerical results
Repo: None
Global Stabilization of Antipodal Points on n-Sphere with Application to Attitude Tracking
Authors: Xin Tong, Shing Shin Cheng
Arxiv: https://arxiv.org/abs/2306.05234
TLDR: Existing approaches to robust global asymptotic stabilization of a pair of antipodal points on unit $n$-sphere $\mathbb{S}^n$ typically involve the non-centrally synergistic hybrid controllers for attitude tracking on unit quaternion space. However, when switching faults occur due to parameter errors or in some cases, it can lead to the unwinding of the desired set. In this work, a hybrid controller is first proposed based on
Repo: None
Trustworthy Sensor Fusion against Inaudible Command Attacks in Advanced Driver-Assistance System
Authors: Jiwei Guan, Lei Pan, Chen Wang, Shui Yu, Longxiang Gao, Xi Zheng
Arxiv: https://arxiv.org/abs/2306.05358
TLDR: There are increasing concerns about malicious attacks on autonomous vehicles. In particular, inaudible voice command attacks pose a significant threat as voice commands become available in autonomous driving systems. How to empirically defend against these inaudibles attacks remains an open question. Previous research investigates utilizing deep learning-based multimodal fusion for defense, without considering the model uncertainty in trustworthiness. As deep learning has been applied to increasingly sensitive tasks, uncertainty measurement is crucial in helping improve model robustness, especially
Repo: None

Keyword: simplification

A shape derivative approach to domain simplification

Authors: Jochen Hinz, Ondine Chanon, Alessandra Arrigoni, Annalisa Buffa
Arxiv: https://arxiv.org/abs/2306.05384
TLDR: The objective of this study is to address the difficulty of simplifying the geometric model in which a differential problem is formulated, also called defeaturing, while simultaneously ensuring that the accuracy of the solution is maintained under control. This enables faster and more efficient simulations, without sacrificing accuracy. More precisely, we consider an isogeometric discretisation of an elliptic model problem defined on a two-dimensional hierarchical B-spline computational domain with a complex boundary. Starting with an oversimplification
Repo: None

Keyword: summarization

Absformer: Transformer-based Model for Unsupervised Multi-Document Abstractive Summarization

Authors: Mohamed Trabelsi, Huseyin Uzunalioglu
Arxiv: https://arxiv.org/abs/2306.04787
TLDR: Multi-document summarization (MDS) refers to the task of summarizing the text in multiple documents into a concise summary. The generated summary can save the time of reading many documents by providing the important content in the form of a few sentences. Abstractive MDS aims to generate a coherent and fluent summary for multiple documents using natural language generation techniques. In this paper, we consider the unsupervised abstractive MDF setting where there are only documents with no groundtruh
Repo: None
Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework
Authors: Mingqi Gao, Xiaojun Wan, Jia Su, Zhefeng Wang, Baoxing Huai
Arxiv: https://arxiv.org/abs/2306.05119
TLDR: Factuality is important to dialogue summarization. Factual error correction (FEC) of model-generated summaries is one way to improve factuality. Current FEC evaluation that relies on factuality metrics is not reliable and detailed enough. To address this problem, we are the first to manually annotate a FEC dataset for dialogue summarisation containing 4000 items and propose FERRANTI, a fine-grained evaluation framework based on reference correction that automatically evaluates the performance of FEC models on
Repo: None
Overview of the Problem List Summarization (ProbSum) 2023 Shared Task on Summarizing Patients' Active Diagnoses and Problems from Electronic Health Record Progress Notes
Authors: Yanjun Gao, Dmitriy Dligach, Timothy Miller, Matthew M. Churpek, Majid Afshar
Arxiv: https://arxiv.org/abs/2306.05270
TLDR: The BioNLP Workshop 2023 initiated the launch of a shared task on Problem List Summarization (ProbSum) in January 2023. The aim of this shared task is to attract future research efforts in building NLP models for real-world diagnostic decision support applications, where a system generating relevant and accurate diagnoses will augment the healthcare providers decision-making process and improve the quality of care for patients. The goal for participants is to develop models that generated a list of diagnoses and
Repo: None
CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models
Authors: Potsawee Manakul, Yassir Fathullah, Adian Liusie, Vyas Raina, Vatsal Raina, Mark Gales
Arxiv: https://arxiv.org/abs/2306.05317
TLDR: In this paper, we consider the challenge of summarizing patients' medical progress notes in a limited data setting. For the Problem List Summarization (shared task 1A) at the BioNLP Workshop 2023, we demonstrate that Clinical-T5 fine-tuned to 765 medical clinic notes outperforms other extractive, abstractive and zero-shot baselines, yielding reasonable baseline systems for medical note summarization. Further, we introduce Hierarchical Ensemble of Sum
Repo: None

Keyword: text generation

SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking

Authors: Chris Cundy, Stefano Ermon
Arxiv: https://arxiv.org/abs/2306.05426
TLDR: In many domains, autoregressive models can achieve low log-likelihood on the task of predicting the next observation. However, this maximum-likelikelihood (MLE) objective does not necessarily match a downstream use-case of autoregressively generating high-quality sequences. The MLE objective weights sequences proportionally to their frequency under the data distribution, with no guidance for the model's behaviour out of distribution (OOD): leading to compounding error. In order to address
Repo: None

e-tornike self-assigned this Jun 9, 2023

New submissions for Fri, 9 Jun 23 #371

New submissions for Fri, 9 Jun 23 #371

Comments

e-tornike commented Jun 9, 2023

Keyword: contrastive

Generalizable Low-Resource Activity Recognition with Diverse and Discriminative Representation Learning

Automatic retrieval of corresponding US views in longitudinal examinations

RefineVIS: Video Instance Segmentation with Temporal Attention Refinement

Generative Text-Guided 3D Vision-Language Pretraining for Unified Medical Image Segmentation

On the Effectiveness of Out-of-Distribution Data in Self-Supervised Long-Tail Learning

CoCo: A Coupled Contrastive Framework for Unsupervised Domain Adaptive Graph Classification

COURIER: Contrastive User Intention Reconstruction for Large-Scale Pre-Train of Image Features

Attention Weighted Mixture of Experts with Contrastive Learning for Personalized Ranking in E-commerce

Sy-CON: Symmetric Contrastive Loss for Continual Self-Supervised Representation Learning

Variable Radiance Field for Real-Life Category-Specifc Reconstruction from Single Image

Devil is in Channels: Contrastive Single Domain Generalization for Medical Image Segmentation

Factorized Contrastive Learning: Going Beyond Multi-view Redundancy

R-MAE: Regions Meet Masked Autoencoders

Keyword: data augmentation

Enhancing Robustness of AI Offensive Code Generators via Data Augmentation

Factorized Contrastive Learning: Going Beyond Multi-view Redundancy

KIT's Multilingual Speech Translation System for IWSLT 2023

Keyword: knowledge discovery

SKG: A Versatile Information Retrieval and Analysis Framework for Academic Papers with Semantic Knowledge Graphs

Keyword: knowledge graph

SKG: A Versatile Information Retrieval and Analysis Framework for Academic Papers with Semantic Knowledge Graphs

Enabling tabular deep learning when $d \gg n$ with an auxiliary knowledge graph

A Survey on Knowledge Graphs for Healthcare: Resources, Applications, and Promises

Revisiting Inferential Benchmarks for Knowledge Graph Completion

Keyword: legal

Improving Vietnamese Legal Question--Answering System based on Automatic Data Enrichment

NOWJ at COLIEE 2023 -- Multi-Task and Ensemble Approaches in Legal Information Processing

Reconciling Predictive and Statistical Parity: A Causal Approach

A Computational Analysis of Oral Argument in the Supreme Court

Keyword: legal text

NOWJ at COLIEE 2023 -- Multi-Task and Ensemble Approaches in Legal Information Processing

Keyword: mixup

Non-autoregressive Conditional Diffusion Models for Time Series Prediction

Keyword: multi-task

Language Adaptive Weight Generation for Multi-task Visual Grounding

InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding

NOWJ at COLIEE 2023 -- Multi-Task and Ensemble Approaches in Legal Information Processing

Prefer to Classify: Improving Text Classifiers via Auxiliary Preference Learning

A Dynamic Feature Interaction Framework for Multi-task Visual Perception

LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for Sexism Detection and Classification

Efficient Multi-Task Scene Analysis with RGB-D Transformers

Keyword: robustness

Robust-DefReg: A Robust Deformable Point Cloud Registration Method based on Graph Convolutional Neural Networks

Prompter: Zero-shot Adaptive Prefixes for Dialogue State Tracking Domain Adaptation

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models

Data Augmentation for Improving Tail-traffic Robustness in Skill-routing for Dialogue Systems

Expanding Scope: Adapting English Adversarial Attacks to Chinese

Combined Left and Right Temporal Robustness for Control under STL Specifications

Layer-level activation mechanism

Robust Learning with Progressive Data Expansion Against Spurious Correlation

Degraded Polygons Raise Fundamental Questions of Neural Network Perception

Generalizable Lightweight Proxy for Robust NAS against Diverse Perturbations

Enhancing Robustness of AI Offensive Code Generators via Data Augmentation

On the Robustness of Topics API to a Re-Identification Attack

Focus for Free in Density-Based Counting

DFT-Based Channel Estimation for Holographic MIMO

Global Stabilization of Antipodal Points on n-Sphere with Application to Attitude Tracking

Trustworthy Sensor Fusion against Inaudible Command Attacks in Advanced Driver-Assistance System

Keyword: simplification

A shape derivative approach to domain simplification

Keyword: summarization

Absformer: Transformer-based Model for Unsupervised Multi-Document Abstractive Summarization

Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework

Overview of the Problem List Summarization (ProbSum) 2023 Shared Task on Summarizing Patients' Active Diagnoses and Problems from Electronic Health Record Progress Notes

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

Keyword: text generation

SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking