论文 | 作者 | 组织 | 摘要 | 翻译 | 代码 | 引用数 |
---|---|---|---|---|---|---|
Enhancing Click-through Rate Prediction in Recommendation Domain with Search Query Representation | Yuening Wang, Man Chen, Yaochen Hu, Wei Guo, Yingxue Zhang, Huifeng Guo, Yong Liu, Mark Coates | Huawei Noah's Ark Lab, Shenzhen, China; Huawei Noah's Ark Lab, Singapore, Singapore; McGill University, Montreal, Canada; Huawei Noah's Ark Lab, Markham, Canada; Huawei Noah's Ark Lab, Montreal, Canada | Many platforms, such as e-commerce websites, offer both search and recommendation services simultaneously to better meet users' diverse needs. Recommendation services suggest items based on user preferences, while search services allow users to search for items before providing recommendations. Since users and items are often shared between the search and recommendation domains, there is a valuable opportunity to enhance the recommendation domain by leveraging user preferences extracted from the search domain. Existing approaches either overlook the shift in user intention between these domains or fail to capture the significant impact of learning from users' search queries on understanding their interests. In this paper, we propose a framework that learns from user search query embeddings within the context of user preferences in the recommendation domain. Specifically, user search query sequences from the search domain are used to predict the items users will click at the next time point in the recommendation domain. Additionally, the relationship between queries and items is explored through contrastive learning. To address issues of data sparsity, the diffusion model is incorporated to infer positive items the user will select after searching with certain queries in a denoising manner, which is particularly effective in preventing false positives. Effectively extracting this information, the queries are integrated into click-through rate prediction in the recommendation domain. Experimental analysis demonstrates that our model outperforms state-of-the-art models in the recommendation domain. | 许多平台,如电子商务网站,同时提供搜索和推荐服务,以更好地满足用户多样化的需求。推荐服务根据用户的偏好推荐商品,而搜索服务则允许用户在提供推荐之前搜索商品。由于用户和商品通常在搜索和推荐领域之间共享,因此有机会通过利用从搜索领域提取的用户偏好来增强推荐领域。现有方法要么忽略了这两个领域之间用户意图的转变,要么未能捕捉到从用户搜索查询中学习对理解用户兴趣的重大影响。本文提出了一种框架,该框架在推荐领域的用户偏好背景下学习用户搜索查询嵌入。具体来说,使用搜索领域的用户搜索查询序列来预测用户在推荐领域中下一次点击的商品。此外,通过对比学习探索查询与商品之间的关系。为了解决数据稀疏性问题,采用了扩散模型以去噪方式推断用户在使用某些查询进行搜索后将选择的正向商品,这在防止误报方面特别有效。有效地提取这些信息后,将查询整合到推荐领域的点击率预测中。实验分析表明,我们的模型在推荐领域的表现优于最先进的模型。 | code | 0 |
Calibration-Disentangled Learning and Relevance-Prioritized Reranking for Calibrated Sequential Recommendation | Hyunsik Jeon, Seeun Yoon, Julian J. McAuley | Calibrated recommendation, which aims to maintain personalized proportions of categories within recommendations, is crucial in practical scenarios since it enhances user satisfaction by reflecting diverse interests. However, achieving calibration in a sequential setting (i.e., calibrated sequential recommendation) is challenging due to the need to adapt to users' evolving preferences. Previous methods typically leverage reranking algorithms to calibrate recommendations after training a model without considering the effect of calibration and do not effectively tackle the conflict between relevance and calibration during the reranking process. In this work, we propose LeapRec (Calibration-Disentangled Learning and Relevance-Prioritized Reranking), a novel approach for the calibrated sequential recommendation that addresses these challenges. LeapRec consists of two phases, model training phase and reranking phase. In the training phase, a backbone model is trained using our proposed calibration-disentangled learning-to-rank loss, which optimizes personalized rankings while integrating calibration considerations. In the reranking phase, relevant items are prioritized at the top of the list, with items needed for calibration following later to address potential conflicts between relevance and calibration. Through extensive experiments on four real-world datasets, we show that LeapRec consistently outperforms previous methods in the calibrated sequential recommendation. Our code is available at https://github.com/jeon185/LeapRec. | 校准推荐旨在保持推荐中类别的个性化比例,这在实际场景中至关重要,因为它通过反映多样化的兴趣来增强用户满意度。然而,在序列环境中实现校准(即校准序列推荐)具有挑战性,因为需要适应用户不断变化的偏好。先前的方法通常利用重新排序算法在训练模型后进行推荐校准,而没有考虑校准效果,并且在重新排序过程中未能有效解决相关性与校准之间的冲突。在这项工作中,我们提出了LeapRec(校准解耦学习和相关性优先重新排序),这是一种新颖的校准序列推荐方法,旨在解决这些挑战。LeapRec包括两个阶段,模型训练阶段和重新排序阶段。在训练阶段,使用我们提出的校准解耦学习排序损失训练骨干模型,该损失在优化个性化排序的同时整合了校准考虑。在重新排序阶段,相关项目优先置于列表顶部,而需要校准的项目随后放置,以解决相关性与校准之间可能的冲突。通过在四个真实世界数据集上的广泛实验,我们展示了LeapRec在校准序列推荐方面始终优于先前的方法。我们的代码可在https://github.com/jeon185/LeapRec获取。 | code | 0 | |
Aligning Query Representation with Rewritten Query and Relevance Judgments in Conversational Search | Fengran Mo, Chen Qu, Kelong Mao, Yihong Wu, Zhan Su, Kaiyu Huang, JianYun Nie | Conversational search supports multi-turn user-system interactions to solve complex information needs. Different from the traditional single-turn ad-hoc search, conversational search encounters a more challenging problem of context-dependent query understanding with the lengthy and long-tail conversational history context. While conversational query rewriting methods leverage explicit rewritten queries to train a rewriting model to transform the context-dependent query into a stand-stone search query, this is usually done without considering the quality of search results. Conversational dense retrieval methods use fine-tuning to improve a pre-trained ad-hoc query encoder, but they are limited by the conversational search data available for training. In this paper, we leverage both rewritten queries and relevance judgments in the conversational search data to train a better query representation model. The key idea is to align the query representation with those of rewritten queries and relevant documents. The proposed model – Query Representation Alignment Conversational Dense Retriever, QRACDR, is tested on eight datasets, including various settings in conversational search and ad-hoc search. The results demonstrate the strong performance of QRACDR compared with state-of-the-art methods, and confirm the effectiveness of representation alignment. | 对话搜索支持多轮用户-系统交互,以解决复杂的信息需求。与传统的单轮即席搜索不同,对话搜索面临着一个更具挑战性的问题,即在长篇且长尾的对话历史背景下进行依赖上下文的查询理解。虽然对话查询重写方法利用显式的重写查询来训练重写模型,将依赖上下文的查询转换为独立的搜索查询,但这通常不考虑搜索结果的质量。对话密集检索方法通过微调预训练的即席查询编码器来改进,但受限于可用于训练的对话搜索数据。本文中,我们利用对话搜索数据中的重写查询和相关性判断来训练一个更好的查询表示模型。关键思想是将查询表示与重写查询和相关文档的表示对齐。提出的模型——查询表示对齐对话密集检索器(QRACDR),在八个数据集上进行了测试,包括对话搜索和即席搜索的各种设置。结果显示,QRACDR相比最先进的方法表现出强劲的性能,并证实了表示对齐的有效性。 | code | 0 | |
Improved Estimation of Ranks for Learning Item Recommenders with Negative Sampling | Anushya Subbiah, Steffen Rendle, Vikram Aggarwal | In recommendation systems, there has been a growth in the number of recommendable items (# of movies, music, products). When the set of recommendable items is large, training and evaluation of item recommendation models becomes computationally expensive. To lower this cost, it has become common to sample negative items. However, the recommendation quality can suffer from biases introduced by traditional negative sampling mechanisms. In this work, we demonstrate the benefits from correcting the bias introduced by sampling of negatives. We first provide sampled batch version of the well-studied WARP and LambdaRank methods. Then, we present how these methods can benefit from improved ranking estimates. Finally, we evaluate the recommendation quality as a result of correcting rank estimates and demonstrate that WARP and LambdaRank can be learned efficiently with negative sampling and our proposed correction technique. | 在推荐系统中,可推荐项目的数量(如电影、音乐、产品)有所增加。当可推荐项目的集合规模较大时,训练和评估项目推荐模型的计算成本会变得非常高。为了降低这一成本,通常会采用负样本采样的方法。然而,传统的负样本采样机制可能会引入偏差,从而影响推荐质量。在这项工作中,我们展示了通过纠正负样本采样引入的偏差所带来的好处。我们首先提供了经过深入研究的WARP和LambdaRank方法的采样批次版本。然后,我们展示了这些方法如何从改进的排序估计中受益。最后,我们评估了纠正排序估计后的推荐质量,并证明WARP和LambdaRank可以通过负样本采样和我们的修正技术高效地进行学习。 | code | 0 | |
Scalable Dynamic Embedding Size Search for Streaming Recommendation | Yunke Qu, Liang Qu, Tong Chen, Xiangyu Zhao, Quoc Viet Hung Nguyen, Hongzhi Yin | Recommender systems typically represent users and items by learning their embeddings, which are usually set to uniform dimensions and dominate the model parameters. However, real-world recommender systems often operate in streaming recommendation scenarios, where the number of users and items continues to grow, leading to substantial storage resource consumption for these embeddings. Although a few methods attempt to mitigate this by employing embedding size search strategies to assign different embedding dimensions in streaming recommendations, they assume that the embedding size grows with the frequency of users/items, which eventually still exceeds the predefined memory budget over time. To address this issue, this paper proposes to learn Scalable Lightweight Embeddings for streaming recommendation, called SCALL, which can adaptively adjust the embedding sizes of users/items within a given memory budget over time. Specifically, we propose to sample embedding sizes from a probabilistic distribution, with the guarantee to meet any predefined memory budget. By fixing the memory budget, the proposed embedding size sampling strategy can increase and decrease the embedding sizes in accordance to the frequency of the corresponding users or items. Furthermore, we develop a reinforcement learning-based search paradigm that models each state with mean pooling to keep the length of the state vectors fixed, invariant to the changing number of users and items. As a result, the proposed method can provide embedding sizes to unseen users and items. Comprehensive empirical evaluations on two public datasets affirm the advantageous effectiveness of our proposed method. | 推荐系统通常通过学习嵌入来表示用户和物品,这些嵌入通常设置为统一的维度,并且主导模型参数。然而,现实世界的推荐系统经常在流式推荐场景中运行,其中用户和物品的数量持续增长,导致这些嵌入的存储资源消耗巨大。尽管一些方法试图通过采用嵌入大小搜索策略在流式推荐中分配不同的嵌入维度来缓解这一问题,但它们假设嵌入大小随着用户/物品的频率增长,最终仍然会超过预定义的内存预算。为了解决这个问题,本文提出了一种名为SCALL的流式推荐可扩展轻量级嵌入学习方法,它能够在给定的内存预算内随时间自适应地调整用户/物品的嵌入大小。具体来说,我们提出从概率分布中采样嵌入大小,以确保满足任何预定义的内存预算。通过固定内存预算,所提出的嵌入大小采样策略可以根据相应用户或物品的频率增加或减少嵌入大小。此外,我们开发了一种基于强化学习的搜索范式,该范式通过均值池化来建模每个状态,以保持状态向量的长度固定,不受用户和物品数量变化的影响。因此,所提出的方法可以为未见过的用户和物品提供嵌入大小。在两个公共数据集上的综合实证评估证实了我们提出的方法的优势有效性。 | code | 0 | |
Ask or Recommend: An Empirical Study on Conversational Product Search | Heli Ma, Jie Zou, Mohammad Aliannejadi, Evangelos Kanoulas, Yi Bin, Yang Yang | University of Electronic Science and Technology of China, Chengdu, China; University of Amsterdam, Amstedam, Netherlands; University of Science and Technology of China, Chengdu, China; Tongji University, Shanghai, China; University of Amsterdam, Amsterdam, Netherlands | Conversational Product Search (CPS) provides an engaging way for users to find products through effective natural language conversations. However, understanding the effect of conversational characteristics on user search performance and when to ask clarifying questions or recommend products remains unexplored. To fill the gap, we conduct an empirical study in this paper. Specifically, we developed a conversational system that allows participants to join as customers or shopping assistants, to simulate the conversational product search activity. Data collected from conversations and participant feedback indicate that: (a) CPS systems tend to ask clarifying questions early in the conversation when users express the intent of issuing a new query and chitchat, while they tend to recommend products at a later stage of conversations; asking clarifying questions early and recommending products lately can significantly improve search performance and user's satisfaction; (b) asking clarifying questions and more fine-grained search keywords positively influence search performance in terms of finding relevant products; (c) although the conversation time has a positive impact on the number of recommended products, the performance gain diminishes with longer conversation time; (d) more clarifying questions, more conversation turns, and longer system response time lead to decreased user satisfaction. | 对话式产品搜索(Conversational Product Search, CPS)为用户提供了一种通过高效自然语言对话寻找产品的互动方式。然而,对话特性对用户搜索表现的影响以及何时提问澄清问题或推荐产品的问题尚未得到深入研究。为了填补这一空白,本文进行了一项实证研究。具体而言,我们开发了一个对话系统,允许参与者扮演顾客或购物助手的角色,模拟对话式产品搜索活动。从对话中收集的数据及参与者反馈表明:(a) CPS系统在用户表达新查询意图和闲聊时,倾向于在对话初期提问澄清问题,而在对话后期则更倾向于推荐产品;尽早提问澄清问题和延迟推荐产品可以显著提升搜索表现和用户满意度;(b)提问澄清问题和更细粒度的搜索关键词对查找相关产品有正面影响,从而提高搜索表现;(c)尽管对话时间对推荐产品数量有正面影响,但随对话时间延长,性能提升逐渐减少;(d)更多的澄清问题、更多的对话轮次和更长的系统响应时间会导致用户满意度下降。 | code | 0 |
Towards Better Seach Query Classification with Distribution-Diverse Multi-Expert Knowledge Distillation in JD Ads Search | KunPeng Ning, Ming Pang, Zheng Fang, Xue Jiang, XiWei Zhao, Changping Peng, Zhangang Lin, Jinghe Hu, Jingping Shao, Li Yuan | Peking University, Shenzhen, China; Business Growth BU, JD.COM, Beijing, China; Peking University, ShenZhen, China | In the dynamic landscape of online advertising, decoding user intent remains a pivotal challenge, particularly in the context of query classification. Swift classification models, exemplified by FastText, cater to the demand for real-time responses but encounter limitations in handling intricate queries. Conversely, accuracy-centric models like BERT introduce challenges associated with increased latency. This paper undertakes a nuanced exploration, navigating the delicate balance between efficiency and accuracy. It unveils FastText's latent potential as an 'online dictionary' for historical queries while harnessing the semantic robustness of BERT for novel and complex scenarios. The proposed Distribution-Diverse Multi-Expert (DDME) framework employs multiple teacher models trained from diverse data distributions. Through meticulous data categorization and enrichment, it elevates the classification performance across the query spectrum. Empirical results within the JD ads search system validate the superiority of our proposed approaches. | 在在线广告的动态环境中,解读用户意图仍然是一个关键挑战,尤其是在查询分类的背景下。以FastText为代表的快速分类模型满足了实时响应的需求,但在处理复杂查询时存在局限性。相反,以准确性为中心的模型如BERT,虽然引入了延迟增加的挑战,但在处理复杂查询时表现出色。本文深入探讨了在效率和准确性之间寻求微妙平衡的问题。研究发现,FastText作为历史查询的“在线词典”具有潜在价值,同时利用BERT的语义丰富性来应对新颖和复杂的场景。提出的分布多样多专家(DDME)框架采用了从不同数据分布中训练的多个教师模型。通过细致的数据分类和丰富化处理,该框架提升了查询分类的整体性能。在京东广告搜索系统中的实证结果验证了我们提出的方法的优越性。 | code | 0 |
Spectral and Geometric Spaces Representation Regularization for Multi-Modal Sequential Recommendation | Zihao Li, Xuekong Xu, Zuoli Tang, Lixin Zou, Qian Wang, Chenliang Li | Recent works demonstrate the effectiveness of multi-modal information for sequential recommendation. However, the computational cost and representation degeneration fail to be focused specifically and addressed adequately in multi-modality recommendation. To this end, we first identify and formalize three properties i.e., diversity, compactness, and consistency from the geometric space and spectrum perspective. Building upon this foundation, we devise tailored loss functions to regularize the above three properties for representation optimization. Theoretical underpinnings and experimental results demonstrate the efficacy of an enhanced item representation in ameliorating degeneration. Furthermore, we propose an efficient and expandable image-centered method, named E2 ImgRec, to mitigate the immense cost of computation. Concretely, we substitute the linear projection operations in the self-attention module and feed-forward network layer with two learnable rescaling vectors or efficient recommendation, then leverage cross-attention for multi-modality information fusion. Extensive experiments on three public datasets illustrate our method outperforms representative ID-based solutions and multi-modal based state-of-the-arts with only up to 39.9% in memory usage and 4.3× acceleration in training time. The code for replication is available at https://github.com/WHUIR/E2ImgRec. | 近期的研究展示了多模态信息在序列推荐中的有效性。然而,多模态推荐中的计算成本和表示退化问题尚未得到充分关注和解决。为此,我们首先从几何空间和频谱的角度识别并形式化了三个特性,即多样性、紧凑性和一致性。在此基础上,我们设计了定制的损失函数来规范上述三个特性,以优化表示。理论基础和实验结果表明,增强的物品表示能够有效改善退化问题。此外,我们提出了一种高效且可扩展的以图像为中心的方法,名为E2 ImgRec,以缓解巨大的计算成本。具体而言,我们用两个可学习的重缩放向量替代了自注意力模块和前馈网络层中的线性投影操作,并利用交叉注意力进行多模态信息融合。在三个公开数据集上的广泛实验表明,我们的方法在内存使用率最高仅为39.9%和训练时间加速4.3倍的情况下,优于基于ID的代表性解决方案和多模态的最新技术。可复现代码已发布在https://github.com/WHUIR/E2ImgRec。 | code | 0 | |
Retrieval-Oriented Knowledge for Click-Through Rate Prediction | Huanshuo Liu, Bo Chen, Menghui Zhu, Jianghao Lin, Jiarui Qin, Hao Zhang, Yang Yang, Ruiming Tang | Click-through rate (CTR) prediction plays an important role in personalizedrecommendations. Recently, sample-level retrieval-based models (e.g., RIM) haveachieved remarkable performance by retrieving and aggregating relevant samples.However, their inefficiency at the inference stage makes them impractical forindustrial applications. To overcome this issue, this paper proposes auniversal plug-and-play Retrieval-Oriented Knowledge (ROK) framework.Specifically, a knowledge base, consisting of a retrieval-oriented embeddinglayer and a knowledge encoder, is designed to preserve and imitate theretrieved aggregated representations in a decomposition-reconstructionparadigm. Knowledge distillation and contrastive learning methods are utilizedto optimize the knowledge base, and the learned retrieval-enhancedrepresentations can be integrated with arbitrary CTR models in bothinstance-wise and feature-wise manners. Extensive experiments on threelarge-scale datasets show that ROK achieves competitive performance with theretrieval-based CTR models while reserving superior inference efficiency andmodel compatibility. | 点击率(CTR)预测在个性化推荐中扮演着重要角色。近期,基于样本级检索的模型(如RIM)通过检索并聚合相关样本来取得了显著的性能。然而,这些模型在推理阶段的效率低下使其难以应用于工业场景。为解决这一问题,本文提出了一种通用的即插即用型检索导向知识(ROK)框架。具体而言,设计了一个由检索导向嵌入层和知识编码器组成的知识库,该知识库在分解-重构范式中保留并模仿检索到的聚合表示。利用知识蒸馏和对比学习方法来优化知识库,所学到的检索增强表示可以与任意CTR模型在实例级和特征级方式上进行集成。在三个大规模数据集上的广泛实验表明,ROK在保留优越的推理效率和模型兼容性的同时,实现了与基于检索的CTR模型相当的性能。 | code | 0 | |
Mitigating Exposure Bias in Online Learning to Rank Recommendation: A Novel Reward Model for Cascading Bandits | Masoud Mansoury, Bamshad Mobasher, Herke van Hoof | Exposure bias is a well-known issue in recommender systems where items and suppliers are not equally represented in the recommendation results. This bias becomes particularly problematic over time as a few items are repeatedly over-represented in recommendation lists, leading to a feedback loop that further amplifies this bias. Although extensive research has addressed this issue in model-based or neighborhood-based recommendation algorithms, less attention has been paid to online recommendation models, such as those based on top-K contextual bandits, where recommendation models are dynamically updated with ongoing user feedback. In this paper, we study exposure bias in a class of well-known contextual bandit algorithms known as Linear Cascading Bandits. We analyze these algorithms in their ability to handle exposure bias and provide a fair representation of items in the recommendation results. Our analysis reveals that these algorithms fail to mitigate exposure bias in the long run during the course of ongoing user interactions. We propose an Exposure-Aware reward model that updates the model parameters based on two factors: 1) implicit user feedback and 2) the position of the item in the recommendation list. The proposed model mitigates exposure bias by controlling the utility assigned to the items based on their exposure in the recommendation list. Our experiments with two real-world datasets show that our proposed reward model improves the exposure fairness of the linear cascading bandits over time while maintaining the recommendation accuracy. It also outperforms the current baselines. Finally, we prove a high probability upper regret bound for our proposed model, providing theoretical guarantees for its performance. | 曝光偏差是推荐系统中一个众所周知的问题,其中物品和供应商在推荐结果中的表现并不均衡。随着时间的推移,这种偏差变得尤为严重,因为少数物品在推荐列表中被过度重复展示,形成了一个反馈循环,进一步加剧了这种偏差。尽管大量研究已经解决了基于模型或基于邻域的推荐算法中的这一问题,但对于在线推荐模型(如基于top-K上下文强盗的模型)的关注较少,这些模型会根据用户的持续反馈动态更新推荐模型。在本文中,我们研究了一类著名的上下文强盗算法——线性级联强盗算法中的曝光偏差问题。我们分析了这些算法在处理曝光偏差和在推荐结果中公平展示物品方面的能力。我们的分析表明,这些算法在长期用户交互过程中无法有效缓解曝光偏差。我们提出了一种曝光感知奖励模型,该模型根据两个因素更新模型参数:1)隐式用户反馈和2)物品在推荐列表中的位置。所提出的模型通过根据物品在推荐列表中的曝光程度调整分配给它们的效用,来缓解曝光偏差。我们在两个真实世界数据集上的实验表明,所提出的奖励模型随着时间的推移提高了线性级联强盗算法的曝光公平性,同时保持了推荐准确性。此外,它还优于当前的基线模型。最后,我们证明了所提出模型的高概率上界遗憾界限,为其性能提供了理论保证。 | code | 0 | |
MemoCRS: Memory-enhanced Sequential Conversational Recommender Systems with Large Language Models | Yunjia Xi, Weiwen Liu, Jianghao Lin, Bo Chen, Ruiming Tang, Weinan Zhang, Yong Yu | Conversational recommender systems (CRSs) aim to capture user preferences and provide personalized recommendations through multi-round natural language dialogues. However, most existing CRS models mainly focus on dialogue comprehension and preferences mining from the current dialogue session, overlooking user preferences in historical dialogue sessions. The preferences embedded in the user's historical dialogue sessions and the current session exhibit continuity and sequentiality, and we refer to CRSs with this characteristic as sequential CRSs. In this work, we leverage memory-enhanced LLMs to model the preference continuity, primarily focusing on addressing two key issues: (1) redundancy and noise in historical dialogue sessions, and (2) the cold-start users problem. To this end, we propose a Memory-enhanced Conversational Recommender System Framework with Large Language Models (dubbed MemoCRS) consisting of user-specific memory and general memory. User-specific memory is tailored to each user for their personalized interests and implemented by an entity-based memory bank to refine preferences and retrieve relevant memory, thereby reducing the redundancy and noise of historical sessions. The general memory, encapsulating collaborative knowledge and reasoning guidelines, can provide shared knowledge for users, especially cold-start users. With the two kinds of memory, LLMs are empowered to deliver more precise and tailored recommendations for each user. Extensive experiments on both Chinese and English datasets demonstrate the effectiveness of MemoCRS. | 对话推荐系统(CRSs)旨在通过多轮自然语言对话捕捉用户偏好并提供个性化推荐。然而,大多数现有的CRS模型主要关注当前对话会话中的对话理解和偏好挖掘,忽视了历史对话会话中的用户偏好。用户历史对话会话和当前会话中嵌入的偏好具有连续性和顺序性,我们将具备这种特性的CRS称为顺序CRS。在本研究中,我们利用记忆增强型LLMs来建模偏好连续性,主要解决两个关键问题:(1)历史对话会话中的冗余和噪声,(2)冷启动用户问题。为此,我们提出了一种基于大语言模型的记忆增强对话推荐系统框架(称为MemoCRS),该框架包括用户特定记忆和通用记忆。用户特定记忆针对每个用户的个性化兴趣定制,并通过基于实体的记忆库实现,以精炼偏好并检索相关记忆,从而减少历史会话的冗余和噪声。通用记忆封装了协作知识和推理指南,可以为所有用户提供共享知识,特别是冷启动用户。通过这两种记忆,LLMs能够为每个用户提供更精确和定制化的推荐。在中英文数据集上的广泛实验证明了MemoCRS的有效性。 | code | 0 | |
Early Exit Strategies for Approximate k-NN Search in Dense Retrieval | Francesco Busolin, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Salvatore Trani | Learned dense representations are a popular family of techniques for encoding queries and documents using high-dimensional embeddings, which enable retrieval by performing approximate k nearest-neighbors search (A-kNN). A popular technique for making A-kNN search efficient is based on a two-level index, where the embeddings of documents are clustered offline and, at query processing, a fixed number N of clusters closest to the query is visited exhaustively to compute the result set. In this paper, we build upon state-of-the-art for early exit A-kNN and propose an unsupervised method based on the notion of patience, which can reach competitive effectiveness with large efficiency gains. Moreover, we discuss a cascade approach where we first identify queries that find their nearest neighbor within the closest t << N clusters, and then we decide how many more to visit based on our patience approach or other state-of-the-art strategies. Reproducible experiments employing state-of-the-art dense retrieval models and publicly available resources show that our techniques improve the A-kNN efficiency with up to 5x speedups while achieving negligible effectiveness losses. All the code used is available at https://github.com/francescobusolin/faiss_pEE | 学习到的密集表示是一种流行的技术家族,用于使用高维嵌入对查询和文档进行编码,通过执行近似k近邻搜索(A-kNN)来实现检索。使A-kNN搜索高效的一种流行技术是基于两级索引,其中文档的嵌入在离线状态下被聚类,在查询处理时,固定数量的N个最接近查询的聚类被穷尽地访问以计算结果集。在本文中,我们基于最先进的早期退出A-kNN技术,提出了一种基于耐心理念的无监督方法,该方法能够在大幅提高效率的同时达到竞争性的有效性。此外,我们讨论了一种级联方法,首先识别在其最接近的t << N个聚类内找到最近邻的查询,然后根据我们的耐心理念或其他最先进策略决定访问更多聚类的数量。使用最先进的密集检索模型和公开可用资源的可重复实验表明,我们的技术在实现几乎无有效性损失的情况下,将A-kNN效率提高了最多5倍的速度。所有使用的代码均可在https://github.com/francescobusolin/faiss_pEE获取。 | code | 0 | |
MODRL-TA: A Multi-Objective Deep Reinforcement Learning Framework for Traffic Allocation in E-Commerce Search | Peng Cheng, Huimu Wang, Jinyuan Zhao, Yihao Wang, Enqiang Xu, Yu Zhao, Zhuojian Xiao, Songlin Wang, Guoyu Tang, Lin Liu, Sulong Xu | Traffic allocation is a process of redistributing natural traffic to products by adjusting their positions in the post-search phase, aimed at effectively fostering merchant growth, precisely meeting customer demands, and ensuring the maximization of interests across various parties within e-commerce platforms. Existing methods based on learning to rank neglect the long-term value of traffic allocation, whereas approaches of reinforcement learning suffer from balancing multiple objectives and the difficulties of cold starts within realworld data environments. To address the aforementioned issues, this paper propose a multi-objective deep reinforcement learning framework consisting of multi-objective Q-learning (MOQ), a decision fusion algorithm (DFM) based on the cross-entropy method(CEM), and a progressive data augmentation system(PDA). Specifically. MOQ constructs ensemble RL models, each dedicated to an objective, such as click-through rate, conversion rate, etc. These models individually determine the position of items as actions, aiming to estimate the long-term value of multiple objectives from an individual perspective. Then we employ DFM to dynamically adjust weights among objectives to maximize long-term value, addressing temporal dynamics in objective preferences in e-commerce scenarios. Initially, PDA trained MOQ with simulated data from offline logs. As experiments progressed, it strategically integrated real user interaction data, ultimately replacing the simulated dataset to alleviate distributional shifts and the cold start problem. Experimental results on real-world online e-commerce systems demonstrate the significant improvements of MODRL-TA, and we have successfully deployed MODRL-TA on an e-commerce search platform. | 流量分配是通过调整产品在搜索后阶段的位置来重新分配自然流量,旨在有效促进商家增长、精准满足客户需求,并确保电子商务平台各方的利益最大化。现有的基于学习排序的方法忽视了流量分配的长期价值,而强化学习的方法则在平衡多个目标和处理现实数据环境中的冷启动问题上存在困难。为解决上述问题,本文提出了一种多目标深度强化学习框架,包括多目标Q学习(MOQ)、基于交叉熵方法(CEM)的决策融合算法(DFM)和渐进式数据增强系统(PDA)。具体而言,MOQ构建了专注于不同目标(如点击率、转化率等)的集成强化学习模型,每个模型独立决定商品的位置作为动作,旨在从个体角度估计多个目标的长期价值。随后,我们采用DFM动态调整目标之间的权重以最大化长期价值,解决电子商务场景中目标偏好的时间动态性。最初,PDA使用离线日志中的模拟数据训练MOQ。随着实验的进行,它策略性地整合了真实用户交互数据,最终替代模拟数据集以缓解分布偏移和冷启动问题。在真实在线电子商务系统上的实验结果显示了MODRL-TA的显著改进,并且我们已成功将MODRL-TA部署在电子商务搜索平台上。 | code | 0 | |
Enhancing CTR Prediction through Sequential Recommendation Pre-training: Introducing the SRP4CTR framework | Ruidong Han, Qianzhong Li, He Jiang, Rui Li, Yurou Zhao, Xiang Li, Wei Lin | Understanding user interests is crucial for Click-Through Rate (CTR) prediction tasks. In sequential recommendation, pre-training from user historical behaviors through self-supervised learning can better comprehend user dynamic preferences, presenting the potential for direct integration with CTR tasks. Previous methods have integrated pre-trained models into downstream tasks with the sole purpose of extracting semantic information or well-represented user features, which are then incorporated as new features. However, these approaches tend to ignore the additional inference costs to the downstream tasks, and they do not consider how to transfer the effective information from the pre-trained models for specific estimated items in CTR prediction. In this paper, we propose a Sequential Recommendation Pre-training framework for CTR prediction (SRP4CTR) to tackle the above problems. Initially, we discuss the impact of introducing pre-trained models on inference costs. Subsequently, we introduced a pre-trained method to encode sequence side information concurrently.During the fine-tuning process, we incorporate a cross-attention block to establish a bridge between estimated items and the pre-trained model at a low cost. Moreover, we develop a querying transformer technique to facilitate the knowledge transfer from the pre-trained model to industrial CTR models. Offline and online experiments show that our method outperforms previous baseline models. | 理解用户兴趣对于点击率(CTR)预测任务至关重要。在序列推荐中,通过自监督学习从用户历史行为中进行预训练,能更好地理解用户的动态偏好,为直接整合到CTR任务中提供了潜力。以往的方法将预训练模型整合到下游任务中,主要是为了提取语义信息或良好表示的用户特征,并将其作为新特征引入。然而,这些方法往往忽略了增加的推理成本,以及如何将预训练模型中的有效信息传递给CTR预测中特定的估计项。本文提出了一种用于CTR预测的序列推荐预训练框架(SRP4CTR),以解决上述问题。首先,我们讨论了引入预训练模型对推理成本的影响。接着,我们引入了一种预训练方法,以同时编码序列侧信息。在微调过程中,我们通过一个交叉注意力模块,以较低的成本在估计项和预训练模型之间建立桥梁。此外,我们开发了一种查询变换器技术,以促进预训练模型中的知识向工业CTR模型的转移。离线和在线实验表明,我们的方法优于以往的基线模型。 | code | 0 | |
MARS: Matching Attribute-aware Representations for Text-based Sequential Recommendation | Hyunsoo Kim, Junyoung Kim, Minjin Choi, Sunkyung Lee, Jongwuk Lee | Sequential recommendation aims to predict the next item a user is likely to prefer based on their sequential interaction history. Recently, text-based sequential recommendation has emerged as a promising paradigm that uses pre-trained language models to exploit textual item features to enhance performance and facilitate knowledge transfer to unseen datasets. However, existing text-based recommender models still struggle with two key challenges: (i) representing users and items with multiple attributes, and (ii) matching items with complex user interests. To address these challenges, we propose a novel model, Matching Attribute-aware Representations for Text-based Sequential Recommendation (MARS)}. MARS extracts detailed user and item representations through attribute-aware text encoding, capturing diverse user intents with multiple attribute-aware representations. It then computes user-item scores via attribute-wise interaction matching, effectively capturing attribute-level user preferences. Our extensive experiments demonstrate that MARS significantly outperforms existing sequential models, achieving improvements of up to 24.43% and 29.26% in Recall@10 and NDCG@10 across five benchmark datasets. Code is available at https://github.com/junieberry/MARS | 顺序推荐旨在根据用户的顺序交互历史预测他们可能偏好的下一个项目。近年来,基于文本的顺序推荐作为一种有前景的范式出现,它利用预训练的语言模型来利用文本项目特征,以提升性能并促进知识向未见数据集的转移。然而,现有的基于文本的推荐模型仍面临两个关键挑战:(i)用多个属性表示用户和项目,以及(ii)匹配具有复杂用户兴趣的项目。为解决这些挑战,我们提出了一种新颖的模型,即基于文本的顺序推荐匹配属性感知表示(MARS)。MARS通过属性感知的文本编码提取详细的用户和项目表示,利用多个属性感知表示捕捉多样化的用户意图。然后,它通过属性层面的交互匹配计算用户-项目分数,有效捕捉属性级别的用户偏好。我们的广泛实验表明,MARS显著优于现有的顺序推荐模型,在五个基准数据集上的Recall@10和NDCG@10分别提高了24.43%和29.26%。代码可在https://github.com/junieberry/MARS获取。 | code | 0 | |
How to Leverage Personal Textual Knowledge for Personalized Conversational Information Retrieval | Fengran Mo, Longxiang Zhao, Kaiyu Huang, Yue Dong, Degen Huang, JianYun Nie | Personalized conversational information retrieval (CIR) combines conversational and personalizable elements to satisfy various users' complex information needs through multi-turn interaction based on their backgrounds. The key promise is that the personal textual knowledge base (PTKB) can improve the CIR effectiveness because the retrieval results can be more related to the user's background. However, PTKB is noisy: not every piece of knowledge in PTKB is relevant to the specific query at hand. In this paper, we explore and test several ways to select knowledge from PTKB and use it for query reformulation by using a large language model (LLM). The experimental results show the PTKB might not always improve the search results when used alone, but LLM can help generate a more appropriate personalized query when high-quality guidance is provided. | 个性化对话信息检索(CIR)结合了对话性和可个性化元素,通过基于用户背景的多轮交互来满足不同用户的复杂信息需求。其核心优势在于,个性化文本知识库(PTKB)能够提升CIR的效果,因为检索结果可以更贴近用户的背景。然而,PTKB存在噪声问题:并非PTKB中的每条知识都与当前的具体查询相关。本文探讨并测试了几种从PTKB中选择知识并用于查询重构的方法,这些方法借助大型语言模型(LLM)实现。实验结果表明,单独使用PTKB并不总能提升搜索结果,但在高质量指引下,LLM能够生成更为合适的个性化查询。 | code | 0 | |
Enhancing Relevance of Embedding-based Retrieval at Walmart | Juexin Lin, Sachin Yadav, Feng Liu, Nicholas Rossi, Praveen Reddy Suram, Satya Chembolu, Prijith Chandran, Hrushikesh Mohapatra, Tony Lee, Alessandro Magnani, Ciya Liao | Embedding-based neural retrieval (EBR) is an effective search retrieval method in product search for tackling the vocabulary gap between customer search queries and products. The initial launch of our EBR system at Walmart yielded significant gains in relevance and add-to-cart rates [1]. However, despite EBR generally retrieving more relevant products for reranking, we have observed numerous instances of relevance degradation. Enhancing retrieval performance is crucial, as it directly influences product reranking and affects the customer shopping experience. Factors contributing to these degradations include false positives/negatives in the training data and the inability to handle query misspellings. To address these issues, we present several approaches to further strengthen the capabilities of our EBR model in terms of retrieval relevance. We introduce a Relevance Reward Model (RRM) based on human relevance feedback. We utilize RRM to remove noise from the training data and distill it into our EBR model through a multi-objective loss. In addition, we present the techniques to increase the performance of our EBR model, such as typo-aware training, and semi-positive generation. The effectiveness of our EBR is demonstrated through offline relevance evaluation, online AB tests, and successful deployments to live production. [1] Alessandro Magnani, Feng Liu, Suthee Chaidaroon, Sachin Yadav, Praveen Reddy Suram, Ajit Puthenputhussery, Sijie Chen, Min Xie, Anirudh Kashi, Tony Lee, et al. 2022. Semantic retrieval at walmart. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3495-3503. | 基于嵌入的神经检索(EBR)是一种在产品搜索中有效应对客户搜索查询与产品之间词汇差异的搜索检索方法。我们最初在沃尔玛推出的EBR系统显著提升了相关性和加入购物车的比率[1]。然而,尽管EBR通常能检索到更相关的产品以进行重新排序,我们仍观察到许多相关性下降的情况。提升检索性能至关重要,因为它直接影响产品重新排序并影响客户购物体验。导致这些下降的因素包括训练数据中的假阳性/阴性以及无法处理查询拼写错误。为解决这些问题,我们提出了几种方法来进一步增强EBR模型在检索相关性方面的能力。我们引入了一个基于人类相关性反馈的相关性奖励模型(RRM)。我们利用RRM来消除训练数据中的噪声,并通过多目标损失将其提炼到EBR模型中。此外,我们还提出了提升EBR模型性能的技术,如拼写感知训练和半正例生成。通过离线相关性评估、在线AB测试以及成功部署到实际生产中,展示了我们EBR的有效性。[1] Alessandro Magnani, Feng Liu, Suthee Chaidaroon, Sachin Yadav, Praveen Reddy Suram, Ajit Puthenputhussery, Sijie Chen, Min Xie, Anirudh Kashi, Tony Lee, 等. 2022. 沃尔玛的语义检索. 在第28届ACM SIGKDD知识发现与数据挖掘会议论文集. 3495-3503. | code | 0 | |
Relevance Filtering for Embedding-based Retrieval | Nicholas Rossi, Juexin Lin, Feng Liu, Zhen Yang, Tony Lee, Alessandro Magnani, Ciya Liao | In embedding-based retrieval, Approximate Nearest Neighbor (ANN) search enables efficient retrieval of similar items from large-scale datasets. While maximizing recall of relevant items is usually the goal of retrieval systems, a low precision may lead to a poor search experience. Unlike lexical retrieval, which inherently limits the size of the retrieved set through keyword matching, dense retrieval via ANN search has no natural cutoff. Moreover, the cosine similarity scores of embedding vectors are often optimized via contrastive or ranking losses, which make them difficult to interpret. Consequently, relying on top-K or cosine-similarity cutoff is often insufficient to filter out irrelevant results effectively. This issue is prominent in product search, where the number of relevant products is often small. This paper introduces a novel relevance filtering component (called "Cosine Adapter") for embedding-based retrieval to address this challenge. Our approach maps raw cosine similarity scores to interpretable scores using a query-dependent mapping function. We then apply a global threshold on the mapped scores to filter out irrelevant results. We are able to significantly increase the precision of the retrieved set, at the expense of a small loss of recall. The effectiveness of our approach is demonstrated through experiments on both public MS MARCO dataset and internal Walmart product search data. Furthermore, online A/B testing on the Walmart site validates the practical value of our approach in real-world e-commerce settings. | 在基于嵌入的检索中,近似最近邻(ANN)搜索能够从大规模数据集中高效地检索相似项目。尽管最大化相关项目的召回率通常是检索系统的目标,但低精度可能会导致糟糕的搜索体验。与通过关键词匹配自然限制检索集大小的词法检索不同,通过ANN搜索的密集检索没有自然的截止点。此外,嵌入向量的余弦相似度分数通常通过对比或排序损失进行优化,这使得它们难以解释。因此,仅依赖于前K个结果或余弦相似度截止点往往不足以有效过滤掉不相关的结果。在产品搜索中,这一问题尤为突出,因为相关产品的数量通常较少。本文为基于嵌入的检索引入了一种新颖的相关性过滤组件(称为“余弦适配器”),以应对这一挑战。我们的方法使用查询依赖的映射函数将原始余弦相似度分数映射为可解释的分数,然后对映射后的分数应用全局阈值以过滤掉不相关的结果。我们能够在召回率小幅损失的情况下显著提高检索集的精度。通过在公共MS MARCO数据集和内部沃尔玛产品搜索数据上的实验,证明了我们方法的有效性。此外,在沃尔玛网站上的在线A/B测试验证了我们的方法在实际电子商务环境中的实用价值。 | code | 0 | |
Advancing Re-Ranking with Multimodal Fusion and Target-Oriented Auxiliary Tasks in E-Commerce Search | Enqiang Xu, Xinhui Li, Zhigong Zhou, Jiahao Ji, Jinyuan Zhao, Dadong Miao, Songlin Wang, Lin Liu, Sulong Xu | In the rapidly evolving field of e-commerce, the effectiveness of search re-ranking models is crucial for enhancing user experience and driving conversion rates. Despite significant advancements in feature representation and model architecture, the integration of multimodal information remains underexplored. This study addresses this gap by investigating the computation and fusion of textual and visual information in the context of re-ranking. We propose Advancing Re-Ranking with Multimodal Fusion and Target-Oriented Auxiliary Tasks (ARMMT), which integrates an attention-based multimodal fusion technique and an auxiliary ranking-aligned task to enhance item representation and improve targeting capabilities. This method not only enriches the understanding of product attributes but also enables more precise and personalized recommendations. Experimental evaluations on JD.com's search platform demonstrate that ARMMT achieves state-of-the-art performance in multimodal information integration, evidenced by a 0.22% increase in the Conversion Rate (CVR), significantly contributing to Gross Merchandise Volume (GMV). This pioneering approach has the potential to revolutionize e-commerce re-ranking, leading to elevated user satisfaction and business growth. | 在电子商务快速发展的领域中,搜索重排序模型的有效性对于提升用户体验和推动转化率至关重要。尽管在特征表示和模型架构方面取得了显著进展,但多模态信息的整合仍未得到充分探索。本研究通过探讨重排序情境下文本和视觉信息的计算与融合,填补了这一空白。我们提出了基于多模态融合与面向目标的辅助任务的进阶重排序模型(ARMMT),该模型整合了基于注意力的多模态融合技术与辅助排序对齐任务,以增强商品表示并提升目标定位能力。这种方法不仅丰富了对产品属性的理解,还实现了更精确和个性化的推荐。在京东搜索平台上的实验评估表明,ARMMT在多模态信息整合方面达到了最先进的性能,体现在转化率(CVR)提高了0.22%,显著促进了商品交易总额(GMV)的增长。这一开创性方法有望革新电子商务重排序,带来用户满意度和业务增长的双重提升。 | code | 0 | |
Missing Interest Modeling with Lifelong User Behavior Data for Retrieval Recommendation | Gaode Chen, Yuezihan Jiang, Rui Huang, Kuo Cai, Yunze Luo, Ruina Sun, Qi Zhang, Han Li, Kun Gai | Kuaishou Technology, Beijing, China | Rich user behavior data has been proven to be of great value for recommendation systems. Modeling lifelong user behavior data in the retrieval stage to explore user long-term preference and obtain comprehensive retrieval results is crucial. Existing lifelong modeling methods cannot applied to the retrieval stage because they extract target-relevant items through the coupling between the user and the target item. Moreover, the current retrieval methods fail to precisely capture user interests when the length of the user behavior sequence increases further. That leads to a gap in the ability of retrieval models to model lifelong user behavior data. In this paper, we propose the concept of missing interest, leveraging the idea of complementarity, which serves as a supplement to short-term interest based on lifelong behavior data in the retrieval stage. Specifically, we design a missing interest operator and deploy it in Kafka data stream, without incurring latency or storage costs. This operator derives categories and authors of items that the user was previously interested in but has recently missed, and uses these as triggers to output missing features to the downstream retrieval model. Our retrieval model is a complete dual-tower structure that combines short-term and missing interests on the user side to provide a comprehensive depiction of lifelong behaviors. Since 2023, the presented solution has been deployed in Kuaishou, one of the most popular short-video streaming platforms in China with hundreds of millions of active users. | 丰富的用户行为数据已被证明对推荐系统具有巨大价值。在检索阶段对终身用户行为数据进行建模,以探索用户的长期偏好并获得全面的检索结果至关重要。现有的终身建模方法无法应用于检索阶段,因为它们通过用户与目标项目之间的耦合来提取目标相关项目。此外,当前的检索方法在用户行为序列长度进一步增加时无法精确捕捉用户兴趣。这导致了检索模型在终身用户行为数据建模能力上的差距。本文提出了缺失兴趣的概念,利用互补的思想,作为基于终身行为数据在检索阶段对短期兴趣的补充。具体来说,我们设计了一个缺失兴趣操作符,并将其部署在Kafka数据流中,不会产生延迟或存储成本。该操作符推导出用户之前感兴趣但最近错过的项目的类别和作者,并使用这些作为触发器向下游检索模型输出缺失特征。我们的检索模型是一个完整的双塔结构,结合了用户端的短期兴趣和缺失兴趣,全面描绘了终身行为。自2023年以来,所提出的解决方案已部署在中国最受欢迎的短视频流媒体平台之一——快手,该平台拥有数亿活跃用户。 | code | 0 |
Relative Contrastive Learning for Sequential Recommendation with Similarity-based Positive Sample Selection | Zhikai Wang, Yanyan Shen, Zexi Zhang, Li He, Yichun Li, Hao Gu, Yinghua Zhang | Shanghai Jiao Tong University, Shanghai, China; Meituan, Shanghai, China | Contrastive Learning (CL) enhances the training of sequential recommendation (SR) models through informative self-supervision signals. Existing methods often rely on data augmentation strategies to create positive samples and promote representation invariance. Some strategies such as item reordering and item substitution may inadvertently alter user intent. Supervised Contrastive Learning (SCL) based methods find an alternative to augmentation-based CL methods by selecting same-target sequences (interaction sequences with the same target item) to form positive samples. However, SCL-based methods suffer from the scarcity of same-target sequences and consequently lack enough signals for contrastive learning. In this work, we propose to use similar sequences (with different target items) as additional positive samples and introduce a Relative Contrastive Learning (RCL) framework for sequential recommendation. RCL comprises a dual-tiered positive sample selection module and a relative contrastive learning module. The former module selects same-target sequences as strong positive samples and selects similar sequences as weak positive samples. The latter module employs a weighted relative contrastive loss, ensuring that each sequence is represented closer to its strong positive samples than its weak positive samples. We apply RCL on two mainstream deep learning-based SR models, and our empirical results reveal that RCL can achieve 4.88% improvement averagely than the state-of-the-art SR methods on five public datasets and one private dataset. The code can be found at https://github.com/Cloudcatcher888/RCL. | 对比学习(CL)通过提供信息丰富的自监督信号,增强了序列推荐(SR)模型的训练。现有方法通常依赖于数据增强策略来创建正样本并促进表示的不变性。一些策略如物品重新排序和物品替换可能会无意中改变用户意图。基于监督对比学习(SCL)的方法通过选择相同目标序列(与相同目标物品的交互序列)来形成正样本,从而为基于增强的CL方法提供了替代方案。然而,SCL方法面临相同目标序列稀缺的问题,因此缺乏足够的对比学习信号。在这项工作中,我们提出使用相似序列(具有不同目标物品)作为额外的正样本,并引入了一个相对对比学习(RCL)框架用于序列推荐。RCL包括一个双层正样本选择模块和一个相对对比学习模块。前者模块选择相同目标序列作为强正样本,并选择相似序列作为弱正样本。后者模块采用加权相对对比损失,确保每个序列的表示更接近其强正样本而非弱正样本。我们将RCL应用于两个主流的基于深度学习的SR模型,我们的实验结果显示,RCL在五个公共数据集和一个私有数据集上平均比最先进的SR方法提高了4.88%。代码可在https://github.com/Cloudcatcher888/RCL找到。 | code | 0 |
Momentum Contrastive Bidirectional Encoding with Self-Distillation for Sequential Recommendation | Dingyi Zhang, Haoyu Wenren, Yue Wang, Yingming Li | Alipay (Hangzhou) Information Technology Co., Ltd, Hangzhou, China; Zhejiang University, Hangzhou, China | In this paper, we propose a new Momentum Contrastive Bidirectional Encoding network with S elf-D istillation (MoCoBE-SD) to alleviate the data sparsity and noise issues in sequential recommendation by providing rich informative supervisions from both sequence-level and item-level perspectives. In particular, a Momentum Contrastive Bidirectional Encoding (MoCoBE) network is first proposed by constructing momentum updated encoder based on an online bidirectional self-attention encoder, where a momentum contrastive learning task and a masked item prediction task are simultaneously optimized. Building upon MoCoBE, a well-elaborated Self-Distillation (SD) scheme is incorporated to further suppress the noise influence. Specifically, a well-trained sequence encoder by MoCoBE is adopted as the teacher encoder to provide refined supervision for the masked item prediction, which constitutes our MoCoBE-SD framework. Extensive experiments on three public datasets show that MoCoBE-SD outperforms the existing state-of-the-art methods consistently. | 本文提出了一种新的动量对比双向编码网络,结合自蒸馏技术(MoCoBE-SD),以缓解序列推荐中数据稀疏和噪声问题。通过从序列级和项目级两个角度提供丰富的信息监督来实现这一目标。具体而言,首先提出了一种动量对比双向编码(MoCoBE)网络,该网络基于在线双向自注意力编码器构建了动量更新的编码器,同时优化了动量对比学习任务和掩码项目预测任务。在MoCoBE的基础上,引入了一种精心设计的自蒸馏(SD)方案,以进一步抑制噪声的影响。具体来说,通过MoCoBE训练好的序列编码器被用作教师编码器,为掩码项目预测提供精细化的监督,从而构成了我们的MoCoBE-SD框架。在三个公共数据集上的广泛实验表明,MoCoBE-SD在性能上持续优于现有的最先进方法。 | code | 0 |
A Real-Time Adaptive Multi-Stream GPU System For Online Approximate Nearest Neighborhood Search | Yiping Sun, Yang Shi, Jiaolong Du | In recent years, Approximate Nearest Neighbor Search (ANNS) has played a pivotal role in modern search and recommendation systems, especially in emerging LLM applications like Retrieval-Augmented Generation. There is a growing exploration into harnessing the parallel computing capabilities of GPUs to meet the substantial demands of ANNS. However, existing systems primarily focus on offline scenarios, overlooking the distinct requirements of online applications that necessitate real-time insertion of new vectors. This limitation renders such systems inefficient for real-world scenarios. Moreover, previous architectures struggled to effectively support real-time insertion due to their reliance on serial execution streams. In this paper, we introduce a novel Real-Time Adaptive Multi-Stream GPU ANNS System (RTAMS-GANNS). Our architecture achieves its objectives through three key advancements: 1) We initially examined the real-time insertion mechanisms in existing GPU ANNS systems and discovered their reliance on repetitive copying and memory allocation, which significantly hinders real-time effectiveness on GPUs. As a solution, we introduce a dynamic vector insertion algorithm based on memory blocks, which includes in-place rearrangement. 2) To enable real-time vector insertion in parallel, we introduce a multi-stream parallel execution mode, which differs from existing systems that operate serially within a single stream. Our system utilizes a dynamic resource pool, allowing multiple streams to execute concurrently without additional execution blocking. 3) Through extensive experiments and comparisons, our approach effectively handles varying QPS levels across different datasets, reducing latency by up to 40 proposed system has also been deployed in real-world industrial search and recommendation systems, serving hundreds of millions of users daily, and has achieved good results. | 近年来,近似最近邻搜索(ANNS)在现代搜索和推荐系统中发挥了关键作用,特别是在诸如增强检索生成(Retrieval-Augmented Generation)等新兴的大型语言模型(LLM)应用中。越来越多的研究致力于利用GPU的并行计算能力来满足ANNS的巨大需求。然而,现有的系统主要关注离线场景,忽视了在线应用的独特需求,这些需求需要实时插入新向量。这种局限性使得这些系统在现实场景中效率低下。此外,先前的架构由于依赖串行执行流,难以有效支持实时插入。在本文中,我们介绍了一种新型实时自适应多流GPU ANNS系统(RTAMS-GANNS)。我们的架构通过三个关键进展实现了其目标:1)我们首先研究了现有GPU ANNS系统中的实时插入机制,发现它们依赖于重复的复制和内存分配,这严重阻碍了GPU上的实时效率。作为解决方案,我们引入了一种基于内存块的动态向量插入算法,包括就地重排。2)为了实现并行实时向量插入,我们引入了一种多流并行执行模式,这与现有系统在单一流中串行操作不同。我们的系统利用动态资源池,允许多个流并发执行而无需额外的执行阻塞。3)通过广泛的实验和比较,我们的方法有效地处理了不同数据集上的不同QPS水平,延迟降低了高达40%。所提出的系统也已部署在实际的工业搜索和推荐系统中,每天服务于数亿用户,并取得了良好的效果。 | code | 0 | |
MERLIN: Multimodal & Multilingual Embedding for Recommendations at Large-scale via Item Associations | Sambeet Tiady, Arihant Jain, Dween Rabius Sanny, Khushi Gupta, Srinivas Virinchi, Swapnil Gupta, Anoop Saladi, Deepak Gupta | Amazon.com, Bangalore, India | Product recommendations incentivize customers to make multi-unit purchases by surfacing relevant products, leading to lower cost per unit for e-commerce stores and lower prices for their customers. However, the humongous scale of products, implicit co-purchase asymmetry and variation in co-purchase behavior across different categories, are orthogonal problems to solve. To address these problems, we propose MERLIN (Multimodal & Multilingual Embedding for Recommendations at Large-scale via Item associations), a Graph Neural Network that generates product recommendations from a heterogeneous and directed product graph. We mine category associations to remove noisy product co-purchase associations, leading to higher quality recommendations. Leveraging product co-view relationships, we finetune SentenceBERT model for textual representation, and train a self-supervised knowledge distillation model to learn visual representation, which allows us to learn product representations which are multi-lingual and multi-modal in nature. We selectively align node embeddings leveraging co-viewed products. MERLIN model can handle node asymmetry by learning dual embeddings for each product, and can generate recommendations for cold-start products by employing catalog metadata such as title, category and image. Extensive offline experiments on internal and external datasets show that MERLIN model outperforms state-of-the-art baselines for node recommendation and link prediction task. We conduct ablations to quantify the impact of our model components and choices. Further, MERLIN model delivers significant improvement in sales measured through an A/B experiment. | 产品推荐通过展示相关产品激励顾客进行多单位购买,从而降低电商商店的单位成本和顾客的购买价格。然而,产品规模的巨大、隐含的共同购买不对称性以及不同类别间共同购买行为的变化,是相互独立的问题。为了解决这些问题,我们提出了MERLIN(通过项目关联进行大规模推荐的多模态与多语言嵌入),这是一个从异构且有向的产品图中生成产品推荐的图神经网络。我们挖掘类别关联来消除噪声产品共同购买关联,从而提高推荐质量。利用产品共同浏览关系,我们微调了SentenceBERT模型以获取文本表示,并训练了一个自监督的知识蒸馏模型来学习视觉表示,这使我们能够学习到本质上是多语言和多模态的产品表示。我们通过共同浏览的产品有选择地对齐节点嵌入。MERLIN模型通过为每个产品学习双重嵌入来处理节点不对称性,并通过使用标题、类别和图像等目录元数据为冷启动产品生成推荐。在内、外部数据集上的广泛离线实验表明,MERLIN模型在节点推荐和链接预测任务上优于最先进的基线模型。我们进行了消融实验,以量化我们模型组件和选择的影响。此外,通过A/B实验测量的销售数据显示,MERLIN模型带来了显著的改进。 | code | 0 |
Towards Advancing Text-Based User and Item Representation in Personalized Recommendation | Hanjia Lyu | University of Rochester, Rochester, NY, USA | In the realm of personalized recommendation systems, accurately capturing user preferences and item characteristics is important for delivering relevant and satisfying recommendations. This study introduces innovative approaches using Large Language Models (LLMs) to generate detailed textual descriptions that enhance both user and item representations. We propose a dual strategy: for user representation, we employ supervised fine-tuning coupled with Retrieval-Augmented Generation (RAG) to keep the model current with dynamic user preferences; for item representation, we leverage the extensive knowledge base of LLMs to enrich item descriptions and infer traits from user interactions. These methods promise a deeper, more nuanced understanding of both users and items, potentially leading to superior recommendation accuracy. We adopt a rigorous evaluation methodology, ensuring the reliability of our results and the effectiveness of our proposed system. This paper discusses these methodologies, presents our preliminary findings, and highlights the potential of text-augmented profiles in advancing recommendation systems. | 在个性化推荐系统领域,准确捕捉用户偏好和物品特征对于提供相关且令人满意的推荐至关重要。本研究引入了利用大型语言模型(LLMs)生成详细文本描述的创新方法,以增强用户和物品的表示。我们提出了一种双重策略:对于用户表示,我们采用监督微调结合检索增强生成(RAG),以使模型与动态用户偏好保持同步;对于物品表示,我们利用LLMs的广泛知识库来丰富物品描述,并从用户交互中推断特征。这些方法有望对用户和物品实现更深入、更细致的理解,从而可能提高推荐准确性。我们采用严格的评估方法,确保结果的可靠性和所提出系统的有效性。本文讨论了这些方法,展示了初步研究成果,并强调了文本增强型用户和物品描述在推进推荐系统方面的潜力。 | code | 0 |
Contrastive Learning on Medical Intents for Sequential Prescription Recommendation | Arya Hadizadeh Moghaddam, Mohsen Nayebi Kerdabadi, Mei Liu, Zijun Yao | Recent advancements in sequential modeling applied to Electronic Health Records (EHR) have greatly influenced prescription recommender systems. While the recent literature on drug recommendation has shown promising performance, the study of discovering a diversity of coexisting temporal relationships at the level of medical codes over consecutive visits remains less explored. The goal of this study can be motivated from two perspectives. First, there is a need to develop a sophisticated sequential model capable of disentangling the complex relationships across sequential visits. Second, it is crucial to establish multiple and diverse health profiles for the same patient to ensure a comprehensive consideration of different medical intents in drug recommendation. To achieve this goal, we introduce Attentive Recommendation with Contrasted Intents (ARCI), a multi-level transformer-based method designed to capture the different but coexisting temporal paths across a shared sequence of visits. Specifically, we propose a novel intent-aware method with contrastive learning, that links specialized medical intents of the patients to the transformer heads for extracting distinct temporal paths associated with different health profiles. We conducted experiments on two real-world datasets for the prescription recommendation task using both ranking and classification metrics. Our results demonstrate that ARCI has outperformed the state-of-the-art prescription recommendation methods and is capable of providing interpretable insights for healthcare practitioners. | 近年来,应用于电子健康记录(EHR)的序列建模的进展极大地影响了处方推荐系统。尽管最近关于药物推荐的文献展示了令人鼓舞的性能,但在连续就诊中,在医疗代码层面发现多种共存的时间关系的研究仍较少探索。本研究的目标可以从两个角度来推动。首先,需要开发一种复杂的序列模型,能够解开跨连续就诊的复杂关系。其次,为同一患者建立多个多样化的健康档案至关重要,以确保在药物推荐中全面考虑不同的医疗意图。为了实现这一目标,我们引入了带有对比意图的注意力推荐(ARCI),这是一种基于多层变换器的方法,旨在捕捉共享序列就诊中不同的但共存的时间路径。具体来说,我们提出了一种新颖的意图感知方法,结合对比学习,将患者的专门医疗意图链接到变换器头部,以提取与不同健康档案相关的独特时间路径。我们在两个真实世界的数据集上进行了处方推荐任务的实验,使用了排名和分类指标。结果表明,ARCI在性能上优于最先进的处方推荐方法,并能够为医疗从业者提供可解释的见解。 | code | 0 | |
Behavior-Dependent Linear Recurrent Units for Efficient Sequential Recommendation | Chengkai Liu, Jianghao Lin, Hanzhou Liu, Jianling Wang, James Caverlee | Sequential recommender systems aims to predict the users' next interaction through user behavior modeling with various operators like RNNs and attentions. However, existing models generally fail to achieve the three golden principles for sequential recommendation simultaneously, i.e., training efficiency, low-cost inference, and strong performance. To this end, we propose RecBLR, an Efficient Sequential Recommendation Model based on Behavior-Dependent Linear Recurrent Units to accomplish the impossible triangle of the three principles. By incorporating gating mechanisms and behavior-dependent designs into linear recurrent units, our model significantly enhances user behavior modeling and recommendation performance. Furthermore, we unlock the parallelizable training as well as inference efficiency for our model by designing a hardware-aware scanning acceleration algorithm with a customized CUDA kernel. Extensive experiments on real-world datasets with varying lengths of user behavior sequences demonstrate RecBLR's remarkable effectiveness in simultaneously achieving all three golden principles - strong recommendation performance, training efficiency, and low-cost inference, while exhibiting excellent scalability to datasets with long user interaction histories. | 顺序推荐系统旨在通过使用RNN和注意力等操作符对用户行为进行建模来预测用户的下一次交互。然而,现有的模型通常无法同时实现顺序推荐的三个黄金原则,即训练效率、低成本推理和强大的性能。为此,我们提出了RecBLR,一种基于行为依赖线性循环单元的高效顺序推荐模型,以实现这三个原则的不可能三角。通过将门控机制和行为依赖设计融入线性循环单元,我们的模型显著增强了用户行为建模和推荐性能。此外,我们通过设计一个硬件感知的扫描加速算法和定制的CUDA内核,为我们的模型解锁了可并行化的训练以及推理效率。在具有不同长度用户行为序列的实际数据集上的广泛实验表明,RecBLR在同时实现所有三个黄金原则方面表现出色——强大的推荐性能、训练效率和低成本推理,同时在处理具有长用户交互历史的数据集时展现出卓越的可扩展性。 | code | 0 | |
MultiLoRA: Multi-Directional Low Rank Adaptation for Multi-Domain Recommendation | Zijian Song, Wenhan Zhang, Lifang Deng, Jiandong Zhang, Kaigui Bian, Bin Cui | Lazada Group, Beijing, China; School of CS, Peking University & AI Innovation Center, Peking University, Beijing, China | To address the business needs of industrial recommendation systems, an increasing number of Multi-Domain Recommendation (MDR) methods are designed to improve recommendation performance on multiple domains simultaneously. Most MDR methods follow a multi-task learning paradigm, suffering from poor deployability and negative transfer. Due to the great success of large pre-trained models, the pre-train & fine-tune paradigm is attracting increasing attention. The latest methods introduce parameter-efficient fine-tuning techniques like prompt-tuning, showcasing high efficiency and effectiveness. However, these methods neglect the fundamental differences between recommendation and NLP tasks. The inadequate capacity of recommendation models restricts the effectiveness of prompts and adapters. Worse still, traditional natural domain division may group non-identically distributed samples into the same domain, violating the assumption of independent and identically distributed (i.i.d.) data. In this paper, we propose MultiLoRA, a Multi-directional Low Rank Adaptation paradigm for multi-domain recommendation. First we pre-train a universal model using all data samples. Then we conduct multiple domain divisions on the sample space. Under each division, we fine-tune the pre-trained model to obtain a set of domain-specific LoRAs. Finally, we learn a LoRA fusion module to integrate domain-specific preference patterns across multiple divisions. Experimental results on real-world datasets demonstrate notable advantages of MultiLoRA: (1) achieving SOTA performance, (2) showcasing remarkable compatibility, and (3) proving highly efficient, featuring only 2% trainable parameters compared to the backbone. | 为了满足工业推荐系统的业务需求,越来越多的多领域推荐(MDR)方法被设计出来,以同时提升多个领域的推荐性能。大多数MDR方法遵循多任务学习的范式,存在部署性差和负迁移的问题。由于大型预训练模型取得了巨大成功,预训练与微调的范式正受到越来越多的关注。最新的方法引入了如提示调优等参数高效的微调技术,展示了高效性和有效性。然而,这些方法忽略了推荐任务与自然语言处理任务之间的根本差异。推荐模型的不足能力限制了提示词和适配器的有效性。更糟糕的是,传统的自然领域划分可能将非同分布的样本归入同一领域,违反了独立同分布(i.i.d.)数据的假设。在本文中,我们提出了MultiLoRA,一种面向多领域推荐的多向低秩适应范式。首先,我们使用所有数据样本预训练一个通用模型。然后,我们在样本空间上进行多次领域划分。在每次划分下,我们对预训练模型进行微调,以获得一组领域特定的LoRAs。最后,我们学习一个LoRA融合模块,以整合多个划分中的领域特定偏好模式。在真实世界数据集上的实验结果显示了MultiLoRA的显著优势:(1)实现了SOTA性能,(2)展示了出色的兼容性,(3)证明了高效率,仅具有与骨干模型相比2%的可训练参数。 | code | 0 |
Bridging User Dynamics: Transforming Sequential Recommendations with Schrödinger Bridge and Diffusion Models | Wenjia Xie, Rui Zhou, Hao Wang, Tingjia Shen, Enhong Chen | Sequential recommendation has attracted increasing attention due to its ability to accurately capture the dynamic changes in user interests. We have noticed that generative models, especially diffusion models, which have achieved significant results in fields like image and audio, hold considerable promise in the field of sequential recommendation. However, existing sequential recommendation methods based on diffusion models are constrained by a prior distribution limited to Gaussian distribution, hindering the possibility of introducing user-specific information for each recommendation and leading to information loss. To address these issues, we introduce the Schrödinger Bridge into diffusion-based sequential recommendation models, creating the SdifRec model. This allows us to replace the Gaussian prior of the diffusion model with the user's current state, directly modeling the process from a user's current state to the target recommendation. Additionally, to better utilize collaborative information in recommendations, we propose an extended version of SdifRec called con-SdifRec, which utilizes user clustering information as a guiding condition to further enhance the posterior distribution. Finally, extensive experiments on multiple public benchmark datasets have demonstrated the effectiveness of SdifRec and con-SdifRec through comparison with several state-of-the-art methods. Further in-depth analysis has validated their efficiency and robustness. | 顺序推荐因其能够准确捕捉用户兴趣的动态变化而受到越来越多的关注。我们注意到,生成模型,特别是扩散模型,在图像和音频等领域取得了显著成果,在顺序推荐领域也展现出巨大的潜力。然而,现有的基于扩散模型的顺序推荐方法受限于仅限于高斯分布的先验分布,这阻碍了在每次推荐中引入特定用户信息的可能性,导致信息损失。为了解决这些问题,我们将薛定谔桥引入基于扩散的顺序推荐模型,创建了SdifRec模型。这使得我们能够将扩散模型的高斯先验替换为用户当前状态,直接模拟从用户当前状态到目标推荐的过程。此外,为了更好地利用推荐中的协同信息,我们提出了SdifRec的扩展版本,称为con-SdifRec,它利用用户聚类信息作为指导条件,进一步增强后验分布。最后,在多个公共基准数据集上的广泛实验通过与几种最先进方法的比较,证明了SdifRec和con-SdifRec的有效性。进一步的深入分析验证了它们的效率和鲁棒性。 | code | 0 | |
Generating Intent-aware Clarifying Questions in Conversational Information Retrieval Systems | Ziliang Zhao, Zhicheng Dou, Yujia Zhou | Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China | Generating clarifying questions can effectively clarify users' complicated search intent in conversational search systems. However, existing methods based on pre-defined templates are inadequate in understanding explicit user intents, making generated questions monotonous or inaccurate in some cases. In this paper, we define the ''intent'' of a query as a verb representing the potential behavior, action, or task the user may take. We study generating clarifying questions from a new perspective by incorporating the intents explicitly to form ''intent-aware'' questions with high informativeness and accuracy. Since obtaining gold intent-aware questions is expensive, we propose a rule-based method and a continual learning model to generate intent-aware questions as weak supervision signals. The former leverages search results to mine contextual intent-aware words or phrases, and the latter relies on parallel corpora to paraphrase template-based questions by incorporating the intents. The generated weak supervision data are then applied to fine-tune a BART-based model for end-to-end intent-aware question generation. We also explore to prompt a large language model to generate intent-aware questions. Experimental results on a public clarification dataset demonstrate that our proposed methods improve users' search experience compared to existing methods. | 生成澄清问题可以有效澄清对话搜索系统中用户复杂的搜索意图。然而,现有基于预定义模板的方法在理解明确用户意图方面存在不足,导致生成的问题在某些情况下单调或不准确。本文中,我们将查询的“意图”定义为用户可能采取的潜在行为、动作或任务的动词表示。我们通过明确结合意图,从新的角度研究生成信息丰富且准确的“意图感知”澄清问题。由于获取黄金标准的意图感知问题成本高昂,我们提出了一种基于规则的方法和一种持续学习模型,以生成意图感知的弱监督信号。前者利用搜索结果挖掘上下文中的意图感知词或短语,后者则依赖平行语料库通过结合意图对基于模板的问题进行释义。生成的弱监督数据随后用于微调基于BART的模型,以实现端到端的意图感知问题生成。我们还探索了引导大型语言模型生成意图感知问题的方法。在公开的澄清数据集上的实验结果表明,与现有方法相比,我们提出的方法提升了用户的搜索体验。 | code | 0 |
Enhancing E-Commerce Query Rewriting: A Large Language Model Approach with Domain-Specific Pre-Training and Reinforcement Learning | Aijun Dai, Zhenyu Zhu, Haiqing Hu, Guoyu Tang, Lin Liu, Sulong Xu | JD.com, Beijing, China; JD.com, Bejing, China | In the domain of e-commerce, query rewriting is a potent strategy for bridging the lexical gap between search queries and product descriptions, thereby enhancing the recall rate of search engines. This research introduces a query rewriting framework predicated on large language models (LLM), encompassing three phases of training: domain-specific pre-training, supervised fine-tuning (SFT) and reinforcement learning (RL) for objective alignment. To detail, the process initiates with domain-specific pre-training using consumer behavior data and product descriptions from JD.com. Subsequently, we filter and utilize high-quality query-rewrite pairs for SFT. The final stage employs RL to refine the model's objective alignment, utilizing an offline search system as the simulation environment. The RL's training reward is derived from the recall rate, aiming to optimize the number of relevant products the rewrites retrieve. Through offline evaluations, our method has demonstrated its capacity to substantially enhance the efficacy of LLMs for e-commerce query rewriting. Moreover, online A/B testing has corroborated that our approach significantly boosts the number of purchases made per user (UCVR). Since December 2023, our approach has been successfully implemented on JD.com, one of China's most frequented online shopping platforms. | 在电子商务领域,查询重写是弥合搜索查询与产品描述之间词汇鸿沟的有效策略,从而提高搜索引擎的召回率。本研究引入了一种基于大型语言模型(LLM)的查询重写框架,该框架包括三个训练阶段:领域特定的预训练、有监督的微调(SFT)和用于目标对齐的强化学习(RL)。具体而言,该过程首先使用京东的消费行为数据和产品描述进行领域特定的预训练。随后,我们筛选并利用高质量的查询-重写对进行SFT。最后阶段采用RL来优化模型的目标对齐,利用离线搜索系统作为模拟环境。RL的训练奖励基于召回率,旨在优化重写查询所检索到的相关产品数量。通过离线评估,我们的方法展示了其显著提升LLM在电子商务查询重写中效能的能力。此外,在线A/B测试证实了我们的方法显著提高了每位用户的购买转化率(UCVR)。自2023年12月以来,我们的方法已成功应用于京东,这是中国访问量最大的在线购物平台之一。 | code | 0 |
Deep Uncertainty-Based Explore for Index Construction and Retrieval in Recommendation System | Xin Jiang, Kaiqiang Wang, Yinlong Wang, Fengchang Lv, Taiyang Peng, Shuai Yang, Xianteng Wu, Pengye Zhang, Shuo Yuan, Yifan Zeng | In recommendation systems, the relevance and novelty of the final results are selected through a cascade system of Matching -> Ranking -> Strategy. The matching model serves as the starting point of the pipeline and determines the upper bound of the subsequent stages. Balancing the relevance and novelty of matching results is a crucial step in the design and optimization of recommendation systems, contributing significantly to improving recommendation quality. However, the typical matching algorithms have not simultaneously addressed the relevance and novelty perfectly. One main reason is that deep matching algorithms exhibit significant uncertainty when estimating items in the long tail (e.g., due to insufficient training samples) items.The uncertainty not only affects the training of the models but also influences the confidence in the index construction and beam search retrieval process of these models. This paper proposes the UICR (Uncertainty-based explore for Index Construction and Retrieval) algorithm, which introduces the concept of uncertainty modeling in the matching stage and achieves multi-task modeling of model uncertainty and index uncertainty. The final matching results are obtained by combining the relevance score and uncertainty score infered by the model. Experimental results demonstrate that the UICR improves novelty without sacrificing relevance on realworld industrial productive environments and multiple open-source datasets. Remarkably, online A/B test results of display advertising in Shopee demonstrates the effectiveness of the proposed algorithm. | 在推荐系统中,最终结果的相关性和新颖性是通过匹配(Matching)->排序(Ranking)->策略(Strategy)的级联系统来选择的。匹配模型作为管道的起点,决定了后续阶段的上限。平衡匹配结果的相关性和新颖性是推荐系统设计和优化的关键步骤,对提高推荐质量有显著贡献。然而,典型的匹配算法并没有同时完美地解决相关性和新颖性问题。主要原因之一是深度匹配算法在估计长尾(例如,由于训练样本不足)项目时表现出显著的不确定性。这种不确定性不仅影响模型的训练,还影响这些模型在索引构建和束搜索检索过程中的置信度。本文提出了基于不确定性的索引构建和检索(UICR)算法,该算法在匹配阶段引入了不确定性建模的概念,实现了模型不确定性和索引不确定性的多任务建模。最终的匹配结果是通过结合模型推断的相关性分数和不确定性分数获得的。实验结果表明,UICR在不牺牲相关性的情况下提高了新颖性,在现实世界的工业生产环境和多个开源数据集上都得到了验证。值得注意的是,Shopee展示广告的在线A/B测试结果证明了所提出算法的有效性。 | code | 0 | |
Towards Seamless User Query to REST API Conversion | Han Xu | University of Illinois Urbana-Champaign, Urbana, IL, USA | Integrating Large Language Models (LLMs) with external tools and APIs is essential for fields such as information retrieval and knowledge management. While LLMs have made significant strides, their effective integration with external APIs-essential for real-world applications-remains challenging. This paper introduces RESTful-Llama, a novel method designed to empower open-source LLMs to accurately convert natural language instructions into well-formed RESTful API calls. Moreover, RESTful-Llama utilizes DOC-Prompt, a newly proposed technique for generating fine-tuning datasets from publicly available API documentation. Initial experiments demonstrate that RESTful-Llama significantly enhances the accuracy of generated REST API requests. | 将大型语言模型(LLMs)与外部工具和API集成对于信息检索和知识管理等领域至关重要。尽管LLMs取得了显著进展,但它们与外部API的有效集成——这对于实际应用至关重要——仍然具有挑战性。本文介绍了RESTful-Llama,这是一种新颖的方法,旨在使开源LLMs能够准确地将自然语言指令转换为格式良好的RESTful API调用。此外,RESTful-Llama利用了DOC-Prompt,这是一种新提出的技术,用于从公开可用的API文档生成微调数据集。初步实验表明,RESTful-Llama显著提高了生成的REST API请求的准确性。 | code | 0 |
Product Retrieval and Ranking for Alphanumeric Queries | Hadeel Saadany, Swapnil Bhosale, Samarth Agrawal, Zhe Wu, Constantin Orasan, Diptesh Kanojia | Centre for Translation Studies, University of Surrey, Guildford, United Kingdom; eBay Inc., Seattle, USA; People-Centred AI, University of Surrey, Guildford, United Kingdom; eBay Inc., San Jose, USA | This talk addresses the challenge of improving user experience on e-commerce platforms by enhancing product ranking relevant to user's search queries. Queries such as S2716DG consist of alphanumeric characters where a letter or number can signify important detail for the product/model. Speaker describes recent research where we curate samples from existing datasets at eBay, manually annotated with buyer-centric relevance scores, and centrality scores which reflect how well the product title matches the user's intent. We introduce a User-intent Centrality Optimization (UCO) approach for existing models, which optimizes for the user intent in semantic product search. To that end, we propose a dual-loss based optimization to handle hard negatives, i.e., product titles that are semantically relevant but do not reflect the user's intent. Our contributions include curating a challenging evaluation set and implementing UCO, resulting in significant improvements in product ranking efficiency, observed for different evaluation metrics. Our work aims to ensure that the most buyer-centric titles for a query are ranked higher, thereby, enhancing the user experience on e-commerce platforms. | 本次演讲探讨了通过提升与用户搜索查询相关的产品排序来改善电子商务平台用户体验的挑战。诸如S2716DG之类的查询包含字母数字字符,其中字母或数字可能代表产品/型号的重要细节。演讲者描述了最近的研究,我们在eBay的现有数据集中精选样本,这些样本经过手动标注,具有以买家为中心的相关性评分和中心性评分,后者反映了产品标题与用户意图的匹配程度。我们引入了一种用户意图中心性优化(User-intent Centrality Optimization, UCO)方法,用于现有模型,该方法优化了语义产品搜索中的用户意图。为此,我们提出了一种基于双重损失的优化方法来处理硬负样本,即那些在语义上相关但未反映用户意图的产品标题。我们的贡献包括策划一个具有挑战性的评估集和实现UCO,从而在不同的评估指标下显著提高了产品排序效率。我们的工作旨在确保对于一个查询,最符合买家意图的标题排名更高,从而增强电子商务平台的用户体验。 | code | 0 |
PTSR: Prefix-Target Graph-based Sequential Recommendation | Jiayu Chen, Xiaoyu Du, Yonghua Pan, Jinhui Tang | Nanjing University of Science and Technology, Nanjing, China | Sequential recommendation approaches predict the next items (targets) by analyzing prefix subsequences. These methods primarily model the correlations between prefixes and targets but often neglect the inherent correlations among prefixes and items. In this paper, we propose a Prefix-Target Graph-based Sequential Recommendation Approach (PTSR), which constructs a prefix-target graph (PTG) to collect observed correlations among prefixes and targets. It utilizes a graph neural network to model these inherent correlations, thus improving the item representations used in the predictive model. Specifically, prefixes linked to the same target reflect similar intents, while targets linked to the same prefix indicate available choices. This allows the graph neural network to effectively capture high-level correlations among prefixes and items, enhancing recommendation accuracy. We conduct extensive experiments on four real-world datasets to demonstrate the superiority of PTSR compared to state-of-the-art (SOTA) sequential recommendation methods. The source code of the PTSR is available at https://github.com/TosakRin/PTSR. | 顺序推荐方法通过分析前缀子序列来预测下一个项目(目标)。这些方法主要建模前缀与目标之间的关联,但往往忽略了前缀和项目之间固有的关联。本文提出了一种基于前缀-目标图的顺序推荐方法(PTSR),该方法构建了一个前缀-目标图(PTG)以收集前缀与目标之间观察到的关联。它利用图神经网络来建模这些固有关联,从而改进预测模型中使用的项目表示。具体而言,与同一目标相连的前缀反映了相似的意图,而与同一前缀相连的目标则表示可用的选择。这使得图神经网络能够有效地捕捉前缀和项目之间的高层次关联,从而提高推荐准确性。我们在四个真实世界数据集上进行了广泛的实验,以证明PTSR相较于最先进的(SOTA)顺序推荐方法的优越性。PTSR的源代码可在https://github.com/TosakRin/PTSR获取。 | code | 0 |
PACIFIC: Enhancing Sequential Recommendation via Preference-aware Causal Intervention and Counterfactual Data Augmentation | Jinpeng Chen, Huachen Guan, Huan Li, Fan Zhang, Liwei Huang, Guangyao Pang, Xiongnan Jin | ; Beijing Institute of Remote Sensing, Beijing, China; The State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou, China | Sequential recommendation has been receiving increasing attention from researchers. Existing sequential recommendation models leverage deep learning models to capture sequential features. However, these methods ignore confounders in the recommendation process, which can lead the model to learn incorrect correlations and fail to accurately capture users' true preferences. Moreover, these methods rely on extensive interaction sequences, but sequential data often suffers from sparsity issues. To address these limitations, this paper proposes a P reference- a ware C ausal I ntervention and Counter f a c tual Data Augmentation ( Pacific ) framework to enhance sequential recommendation. Initially, we model the causal graph of sequential recommendation and categorize user preferences into global long-term preferences, local long-term preferences, and short-term preferences. Then, we introduce the front-door criterion to eliminate the interference of confounders and design different self-attention mechanisms to estimate the causal effects, aiming to capture users' true preferences. In addition, based on counterfactual thinking, we design a counterfactual data augmentation module to generate enriched sequences. Experimental results on four real-world datasets demonstrate the superiority of our proposed approach over state-of-the-art sequential recommendation methods. | 序列推荐近年来引起了研究人员的广泛关注。现有的序列推荐模型利用深度学习模型来捕捉序列特征。然而,这些方法忽略了推荐过程中的混杂因素,这可能导致模型学习到错误的关联,无法准确捕捉用户的真实偏好。此外,这些方法依赖于大量的交互序列,但序列数据往往存在稀疏性问题。为了解决这些局限性,本文提出了一个P reference-a ware C ausal I ntervention and Counter f a c tual Data Augmentation(Pacific)框架,以增强序列推荐。首先,我们建模了序列推荐的因果图,并将用户偏好分为全局长期偏好、局部长期偏好和短期偏好。然后,我们引入了前门准则来消除混杂因素的干扰,并设计了不同的自注意力机制来估计因果效应,旨在捕捉用户的真实偏好。此外,基于反事实思维,我们设计了一个反事实数据增强模块,以生成丰富的序列。在四个真实世界数据集上的实验结果表明,我们提出的方法优于最先进的序列推荐方法。 | code | 0 |
Context Matters: Enhancing Sequential Recommendation with Context-aware Diffusion-based Contrastive Learning | Ziqiang Cui, Haolun Wu, Bowei He, Ji Cheng, Chen Ma | City University of Hong Kong, Hong Kong SAR, Hong Kong; McGill University, Montréal, Canada | Contrastive learning has been effectively utilized to enhance the training of sequential recommendation models by leveraging informative self-supervised signals. Most existing approaches generate augmented views of the same user sequence through random augmentation and subsequently maximize their agreement in the representation space. However, these methods often neglect the rationality of the augmented samples. Due to significant uncertainty, random augmentation can disrupt the semantic information and interest evolution patterns inherent in the original user sequences. Moreover, pulling semantically inconsistent sequences closer in the representation space can render the user sequence embeddings insensitive to variations in user preferences, which contradicts the primary objective of sequential recommendation. To address these limitations, we propose the Context-aware Diffusion-based Contrastive Learning for Sequential Recommendation, named CaDiRec. The core idea is to leverage context information to generate more reasonable augmented views. Specifically, CaDiRec employs a context-aware diffusion model to generate alternative items for the given positions within a sequence. These generated items are aligned with their respective context information and can effectively replace the corresponding original items, thereby generating a positive view of the original sequence. By considering two different augmentations of the same user sequence, we can construct a pair of positive samples for contrastive learning. To ensure representation cohesion, we train the entire framework in an end-to-end manner, with shared item embeddings between the diffusion model and the recommendation model. Extensive experiments on five benchmark datasets demonstrate the advantages of our proposed method over existing baselines. | 对比学习已被有效利用,通过利用信息丰富的自监督信号来增强序列推荐模型的训练。大多数现有方法通过随机增强生成同一用户序列的增强视图,并在表示空间中最大化它们的共识。然而,这些方法往往忽略了增强样本的合理性。由于存在显著的不确定性,随机增强可能会破坏原始用户序列中固有的语义信息和兴趣演变模式。此外,在表示空间中将语义不一致的序列拉近会导致用户序列嵌入对用户偏好变化的敏感性降低,这与序列推荐的主要目标相悖。为了解决这些局限性,我们提出了基于上下文感知的扩散对比学习用于序列推荐,命名为CaDiRec。其核心思想是利用上下文信息生成更合理的增强视图。具体来说,CaDiRec采用上下文感知的扩散模型为序列中给定位置生成替代项。这些生成的项与其上下文信息对齐,并能有效替换相应的原始项,从而生成原始序列的正视图。通过考虑同一用户序列的两种不同增强,我们可以构建一对用于对比学习的正样本。为确保表示的一致性,我们以端到端的方式训练整个框架,并在扩散模型和推荐模型之间共享项嵌入。在五个基准数据集上的广泛实验证明了我们提出的方法相对于现有基线的优势。 | code | 0 |
A General Strategy Graph Collaborative Filtering for Recommendation Unlearning | Yongjing Hao, Fuzhen Zhuang, Deqing Wang, Guanfeng Liu, Victor S. Sheng, Pengpeng Zhao | Macquarie University, Sydney, Australia; Soochow University, Suzhou, Jiangsu, China; Texas Tech University, Lubbock, USA; Beihang University, Beijing, China | Recommender systems play a crucial role in delivering personalized services to users, but the increasing volume of user data raises significant concerns about privacy, security, and utility. However, existing machine unlearning methods cannot be directly applied to recommendation systems as they overlook the collaborative information shared across users and items. More recently, a method known as RecEraser was introduced, offering partitioning and aggregation-based approaches. Nevertheless, these approaches have limitations due to their inadequate handling of additional overhead costs. In this paper, we propose A General Strategy Graph Collaborative Filtering for Recommendation Unlearning (GSGCF-RU), which is a novel model-agnostic learnable delete operator that optimizes unlearning edge consistency and feature representation consistency. Specifically, the GSGCF-RU model utilizes unlearning edge consistency to eliminate the influence of deleted elements, followed by feature representation consistency to retain knowledge after deletion. Lastly, experimental results on three real-world public benchmarks demonstrate that GSGCF-RU not only achieves efficient recommendation unlearning but also surpasses state-of-the-art methods in terms of model utility. The source code can be found at https://github.com/YongjingHao/GSGCF-RU. | 推荐系统在为用户提供个性化服务方面起着至关重要的作用,但随着用户数据量的增加,隐私、安全和效用问题日益突出。然而,现有的机器遗忘方法无法直接应用于推荐系统,因为它们忽略了用户和物品之间共享的协作信息。最近,一种名为RecEraser的方法被提出,采用了基于分区和聚合的策略。然而,这些方法由于未能充分处理额外的开销成本而存在局限性。本文提出了一种通用策略图协同过滤推荐遗忘(GSGCF-RU)方法,这是一种新颖的模型无关可学习删除操作符,优化了遗忘边缘一致性和特征表示一致性。具体而言,GSGCF-RU模型利用遗忘边缘一致性来消除已删除元素的影响,随后通过特征表示一致性来保留删除后的知识。最后,在三个真实世界的公共基准上的实验结果表明,GSGCF-RU不仅实现了高效的推荐遗忘,而且在模型效用方面超越了最先进的方法。源代码可在https://github.com/YongjingHao/GSGCF-RU找到。 | code | 0 |
Interpretable Triplet Importance for Personalized Ranking | Bowei He, Chen Ma | Personalized item ranking has been a crucial component contributing to the performance of recommender systems. As a representative approach, pairwise ranking directly optimizes the ranking with user implicit feedback by constructing (user, positive item, negative item) triplets. Several recent works have noticed that treating all triplets equally may hardly achieve the best effects. They assign different importance scores to negative items, user-item pairs, or triplets, respectively. However, almost all the generated importance scores are groundless and hard to interpret, thus far from trustworthy and transparent. To tackle these, we propose the Triplet Shapley – a Shapely value-based method to measure the triplet importance in an interpretable manner. Due to the huge number of triplets, we transform the original Shapley value calculation to the Monte Carlo (MC) approximation, where the guarantee for the approximation unbiasedness is also provided. To stabilize the MC approximation, we adopt a control covariates-based method. Finally, we utilize the triplet Shapley value to guide the resampling of important triplets for benefiting the model learning. Extensive experiments are conducted on six public datasets involving classical matrix factorization- and graph neural network-based recommendation models. Empirical results and subsequent analysis show that our model consistently outperforms the state-of-the-art methods. | 个性化项目排序已成为提升推荐系统性能的关键组成部分。作为代表性方法,成对排序通过构建(用户,正项,负项)三元组,直接优化用户隐式反馈的排序。近期研究注意到,同等对待所有三元组可能难以达到最佳效果。因此,它们分别对负项、用户-项目对或三元组分配不同的重要性分数。然而,几乎所有生成的重要性分数都缺乏依据且难以解释,因此远非可靠和透明。为解决这些问题,我们提出了三元组Shapley——一种基于Shapley值的方法,以可解释的方式衡量三元组的重要性。由于三元组数量庞大,我们将原始Shapley值计算转换为蒙特卡洛(MC)近似,并提供了近似无偏性的保证。为稳定MC近似,我们采用了基于控制协变量的方法。最后,我们利用三元组Shapley值来指导重要三元组的重新采样,以促进模型学习。在涉及经典矩阵分解和图神经网络推荐模型的六个公开数据集上进行了广泛的实验。实证结果和后续分析表明,我们的模型始终优于最先进的方法。 | code | 0 | |
CausalMed: Causality-Based Personalized Medication Recommendation Centered on Patient Health State | Xiang Li, Shunpan Liang, Yu Lei, Chen Li, Yulei Hou, Dashun Zheng, Tengfei Ma | ; Hunan University School of Computer Science and Engineering; Yanshan University School of Information Science and Engineering; Yanshan University School of Mechanical Engineering; Xinjiang University of Science & Technology School of Information Science and Engineering | Medication recommendation systems are developed to recommend suitable medications tailored to specific patient. Previous researches primarily focus on learning medication representations, which have yielded notable advances. However, these methods are limited to capturing personalized patient representations due to the following primary limitations: (i) unable to capture the differences in the impact of diseases/procedures on patients across various patient health states; (ii) fail to model the direct causal relationships between medications and specific health state of patients, resulting in an inability to determine which specific disease each medication is treating. To address these limitations, we propose CausalMed, a patient health state-centric model capable of enhancing the personalization of patient representations. Specifically, CausalMed first captures the causal relationship between diseases/procedures and medications through causal discovery and evaluates their causal effects. Building upon this, CausalMed focuses on analyzing the health state of patients, capturing the dynamic differences of diseases/procedures in different health states of patients, and transforming diseases/procedures into medications on direct causal relationships. Ultimately, CausalMed integrates information from longitudinal visits to recommend medication combinations. Extensive experiments on real-world datasets show that our method learns more personalized patient representation and outperforms state-of-the-art models in accuracy and safety. | 药物推荐系统旨在为特定患者推荐合适的药物。以往的研究主要集中在学习药物表征上,取得了显著进展。然而,这些方法由于以下主要限制,无法捕捉个性化的患者表征:(i)无法捕捉疾病/程序对不同患者健康状态影响的差异;(ii)未能建模药物与患者特定健康状态之间的直接因果关系,导致无法确定每种药物具体治疗哪种疾病。为解决这些限制,我们提出了CausalMed,一种以患者健康状态为中心的模型,能够增强患者表征的个性化。具体来说,CausalMed首先通过因果发现捕捉疾病/程序与药物之间的因果关系,并评估其因果效应。在此基础上,CausalMed专注于分析患者的健康状态,捕捉疾病/程序在不同健康状态下的动态差异,并基于直接因果关系将疾病/程序转化为药物。最终,CausalMed整合了纵向就诊信息,推荐药物组合。在真实世界数据集上的广泛实验表明,我们的方法学习到了更个性化的患者表征,并在准确性和安全性方面优于最先进的模型。 | code | 0 |
PSNE: Efficient Spectral Sparsification Algorithms for Scaling Network Embedding | Longlong Lin, Yunfeng Yu, Zihao Wang, Zeli Wang, Yuying Zhao, Jin Zhao, Tao Jia | Network embedding has numerous practical applications and has received extensive attention in graph learning, which aims at mapping vertices into a low-dimensional and continuous dense vector space by preserving the underlying structural properties of the graph. Many network embedding methods have been proposed, among which factorization of the Personalized PageRank (PPR for short) matrix has been empirically and theoretically well supported recently. However, several fundamental issues cannot be addressed. (1) Existing methods invoke a seminal Local Push subroutine to approximate \textit{a single} row or column of the PPR matrix. Thus, they have to execute |
网络嵌入在图学习领域具有众多实际应用,并受到了广泛关注。其目标是将顶点映射到一个低维且连续的密集向量空间,同时保留图的底层结构特性。已有多种网络嵌入方法被提出,其中个性化PageRank(简称PPR)矩阵的分解方法近期在实践和理论上都得到了良好的支持。然而,仍存在几个基本问题无法解决。(1)现有方法调用一个开创性的Local Push子程序来近似PPR矩阵的单行或单列。因此,它们需要执行n(n为节点数量)次Local Push子程序以获得可证明的PPR矩阵,这对于大规模n来说计算成本极高。(2)PPR矩阵在捕捉顶点间结构相似性方面能力有限,导致性能下降。为解决这些问题,我们提出了PSNE,这是一种高效的谱稀疏化方法,用于扩展网络嵌入,能够快速获取保留强结构相似性的嵌入向量。具体而言,PSNE首先设计了一种矩阵多项式稀疏化方法,以加速PPR矩阵的计算,该方法在Frobenius范数方面具有理论保证。随后,PSNE提出了一种简单但有效的多视角策略,进一步增强所获得的近似PPR矩阵的表示能力。最后,PSNE对稀疏且多视角的PPR矩阵应用随机奇异值分解算法,以获取目标嵌入向量。对真实世界和合成数据集的实验评估表明,与十个竞争对手相比,我们的解决方案确实更加高效、有效且可扩展。 | code | 0 | |
UniRec: A Dual Enhancement of Uniformity and Frequency in Sequential Recommendations | Yang Liu, Yitong Wang, Chenyue Feng | Representation learning in sequential recommendation is critical for accurately modeling user interaction patterns and improving recommendation precision. However, existing approaches predominantly emphasize item-to-item transitions, often neglecting the time intervals between interactions, which are closely related to behavior pattern changes. Additionally, broader interaction attributes, such as item frequency, are frequently overlooked. We found that both sequences with more uniform time intervals and items with higher frequency yield better prediction performance. Conversely, non-uniform sequences exacerbate user interest drift and less-frequent items are difficult to model due to sparse sampling, presenting unique challenges inadequately addressed by current methods. In this paper, we propose UniRec, a novel bidirectional enhancement sequential recommendation method. UniRec leverages sequence uniformity and item frequency to enhance performance, particularly improving the representation of non-uniform sequences and less-frequent items. These two branches mutually reinforce each other, driving comprehensive performance optimization in complex sequential recommendation scenarios. Additionally, we present a multidimensional time module to further enhance adaptability. To the best of our knowledge, UniRec is the first method to utilize the characteristics of uniformity and frequency for feature augmentation. Comparing with eleven advanced models across four datasets, we demonstrate that UniRec outperforms SOTA models significantly. The code is available at https://github.com/Linxi000/UniRec. | 在序列推荐中的表示学习对于准确建模用户交互模式和提升推荐精度至关重要。然而,现有方法主要侧重于项目间的转换,往往忽视了交互之间的时间间隔,这些时间间隔与行为模式的变化密切相关。此外,更广泛的交互属性,如项目频率,也经常被忽略。我们发现,时间间隔更均匀的序列和频率更高的项目能带来更好的预测性能。相反,时间间隔不均匀的序列会加剧用户兴趣的漂移,而采样稀疏的低频项目则难以建模,这为当前方法带来了独特的挑战。在本文中,我们提出了UniRec,一种新颖的双向增强序列推荐方法。UniRec利用序列均匀性和项目频率来提升性能,特别是在改进非均匀序列和低频项目的表示方面。这两个分支相互强化,推动在复杂序列推荐场景中的全面性能优化。此外,我们还引入了一个多维度时间模块,以进一步增强适应性。据我们所知,UniRec是首个利用均匀性和频率特性进行特征增强的方法。通过与四个数据集上的十一种先进模型进行比较,我们展示了UniRec显著优于现有的最先进模型。代码已公开在https://github.com/Linxi000/UniRec。 | code | 0 | |
Collaborative Cross-modal Fusion with Large Language Model for Recommendation | Zhongzhou Liu, Hao Zhang, Kuicai Dong, Yuan Fang | Despite the success of conventional collaborative filtering (CF) approaches for recommendation systems, they exhibit limitations in leveraging semantic knowledge within the textual attributes of users and items. Recent focus on the application of large language models for recommendation (LLM4Rec) has highlighted their capability for effective semantic knowledge capture. However, these methods often overlook the collaborative signals in user behaviors. Some simply instruct-tune a language model, while others directly inject the embeddings of a CF-based model, lacking a synergistic fusion of different modalities. To address these issues, we propose a framework of Collaborative Cross-modal Fusion with Large Language Models, termed CCF-LLM, for recommendation. In this framework, we translate the user-item interactions into a hybrid prompt to encode both semantic knowledge and collaborative signals, and then employ an attentive cross-modal fusion strategy to effectively fuse latent embeddings of both modalities. Extensive experiments demonstrate that CCF-LLM outperforms existing methods by effectively utilizing semantic and collaborative signals in the LLM4Rec context. | 尽管传统的协同过滤(CF)方法在推荐系统中取得了成功,但它们在利用用户和项目文本属性中的语义知识方面存在局限性。最近,大语言模型在推荐系统中的应用(LLM4Rec)强调了其在有效捕捉语义知识方面的能力。然而,这些方法往往忽略了用户行为中的协同信号。有些方法仅仅是通过指令调整语言模型,而另一些方法则直接注入基于协同过滤模型的嵌入,缺乏不同模态之间的协同融合。为了解决这些问题,我们提出了一种名为CCF-LLM的大语言模型协同跨模态融合框架,用于推荐系统。在该框架中,我们将用户-项目交互转换为混合提示,以编码语义知识和协同信号,然后采用一种注意力跨模态融合策略,有效地融合两种模态的潜在嵌入。广泛的实验表明,CCF-LLM在LLM4Rec背景下,通过有效利用语义和协同信号,优于现有的方法。 | code | 0 | |
Re-evaluating the Command-and-Control Paradigm in Conversational Search Interactions | Johanne R. Trippas, Luke Gallagher, Joel Mackenzie | The University of Queensland, Brisbane, Australia; RMIT University, Melbourne, Australia; The University of Melbourne, Melbourne, Australia | Conversational assistants are becoming prevalent among the wider population due to their simplicity and increasing utility. However, the shortcomings of these tools are as renowned as their benefits. In this work, we present a "first look" at an extensive collection of conversational queries, aiming to identify limitations and improvement opportunities specifically related to information access (i.e., search interactions). We explore over 600,000 Google Assistant interactions from 173 unique users, examining usage trends and the resulting deficiencies and strengths of these assistants. We aim to provide a balanced assessment, highlighting the assistant's shortcomings in supporting users and delivering relevant information to user needs and areas where it demonstrates a reasonable response to user inputs. Our analysis shows that, although most users conduct information-seeking tasks, there is little evidence of complex information-seeking behaviour, with most interactions consisting of simple, imperative instructions. Finally, we find that conversational devices allow users to benefit from increased naturalistic interactions and the ability to apply acquired information in situ, a novel observation for conversational information seeking. | 对话助手因其简便性和日益增加的实用性,在更广泛的人群中变得普及。然而,这些工具的缺点与其优点同样为人所知。在这项工作中,我们首次对大量对话查询进行了深入分析,旨在识别与信息访问(即搜索交互)相关的局限性和改进机会。我们研究了来自173名独特用户的超过600,000次Google Assistant交互,考察了使用趋势以及这些助手的缺陷和优势。我们的目标是提供一个平衡的评估,突出助手在支持用户和传递相关信息以满足用户需求方面的不足,以及在合理响应用户输入的领域。我们的分析显示,尽管大多数用户进行信息检索任务,但几乎没有证据表明存在复杂的信息检索行为,大多数交互由简单的、命令式的指令组成。最后,我们发现对话设备使用户能够受益于更加自然的交互,并能够在现场应用所获取的信息,这是对话信息检索的一个新颖观察。 | code | 0 |
Collaborative Alignment for Recommendation | Chen Wang, Liangwei Yang, Zhiwei Liu, Xiaolong Liu, Mingdai Yang, Yueqing Liang, Philip S. Yu | University of Illinois Chicago, Chicgao, IL, USA; Salesforce AI Research, Palo Alto, CA, USA; Illinois Institute of Technology, Chicago, IL, USA; University of Illinois Chicago, Chicago, IL, USA | Traditional recommender systems have primarily relied on identity representations (IDs) to model users and items. Recently, the integration of pre-trained language models (PLMs) has enhanced the capability to capture semantic descriptions of items. However, while PLMs excel in few-shot, zero-shot, and unified modeling scenarios, they often overlook the crucial signals from collaborative filtering (CF), resulting in suboptimal performance when sufficient training data is available. To effectively combine semantic representations with the CF signal and enhance recommender system performance in both warm and cold settings, two major challenges must be addressed: (1) bridging the gap between semantic and collaborative representation spaces, and (2) refining while preserving the integrity of semantic representations. In this paper, we introduce CARec, a novel model that adeptly integrates collaborative filtering signals with semantic representations, ensuring alignment within the semantic space while maintaining essential semantics. We present experimental results from four real-world datasets, which demonstrate significant improvements. By leveraging collaborative alignment, CARec also shows remarkable effectiveness in cold-start scenarios, achieving notable enhancements in recommendation performance. The code is available at https://github.com/ChenMetanoia/CARec REMOVE 2nd URL://github.com/ChenMetanoia/CARec. | 传统的推荐系统主要依赖于身份表示(IDs)来建模用户和物品。近年来,预训练语言模型(PLMs)的引入增强了捕捉物品语义描述的能力。然而,尽管PLMs在少样本、零样本和统一建模场景中表现出色,但它们往往忽略了协同过滤(CF)中的关键信号,导致在有足够训练数据时性能不佳。为了有效结合语义表示与CF信号,并在冷启动和热启动场景中提升推荐系统性能,必须解决两个主要挑战:(1)弥合语义与协同表示空间之间的差距,(2)在保持语义表示完整性的同时进行优化。本文介绍了CARec,这是一种新型模型,能够巧妙地将协同过滤信号与语义表示相结合,确保在语义空间内的对齐同时保持基本语义。我们在四个真实世界数据集上进行了实验,结果显示了显著的改进。通过利用协同对齐,CARec在冷启动场景中也表现出显著的有效性,实现了推荐性能的显著提升。代码可在https://github.com/ChenMetanoia/CARec获取。 | code | 0 |
Sparks of Surprise: Multi-objective Recommendations with Hierarchical Decision Transformers for Diversity, Novelty, and Serendipity | Jie Wang, Alexandros Karatzoglou, Ioannis Arapakis, Xin Xin, Xuri Ge, Joemon M. Jose | ShanDong University, Qingdao, China; University of Glasgow, Glasgow, United Kingdom; Amazon, BARCELONA, Spain; Telefonica Research, BARCELONA, Spain | Personalized Session-based Recommendation (PSR) extends the traditional sequential recommendation models-which typically recommends the next item based on a recent active session-to leverage historical sessions of a user for short-term recommendations in current session. However, existing PSR methods face two limitations: (1) treating offline sessions uniformly as static data and relying on user embeddings to represent personalized information overlook the dynamic evolution of interests over time, which can change significantly as sessions progress in practical application. (2) focusing on accuracy, i.e., recommending items relevant to recent interactions, ignores the balance of multi-faceted requirements for user satisfaction, i.e., diversity, novelty, and serendipity. Therefore, we introduce Multi-objective PSR (MOPSR) task and propose Hierarchical Decision Transformers (HDT) framework, which models strictly sequential preference transitions of users across and within sessions to balance recommendation accuracy with the mentioned objectives. To address the first problem, Inter-session DT dynamically tracks the user's long-term preference across sessions by maintaining a goal state. This goal state serves as personalized information to collaboratively make recommendations with short-term state via the Intra-session DT. To tackle the second limitation, we propose inter-session and intra-session unexpected returns to trade off relevant recommendations and user preferences on diversity, novelty, and serendipity. The hierarchical returns help the recommender accurately identify signals of the user's expectations and changes in multi-objective preferences. To verify the effectiveness of our method on the MOPSR, we apply HDT to four state-of-the-art sequential recommendation models and conduct experiments on two publicly available datasets. Experimental results demonstrate that (1) HDT can widely generalize sequential models to solve the MOPSR task in scenarios with incrementally generated sessions, and (2) our method can balance multi-objectives by maintaining and even enhancing accuracy while effectively improving the diversity, novelty, and serendipity objectives. | 个性化会话推荐(PSR)扩展了传统的序列推荐模型——这些模型通常根据最近的活动会话推荐下一个项目——以利用用户的历史会话为当前会话提供短期推荐。然而,现有的PSR方法存在两个局限性:(1)将离线会话统一视为静态数据,并依赖用户嵌入来表示个性化信息,忽略了兴趣随时间的动态演变,这在实际应用中随着会话的进展可能会发生显著变化。(2)专注于准确性,即推荐与最近交互相关的项目,忽略了用户满意度的多方面需求平衡,即多样性、新颖性和意外性。因此,我们引入了多目标PSR(MOPSR)任务,并提出了层次决策变换器(HDT)框架,该框架对用户在会话内外严格顺序的偏好转移进行建模,以平衡推荐准确性与上述目标。为解决第一个问题,会话间DT通过维护一个目标状态来动态追踪用户在会话间的长期偏好。该目标状态作为个性化信息,通过会话内DT与短期状态协作进行推荐。为应对第二个局限性,我们提出了会话间和会话内的意外回报,以权衡相关推荐与用户对多样性、新颖性和意外性的偏好。层次回报有助于推荐系统准确识别用户期望的信号和多目标偏好的变化。为验证我们的方法在MOPSR上的有效性,我们将HDT应用于四种最先进的序列推荐模型,并在两个公开数据集上进行实验。实验结果表明:(1)HDT能够广泛推广序列模型,以解决在会话增量生成场景下的MOPSR任务;(2)我们的方法能够在保持甚至提高准确性的同时,有效提升多样性、新颖性和意外性目标,从而平衡多目标。 | code | 0 |
Content-Based Collaborative Generation for Recommender Systems | Yidan Wang, Zhaochun Ren, Weiwei Sun, Jiyuan Yang, Zhixiang Liang, Xin Chen, Ruobing Xie, Su Yan, Xu Zhang, Pengjie Ren, Zhumin Chen, Xin Xin | WeChat, Tencent, Beijing, China; Shandong University, Qingdao, China; Tencent, Beijing, China; Zhejiang University, Hangzhou, China; Leiden University, Leiden, Netherlands | Generative models have emerged as a promising utility to enhance recommender systems. It is essential to model both item content and user-item collaborative interactions in a unified generative framework for better recommendation. Although some existing large language model (LLM)-based methods contribute to fusing content information and collaborative signals, they fundamentally rely on textual language generation, which is not fully aligned with the recommendation task. How to integrate content knowledge and collaborative interaction signals in a generative framework tailored for item recommendation is still an open research challenge. In this paper, we propose co ntent-based col la borative generation for rec ommender systems, namely ColaRec. ColaRec is a sequence-to-sequence framework which is tailored for directly generating the recommended item identifier. Precisely, the input sequence comprises data pertaining to the user's interacted items, and the output sequence represents the generative identifier (GID) for the suggested item. To model collaborative signals, the GIDs are constructed from a pretrained collaborative filtering model, and the user is represented as the content aggregation of interacted items. To this end, ColaRec captures both collaborative signals and content information in a unified framework. Then an item indexing task is proposed to conduct the alignment between the content-based semantic space and the interaction-based collaborative space. Besides, a contrastive loss is further introduced to ensure that items with similar collaborative GIDs have similar content representations. To verify the effectiveness of ColaRec, we conduct experiments on four benchmark datasets. Empirical results demonstrate the superior performance of ColaRec. | 生成模型已成为增强推荐系统的有力工具。为了实现更好的推荐效果,在一个统一的生成框架中同时建模项目内容和用户-项目协同交互是至关重要的。尽管一些现有的基于大型语言模型(LLM)的方法有助于融合内容信息和协同信号,但它们本质上依赖于文本语言生成,这并未完全适应推荐任务的需求。如何在专为项目推荐设计的生成框架中整合内容知识和协同交互信号,仍然是一个开放的研究挑战。本文提出了一种基于内容的协同生成推荐系统方法,命名为ColaRec。ColaRec是一个序列到序列框架,专门用于直接生成推荐项目的标识符。具体而言,输入序列包含用户交互过的项目数据,输出序列表示推荐项目的生成标识符(GID)。为了建模协同信号,GID由预训练的协同过滤模型构建,用户则表示为交互项目的聚合内容。通过这种方式,ColaRec在一个统一的框架中捕捉了协同信号和内容信息。随后,提出了一项项目索引任务,以实现基于内容语义空间与基于交互的协同空间之间的对齐。此外,引入对比损失以确保具有相似协同GID的项目具有相似的内容表示。为了验证ColaRec的有效性,我们在四个基准数据集上进行了实验。实证结果表明,ColaRec具有优越的性能。 | code | 0 |
Multi-Task Recommendation with Task Information Decoupling | Ruiran Yan, Rui Fan, Defu Lian | Multi-task learning (MTL) has become increasingly prevalent in e-commerce recommender systems. However, existing MTL methods, particularly those utilizing the Multi-gate Mixture-of-Experts (MMoE) architecture, face challenges due to their implicit routing mechanisms. These mechanisms can inadvertently lead to negative knowledge transfer, failing to resolve conflicts among tasks and resulting in gradient contradictions on shared parameters. Such issues undermine the generalization capability of MTL models across various tasks. To address these limitations, we introduce the Task Information Decoupling Model (TIDM), designed to alleviate negative transfer by decoupling task knowledge. TIDM incorporates two innovative modules following the expert layer: the Maximize Information Aggregation Module (MIA) and the Automatic Information Selection Module (AIS). The MIA module employs an auxiliary loss to filter out irrelevant task information and aggregates task-specific knowledge using a dissimilar self-attention network. Subsequently, the AIS module automatically selects the most pertinent task-specific information to facilitate task tower learning. Our experiments demonstrate that TIDM outperforms five contemporary MTL models across two datasets, showcasing its effectiveness in extracting task-specific information. This advancement is crucial for enhancing the performance of recommender systems in e-commerce and other complex domains. | 多任务学习(MTL)在电子商务推荐系统中变得越来越普遍。然而,现有的MTL方法,特别是那些采用多门混合专家(MMoE)架构的方法,面临着由于其隐式路由机制带来的挑战。这些机制可能会无意中导致负知识转移,无法解决任务间的冲突,并在共享参数上产生梯度矛盾。这些问题削弱了MTL模型在各种任务中的泛化能力。为了解决这些局限性,我们引入了任务信息解耦模型(TIDM),旨在通过解耦任务知识来减轻负转移。TIDM在专家层之后包含了两个创新模块:最大化信息聚合模块(MIA)和自动信息选择模块(AIS)。MIA模块采用辅助损失来过滤无关任务信息,并使用不相似的自注意力网络聚合任务特定知识。随后,AIS模块自动选择最相关的任务特定信息,以促进任务塔的学习。我们的实验表明,TIDM在两个数据集上优于五种当代MTL模型,展示了其在提取任务特定信息方面的有效性。这一进展对于提升电子商务及其他复杂领域中推荐系统的性能至关重要。 | code | 0 | |
MMLRec: A Unified Multi-Task and Multi-Scenario Learning Benchmark for Recommendation | Guanghu Yuan, Jieyu Yang, Shujie Li, Mingjie Zhong, Ang Li, Ke Ding, Yong He, Min Yang, Liang Zhang, Xiaolu Zhang, Linjian Mo | Ant Group, Hangzhou, China; Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China | In recent years, there has been a trend in the field of recommender systems towards multi-task modeling and multi-scenario modeling. The aim is to enhance the performance of various tasks and scenarios by jointly training on multiple tasks or scenarios to learn common patterns and features. Joint modeling of tasks and scenarios has also received widespread attention recently. However, despite the rich proposals of methods for Multi-Task Learning (MTL), Multi-Scenario Learning (MSL), and Multi-Task-Multi-Scenario Learning (MTMSL) in recent years, there still lacks a comprehensive benchmark to evaluate these methods. Previous studies often employed different datasets, data processing techniques, data partitioning strategies, and hyperparameter settings, making replication of existing research and fair comparison of experimental results challenging. To address this challenge, we introduce MMLRec, the first unified comprehensive benchmark for evaluating MTL, MSL and MTMSL, featuring consistent dataset processing and identical parameter settings. This benchmark implements a range of MTL, MSL, and MTMSL algorithms, and evaluates them on multiple commonly used recommender systems datasets. Through fair comparative experiments, we find that some structurally simplistic recommendation algorithms are underestimated, as they can achieve comparable results to more complex algorithms while maintaining lower complexity. Furthermore, our experimental analysis indicates that more complex methods exhibit better robustness when there are significant differences between tasks or scenarios. By providing a unified framework (MMLRec), our goal is to promote rapid evaluation and inspire innovative research in this continuously evolving field. We hope that our open-source benchmark can facilitate swift, equitable evaluations, while also fostering further breakthrough research in the domains of MTL, MSL, and MTMSL. | 近年来,推荐系统领域出现了一种趋势,即向多任务建模和多场景建模发展。其目的是通过联合训练多个任务或场景,以学习共同的模型和特征,从而提升各个任务和场景的性能。任务和场景的联合建模也近来受到了广泛关注。然而,尽管近年来针对多任务学习(MTL)、多场景学习(MSL)以及多任务多场景学习(MTMSL)的方法提出了丰富的建议,但仍缺乏一个全面的基准来评估这些方法。以往的研究往往采用不同的数据集、数据处理技术、数据划分策略和超参数设置,这使得现有研究的复现和实验结果的公平比较变得困难。为了应对这一挑战,我们引入了MMLRec,这是首个用于评估MTL、MSL和MTMSL的统一综合性基准,具有一致的数据集处理和相同的参数设置。该基准实现了一系列MTL、MSL和MTMSL算法,并在多个常用的推荐系统数据集上对其进行了评估。通过公平的比较实验,我们发现一些结构上较为简单的推荐算法被低估了,因为它们能够在保持较低复杂度的同时,取得与更复杂算法相当的结果。此外,我们的实验分析表明,当任务或场景之间存在显著差异时,更复杂的方法表现出更好的鲁棒性。通过提供一个统一的框架(MMLRec),我们的目标是促进该领域的快速评估,并激发创新研究。我们希望我们的开源基准能够促进快速、公平的评估,同时也推动MTL、MSL和MTMSL领域的进一步突破性研究。 | code | 0 |
Reformulating Conversational Recommender Systems as Tri-Phase Offline Policy Learning | Gangyi Zhang, Chongming Gao, Hang Pan, Runzhe Teng, Ruizhe Li | Existing Conversational Recommender Systems (CRS) predominantly utilize user simulators for training and evaluating recommendation policies. These simulators often oversimplify the complexity of user interactions by focusing solely on static item attributes, neglecting the rich, evolving preferences that characterize real-world user behavior. This limitation frequently leads to models that perform well in simulated environments but falter in actual deployment. Addressing these challenges, this paper introduces the Tri-Phase Offline Policy Learning-based Conversational Recommender System (TPCRS), which significantly reduces dependency on real-time interactions and mitigates overfitting issues prevalent in traditional approaches. TPCRS integrates a model-based offline learning strategy with a controllable user simulation that dynamically aligns with both personalized and evolving user preferences. Through comprehensive experiments, TPCRS demonstrates enhanced robustness, adaptability, and accuracy in recommendations, outperforming traditional CRS models in diverse user scenarios. This approach not only provides a more realistic evaluation environment but also facilitates a deeper understanding of user behavior dynamics, thereby refining the recommendation process. | 现有的对话推荐系统(CRS)主要依赖用户模拟器进行推荐策略的训练和评估。这些模拟器通常过于简化用户交互的复杂性,仅关注静态的物品属性,而忽略了现实世界中用户行为所具有的丰富且不断演变的偏好。这一局限性往往导致模型在模拟环境中表现良好,但在实际应用中却表现不佳。为了应对这些挑战,本文提出了基于三阶段离线策略学习的对话推荐系统(TPCRS),该系统显著减少了对实时交互的依赖,并缓解了传统方法中常见的过拟合问题。TPCRS结合了基于模型的离线学习策略与可控的用户模拟器,后者能够动态地与个性化且不断演变的用户偏好相匹配。通过全面的实验,TPCRS在推荐系统的鲁棒性、适应性和准确性方面表现出色,在多种用户场景下均优于传统的CRS模型。这种方法不仅提供了一个更为真实的评估环境,还促进了对于用户行为动态的深入理解,从而优化了推荐过程。 | code | 0 | |
HGCH: A Hyperbolic Graph Convolution Network Model for Heterogeneous Collaborative Graph Recommendation | Lu Zhang, Ning Wu | Huazhong University of Science and Technology, Wuhan, China; Beihang University, Beijing, China | User-item interaction data in collaborative filtering and graph modeling tasks often exhibit power-law characteristics, which suggest the suitability of hyperbolic space modeling. Hyperbolic Graph Convolution Neural Networks (HGCNs) are a novel technique that leverages the advantages of GCN and hyperbolic space, and then achieves remarkable results. However, existing HGCN methods have several drawbacks: they fail to fully leverage hyperbolic space properties due to arbitrary embedding initialization and imprecise tangent space aggregation; they overlook auxiliary information that could enrich the collaborative graph; and their training convergence is slow due to margin ranking loss and random negative sampling. To overcome these challenges, we propose Hyperbolic Graph Collaborative for Heterogeneous Recommendation (HGCH), an enhanced HGCN-based model for collaborative filtering that integrates diverse side information into a heterogeneous collaborative graph and improves training convergence speed. HGCH first preserves the long-tailed nature of the graph by initializing node embeddings with power law prior; then it aggregates neighbors in hyperbolic space using the gyromidpoint method for accurate computation; finally, it fuses multiple embeddings from different hyperbolic spaces by the gate fusion with prior. Moreover, HGCH employs a hyperbolic user-specific negative sampling to speed up convergence. We evaluate HGCH on four real datasets, and the results show that HGCH achieves competitive results and outperforms leading baselines, including HGCNs. Extensive ablation studies further confirm its effectiveness. | 在协同过滤和图建模任务中的用户-物品交互数据往往呈现出幂律分布特征,这表明双曲空间建模的适用性。双曲图卷积神经网络(HGCNs)是一种利用GCN和双曲空间优势的新技术,并取得了显著成果。然而,现有的HGCN方法存在几个缺点:由于任意嵌入初始化和不精确的切空间聚合,未能充分利用双曲空间特性;忽略了可以丰富协同图的辅助信息;由于边缘排序损失和随机负采样,训练收敛速度慢。为了克服这些挑战,我们提出了异构推荐的双曲图协同(HGCH),这是一个基于HGCN的协同过滤增强模型,它将多样化的辅助信息整合到异构协同图中,并提高了训练收敛速度。HGCH首先通过使用幂律先验初始化节点嵌入来保留图的长尾特性;然后使用双曲空间中的中点方法聚合邻居以进行精确计算;最后,通过带先验的门融合方法融合来自不同双曲空间的多个嵌入。此外,HGCH采用双曲用户特定的负采样来加速收敛。我们在四个真实数据集上评估了HGCH,结果显示HGCH取得了竞争性的结果,并优于包括HGCNs在内的领先基线。广泛的消融研究进一步证实了其有效性。 | code | 0 |
EMERGE: Enhancing Multimodal Electronic Health Records Predictive Modeling with Retrieval-Augmented Generation | Yinghao Zhu, Changyu Ren, Zixiang Wang, Xiaochen Zheng, Shiyun Xie, Junlan Feng, Xi Zhu, Zhoujun Li, Liantao Ma, Chengwei Pan | China Mobile Research Institute, Beijing, China; Peking University, Beijing, China; Beihang University & Peking University, Beijing, China; Beihang University & Zhongguancun Laboratory, Beijing, China; ETH Zürich, Zürich, Switzerland; Beihang University, Beijing, China | The integration of multimodal Electronic Health Records (EHR) data has significantly advanced clinical predictive capabilities. Existing models, which utilize clinical notes and multivariate time-series EHR data, often fall short of incorporating the necessary medical context for accurate clinical tasks, while previous approaches with knowledge graphs (KGs) primarily focus on structured knowledge extraction. In response, we propose EMERGE, a Retrieval-Augmented Generation (RAG) driven framework to enhance multimodal EHR predictive modeling. We extract entities from both time-series data and clinical notes by prompting Large Language Models (LLMs) and align them with professional PrimeKG, ensuring consistency. In addition to triplet relationships, we incorporate entities' definitions and descriptions for richer semantics. The extracted knowledge is then used to generate task-relevant summaries of patients' health statuses. Finally, we fuse the summary with other modalities using an adaptive multimodal fusion network with cross-attention. Extensive experiments on the MIMIC-III and MIMIC-IV datasets' in-hospital mortality and 30-day readmission tasks demonstrate the superior performance of the EMERGE framework over baseline models. Comprehensive ablation studies and analysis highlight the efficacy of each designed module and robustness to data sparsity. EMERGE contributes to refining the utilization of multimodal EHR data in healthcare, bridging the gap with nuanced medical contexts essential for informed clinical predictions. We have publicly released the code at https://github.com/yhzhu99/EMERGE. | 多模态电子健康记录(EHR)数据的整合显著提升了临床预测能力。现有的模型,尽管利用了临床笔记和多元时间序列EHR数据,但往往未能充分纳入必要的医学背景信息,以实现精准的临床任务;而先前基于知识图谱(KGs)的方法则主要集中在结构化知识的提取上。为此,我们提出了EMERGE,一个由检索增强生成(RAG)驱动框架,旨在提升多模态EHR预测建模。我们通过提示大型语言模型(LLMs)从时间序列数据和临床笔记中提取实体,并将其与专业的PrimeKG对齐,以确保一致性。除了三元组关系外,我们还纳入了实体的定义和描述,以丰富语义。提取的知识随后用于生成与任务相关的患者健康状况摘要。最后,我们利用具有交叉注意力的自适应多模态融合网络将该摘要与其他模态数据融合。在MIMIC-III和MIMIC-IV数据集上的住院死亡率和30天再入院任务的广泛实验表明,EMERGE框架优于基线模型。全面的消融研究和分析突显了每个设计模块的有效性及其对数据稀疏性的稳健性。EMERGE有助于优化多模态EHR数据在医疗领域的应用,弥合了与精细医学背景之间的差距,这对于精准的临床预测至关重要。我们已经公开发布了代码,地址为https://github.com/yhzhu99/EMERGE。 | code | 0 |
Pairing Clustered Inverted Indexes with κ-NN Graphs for Fast Approximate Retrieval over Learned Sparse Representations | Sebastian Bruch, Franco Maria Nardini, Cosimo Rulli, Rossano Venturini | Learned sparse representations form an effective and interpretable class of embeddings for text retrieval. While exact top-k retrieval over such embeddings faces efficiency challenges, a recent algorithm called Seismic has enabled remarkably fast, highly-accurate approximate retrieval. Seismic statically prunes inverted lists, organizes each list into geometrically-cohesive blocks, and augments each block with a summary vector. At query time, each inverted list associated with a query term is traversed one block at a time in an arbitrary order, with the inner product between the query and summaries determining if a block must be evaluated. When a block is deemed promising, its documents are fully evaluated with a forward index. Seismic is one to two orders of magnitude faster than state-of-the-art inverted index-based solutions and significantly outperforms the winning graph-based submissions to the BigANN 2023 Challenge. In this work, we speed up Seismic further by introducing two innovations to its query processing subroutine. First, we traverse blocks in order of importance, rather than arbitrarily. Second, we take the list of documents retrieved by Seismic and expand it to include the neighbors of each document using an offline k-regular nearest neighbor graph; the expanded list is then ranked to produce the final top-k set. Experiments on two public datasets show that our extension, named SeismicWave, can reach almost-exact accuracy levels and is up to 2.2x faster than Seismic. | 学习到的稀疏表示为文本检索提供了一类有效且可解释的嵌入。尽管在这种嵌入上进行精确的top-k检索面临效率挑战,但最近的一种名为Seismic的算法实现了非常快速且高度精确的近似检索。Seismic静态地修剪倒排列表,将每个列表组织成几何上内聚的块,并为每个块增加一个摘要向量。在查询时,与查询词相关的每个倒排列表按任意顺序逐块遍历,通过查询与摘要之间的内积来确定是否需要评估该块。当一个块被认为有前景时,其文档会通过正向索引进行全面评估。Seismic比最先进的基于倒排索引的解决方案快一到两个数量级,并且在2023年BigANN挑战赛中显著优于基于图的获胜提交方案。在本研究中,我们通过在查询处理子程序中引入两项创新来进一步加速Seismic。首先,我们按重要性顺序遍历块,而不是任意顺序。其次,我们使用离线的k-正则最近邻图,将Seismic检索到的文档列表扩展为包含每个文档的邻居;然后对扩展后的列表进行排序以生成最终的top-k集合。在两个公开数据集上的实验表明,我们的扩展版本SeismicWave几乎可以达到完全精确的准确度,并且比Seismic快2.2倍。 | code | 0 | |
PP4RNR: Popularity- and Position-Aware Contrastive Learning for Retrieval-Driven News Recommendation | Wenwei Chen, Yewang Chen | College of Computer Science and Technology, Huaqiao University, Xiamen, China | Existing news recommendation systems often overlook the diversity of recommended content and exhibit popularity bias, resulting in suboptimal performance. To address this issue, this paper introduces a novel news recommendation approach, Popularity- and Position-Aware Contrastive Learning for Retrieval-Driven News Recommendation (PP4RNR). It consists of two modules: Entity-Level Retrieval Augmentation (ERA) and Popularity- and Position-Aware Contrastive Learning (PPCL). The ERA module utilizes both entities and titles to retrieve relevant news. Subsequently, retrieval-augmented news is fused with candidate news using our innovative cascaded attention network, leading to richer and more diverse news semantics. The PPCL module introduces perturbations in the news representation using a Gaussian perturbation vector based on the popularity and position information and then employs contrastive learning to regularize the representation space. Hence, this approach not only deepens the understanding of content diversity but also implicitly mitigates the popularity bias prevalent in current models. Rigorous testing on benchmark datasets demonstrates that our method significantly outperforms a range of state-of-the-art techniques. | 现有的新闻推荐系统往往忽视推荐内容的多样性,并表现出流行度偏差,导致推荐效果不佳。为解决这一问题,本文提出了一种新颖的新闻推荐方法——基于检索的新闻推荐的流行度和位置感知对比学习(PP4RNR)。该方法包含两个模块:实体级检索增强(ERA)和流行度与位置感知的对比学习(PPCL)。ERA模块利用实体和标题来检索相关新闻。随后,通过我们创新的级联注意力网络将检索增强的新闻与候选新闻融合,从而丰富和多样化新闻语义。PPCL模块基于新闻的流行度和位置信息引入高斯扰动向量,对新闻表示进行扰动,然后利用对比学习来规范化表示空间。因此,这种方法不仅加深了对内容多样性的理解,还隐式地缓解了当前模型中普遍存在的流行度偏差。在基准数据集上的严格测试表明,我们的方法显著优于一系列最先进的技术。 | code | 0 |
Exploiting Preferences in Loss Functions for Sequential Recommendation via Weak Transitivity | Hyunsoo Chung, Jungtaek Kim, Hyungeun Jo, Hyungwon Choi | A choice of optimization objective is immensely pivotal in the design of a recommender system as it affects the general modeling process of a user's intent from previous interactions. Existing approaches mainly adhere to three categories of loss functions: pairwise, pointwise, and setwise loss functions. Despite their effectiveness, a critical and common drawback of such objectives is viewing the next observed item as a unique positive while considering all remaining items equally negative. Such a binary label assignment is generally limited to assuring a higher recommendation score of the positive item, neglecting potential structures induced by varying preferences between other unobserved items. To alleviate this issue, we propose a novel method that extends original objectives to explicitly leverage the different levels of preferences as relative orders between their scores. Finally, we demonstrate the superior performance of our method compared to baseline objectives. | 优化目标的选择在推荐系统设计中极为关键,因为它影响从用户先前交互中对用户意图的一般建模过程。现有方法主要遵循三类损失函数:成对损失、逐点损失和集合损失函数。尽管这些方法有效,但它们的一个关键且普遍的缺点是将下一个观察到的项目视为唯一的正样本,而将所有剩余项目视为等同的负样本。这种二元标签分配通常仅限于确保正样本的推荐得分更高,而忽略了其他未观察项目之间因偏好差异所诱导的潜在结构。为解决这一问题,我们提出了一种新方法,将原始目标扩展为显式利用其得分之间的相对顺序来表示不同级别的偏好。最后,我们展示了我们的方法相对于基线目标的优越性能。 | code | 0 | |
RECE: Reduced Cross-Entropy Loss for Large-Catalogue Sequential Recommenders | Danil Gusak, Gleb Mezentsev, Ivan V. Oseledets, Evgeny Frolov | Scalability is a major challenge in modern recommender systems. In sequential recommendations, full Cross-Entropy (CE) loss achieves state-of-the-art recommendation quality but consumes excessive GPU memory with large item catalogs, limiting its practicality. Using a GPU-efficient locality-sensitive hashing-like algorithm for approximating large tensor of logits, this paper introduces a novel RECE (REduced Cross-Entropy) loss. RECE significantly reduces memory consumption while allowing one to enjoy the state-of-the-art performance of full CE loss. Experimental results on various datasets show that RECE cuts training peak memory usage by up to 12 times compared to existing methods while retaining or exceeding performance metrics of CE loss. The approach also opens up new possibilities for large-scale applications in other domains. | 可扩展性是现代推荐系统面临的主要挑战之一。在序列推荐中,全交叉熵(CE)损失实现了最先进的推荐质量,但在处理大型物品目录时会消耗过多的GPU内存,限制了其实用性。本文介绍了一种新的RECE(缩减交叉熵)损失,通过使用GPU高效的类似局部敏感哈希算法来近似大的logits张量。RECE显著减少了内存消耗,同时允许用户享受到全CE损失的最先进性能。在各种数据集上的实验结果表明,与现有方法相比,RECE将训练峰值内存使用量减少了高达12倍,同时保持或超过了CE损失的性能指标。该方法还为其他领域的大规模应用开辟了新的可能性。 | code | 0 | |
Enhanced Retrieval Effectiveness through Selective Query Generation | Seyed Mohammad Hosseini, Negar Arabzadeh, Morteza Zihayat, Ebrahim Bagheri | University of Waterloo, Waterloo, Ontario, Canada; Toronto Metropolitan University, Toronto, Ontario, Canada | Prior research has demonstrated that reformulation of queries can significantly enhance retrieval effectiveness. Despite notable successes in neural-based query reformulation methods, identifying optimal reformulations that cover the same information need while enhancing retrieval effectiveness is still challenging. This paper introduces a two-step query reformulation framework for generating and selecting optimal target query variants which not only achieve higher retrieval performance but also preserve the original query's information need. Our comprehensive evaluations on the MS MARCO dataset and TREC Deep Learning tracks demonstrate substantial improvements over original query's performance. | 先前研究表明,查询的重构可以显著提升检索效果。尽管基于神经网络的查询重构方法取得了显著成功,但识别出既能覆盖相同信息需求又能增强检索效果的最佳重构查询仍然具有挑战性。本文提出了一种两步走的查询重构框架,用于生成和选择最优的目标查询变体,这些变体不仅实现了更高的检索性能,还保留了原始查询的信息需求。我们在MS MARCO数据集和TREC深度学习赛道上的全面评估显示,相较于原始查询,性能有显著提升。 | code | 0 |
Post-Training Embedding Enhancement for Long-Tail Recommendation | Geon Lee, Kyungho Kim, Kijung Shin | KAIST, Seoul, Republic of Korea | Item popularity in real-world data follows a long-tail distribution, where a few items attract most of the attention, while the majority receive much less. This disparity results in high-quality embeddings for popular (head) items, but lower-quality embeddings for unpopular (tail) items, leading to less accurate recommendations for the latter. Our observations confirm that embeddings of tail items often exhibit (1) magnitudes (i.e., norms) that are less reflective of actual popularity and (2) directions that are less effective in capturing user preferences, compared to those of head items. To address this issue, we propose EDGE, a post-training embedding enhancement method for long-tail recommendations. EDGE employs two key strategies: (1) refining embedding magnitudes to better reflect item popularity and (2) adjusting embedding directions by leveraging knowledge from head items. Importantly, EDGE is model-agnostic and can be applied to embeddings learned from any trained recommender system. Experimental results show that EDGE significantly improves tail item recommendation performance and overall system performance, achieving up to an improvement of 211.23% in NDCG@20 over the state-of-the-art method. Our code and datasets are available at https://github.com/geon0325/EDGE. | 现实世界数据中的物品流行度遵循长尾分布,其中少数物品吸引了大部分关注,而大多数物品则受到较少的关注。这种差异导致流行(头部)物品的嵌入质量较高,但不流行(尾部)物品的嵌入质量较低,从而使得后者的推荐准确性降低。我们的观察证实了尾部物品的嵌入通常表现出(1)范数(即模)不太能反映实际流行度,以及(2)方向不太能有效捕捉用户偏好,相比头部物品的嵌入。为了解决这一问题,我们提出了EDGE,一种用于长尾推荐的后训练嵌入增强方法。EDGE采用两种关键策略:(1)优化嵌入范数以更好地反映物品流行度,以及(2)通过利用头部物品的知识来调整嵌入方向。重要的是,EDGE与模型无关,可以应用于从任何训练好的推荐系统中学习到的嵌入。实验结果表明,EDGE显著提升了尾部物品的推荐性能和整体系统性能,在NDCG@20指标上相比最先进的方法提升了高达211.23%。我们的代码和数据集可在https://github.com/geon0325/EDGE获取。 | code | 0 |
Scalable and Adaptive Spectral Embedding for Attributed Graph Clustering | Yunhui Liu, Tieke He, Qing Wu, Tao Zheng, Jianhua Zhao | Attributed graph clustering, which aims to group the nodes of an attributed graph into disjoint clusters, has made promising advancements in recent years. However, most existing methods face challenges when applied to large graphs due to the expensive computational cost and high memory usage. In this paper, we introduce Scalable and Adaptive Spectral Embedding (SASE), a simple attributed graph clustering method devoid of parameter learning. SASE comprises three main components: node features smoothing via |
属性图聚类旨在将属性图的节点分组为不相交的集群,近年来取得了显著进展。然而,大多数现有方法在应用于大规模图时面临挑战,主要是因为计算成本高且内存使用量大。本文提出了一种名为可扩展自适应谱嵌入(SASE)的简单属性图聚类方法,该方法无需参数学习。SASE包含三个主要组件:通过$k$阶简单图卷积进行节点特征平滑、使用随机傅里叶特征的可扩展谱聚类以及自适应阶数选择。这些设计使得SASE不仅能够有效捕捉全局聚类结构,而且相对于图的大小表现出线性的时间和空间复杂度。实证结果表明,SASE具有优越性。例如,在拥有169K节点和1.17M边的ArXiv数据集上,SASE在ACC上比次优的S3GC提高了6.9%,并且速度提升了$5.87$倍。 | code | 0 | |
P-Rank+: A Scalable Efficient P-Rank Search Algorithm | Maoyin Zhang, Weiren Yu | Warwick University, Coventry, United Kingdom; Nanjing University of Sci. & Tech., Jiangsu, China | P-Rank (Penetrating-Rank) is a charming measure of structural similarity between objects based on graph topology. It recursively follows the principle that "two objects are considered similar if (a) they are referenced by similar objects and (b) they reference similar objects''. The best-known algorithm for computing P-Rank employs two repeated Singular Value Decompositions (SVDs) coupled with the Woodbury matrix identity. However, this method does not scale well on billion-sized graphs. Worse yet, this algorithm only provides a linear approximation of the P-Rank model and cannot deliver accurate P-Rank values. In this paper, we propose P-Rank+, a fast and efficient algorithm for computing P-Rank similarities, which scales well on large graphs with billions of edges. P-Rank+ leverages dimensionality reduction techniques by performing only one SVD of the graph integrated with Hadamard products in the reduced subspace. Moreover, we provide provable error guarantees for P-Rank+ computation. Experiments on various datasets validate that P-Rank+ is 1--3 orders of magnitude faster than the best-known competitor while achieving excellent scalability on massive graphs. | P-Rank(渗透排序)是一种基于图拓扑结构的对象间结构相似性的迷人度量方法。它递归地遵循以下原则:“如果两个对象(a)被相似的对象引用,并且(b)引用相似的对象,则认为它们是相似的”。计算P-Rank最著名的算法采用了两个重复的奇异值分解(SVD),并与Woodbury矩阵恒等式相结合。然而,这种方法在处理十亿级大小的图时扩展性不佳。更糟糕的是,该算法仅提供了P-Rank模型的线性近似,无法提供精确的P-Rank值。在本文中,我们提出了P-Rank+,一种快速且高效的计算P-Rank相似性的算法,该算法在拥有数十亿条边的大型图上具有良好的扩展性。P-Rank+通过在降维子空间中执行一次图的SVD结合Hadamard积来利用降维技术。此外,我们为P-Rank+的计算提供了可证明的误差保证。在各种数据集上的实验验证了P-Rank+比已知的最优竞争对手快1到3个数量级,同时在大型图上表现出卓越的可扩展性。 | code | 0 |
Learning the Dynamics in Sequential Recommendation by Exploiting Real-time Information | Rujiao Zhang, Hao Zhang, Yucong Luo, Zhiding Liu, Mingyue Cheng, Qi Liu, Enhong Chen | Sequential recommender systems offer personalized suggestions by modeling users' interactions chronologically to capture dynamic user interest. Existing approaches typically fail to adequately describe the dynamics of the entire recommender system, including shifts in both user interest and item availability. To address this, we propose a simple yet effective framework with three key perspectives, tailored to the dynamics of recommender system by fully exploiting the time information. Firstly, we propose a dynamic candidate set construction approach to prevent the model from learning future interactions. Secondly, assuming that user behaviors remain consistent over short terms but may evolve over long terms, we employ a interval-weighted optimization target to model the correlation of users' historical interactions. Finally, we introduce a specialized time-aware attention module to enhance recommendations within specific temporal contexts. Extensive experiments demonstrate the effectiveness and generalizability of our framework. We make our codes publicly available. | 顺序推荐系统通过按时间顺序建模用户的交互来捕捉动态用户兴趣,从而提供个性化建议。现有的方法通常未能充分描述整个推荐系统的动态变化,包括用户兴趣和物品可用性的变化。为了解决这一问题,我们提出了一个简单而有效的框架,该框架从三个关键角度出发,充分利用时间信息来适应推荐系统的动态变化。首先,我们提出了一种动态候选集构建方法,以防止模型学习未来的交互。其次,假设用户行为在短期内保持一致,但在长期内可能发生变化,我们采用了一种区间加权优化目标来建模用户历史交互的相关性。最后,我们引入了一个专门的时间感知注意力模块,以增强在特定时间上下文中的推荐效果。大量实验证明了我们框架的有效性和普适性。我们将代码公开发布。 | code | 0 | |
VIER: Visual Imagination Enhanced Retrieval in Sponsored Search | Yadong Zhang, Yuqing Song, Siyu Lu, Qiang Liu, Xingxing Wang | Meituan, Beijing, China | Embedding-based Retrieval (EBR) has been a fundamental component in sponsored-search systems, which retrieves high-quality products for the user's search query by encoding the information of the query, user and product into dense embeddings. However, due to the characteristic of location-based service, the user input queries suffer from two extremes: overly brief queries with vague intentions and lengthy queries with substantial noise, both of which make it challenging to discern the exact user search intent. In fact, the e-consumers typically have a mental imagery of the product they intend to search for, reflecting their specific purchasing intentions. In this paper, we propose a Visual Imagination Enhanced Retrieval model (VIER) to explore the implicit imagery of users. Specifically, we design a visual imagination network to reconstruct the imagery embeddings that capture both coarse-grained query commonalities and fine-grained user personalities. These pseudo-image representations are integrated with the query and user behavior to enhance the understanding of user search intentions for improved retrieval. According to online A/B tests on Meituan sponsored-search system, our method significantly outperforms baselines in terms of revenue, clicks and click-through rate. | 基于嵌入的检索(EBR)已成为赞助搜索系统中的基础组件,通过将查询、用户和产品的信息编码为密集嵌入,检索出高质量的产品以满足用户的搜索需求。然而,由于基于位置服务的特性,用户输入的查询呈现出两种极端:过于简短且意图模糊的查询,以及冗长但包含大量噪音的查询,这两者都使得准确识别用户的搜索意图变得困难。实际上,电子消费者通常对其意图搜索的产品有一个心理意象,这反映了他们特定的购买意向。在本文中,我们提出了一种视觉想象力增强的检索模型(VIER),以探索用户的隐含意象。具体而言,我们设计了一个视觉想象力网络,用于重建捕捉粗粒度查询共性和细粒度用户个性的意象嵌入。这些伪图像表示与查询和用户行为相结合,以增强对用户搜索意图的理解,从而改进检索效果。根据在美团赞助搜索系统上的在线A/B测试结果,我们的方法在收入、点击量和点击率方面显著优于基线方法。 | code | 0 |
Transforming Location Retrieval at Airbnb: A Journey from Heuristics to Reinforcement Learning | Dillon Davis, Huiji Gao, Thomas Legrand, Malay Haldar, Alex Deng, Han Zhao, Liwei He, Sanjeev Katariya | The Airbnb search system grapples with many unique challenges as it continues to evolve. We oversee a marketplace that is nuanced by geography, diversity of homes, and guests with a variety of preferences. Crafting an efficient search system that can accommodate diverse guest needs, while showcasing relevant homes lies at the heart of Airbnb's success. Airbnb search has many challenges that parallel other recommendation and search systems but it has a unique information retrieval problem, upstream of ranking, called location retrieval. It requires defining a topological map area that is relevant to the searched query for homes listing retrieval. The purpose of this paper is to demonstrate the methodology, challenges, and impact of building a machine learning based location retrieval product from the ground up. Despite the lack of suitable, prevalent machine learning based approaches, we tackle cold start, generalization, differentiation and algorithmic bias. We detail the efficacy of heuristics, statistics, machine learning, and reinforcement learning approaches to solve these challenges, particularly for systems that are often unexplored by current literature. | 随着不断发展,Airbnb搜索系统面临着许多独特的挑战。我们管理着一个由地理位置、房屋多样性和具有各种偏好的客人所构成的复杂市场。打造一个能够满足不同客人需求的高效搜索系统,同时展示相关房屋,是Airbnb成功的核心。Airbnb搜索系统面临许多与其他推荐和搜索系统相似的挑战,但它有一个独特的信息检索问题,即在排序之前的位置检索。这需要定义一个与搜索查询相关的拓扑地图区域,以便进行房屋列表检索。本文旨在展示从头构建一个基于机器学习的位置检索产品的过程、挑战及其影响。尽管缺乏合适且普遍的基于机器学习的方法,我们解决了冷启动、泛化、差异化和算法偏差等问题。我们详细介绍了启发式、统计学、机器学习和强化学习方法在这些挑战中的有效性,特别是针对当前文献中较少探索的系统。 | code | 0 | |
Pareto-based Multi-Objective Recommender System with Forgetting Curve | Jipeng Jin, Zhaoxiang Zhang, Zhiheng Li, Xiaofeng Gao, Xiongwen Yang, Lei Xiao, Jie Jiang | ; Shanghai Jiao Tong University | Recommender systems with cascading architecture play an increasinglysignificant role in online recommendation platforms, where the approach todealing with negative feedback is a vital issue. For instance, in short videoplatforms, users tend to quickly slip away from candidates that they feelaversive, and recommender systems are expected to receive these explicitnegative feedbacks and make adjustments to avoid these recommendations.Considering recency effect in memories, we propose a forgetting model based onEbbinghaus Forgetting Curve to cope with negative feedback. In addition, weintroduce a Pareto optimization solver to guarantee a better trade-off betweenrecency and model performance. In conclusion, we propose Pareto-basedMulti-Objective Recommender System with forgetting curve (PMORS), which can beapplied to any multi-objective recommendation and show sufficiently superioritywhen facing explicit negative feedback. We have conducted evaluations of PMORSand achieved favorable outcomes in short-video scenarios on both public datasetand industrial dataset. After being deployed on an online short video platformnamed WeChat Channels in May, 2023, PMORS has not only demonstrated promisingresults for both consistency and recency but also achieved an improvement of upto +1.45 | 具有级联架构的推荐系统在在线推荐平台中扮演着越来越重要的角色,其中如何处理负面反馈是一个关键问题。例如,在短视频平台中,用户往往会对感到不喜欢的候选内容迅速失去兴趣,推荐系统需要接收这些明确的负面反馈并进行调整,以避免此类推荐。考虑到记忆中的时效性效应,我们提出了一种基于艾宾浩斯遗忘曲线的遗忘模型来处理负面反馈。此外,我们引入了一种帕累托优化求解器,以确保在时效性和模型性能之间取得更好的平衡。综上所述,我们提出了基于帕累托的多目标推荐系统(PMORS),该系统可以应用于任何多目标推荐,并且在面对明确的负面反馈时表现出足够的优越性。我们对PMORS进行了评估,并在公共数据集和工业数据集的短视频场景中取得了良好的结果。自2023年5月在名为微信视频号的在线短视频平台上部署以来,PMORS不仅在一致性和时效性方面展示了有前景的结果,还实现了高达+1.45的改进。 | code | 0 |
Ads Supply Personalization via Doubly Robust Learning | Wei Shi, Chen Fu, Qi Xu, Sanjian Chen, Jizhe Zhang, Qinqin Zhu, Zhigang Hua, Shuang Yang | Meta Platforms, Inc., Sunnyvale, CA, USA; Meta Platforms, Inc., Menlo Park, CA, USA | Ads supply personalization aims to balance the revenue and user engagement, two long-term objectives in social media ads, by tailoring the ad quantity and density. In the industry-scale system, the challenge for ads supply lies in modeling the counterfactual effects of a conservative supply treatment (e.g., a small density change) over an extended duration. In this paper, we present a streamlined framework for personalized ad supply. This framework optimally utilizes information from data collection policies through the doubly robust learning. Consequently, it significantly improves the accuracy of long-term treatment effect estimates. Additionally, its low-complexity design not only results in computational cost savings compared to existing methods, but also makes it scalable for billion-scale applications. Through both offline experiments and online production tests, the framework consistently demonstrated significant improvements in top-line business metrics over months. The framework has been fully deployed to live traffic in one of the world's largest social media platforms. | 广告供应个性化旨在通过调整广告数量和密度,平衡社交媒体广告中的两个长期目标:收入和用户参与度。在行业规模的系统中,广告供应的挑战在于对保守供应处理(例如,小幅密度变化)在长时间内的反事实效应进行建模。本文提出了一种简化的个性化广告供应框架。该框架通过双重稳健学习最优地利用数据收集策略中的信息,从而显著提高了长期处理效应估计的准确性。此外,其低复杂度的设计不仅在计算成本上优于现有方法,还使其适用于亿级规模的应用。通过离线实验和在线生产测试,该框架在数月内持续显示出对业务关键指标的显著改进。该框架已全面部署到全球最大社交媒体平台之一的实时流量中。 | code | 0 |
DivNet: Diversity-Aware Self-Correcting Sequential Recommendation Networks | Shuai Xiao, Zaifan Jiang | Alibaba Group, Shanghai, China; Alibaba Group, Beijing, China | As the last stage of a typical recommendation system, collective recommendation aims to give the final touches to the recommended items and their layout so as to optimize overall objectives such as diversity and whole-page relevance. In practice, however, the interaction dynamics among the recommended items, their visual appearances and meta-data such as specifications are often too complex to be captured by experts' heuristics or simple models. To address this issue, we propose a div ersity-aware self-correcting sequential recommendation net works (DivNet) that is able to estimate utility by capturing the complex interactions among sequential items and diversify recommendations simultaneously. Experiments on both offline and online settings demonstrate that DivNet can achieve better results compared to baselines with or without collective recommendations. | 在典型的推荐系统的最后一个阶段,集体推荐旨在对推荐项目及其布局进行最后的调整,以优化多样性和整个页面的相关性等总体目标。然而,在实践中,推荐项目之间的交互动态、它们的视觉外观和规格等元数据往往过于复杂,难以被专家的启发式方法或简单的模型捕捉。为了解决这个问题,我们提出了一种多样性感知的自校正序列推荐网络(DivNet),它能够通过捕捉序列项目之间的复杂交互来估计效用,并同时实现推荐项目的多样化。在离线和在线设置中的实验表明,与有无集体推荐的基线相比,DivNet能够取得更好的结果。 | code | 0 |
Enhancing Playback Performance in Video Recommender Systems with an On-Device Gating and Ranking Framework | Yunfei Yang, Zhenghao Qi, Honghuan Wu, Qi Song, Tieyao Zhang, Hao Li, Yimin Tu, Kaiqiao Zhan, Ben Wang | Video recommender systems (RSs) have gained increasing attention in recent years. Existing mainstream RSs focus on optimizing the matching function between users and items. However, we noticed that users frequently encounter playback issues such as slow loading or stuttering while browsing the videos, especially in weak network conditions, which will lead to a subpar browsing experience, and may cause users to leave, even when the video content and recommendations are superior. It is quite a serious issue, yet easily overlooked. To tackle this issue, we propose an on-device Gating and Ranking Framework (GRF) that cooperates with server-side RS. Specifically, we utilize a gate model to identify videos that may have playback issues in real-time, and then we employ a ranking model to select the optimal result from a locally-cached pool to replace the stuttering videos. Our solution has been fully deployed on Kwai, a large-scale short video platform with hundreds of millions of users globally. Moreover, it significantly enhances video playback performance and improves overall user experience and retention rates. | 视频推荐系统(RSs)近年来受到了越来越多的关注。现有的主流RSs专注于优化用户与项目之间的匹配函数。然而,我们注意到用户在浏览视频时经常遇到播放问题,如加载缓慢或卡顿,尤其是在网络条件较差的情况下,这将导致浏览体验不佳,甚至可能使用户流失,即使视频内容和推荐本身是优质的。这是一个相当严重但容易被忽视的问题。为了解决这个问题,我们提出了一个设备端门控与排序框架(GRF),该框架与服务器端RS协同工作。具体来说,我们利用门控模型实时识别可能存在播放问题的视频,然后使用排序模型从本地缓存池中选择最佳结果来替换卡顿的视频。我们的解决方案已在快手这一全球拥有数亿用户的大型短视频平台上全面部署。此外,它显著提升了视频播放性能,并改善了整体用户体验和留存率。 | code | 0 | |
An Enhanced Batch Query Architecture in Real-time Recommendation | Qiang Zhang, Zhipeng Teng, Disheng Wu, Jiayin Wang | In industrial recommendation systems on websites and apps, it is essential to recall and predict top-n results relevant to user interests from a content pool of billions within milliseconds. To cope with continuous data growth and improve real-time recommendation performance, we have designed and implemented a high-performance batch query architecture for real-time recommendation systems. Our contributions include optimizing hash structures with a cacheline-aware probing method to enhance coalesced hashing, as well as the implementation of a hybrid storage key-value service built upon it. Our experiments indicate this approach significantly surpasses conventional hash tables in batch query throughput, achieving up to 90 of random memory access when incorporating parallel optimization. The support for NVMe, integrating two-tier storage for hot and cold data, notably reduces resource consumption. Additionally, the system facilitates dynamic updates, automated sharding of attributes and feature embedding tables, and introduces innovative protocols for consistency in batch queries, thereby enhancing the effectiveness of real-time incremental learning updates. This architecture has been deployed and in use in the bilibili recommendation system for over a year, a video content community with hundreds of millions of users, supporting 10x increase in model computation with minimal resource growth, improving outcomes while preserving the system's real-time performance. | 在网站和应用的工业推荐系统中,从数十亿内容库中毫秒级召回并预测与用户兴趣相关的前N个结果至关重要。为应对持续的数据增长并提升实时推荐性能,我们设计并实现了一种高性能的实时推荐系统批量查询架构。我们的贡献包括通过缓存行感知的探测方法优化哈希结构以增强聚合哈希,以及基于此构建的混合存储键值服务。实验表明,该方法在批量查询吞吐量方面显著超越传统哈希表,结合并行优化时随机内存访问可达90%。对NVMe的支持,结合冷热数据的两级存储,显著降低了资源消耗。此外,系统支持动态更新、属性和特征嵌入表的自动分片,并引入了创新的批量查询一致性协议,从而提升了实时增量学习更新的效果。该架构已在拥有数亿用户的视频内容社区bilibili推荐系统中部署并使用超过一年,支持模型计算量10倍增长的同时资源增长最小化,既提升了推荐效果又保持了系统的实时性能。 | code | 0 | |
Voting with Generative AI for German Compound Splitting in E-commerce Search | Ümit Yilmaz, Kilian Merkelbach, Daniel Stein, Hasan Oezkan | eBay Inc., Aachen, Germany; eBay Inc., Dreilinden, Germany | Compound words are a grammatical structure that allows forming new words by composing existing words. For e-commerce search in German, it is essential to split these compounds into meaningful parts because item titles often use the joint form while search queries are often split. We propose a method for German compound splitting leveraging a large language model (LLM) with a voting mechanism and a hyperparameter search for automatically optimizing prompt and parameter combinations. Our evaluation of the proposed method on human-created gold standard data for e-commerce shows that it outperforms existing methods for compound splitting in this domain. | 复合词是一种语法结构,通过组合现有词汇来形成新词。在德语的电子商务搜索中,将这些复合词拆分为有意义的组成部分至关重要,因为商品标题通常使用联合形式,而搜索查询则通常是拆分后的形式。我们提出了一种利用大型语言模型(LLM)进行德语复合词拆分的方法,该方法结合了投票机制和超参数搜索,以自动优化提示和参数组合。我们对所提出的方法在人工创建的电子商务金标准数据上的评估显示,它在复合词拆分方面优于现有的方法。 | code | 0 |
AI Agent for Information Retrieval: Generating and Ranking | Yongfeng Zhang, Zhiwei Liu, Qingsong Wen, Linsey Pang, Wei Liu, Philip S. Yu | University of Technology Sydney, Sydney, NSW, Australia; Squirrel Ai Learning, Seattle, WA, USA; Rutgers University, New Brunswick, NJ, USA; Salesforce, San Francisco, CA, USA; University of Illinois at Chicago, Chicago, IL, USA; Salesforce AI Research, Palo Alto, CA, USA | The field of information retrieval has significantly transformed with the integration of AI technologies. AI agents, especially those leveraging LLMs and vast computational power, have revolutionized information retrieval, processing, and presentation. LLM agents, with advanced memory, reasoning, and planning capabilities, can perform complex tasks, engage in coherent conversations, and provide personalized responses. Despite these advancements, challenges such as ensuring relevance and accuracy, mitigating biases, providing real-time responses, and maintaining data security remain. This workshop aims to explore these challenges, share innovative solutions, and discuss future directions. It will provide a platform to bring together researchers, practitioners to discuss the latest theoretical advancements and practical implementations of AI agents in information retrieval. Topics include AI in search, recommendation, and personalization systems. By gathering a diverse group of experts, the workshop seeks to deepen the understanding of AI agents in information retrieval, advance the field, and enhance its societal impact. Participants will gain insights into cutting-edge research, emerging trends, and foster knowledge exchange and collaboration within the community. | 信息检索领域随着AI技术的融合发生了显著变革。AI代理,尤其是那些利用大型语言模型(LLMs)和强大计算能力的代理,已经彻底改变了信息检索、处理和呈现的方式。具备先进记忆、推理和规划能力的LLM代理能够执行复杂任务、进行连贯对话并提供个性化响应。尽管取得了这些进展,但仍面临确保相关性和准确性、减轻偏见、提供实时响应以及维护数据安全等挑战。本次研讨会旨在探讨这些挑战,分享创新解决方案,并讨论未来的发展方向。研讨会将提供一个平台,让研究人员和从业者能够讨论AI代理在信息检索中最新的理论进展和实际应用。主题包括AI在搜索、推荐和个人化系统中的应用。通过汇集多元化的专家群体,研讨会旨在深化对信息检索中AI代理的理解,推动该领域的发展,并增强其社会影响力。参与者将获得关于尖端研究、新兴趋势的见解,并促进社区内的知识交流与合作。 | code | 0 |
UniEmbedding: Learning Universal Multi-Modal Multi-Domain Item Embeddings via User-View Contrastive Learning | Boqi Dai, Zhaocheng Du, Jieming Zhu, Jintao Xu, Deqing Zou, Quanyu Dai, Zhenhua Dong, Rui Zhang, HaiTao Zheng | Huawei Noah's Ark Lab, Shenzhen, China; Huazhong University of Science and Technology, Shenzhen, China; Shenzhen International Graduate School, Tsinghua University, Shenzhen, China; Shenzhen International Graduate School, Tsinghua University & Pengcheng Laboratory, Shenzhen, China | Learning high-quality item embeddings is crucial for recommendation tasks such as matching and ranking. However, existing methods often rely on ID-based item embeddings learned end-to-end with downstream recommendation models, which may suffer from overfitting and limited generalizability. In this paper, we aim to learn universal item embeddings (dubbed UniEmbedding) that capture multi-modal semantics, generalize across multiple domains, and serve different downstream tasks. To achieve this goal, we introduce the UniEmbedding pretraining framework, which includes three modules: a domain-aware multi-modal adapter, a user-view projection module, and contrastive learning objectives across domains. Compared to naive ID embeddings, UniEmbedding provides rich semantic information that generalizes more effectively across domains. Unlike multi-modal embeddings directly extracted from off-the-shelf pretrained models, UniEmbedding achieves better alignment between content semantics and behaviors. We evaluated UniEmbedding on both public and industrial datasets, demonstrating its effectiveness in matching and ranking tasks. Furthermore, UniEmbedding has been deployed in multiple recommendation applications at Huawei, resulting in significant gains in user engagement metrics. | 学习高质量的物品嵌入对于匹配和排序等推荐任务至关重要。然而,现有方法通常依赖于基于ID的物品嵌入,这些嵌入与下游推荐模型端到端学习,可能会遭受过拟合和泛化能力有限的问题。本文旨在学习一种通用的物品嵌入(称为UniEmbedding),这种嵌入能够捕捉多模态语义,跨多个领域泛化,并服务于不同的下游任务。为实现这一目标,我们引入了UniEmbedding预训练框架,该框架包括三个模块:领域感知的多模态适配器、用户视角投影模块以及跨领域的对比学习目标。与简单的ID嵌入相比,UniEmbedding提供了更丰富的语义信息,能更有效地跨领域泛化。与直接从现成的预训练模型中提取的多模态嵌入不同,UniEmbedding在内容语义和行为之间实现了更好的对齐。我们在公共和工业数据集上评估了UniEmbedding,证明了其在匹配和排序任务中的有效性。此外,UniEmbedding已在华为的多个推荐应用中部署,显著提升了用户参与度指标。 | code | 0 |
Modeling User Intent Beyond Trigger: Incorporating Uncertainty for Trigger-Induced Recommendation | Jianxing Ma, Zhibo Xiao, Luwei Yang, Hansheng Xue, Xuanzhou Liu, Wen Jiang, Wei Ning, Guannan Zhang | To cater to users' desire for an immersive browsing experience, numerous e-commerce platforms provide various recommendation scenarios, with a focus on Trigger-Induced Recommendation (TIR) tasks. However, the majority of current TIR methods heavily rely on the trigger item to understand user intent, lacking a higher-level exploration and exploitation of user intent (e.g., popular items and complementary items), which may result in an overly convergent understanding of users' short-term intent and can be detrimental to users' long-term purchasing experiences. Moreover, users' short-term intent shows uncertainty and is affected by various factors such as browsing context and historical behaviors, which poses challenges to user intent modeling. To address these challenges, we propose a novel model called Deep Uncertainty Intent Network (DUIN), comprising three essential modules: i) Explicit Intent Exploit Module extracting explicit user intent using the contrastive learning paradigm; ii) Latent Intent Explore Module exploring latent user intent by leveraging the multi-view relationships between items; iii) Intent Uncertainty Measurement Module offering a distributional estimation and capturing the uncertainty associated with user intent. Experiments on three real-world datasets demonstrate the superior performance of DUIN compared to existing baselines. Notably, DUIN has been deployed across all TIR scenarios in our e-commerce platform, with online A/B testing results conclusively validating its superiority. | 为了满足用户对沉浸式浏览体验的需求,众多电商平台提供了多种推荐场景,着重于触发式推荐(Trigger-Induced Recommendation, TIR)任务。然而,当前大多数TIR方法过于依赖触发项来理解用户意图,缺乏对用户意图的高层次探索和利用(例如,流行商品和互补商品),这可能导致对用户短期意图的理解过于集中,从而对用户的长期购买体验产生不利影响。此外,用户的短期意图表现出不确定性,并受到浏览上下文和历史行为等多种因素的影响,这对用户意图建模提出了挑战。为了应对这些挑战,我们提出了一种名为深度不确定性意图网络(Deep Uncertainty Intent Network, DUIN)的新模型,该模型包含三个核心模块:i) 显式意图利用模块,通过对比学习范式提取显式用户意图;ii) 潜在意图探索模块,利用商品之间的多视角关系来探索潜在用户意图;iii) 意图不确定性度量模块,提供分布估计并捕捉用户意图的不确定性。在三个真实世界数据集上的实验表明,DUIN相比现有基线方法表现出优越的性能。值得注意的是,DUIN已在我们电商平台的所有TIR场景中部署,在线A/B测试结果有力地验证了其优越性。 | code | 0 | |
Confidence-aware Self-Semantic Distillation on Knowledge Graph Embedding | Yichen Liu, Jiawei Chen, Defang Chen, Zhehui Zhou, Yan Feng, Can Wang | Knowledge Graph Embedding (KGE), which projects entities and relations intocontinuous vector spaces, have garnered significant attention. Althoughhigh-dimensional KGE methods offer better performance, they come at the expenseof significant computation and memory overheads. Decreasing embeddingdimensions significantly deteriorates model performance. While several recentefforts utilize knowledge distillation or non-Euclidean representation learningto augment the effectiveness of low-dimensional KGE, they either necessitate apre-trained high-dimensional teacher model or involve complex non-Euclideanoperations, thereby incurring considerable additional computational costs. Toaddress this, this work proposes Confidence-aware Self-Knowledge Distillation(CSD) that learns from model itself to enhance KGE in a low-dimensional space.Specifically, CSD extracts knowledge from embeddings in previous iterations,which would be utilized to supervise the learning of the model in the nextiterations. Moreover, a specific semantic module is developed to filterreliable knowledge by estimating the confidence of previously learnedembeddings. This straightforward strategy bypasses the need for time-consumingpre-training of teacher models and can be integrated into various KGE methodsto improve their performance. Our comprehensive experiments on six KGEbackbones and four datasets underscore the effectiveness of the proposed CSD. | 知识图谱嵌入(KGE)将实体和关系投影到连续的向量空间中,引起了广泛关注。尽管高维KGE方法提供了更好的性能,但它们也带来了显著的计算和内存开销。降低嵌入维度会显著降低模型性能。虽然最近的一些研究利用知识蒸馏或非欧几里得表示学习来增强低维KGE的有效性,但它们要么需要预训练的高维教师模型,要么涉及复杂的非欧几里得操作,从而产生了大量的额外计算成本。为了解决这一问题,本文提出了置信度感知的自知识蒸馏(CSD),该方法通过从模型自身学习来增强低维空间的KGE。具体而言,CSD从先前迭代中的嵌入中提取知识,这些知识将被用于监督模型在后续迭代中的学习。此外,本文还开发了一个特定的语义模块,通过估计先前学习嵌入的置信度来过滤可靠的知识。这种直接的策略避免了耗时的教师模型预训练,并且可以集成到各种KGE方法中以提高其性能。我们在六个KGE基线和四个数据集上的综合实验验证了所提出CSD的有效性。 | code | 0 | |
SAQRec: Aligning Recommender Systems to User Satisfaction via Questionnaire Feedback | Kepu Zhang, Teng Shi, Sunhao Dai, Xiao Zhang, Yinfeng Li, Jing Lu, Xiaoxue Zang, Yang Song, Jun Xu | Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China; Kuaishou Technology Co., Ltd., Beijing, China | In real-world recommender systems, user engagement and subjective feedback play pivotal roles in shaping the content distribution mechanism of the platform. When platforms reach a certain scale, they often gather valuable questionnaire feedback data from users to evaluate their satisfaction with recommended items. Compared to traditional user feedback such as likes, questionnaires explicitly capture both satisfaction and dissatisfaction and are unaffected by other users' questionnaires, thus better expressing users' true preferences. In this paper, we aim to leverage the questionnaire feedback to align the recommendation model with users' true preferences. However, due to the platform distribution mechanism and divergent user attitudes toward questionnaires, the questionnaire feedback data frequently becomes sparse and exhibits selection biases, resulting in challenges in feature integration and training process. To address these issues, we introduce a novel user Satisfaction Alignment framework that effectively leverages Questionnaire feedback to enhance Recommendation, named SAQRec. SAQRec begins by training an unbiased satisfaction model to impute satisfaction, addressing selection bias and data sparsity. Then, SAQRec aligns features with users' true preferences by disentangling satisfaction and dissatisfaction from click history and categorizing clicked items into multiple satisfaction levels through the imputed satisfactions. Additionally, the imputed satisfactions from the pre-trained unbiased satisfaction model serve as pseudo-labels to align the model's outputs with users' true preferences. Extensive experiments on both public and commercial datasets demonstrate SAQRec's superior integration of questionnaire feedback in recommendation models. Online A/B testing on a short video platform confirms its effectiveness in boosting user watch time and positive-to-negative feedback ratio, enhancing overall performance and user satisfaction. | 在实际的推荐系统中,用户参与度和主观反馈在塑造平台内容分发机制方面起着关键作用。当平台达到一定规模时,通常会收集用户对推荐项目的满意度问卷反馈数据,以评估用户的满意度。与传统的用户反馈(如点赞)相比,问卷能够明确捕捉用户的满意和不满意情况,并且不受其他用户问卷的影响,因此更能表达用户的真实偏好。本文旨在利用问卷反馈来使推荐模型与用户的真实偏好相一致。然而,由于平台分发机制和用户对问卷的不同态度,问卷反馈数据往往变得稀疏并存在选择偏差,导致特征整合和训练过程面临挑战。为解决这些问题,我们提出了一种新的用户满意度对齐框架,该框架有效利用问卷反馈来增强推荐,命名为SAQRec。SAQRec首先训练一个无偏的满意度模型来填补满意度,解决选择偏差和数据稀疏问题。然后,SAQRec通过对点击历史进行解耦,将满意和不满意分离,并通过填补的满意度将点击项目分类为多个满意度级别,从而使特征与用户的真实偏好对齐。此外,预训练的无偏满意度模型产生的填补满意度作为伪标签,用于使模型的输出与用户的真实偏好对齐。在公共和商业数据集上的广泛实验表明,SAQRec在推荐模型中对问卷反馈的整合具有优越性。在一个短视频平台上的在线A/B测试证实了其在提升用户观看时间和正面反馈与负面反馈比例方面的有效性,从而提高了整体性能和用户满意度。 | code | 0 |
CXSimulator: A User Behavior Simulation using LLM Embeddings for Web-Marketing Campaign Assessment | Akira Kasuga, Ryo Yonetani | This paper presents the Customer Experience (CX) Simulator, a novel framework designed to assess the effects of untested web-marketing campaigns through user behavior simulations. The proposed framework leverages large language models (LLMs) to represent various events in a user's behavioral history, such as viewing an item, applying a coupon, or purchasing an item, as semantic embedding vectors. We train a model to predict transitions between events from their LLM embeddings, which can even generalize to unseen events by learning from diverse training data. In web-marketing applications, we leverage this transition prediction model to simulate how users might react differently when new campaigns or products are presented to them. This allows us to eliminate the need for costly online testing and enhance the marketers' abilities to reveal insights. Our numerical evaluation and user study, utilizing BigQuery Public Datasets from the Google Merchandise Store, demonstrate the effectiveness of our framework. | 本文介绍了客户体验(CX)模拟器,这是一个新颖的框架,旨在通过用户行为模拟来评估未经测试的网络营销活动的效果。该框架利用大型语言模型(LLMs)将用户行为历史中的各种事件,如查看商品、使用优惠券或购买商品,表示为语义嵌入向量。我们训练了一个模型,从这些LLM嵌入中预测事件之间的转换,该模型甚至可以通过从多样化的训练数据中学习来泛化到未见过的事件。在网络营销应用中,我们利用这种转换预测模型来模拟用户在新活动或新产品呈现给他们时可能产生的不同反应。这使我们能够消除昂贵的在线测试需求,并增强营销人员揭示洞察的能力。我们的数值评估和用户研究,利用了Google商品商店的BigQuery公共数据集,证明了我们框架的有效性。 | code | 0 | |
Exploring High-Order User Preference with Knowledge Graph for Recommendation | Caijun Xu, Fuwei Zhang, Zhao Zhang, Fuzhen Zhuang, Rui Liu | Institute of Artificial Intelligence, Beihang University & Zhongguancun Laboratory, Beijing, China; School of Computer Science, Beihang University, Beijing, China; Institute of Artificial Intelligence, Beihang University, Beijing, China; Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China | Knowledge Graph (KG) has proven its effectiveness in recommendation systems. Recent knowledge-aware recommendation methods, which utilize graph neural networks and contrastive learning, underestimate two issues: 1) The neglect of modeling the latent relationships between users and entities; 2) The insufficiency of traditional cross-view contrastive learning whose domain is incapable of covering all nodes in a graph. To address these issues, we propose a novel model named Knowledge-aware User Preference Network (KUPN). Specifically, KUPN first constructs the relational preference view containing a new graph named User Preference Graph (UPG) to model the potential relationships between users and entities. Then, we adopt a novel attentive information aggregation to learn the UPG. In addition, we obtain semantic information of users and entities from collaborative knowledge view which consists of KG and Interaction Graph (IG) as supplementary. Finally, we apply a cross-view contrastive learning for complete domains between dynamic relational preference view and collaborative knowledge view. Extensive experiments on three real-world datasets demonstrate the superiority of KUPN against the state-of-the-art methods. | 知识图谱(KG)在推荐系统中已证明其有效性。近年来,利用图神经网络和对比学习的知识感知推荐方法低估了两个问题:1)忽视了用户与实体之间潜在关系的建模;2)传统跨视图对比学习的领域不足以覆盖图中的所有节点。为解决这些问题,我们提出了一种名为知识感知用户偏好网络(KUPN)的新模型。具体而言,KUPN首先构建了包含用户偏好图(UPG)的关系偏好视图,以建模用户与实体之间的潜在关系。接着,我们采用了一种新颖的注意力信息聚合方法来学习UPG。此外,我们从协同知识视图中获取用户和实体的语义信息,该视图由KG和交互图(IG)组成,作为补充。最后,我们在动态关系偏好视图和协同知识视图之间应用了跨视图对比学习,以实现完整领域的覆盖。在三个真实世界数据集上的广泛实验表明,KUPN相较于最先进的方法具有优越性。 | code | 0 |
EXIT: An EXplicit Interest Transfer Framework for Cross-Domain Recommendation | Lei Huang, Weitao Li, Chenrui Zhang, Jinpeng Wang, Xianchun Yi, Sheng Chen | Cross-domain recommendation has attracted substantial interest in industrial apps such as Meituan, which serves multiple business domains via knowledge transfer and meets the diverse interests of users. However, existing methods typically follow an implicit modeling paradigm that blends the knowledge from both the source and target domains, and design intricate network structures to share learned embeddings or patterns between domains to improve recommendation accuracy. Since the transfer of interest signals is unsupervised, these implicit paradigms often struggle with the negative transfer resulting from differences in service functions and presentation forms across different domains. In this paper, we propose a simple and effective EXplicit Interest Transfer framework named EXIT to address the stated challenge. Specifically, we propose a novel label combination approach that enables the model to directly learn beneficial source domain interests through supervised learning, while excluding inappropriate interest signals. Moreover, we introduce a scene selector network to model the interest transfer intensity under fine-grained scenes. Offline experiments conducted on the industrial production dataset and online A/B tests validate the superiority and effectiveness of our proposed framework. Without complex network structures or training processes, EXIT can be easily deployed in the industrial recommendation system. EXIT has been successfully deployed in the online homepage recommendation system of Meituan App, serving the main traffic. | 跨领域推荐在美团等工业应用中引起了广泛关注,通过知识转移服务于多个业务领域,满足用户的多样化兴趣。然而,现有方法通常采用隐式建模范式,将源域和目标域的知识混合,设计复杂的网络结构以在域间共享学习到的嵌入或模式,从而提高推荐准确性。由于兴趣信号的转移是无监督的,这些隐式范式往往因不同域间服务功能和呈现形式的差异而遭遇负迁移问题。本文提出了一种简单而有效的显式兴趣转移框架,名为EXIT,以应对上述挑战。具体而言,我们提出了一种新颖的标签组合方法,使模型能够通过监督学习直接学习有益的源域兴趣,同时排除不适当的兴趣信号。此外,我们引入了一个场景选择器网络,以在细粒度场景下建模兴趣转移强度。在工业生产数据集上的离线实验和在线A/B测试验证了我们提出的框架的优越性和有效性。EXIT无需复杂的网络结构或训练过程,可以轻松部署在工业推荐系统中。EXIT已成功部署在美团App的在线首页推荐系统中,服务于主要流量。 | code | 0 | |
To Explore or Exploit? A Gradient-informed Framework to Address the Feedback Loop for Graph based Recommendation | Zhigang Huangfu, Binbin Hu, Zhengwei Wu, Fengyu Han, GongDuo Zhang, Lihong Gu, Zhiqiang Zhang | Ant Group, Hangzhou, China; Ant Group, Hang Zhou, China | Graph-based Recommendation Systems (GRSs) have gained prominence for their ability to enhance the accuracy and effectiveness of recommender systems by exploiting structural relationships in user-item interaction data. Despite their advanced capabilities, we find GRSs are susceptible to feedback-loop phenomena that disproportionately diminish the visibility of new and long-tail items, leading to a homogenization of recommendations and the potential emergence of echo chambers. To mitigate this feedback-loop issue, exploration and exploitation (E&E) strategies have been extensively researched. However, conventional E&E methods rest on the assumption that recommendations are independent and identically distributed-an assumption that is not valid for GRSs. To forge an effective E&E approach tailored to GRSs, we introduce a novel framework, the GRADient-informed Exploration and Exploitation (GRADE), designed to adaptively seek out underrepresented or new items with promising rewards. Our method evaluates the potential benefit of exploring an item by assessing the change in the system's empirical risk error pre- and post-exposure. For practical implementation, we approximate this measure using the gradients of potential edges and model parameters, alongside their associated uncertainties. We then orchestrate the balance between exploration and exploitation utilizing Thompson sampling and the Upper Confidence Bound (UCB) strategy. Empirical tests on datasets from two industrial environments demonstrate that GRADE consistently outperforms existing state-of-the-art methods. Additionally, our approach has been successfully integrated into actual industrial systems. | 基于图的推荐系统(GRSs)因其能够通过利用用户-项目交互数据中的结构关系来提高推荐系统的准确性和有效性而备受关注。尽管其功能强大,我们发现GRSs易受反馈循环现象的影响,这种现象不均衡地降低了新项目和长尾项目的可见性,导致推荐内容的同质化,并可能催生回音壁效应。为缓解这一反馈循环问题,探索与利用(E&E)策略得到了广泛研究。然而,传统的E&E方法基于推荐是独立同分布的假设,这一假设对于GRSs并不成立。为了针对GRSs开发一种有效的E&E方法,我们引入了一个新的框架——基于梯度的探索与利用(GRADE),该框架旨在自适应地发掘具有潜在回报的未充分代表或新项目。我们的方法通过评估系统在项目曝光前后的经验风险误差变化,来评估探索某一项目的潜在收益。在实际应用中,我们利用潜在边的梯度和模型参数及其相关的不确定性来近似这一度量。随后,我们利用汤普森采样和上置信界(UCB)策略来协调探索与利用之间的平衡。在两个工业环境数据集上的实证测试表明,GRADE始终优于现有的最先进方法。此外,我们的方法已成功集成到实际的工业系统中。 | code | 0 |
Sequential Optimum Test with Multi-armed Bandits for Online Experimentation | Fang Kong, Penglei Zhao, Shichao Han, Yong Wang, Shuai Li | Shanghai Jiao Tong University, Shanghai, China; Tencent Inc., Shenzhen, China | In large-scale online experimentation platforms, experimenters aim to discover the best treatment (arm) among multiple candidates. Traditional A/B testing and multi-armed bandits (MAB) algorithms are two popular designs. The former usually achieves a higher power but may hurt the customers' satisfaction when always recommending a poor arm, while the latter aims at improving the customers' experience (collecting more rewards) but faces the loss of testing power. Recently, [26] combine the advantage of A/B testing and MAB algorithms to maximize the testing power while maintaining more rewards for experiments with two-arm and Bernoulli rewards. However, in practice, the number of arms is usually larger than two and the reward type also varies. In multi-arm experiments, the required sample size to find the optimal arm blows up to guarantee a false discovery rate with the increase of arm numbers, bringing high opportunity costs to experimenters. To save the cost during the long experimental process, we propose a more efficient sequential test framework named Soptima that can work with general reward types. Inspired by the design of traditional MAB algorithms in chasing rewards and A/B testing in maximizing power, we propose an Elimination-type strategy adapted to this framework to dynamically adjust the traffic split on arms. This strategy cooperating with Soptima simultaneously maintains the advantage of the A/B testing in maximizing the testing power, the sequential test methods in saving the sample size, and the MAB algorithms in collecting rewards. The theoretical analysis gives guarantees on the Type-I, Type-II, and optimality error rates of the proposed approach. A series of experiments from both simulation and industrial historical data sets are conducted to verify the superiority of our approach compared with available baselines. | 在大规模在线实验平台中,实验者的目标是从多个候选方案(臂)中找出最佳方案。传统的A/B测试和多臂老虎机(MAB)算法是两种流行的设计方案。前者通常具有更高的测试效能,但当总是推荐效果不佳的方案时,可能会损害客户的满意度;而后者旨在提升客户体验(收集更多奖励),但面临测试效能的损失。最近,[26]结合了A/B测试和MAB算法的优势,以最大化测试效能,同时在两臂和伯努利奖励的实验中保持更多的奖励。然而,在实践中,臂的数量通常大于两个,且奖励类型也多种多样。在多臂实验中,随着臂数量的增加,为了保证错误发现率,找到最佳臂所需的样本量会急剧增加,这给实验者带来了高昂的机会成本。为了在漫长的实验过程中节省成本,我们提出了一种名为Soptima的高效序列测试框架,该框架适用于一般的奖励类型。受传统MAB算法在追逐奖励和A/B测试在最大化效能的设计启发,我们提出了一种适应此框架的淘汰型策略,以动态调整对各臂的流量分配。这种策略与Soptima合作,同时保持了A/B测试在最大化测试效能、序列测试方法在节省样本量以及MAB算法在收集奖励方面的优势。理论分析为所提出方法的I类错误率、II类错误率和最优性错误率提供了保障。通过一系列来自模拟和工业历史数据集的实验,验证了我们的方法相对于现有基线的优越性。 | code | 0 |
TWIN V2: Scaling Ultra-Long User Behavior Sequence Modeling for Enhanced CTR Prediction at Kuaishou | Zihua Si, Lin Guan, Zhongxiang Sun, Xiaoxue Zang, Jing Lu, Yiqun Hui, Xingchao Cao, Zeyu Yang, Yichen Zheng, Dewei Leng, Kai Zheng, Chenbin Zhang, Yanan Niu, Yang Song, Kun Gai | The significance of modeling long-term user interests for CTR prediction tasks in large-scale recommendation systems is progressively gaining attention among researchers and practitioners. Existing work, such as SIM and TWIN, typically employs a two-stage approach to model long-term user behavior sequences for efficiency concerns. The first stage rapidly retrieves a subset of sequences related to the target item from a long sequence using a search-based mechanism namely the General Search Unit (GSU), while the second stage calculates the interest scores using the Exact Search Unit (ESU) on the retrieved results. Given the extensive length of user behavior sequences spanning the entire life cycle, potentially reaching up to 10^6 in scale, there is currently no effective solution for fully modeling such expansive user interests. To overcome this issue, we introduced TWIN-V2, an enhancement of TWIN, where a divide-and-conquer approach is applied to compress life-cycle behaviors and uncover more accurate and diverse user interests. Specifically, a hierarchical clustering method groups items with similar characteristics in life-cycle behaviors into a single cluster during the offline phase. By limiting the size of clusters, we can compress behavior sequences well beyond the magnitude of 10^5 to a length manageable for online inference in GSU retrieval. Cluster-aware target attention extracts comprehensive and multi-faceted long-term interests of users, thereby making the final recommendation results more accurate and diverse. Extensive offline experiments on a multi-billion-scale industrial dataset and online A/B tests have demonstrated the effectiveness of TWIN-V2. Under an efficient deployment framework, TWIN-V2 has been successfully deployed to the primary traffic that serves hundreds of millions of daily active users at Kuaishou. | 在大规模推荐系统中,建模长期用户兴趣对点击率(CTR)预测任务的重要性正逐渐受到研究者和从业者的关注。现有的研究工作,如SIM和TWIN,通常采用两阶段方法来高效地建模长期用户行为序列。第一阶段通过基于搜索的机制,即通用搜索单元(GSU),从长序列中快速检索与目标项目相关的子集序列;第二阶段则使用精确搜索单元(ESU)对检索结果计算兴趣分数。鉴于用户行为序列的广泛长度可能跨越整个生命周期,规模可达10^6,目前尚无有效解决方案来全面建模如此广泛的用户兴趣。为解决这一问题,我们引入了TWIN-V2,即TWIN的增强版本,采用分而治之的方法来压缩生命周期行为并揭示更准确和多样的用户兴趣。具体而言,层次聚类方法在离线阶段将具有相似特征的生命周期行为项目分组为一个集群。通过限制集群大小,我们可以将行为序列压缩到远超10^5的规模,使其适合在线推理中的GSU检索。集群感知的目标注意力机制提取了用户全面且多方面的长期兴趣,从而使最终的推荐结果更加准确和多样化。在多十亿规模工业数据集上的广泛离线实验和在线A/B测试证明了TWIN-V2的有效性。在高效部署框架下,TWIN-V2已成功部署到快手的主要流量中,服务数亿日活用户。 | code | 0 | |
Understanding the User: An Intent-Based Ranking Dataset | Abhijit Anand, Jurek Leonhardt, Venktesh V, Avishek Anand | As information retrieval systems continue to evolve, accurate evaluation and benchmarking of these systems become pivotal. Web search datasets, such as MS MARCO, primarily provide short keyword queries without accompanying intent or descriptions, posing a challenge in comprehending the underlying information need. This paper proposes an approach to augmenting such datasets to annotate informative query descriptions, with a focus on two prominent benchmark datasets: TREC-DL-21 and TREC-DL-22. Our methodology involves utilizing state-of-the-art LLMs to analyze and comprehend the implicit intent within individual queries from benchmark datasets. By extracting key semantic elements, we construct detailed and contextually rich descriptions for these queries. To validate the generated query descriptions, we employ crowdsourcing as a reliable means of obtaining diverse human perspectives on the accuracy and informativeness of the descriptions. This information can be used as an evaluation set for tasks such as ranking, query rewriting, or others. | 随着信息检索系统不断演进,对其进行准确的评估和基准测试变得至关重要。诸如MS MARCO等网络搜索数据集主要提供简短的关键词查询,缺乏伴随的意图或描述,这给理解背后的信息需求带来了挑战。本文提出了一种增强此类数据集的方法,旨在为查询添加信息丰富的描述,重点关注两个著名的基准数据集:TREC-DL-21和TREC-DL-22。我们的方法涉及利用最先进的LLMs(大型语言模型)来分析和理解基准数据集中各个查询的隐含意图。通过提取关键的语义元素,我们为这些查询构建了详细且上下文丰富的描述。为了验证生成的查询描述,我们采用众包作为获取多样化人类视角的可靠手段,以评估描述的准确性和信息量。这些信息可以作为排序、查询重写等任务的评估集使用。 | code | 0 | |
Domain Alignment with Large Vision-language Models for Cross-domain Remote Sensing Image Retrieval | Yan Chen, Guocan Cai, Fufang Li, Yangtao Wang, Xin Tan, Xiaocui Li | School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, China; East China Normal University, Shanghai, China; Hunan University of Technology and Business, Changsha, China | Cross-domain remote sensing image retrieval has been a hotspot in the past few years. Most of the existing methods focus on combining semantic learning with domain adaptation on well-labeled source domain and unlabeled target domain. However, they face two serious challenges. (1) They cannot deal with practical scenarios where the source domain lacks sufficient label supervision. (2) They suffer from severe performance degradation when the data distribution between the source domain and target domain becomes highly inconsistent. To address these challenges, we propose D omain A lignment with L arge V ision-language models for cross-domain remote sensing image retrieval (termed as DALV). First, we design a dual-modality prototype guided pseudo-labeling mechanism, which leverages the pre-trained large vision-language model (i.e., CLIP) to assign pseudo-labels for all unlabeled source domain images and target domain images. Second, we compute the confidence scores for these pseudo-labels to distinguish their reliability. Next, we devise a loss reweighting strategy, which incorporates the confidence scores as weight values into the contrastive loss to mitigate the impact of noisy pseudo-labels. Finally, the low-rank adaptation fine-tuning means is adapted to update our model and achieve domain alignment to obtain class discriminative features. Extensive experiments on 12 cross-domain remote sensing image retrieval tasks show that our proposed DALV outperforms the state-of-the-art approaches. The source code is available at https://github.com/ptyy01/DALV. | 跨领域遥感图像检索近年来成为研究热点。现有方法大多集中在结合语义学习和领域适应于标签丰富的源域和无标签的目标域。然而,这些方法面临两个严重挑战:(1)无法处理源域缺乏足够标签监督的实际场景;(2)当源域和目标域的数据分布高度不一致时,性能严重下降。为应对这些挑战,我们提出了基于大规模视觉语言模型的跨领域遥感图像检索的领域对齐方法(简称DALV)。首先,我们设计了一种双模态原型引导的伪标签机制,利用预训练的大规模视觉语言模型(如CLIP)为所有无标签的源域图像和目标域图像分配伪标签。其次,我们计算这些伪标签的置信度分数以区分其可靠性。接着,我们设计了一种损失重加权策略,将置信度分数作为权重值融入对比损失,以减轻噪声伪标签的影响。最后,采用低秩适应微调方法更新模型,实现领域对齐,获取具有类别区分性的特征。在12个跨领域遥感图像检索任务上的广泛实验表明,我们提出的DALV方法优于现有最先进的方法。源代码可在https://github.com/ptyy01/DALV获取。 | code | 0 |
DIIT: A Domain-Invariant Information Transfer Method for Industrial Cross-Domain Recommendation | Heyuan Huang, Xingyu Lou, Chaochao Chen, Pengxiang Cheng, Yue Xin, Chengwei He, Xiang Liu, Jun Wang | Cross-Domain Recommendation (CDR) have received widespread attention due to their ability to utilize rich information across domains. However, most existing CDR methods assume an ideal static condition that is not practical in industrial recommendation systems (RS). Therefore, simply applying existing CDR methods in the industrial RS environment may lead to low effectiveness and efficiency. To fill this gap, we propose DIIT, an end-to-end Domain-Invariant Information Transfer method for industrial cross-domain recommendation. Specifically, We first simulate the industrial RS environment that maintains respective models in multiple domains, each of them is trained in the incremental mode. Then, for improving the effectiveness, we design two extractors to fully extract domain-invariant information from the latest source domain models at the domain level and the representation level respectively. Finally, for improving the efficiency, we design a migrator to transfer the extracted information to the latest target domain model, which only need the target domain model for inference. Experiments conducted on one production dataset and two public datasets verify the effectiveness and efficiency of DIIT. | 跨域推荐(CDR)因其能够利用跨领域的丰富信息而受到广泛关注。然而,大多数现有的CDR方法假设了一个理想的静态条件,这在工业推荐系统(RS)中并不实际。因此,简单地将现有的CDR方法应用于工业RS环境中可能导致效果和效率低下。为了填补这一空白,我们提出了DIIT,一种用于工业跨域推荐的端到端领域不变信息传递方法。具体而言,我们首先模拟了工业RS环境,该环境在多个领域中维护各自的模型,每个模型都以增量模式进行训练。然后,为了提高效果,我们设计了两个提取器,分别从领域级别和表示级别充分提取最新源域模型中的领域不变信息。最后,为了提高效率,我们设计了一个迁移器,将提取的信息传递到最新的目标域模型,该模型仅需要进行目标域模型的推理。在一个生产数据集和两个公共数据集上进行的实验验证了DIIT的有效性和效率。 | code | 0 | |
The Devil is in the Sources! Knowledge Enhanced Cross-Domain Recommendation in an Information Bottleneck Perspective | Binbin Hu, Weifan Wang, Shuhan Wang, Ziqi Liu, Bin Shen, Yong He, Jiawei Chen | Cross-domain Recommendation (CDR) aims to alleviate the data sparsity and the cold-start problems in traditional recommender systems by leveraging knowledge from an informative source domain. However, previously proposed CDR models pursue an imprudent assumption that the entire information from the source domain is equally contributed to the target domain, neglecting the evil part that is completely irrelevant to users' intrinsic interest. To address this concern, in this paper, we propose a novel knowledge enhanced cross-domain recommendation framework named CoTrans, which remolds the core procedures of CDR models with: Compression on the knowledge from the source domain and Transfer of the purity to the target domain. Specifically, following the theory of Graph Information Bottleneck, CoTrans first compresses the source behaviors with the perception of information from the target domain. Then to preserve all the important information for the CDR task, the feedback signals from both domains are utilized to promote the effectiveness of the transfer procedure. Additionally, a knowledge-enhanced encoder is employed to narrow gaps caused by the non-overlapped items across separate domains. Comprehensive experiments on three widely used cross-domain datasets demonstrate that CoTrans significantly outperforms both single-domain and state-of-the-art cross-domain recommendation approaches. | 跨域推荐(CDR)旨在通过利用信息丰富的源域知识,缓解传统推荐系统中的数据稀疏性和冷启动问题。然而,先前提出的CDR模型追求一个不谨慎的假设,即源域的全部信息对目标域的贡献是均等的,忽视了与用户内在兴趣完全无关的有害部分。为了解决这一问题,本文提出了一种名为CoTrans的新型知识增强跨域推荐框架,该框架通过以下核心步骤重构了CDR模型的流程:源域知识的压缩和目标域纯净信息的转移。具体而言,遵循图信息瓶颈理论,CoTrans首先根据目标域的信息感知压缩源域行为。然后,为了保留CDR任务的所有重要信息,利用来自两个域的反馈信号来提升转移过程的有效性。此外,采用知识增强编码器来缩小各域之间非重叠项目造成的差距。在三个广泛使用的跨域数据集上的综合实验表明,CoTrans显著优于单一域和最先进的跨域推荐方法。 | code | 0 | |
MuLe: Multi-Grained Graph Learning for Multi-Behavior Recommendation | Seunghan Lee, Geonwoo Ko, HyunJe Song, Jinhong Jung | Dept. of CSAI, Jeonbuk Nat'l Univ., Jeonju, Republic of Korea; School of Software, Soongsil University, Seoul, Republic of Korea | Multi-behavior recommender systems, rapidly advancing across various domains, utilize plentiful auxiliary interactions on a variety of user behaviors to enhance recommendations for the target behavior, such as purchases. While previous methods have made strides in leveraging such interactions with advanced machine learning methods, they still face challenges in adequately using multi-faceted relationships among behaviors and handling uncertain auxiliary interactions that could potentially lead to purchases or not. In this paper, we propose MuLe (Multi-Grained Graph Learning), a novel graph-based model designed to address these limitations. We design a multi-grained graph learning strategy to capture diverse aspects of behaviors, ranging from unified to specific, and then to target-related behavior interactions. To handle uncertain interactions, we use graph attention, weighting the importance of those interactions related to the target behavior. Afterward, we use an attention mechanism to effectively aggregate diverse behavior embeddings obtained from the multi-grained graph encoders. Extensive experiments show that MuLe significantly outperforms the state-of-the-art methods, achieving improvements of up to 44.6% in HR@10 and 52.9% in NDCG@10, respectively. Our code and datasets are available at https://github.com/geonwooko/MULE. | 多行为推荐系统在各个领域迅速发展,利用丰富的用户行为辅助交互来增强对目标行为(如购买)的推荐。尽管先前的方法通过先进的机器学习方法在这一领域取得了进展,但它们在充分使用行为之间的多方面关系以及处理可能导致或不导致购买的模糊辅助交互方面仍面临挑战。本文提出了MuLe(多粒度图学习),这是一种新颖的基于图的模型,旨在解决这些局限性。我们设计了一种多粒度图学习策略,以捕捉从统一到具体再到与目标相关的行为交互的多样性。为了处理不确定的交互,我们使用图注意力机制,对与目标行为相关的交互进行重要性加权。随后,我们采用注意力机制,有效地聚合从多粒度图编码器获得的各种行为嵌入。广泛的实验表明,MuLe显著优于最先进的方法,HR@10和NDCG@10分别提高了44.6%和52.9%。我们的代码和数据集可在https://github.com/geonwooko/MULE获取。 | code | 0 |
Inferring Visualization Intent from Conversation | Haotian Li, Nithin Chalapathi, Huamin Qu, Alvin Cheung, Aditya G. Parameswaran | UC Berkeley, Berkeley, CA, USA; HKUST, Hong Kong, China | During visual data analysis, users often explore visualizations one at a time, with each visualization leading to new directions of exploration. We consider a conversational approach to visualization, where users specify their needs at each step in natural language, with a visualization being returned in turn. Prior work has shown that visualization generation can be boiled down to the identification of visualization intent and visual encodings. Recognizing that the latter is a well-studied problem with standard solutions, we focus on the former, i.e., identifying visualization intent during conversation. We develop Luna, a framework that comprises a novel combination of language models adapted from BERT and rule-based inference, that together predict various aspects of visualization intent. We compare Luna with other conversational NL-to-visualization and NL-to-SQL approaches (adapted to visualization intent), including GPT-3.5 and GPT-4, and demonstrate that Luna has 14.3% higher accuracy than the state-of-the-art. We also apply Luna to a usage scenario on a dataset of police misconduct, showcasing its benefits relative to other approaches. | 在视觉数据分析过程中,用户通常一次只探索一个可视化图表,每个图表都引导出新的探索方向。我们考虑了一种对话式的可视化方法,用户在每一步以自然语言指定其需求,并依次返回一个可视化图表。先前的研究表明,可视化生成可以简化为可视化意图和视觉编码的识别。鉴于后者是一个已有标准解决方案的成熟问题,我们将重点放在前者,即在对话过程中识别可视化意图。我们开发了Luna框架,该框架结合了从BERT改编的语言模型和基于规则的推理,共同预测可视化意图的各个方面。我们将Luna与其他对话式自然语言到可视化和自然语言到SQL的方法(适配于可视化意图)进行了比较,包括GPT-3.5和GPT-4,并展示了Luna比现有技术高出14.3%的准确性。我们还应用Luna到一个关于警察不当行为的实际数据集场景中,展示了其相对于其他方法的优势。 | code | 0 |
GUME: Graphs and User Modalities Enhancement for Long-Tail Multimodal Recommendation | Guojiao Lin, Zhen Meng, Dongjie Wang, Qingqing Long, Yuanchun Zhou, Meng Xiao | Multimodal recommendation systems (MMRS) have received considerable attention from the research community due to their ability to jointly utilize information from user behavior and product images and text. Previous research has two main issues. First, many long-tail items in recommendation systems have limited interaction data, making it difficult to learn comprehensive and informative representations. However, past MMRS studies have overlooked this issue. Secondly, users' modality preferences are crucial to their behavior. However, previous research has primarily focused on learning item modality representations, while user modality representations have remained relatively simplistic.To address these challenges, we propose a novel Graphs and User Modalities Enhancement (GUME) for long-tail multimodal recommendation. Specifically, we first enhance the user-item graph using multimodal similarity between items. This improves the connectivity of long-tail items and helps them learn high-quality representations through graph propagation. Then, we construct two types of user modalities: explicit interaction features and extended interest features. By using the user modality enhancement strategy to maximize mutual information between these two features, we improve the generalization ability of user modality representations. Additionally, we design an alignment strategy for modality data to remove noise from both internal and external perspectives. Extensive experiments on four publicly available datasets demonstrate the effectiveness of our approach. | 多模态推荐系统(MMRS)因其能够联合利用用户行为、产品图像和文本信息而受到研究界的广泛关注。以往的研究存在两个主要问题。首先,推荐系统中许多长尾项目(long-tail items)的交互数据有限,这使得学习全面且信息丰富的表示变得困难。然而,过去的MMRS研究忽视了这一问题。其次,用户的模态偏好对其行为至关重要。然而,以往的研究主要集中在学习项目模态表示上,而用户模态表示则相对简单。为了解决这些挑战,我们提出了一种新的针对长尾多模态推荐的图与用户模态增强(GUME)方法。具体来说,我们首先通过项目间的多模态相似性来增强用户-项目图,这提高了长尾项目的连通性,并帮助它们通过图传播学习高质量的表示。然后,我们构建了两种用户模态:显式交互特征和扩展兴趣特征。通过使用用户模态增强策略来最大化这两种特征之间的互信息,我们提高了用户模态表示的泛化能力。此外,我们还设计了一种模态数据对齐策略,以从内部和外部角度去除噪声。在四个公开可用的数据集上进行的广泛实验证明了我们方法的有效性。 | code | 0 | |
Multi-Behavior Generative Recommendation | Zihan Liu, Yupeng Hou, Julian J. McAuley | University of California, San Diego, San Diego, CA, USA | Multi-behavior sequential recommendation (MBSR) aims to incorporate behaviortypes of interactions for better recommendations. Existing approaches focus onthe next-item prediction objective, neglecting the value of integrating thetarget behavior type into the learning objective. In this paper, we proposeMBGen, a novel Multi-Behavior sequential Generative recommendation framework.We formulate the MBSR task into a consecutive two-step process: (1) given itemsequences, MBGen first predicts the next behavior type to frame the userintention, (2) given item sequences and a target behavior type, MBGen thenpredicts the next items. To model such a two-step process, we tokenize bothbehaviors and items into tokens and construct one single token sequence withboth behaviors and items placed interleaved. Furthermore, MBGen learns toautoregressively generate the next behavior and item tokens in a unifiedgenerative recommendation paradigm, naturally enabling a multi-task capability.Additionally, we exploit the heterogeneous nature of token sequences in thegenerative recommendation and propose a position-routed sparse architecture toefficiently and effectively scale up models. Extensive experiments on publicdatasets demonstrate that MBGen significantly outperforms existing MBSR modelsacross multiple tasks. | 多行为序列推荐(MBSR)旨在整合交互行为类型以实现更佳的推荐效果。现有方法主要聚焦于下一项预测目标,忽略了将目标行为类型整合到学习目标中的价值。本文提出了一种名为MBGen的新型多行为序列生成推荐框架。我们将MBSR任务构建成一个连续的两步过程:(1)在给定项目序列的情况下,MBGen首先预测下一行为类型以构建用户意图;(2)在给定项目序列和目标行为类型的基础上,MBGen随后预测下一项目。为模拟这一两步过程,我们将行为和项目都标记化为令牌,并构建一个包含交错放置的行为和项目的单一令牌序列。此外,MBGen在统一的生成推荐范式中自回归地生成下一行为和项目令牌,自然地实现了多任务能力。我们还利用生成推荐中令牌序列的异构性,提出了一种位置路由稀疏架构,以高效且有效地扩展模型规模。在公共数据集上的广泛实验表明,MBGen在多个任务中显著优于现有的MBSR模型。 | code | 0 |
Veracity Estimation for Entity-Oriented Search with Knowledge Graphs | Stefano Marchesin, Gianmaria Silvello, Omar Alonso | University of Padua, Padua, Italy; Amazon, Palo Alto, California, USA | In this paper, we discuss the potential costs that emerge from using a Knowledge Graph (KG) in entity-oriented search without considering its data veracity. We argue for the need for KG veracity analysis to gain insights and propose a scalable assessment framework. Previous assessments focused on relevance, assuming correct KGs, and overlooking the potential risks of misinformation. Our approach strategically allocates annotation resources, optimizing utility and revealing the significant impact of veracity on entity search and card generation. Contributions include a fresh perspective on entity-oriented search extending beyond the conventional focus on relevance, a scalable assessment framework, exploratory experiments highlighting the impact of veracity on ranking and user experience, as well as outlining associated challenges and opportunities. | 本文探讨了在面向实体的搜索中使用知识图谱(KG)而不考虑其数据真实性所可能产生的潜在成本。我们主张进行知识图谱真实性分析以获取洞察,并提出一个可扩展的评估框架。以往的评估主要集中在相关性上,假设知识图谱是正确的,而忽视了错误信息可能带来的潜在风险。我们的方法策略性地分配标注资源,优化效用并揭示真实性对实体搜索和卡片生成的重要影响。主要贡献包括:对面向实体的搜索提出了超越传统相关性关注的新视角,一个可扩展的评估框架,探索性实验突显了真实性对排名和用户体验的影响,以及概述了相关的挑战和机遇。 | code | 0 |
Inductive Knowledge Graph Embedding via Exploring Interaction Patterns of Relations | Chong Mu, Lizong Zhang, Jinchuan Zhang, Qian Huang, Zhiguo Wang | Recent research in inductive reasoning has focused on predicting missing links between entities that are not observed during training. However, most approaches usually require that the relations are known at the inference time. In the real world, new entities and new relations usually emerge concurrently, which greatly challenges the model's generalization ability. In this paper, we propose a novel inductive knowledge graph embedding model that effectively handles unknown entities and relations by capturing their local structural features. Specifically, a relation graph is constructed to learn relation representations. In the relation graph, we employ a four-dimensional vector to represent the interaction patterns between nodes (relations), where each dimension corresponds to a specific type of interaction. For entity representations, our model dynamically initializes entity features using relation features and attentively aggregates neighboring features of entities to update entity features. By modeling interaction patterns between relations and incorporating structural information of entities, our model learns how to aggregate neighboring embeddings using attention mechanisms, thus generating high-quality embeddings for new entities and relations. Extensive experiments on benchmark datasets demonstrate that our model outperforms state-of-the-art methods, particularly in scenarios involving completely new relations. | 最近的研究集中在归纳推理,即预测训练过程中未观察到的实体之间的缺失链接。然而,大多数方法通常要求在推理时已知关系。在现实世界中,新实体和新关系通常同时出现,这对模型的泛化能力提出了巨大挑战。在本文中,我们提出了一种新颖的归纳知识图谱嵌入模型,该模型通过捕捉实体和关系的局部结构特征,有效处理未知实体和关系。具体来说,我们构建了一个关系图来学习关系表示。在关系图中,我们使用一个四维向量来表示节点(关系)之间的交互模式,其中每个维度对应一种特定的交互类型。对于实体表示,我们的模型使用关系特征动态初始化实体特征,并注意聚合实体的邻居特征以更新实体特征。通过建模关系之间的交互模式并结合实体的结构信息,我们的模型学习如何使用注意力机制聚合邻居嵌入,从而生成新实体和关系的高质量嵌入。在基准数据集上的广泛实验表明,我们的模型优于最先进的方法,特别是在涉及完全新关系的场景中。 | code | 0 | |
When LLM Meets Hypergraph: A Sociological Analysis on Personality via Online Social Networks | Zhiyao Shu, Xiangguo Sun, Hong Cheng | Individual personalities significantly influence our perceptions, decisions, and social interactions, which is particularly crucial for gaining insights into human behavior patterns in online social network analysis. Many psychological studies have observed that personalities are strongly reflected in their social behaviors and social environments. In light of these problems, this paper proposes a sociological analysis framework for one's personality in an environment-based view instead of individual-level data mining. Specifically, to comprehensively understand an individual's behavior from low-quality records, we leverage the powerful associative ability of LLMs by designing an effective prompt. In this way, LLMs can integrate various scattered information with their external knowledge to generate higher-quality profiles, which can significantly improve the personality analysis performance. To explore the interactive mechanism behind the users and their online environments, we design an effective hypergraph neural network where the hypergraph nodes are users and the hyperedges in the hypergraph are social environments. We offer a useful dataset with user profile data, personality traits, and several detected environments from the real-world social platform. To the best of our knowledge, this is the first network-based dataset containing both hypergraph structure and social information, which could push forward future research in this area further. By employing the framework on this dataset, we can effectively capture the nuances of individual personalities and their online behaviors, leading to a deeper understanding of human interactions in the digital world. | 个体性格显著影响我们的感知、决策和社会互动,这对于深入理解在线社交网络分析中的人类行为模式尤为关键。许多心理学研究观察到,性格在其社会行为和社会环境中得到了强烈体现。鉴于这些问题,本文提出了一种基于环境视角而非个体层面数据挖掘的社会学分析框架,用于分析个体性格。具体而言,为了从低质量记录中全面理解个体行为,我们利用大型语言模型(LLMs)强大的关联能力,通过设计有效的提示词,使LLMs能够整合各种分散的信息与其外部知识,生成更高质量的个体画像,从而显著提升性格分析的性能。为了探索用户与其在线环境之间的交互机制,我们设计了一种有效的超图神经网络,其中超图节点为用户,超图的超边为社会环境。我们提供了一个包含用户画像数据、性格特征及从真实社交平台检测到的多种环境的实用数据集。据我们所知,这是首个同时包含超图结构和社会信息的网络数据集,有望推动该领域的未来研究。通过在该数据集上应用该框架,我们能够有效捕捉个体性格及其在线行为的细微差别,从而更深入地理解数字世界中的人类互动。 | code | 0 | |
FABLE: Approximate Butterfly Counting in Bipartite Graph Stream with Duplicate Edges | Guozhang Sun, Yuhai Zhao, Yuan Li | North China University of Technology, Beijing, China; Northeastern University, Shenyang, China | Bipartite graph models the relationship between two different sets of entities. Such graph data become more dynamic and are organized as stream with duplicate edges in real-word applications such as customer-product in e-commerce. A butterfly, (2,2)-biclique, is the simplest cohesive substructure and of great importance in a bipartite graph. However, it is challenging to estimate the number of butterflies in large scale and high dynamic bipartite graph stream when given a limited memory. Besides, existing works for butterfly counting assume no duplicate edges in the bipartite graph stream, which cause less accuracy in bipartite graph stream with duplicate edges. In this paper, we propose FABLE, a Fixed-size memory Approximate Butterfly counting algorithm for dupLicate Edges in bipartite graph stream. In FABLE, we compute the number of distinct edges by maintaining an ordered list of edge priorities for replacement and sampling. We provide theoretical proof of unbiasedness and derive the variance of butterfly count. Our extensive experiments on 5 real-world datasets confirm that our approach has higher accuracy compared with the baseline method under the same memory usage. | 二分图模型描述了两组不同实体之间的关系。在电子商务等实际应用中,如客户-产品关系,这种图数据变得更加动态,并以包含重复边的流形式组织。蝴蝶结构,即(2,2)-二分团,是二分图中最重要的简单凝聚子结构。然而,在有限的内存条件下,估计大规模和高动态二分图流中的蝴蝶数量是一个挑战。此外,现有的蝴蝶计数方法假设二分图流中没有重复边,这导致在包含重复边的二分图流中计数精度较低。本文提出了FABLE,一种用于二分图流中重复边的固定内存近似蝴蝶计数算法。在FABLE中,我们通过维护一个用于替换和采样的边优先级有序列表来计算不同边的数量。我们提供了无偏性的理论证明,并推导了蝴蝶计数的方差。在5个真实世界数据集上的广泛实验证实,在相同内存使用条件下,我们的方法相比基线方法具有更高的精度。 | code | 0 |
Learnable Item Tokenization for Generative Recommendation | Wenjie Wang, Honghui Bao, Xinyu Lin, Jizhi Zhang, Yongqi Li, Fuli Feng, SeeKiong Ng, TatSeng Chua | Utilizing powerful Large Language Models (LLMs) for generative recommendation has attracted much attention. Nevertheless, a crucial challenge is transforming recommendation data into the language space of LLMs through effective item tokenization. Current approaches, such as ID, textual, and codebook-based identifiers, exhibit shortcomings in encoding semantic information, incorporating collaborative signals, or handling code assignment bias. To address these limitations, we propose LETTER (a LEarnable Tokenizer for generaTivE Recommendation), which integrates hierarchical semantics, collaborative signals, and code assignment diversity to satisfy the essential requirements of identifiers. LETTER incorporates Residual Quantized VAE for semantic regularization, a contrastive alignment loss for collaborative regularization, and a diversity loss to mitigate code assignment bias. We instantiate LETTER on two models and propose a ranking-guided generation loss to augment their ranking ability theoretically. Experiments on three datasets validate the superiority of LETTER, advancing the state-of-the-art in the field of LLM-based generative recommendation. | 利用强大的大型语言模型(LLMs)进行生成式推荐引起了广泛关注。然而,一个关键挑战是如何通过有效的物品标记化将推荐数据转换为LLMs的语言空间。当前的方法,如基于ID、文本和码本的标识符,在编码语义信息、整合协作信号或处理码分配偏差方面存在不足。为了解决这些限制,我们提出了LETTER(一种用于生成式推荐的LEarnable Tokenizer),它整合了层次语义、协作信号和码分配多样性,以满足标识符的基本要求。LETTER结合了残差量化VAE进行语义正则化,对比对齐损失进行协作正则化,以及多样性损失以减轻码分配偏差。我们在两个模型上实例化了LETTER,并提出了一种排名引导的生成损失,以理论增强其排名能力。在三个数据集上的实验验证了LETTER的优越性,推动了基于LLM的生成式推荐领域的前沿进展。 | code | 0 | |
Improving Adversarial Transferability via Frequency-Guided Sample Relevance Attack | Xinyi Wang, Zhibo Jin, Zhiyu Zhu, Jiayu Zhang, Huaming Chen | Suzhou Yierqi, Suzhou, China; University of Malaya, Kuala Lumpur, Malaysia; The University of Sydney, Sydney, Australia | Deep neural networks (DNNs) are known to be vulnerable to adversarial examples. To facilitate model safety, transfer-based attacks employ surrogate models to craft adversarial examples. In this work, we firstly study the intricate mechanisms of such attacks. We observe a correlation between the sharpness of decision boundaries in model sensitive regions and overfitting during adversarial training, which hampers the adversarial examples' transferability. To address this issue, we propose a novel approach termed Frequency-Guided Sample Relevance Attack (FGSRA). Specifically, we leverage frequency information to explore similar sensitive regions across different models, thereby generating neighborhood samples. Additional similarity weights are subsequently introduced to assess the adversarial contribution of the neighborhood samples. A hybrid gradient is then obtained to thoroughly exploit neighborhood information within input samples. Extensive experiments demonstrate the prominent performance of our approach. Compared to other state-of-the-art benchmarks on surrogate model Inc-v3, our method has an average improvement of 27.21% for normally trained CNNs and 42.1% for adversarially trained CNNs. Moreover, we achieve an average improvement of 24.6% for ViTs. Our code is available at:https://github.com/LMBTough/FGSRA | 深度神经网络(DNNs)在面对对抗样本时表现出脆弱性。为了提升模型的安全性,基于迁移的攻击方法利用代理模型来生成对抗样本。在本研究中,我们首先探讨了这类攻击的复杂机制。我们观察到,在模型的敏感区域中,决策边界的锐度与对抗训练中的过拟合现象之间存在关联,这影响了对抗样本的迁移性。为解决这一问题,我们提出了一种名为频率引导样本相关性攻击(Frequency-Guided Sample Relevance Attack, FGSRA)的新方法。具体而言,我们利用频率信息来探索不同模型间相似的敏感区域,从而生成邻域样本。随后,引入额外的相似性权重来评估这些邻域样本的对抗贡献。通过这种方式,我们获得了一种混合梯度,以充分挖掘输入样本中的邻域信息。广泛的实验结果表明,我们的方法表现出色。与基于代理模型Inc-v3的其他最先进基准相比,我们的方法在常规训练的CNNs上平均提升了27.21%,在对抗训练的CNNs上平均提升了42.1%。此外,我们在ViTs上也实现了平均24.6%的提升。我们的代码已公开,可访问:https://github.com/LMBTough/FGSRA。 | code | 0 |
Image-text Retrieval with Main Semantics Consistency | Yi Xie, Yangtao Wang, Yanzhao Xie, Xin Tan, Jingjing Li, Xiaocui Li, Weilong Peng, Maobin Tang, Meie Fang | School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, China; East China Normal University, Shanghai, China; University of Electronic Science and Technology of China, Chengdu, China; Hunan University of Technology and Business, Changsha, China | Image-text retrieval (ITR) has been one of the primary tasks in cross-modal retrieval, serving as a crucial bridge between computer vision and natural language processing. Significant progress has been made to achieve global alignment and local alignment between images and texts by mapping images and texts into a common space to establish correspondences between these two modalities. However, the rich semantic content contained in each image may bring false matches, resulting in the matched text ignoring the main semantics but focusing on the secondary or other semantics of this image. To address this issue, this paper proposes a semantically optimized approach with a novel Main Semantics Consistency (MSC) loss function, which aims to rank the semantically most similar images (or texts) corresponding to the given query at the top position during the retrieval process. First, in each batch of image-text pairs, we separately compute (i) the image-image similarity, i.e., the similarity between every two images, (ii) the text-text similarity, i.e., the similarity between a group of texts (that belong to a certain image) and another group of texts (that belong to another image), and (iii) the image-text similarity, i.e., the similarity between each image and each text. Afterward, our proposed MSC effectively aligns the above image-image, image-text, and text-text similarity, since the main semantics of every two images will be highly close if their text descriptions remain highly semantically consistent. By this means, we can capture the main semantics of each image to be matched with its corresponding texts, prioritizing the semantically most related retrieval results. Extensive experiments on MSCOCO and FLICKR30K verify the superior performance of MSC compared with the SOTA image-text retrieval methods. The source code of this project is released at GitHub: https://github.com/xyi007/MSC. | 图像-文本检索(ITR)一直是跨模态检索中的主要任务之一,作为连接计算机视觉和自然语言处理的关键桥梁。通过将图像和文本映射到共同空间以建立这两种模态之间的对应关系,已经取得了显著的进展,实现了图像与文本之间的全局对齐和局部对齐。然而,每张图像中丰富的语义内容可能会带来错误的匹配,导致匹配的文本忽略了主要语义,而聚焦于次要或其他语义。为了解决这一问题,本文提出了一种语义优化的方法,并引入了一种新颖的主语义一致性(MSC)损失函数,旨在检索过程中将语义上最相似的图像(或文本)排在给定查询结果的最前面。首先,在每一批图像-文本对中,我们分别计算(i)图像-图像相似度,即每两张图像之间的相似度;(ii)文本-文本相似度,即属于某张图像的一组文本与属于另一张图像的另一组文本之间的相似度;以及(iii)图像-文本相似度,即每张图像与每个文本之间的相似度。随后,我们提出的MSC有效地对齐了上述的图像-图像、图像-文本和文本-文本相似度,因为如果两张图像的文本描述在语义上保持高度一致,那么这两张图像的主要语义将会高度接近。通过这种方式,我们可以捕捉每张图像的主要语义,并将其与相应的文本匹配,优先考虑语义上最相关的检索结果。在MSCOCO和FLICKR30K上的大量实验验证了MSC相比最先进的图像-文本检索方法的优越性能。本项目的源代码已在GitHub上发布:https://github.com/xyi007/MSC。 | code | 0 |
Post-Quantum Searchable Encryption Supporting User-Authorization for Outsourced Data Management | Shiyuan Xu, Yibo Cao, Xue Chen, Yu Guo, Yuer Yang, Fangda Guo, SiuMing Yiu | Department of Computer Science, The University of Hong Kong, Hong Kong, China; ; School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing, China; School of Artificial Intelligence, Beijing Normal University, Beijing, China | With the widespread development of database systems, data security has become crucial when it comes to sharing among users and servers. A straightforward approach involves using searchable encryption to ensure the confidentiality of shared data. However, in certain scenarios, varying user tiers are granted disparate data searching privileges, and administrators need to restrict the searchability of ciphertexts to select users exclusively. To address this issue, public key encryption with authorized keyword search (PEAKS) was proposed, wherein solely authorized users possess the ability to conduct targeted keyword searches. Nonetheless, it is vulnerable to resist quantum computing attacks. As a result, research focusing on authorizing users to search for keywords while achieving quantum security is far-reaching. In this paper, we propose a lattice-based variant of PEAKS (L-PEAKS) that enables keyword dataset authorization for outsourced data management. Unlike existing schemes, our design incorporates identity-based encryption (IBE) to overcome the bottleneck of public key management. Besides, we utilize several lattice sampling algorithms to defend against attacks from quantum adversaries. Specifically, each authorized user must obtain a search privilege from an authority. The authority distributes an authorized token to the user within a specific time period, and the user generates a trapdoor for any authorized keywords. Our scheme is proven to be secure against IND-sID-CKA and T-EUF security in a quantum setting. We also conduct comprehensive evaluations on a commodity machine to assess completeness and provide theoretical complexity comparisons with existing state-of-the-art schemes. | 随着数据库系统的广泛发展,数据安全在用户和服务器之间的共享过程中变得至关重要。一个直接的方法是使用可搜索加密来确保共享数据的机密性。然而,在某些情况下,不同的用户层级被授予不同的数据搜索权限,管理员需要将密文的可搜索性限制为仅对选定的用户开放。为了解决这个问题,提出了基于授权关键词搜索的公钥加密(PEAKS),其中只有授权用户能够进行目标关键词搜索。然而,这种方案易受量子计算攻击的影响。因此,研究授权用户进行关键词搜索同时实现量子安全的方案具有深远意义。本文提出了一种基于格的PEAKS变体(L-PEAKS),该变体支持外包数据管理的关键词数据集授权。与现有方案不同,我们的设计结合了基于身份的加密(IBE)来解决公钥管理的瓶颈问题。此外,我们利用多种格采样算法来防御量子敌手的攻击。具体而言,每个授权用户必须从权威机构获取搜索权限。权威机构在特定时间段内向用户分发授权令牌,用户为任何授权关键词生成陷门。我们的方案在量子环境下被证明对IND-sID-CKA和T-EUF安全是安全的。我们还对商用机器进行了全面的评估,以评估其完整性,并提供了与现有最先进方案的理论复杂度比较。 | code | 0 |
Decoupled Behavior-based Contrastive Recommendation | Mengduo Yang, Jie Zhou, Meng Xi, Xiaohua Pan, Yi Yuan, Ying Li, Yangyang Wu, Jinshan Zhang, Jianwei Yin | ; School of Software Technology, Zhejiang University, Ningbo, Zhejiang, China | Recommender systems are crucial tools in online applications, assisting users in discovering relevant content efficiently. Recent studies demonstrate that contrastive learning (CL) based methods yield significant results in collaborative filtering (CF) recommendations, due to their ability to address the issue of data sparsity. However, two inherent limitations remain unexplored in these methods. a) Since the datasets commonly used are binary (0: no interaction; 1: interaction), current methods only provide rudimentary modeling of user behaviors in binary form, which fails to model complex user-item interactions and relationships in real-world recommendation scenarios. b) Existing CL-based methods mostly construct contrastive views through heuristic-based embedding or structure perturbation, which are prone to introduce noise or discard important information, leading to a decreased representation quality. To address these issues, we propose a Decoupled Behavior-based Contrastive Recommendation model (DBCR) that effectively decouples user behaviors from binary datasets for better user-item interaction modeling. The core idea is to decouple latent user behaviors from unlabelled user-item interactions (binary datasets) and utilize self-supervised contrastive learning to optimize CF-based recommendation jointly. Specifically, we introduce latent behavior variables and embed them into user-item interaction modeling within the generalized expectation-maximization (EM) framework. Moreover, we design a contrastive learning task by constructing a preference view instead of unreasonable perturbation to further improve the learned representation. Experimental results and analyses on three real-world datasets demonstrate the effectiveness of DBCR and its high efficiency, with an average improvement of 16.9% over state-of-the-art methods. Our code is available on https://github.com/Du-danger/DBCR. | 推荐系统是在线应用中的关键工具,能够帮助用户高效地发现相关内容。最近的研究表明,基于对比学习(Contrastive Learning, CL)的方法在协同过滤(Collaborative Filtering, CF)推荐中取得了显著成果,这主要归功于它们解决数据稀疏性问题的能力。然而,这些方法存在两个内在限制尚未得到充分探讨。a) 由于常用的数据集通常是二值的(0:无交互;1:有交互),当前的方法仅以二值形式对用户行为进行初步建模,未能捕捉真实推荐场景中复杂的用户-物品交互和关系。b) 现有的基于CL的方法大多通过启发式嵌入或结构扰动来构建对比视图,这容易引入噪声或丢弃重要信息,导致表示质量下降。为了解决这些问题,我们提出了一种解耦行为对比推荐模型(Decoupled Behavior-based Contrastive Recommendation model, DBCR),该模型有效地将用户行为从二值数据集中解耦,以更好地建模用户-物品交互。其核心思想是将潜在用户行为从无标签的用户-物品交互(二值数据集)中解耦,并利用自监督对比学习来联合优化基于CF的推荐。具体来说,我们引入了潜在行为变量,并在广义期望最大化(EM)框架内将其嵌入到用户-物品交互建模中。此外,我们设计了一个对比学习任务,通过构建偏好视图而非不合理的扰动来进一步提高学习到的表示质量。在三个真实世界数据集上的实验结果和分析表明,DBCR的有效性和高效性,平均比最先进的方法提高了16.9%。我们的代码可在https://github.com/Du-danger/DBCR获取。 | code | 0 |
Hyperbolic Contrastive Learning for Cross-Domain Recommendation | Xin Yang, Heng Chang, Zhijian Lai, Jinze Yang, Xingrun Li, Yu Lu, Shuaiqiang Wang, Dawei Yin, Erxue Min | Peking University, Beijing, China; Tsinghua University, Beijing, China; University of Tokyo, Tokyo, Japan; Kyoto University, Kyoto, Japan; University of Tsukuba, Tsukuba, Japan; Baidu Inc., Beijing, China | Cross-Domain Recommendation (CDR) seeks to utilize knowledge from different domains to alleviate the problem of data sparsity in the target recommendation domain, and has been gaining more attention in recent years. Although there have been notable advances in this area, most current methods represent users and items in Euclidean space, which is not ideal for handling long-tail distributed data in recommendation systems. Additionally, adding data from other domains can worsen the long-tail characteristics of the entire dataset, making it harder to train CDR models effectively. Recent studies have shown that hyperbolic methods are particularly suitable for modeling long-tail distributions, which has led us to explore hyperbolic representations for users and items in CDR scenarios. However, due to the distinct characteristics of the different domains, applying hyperbolic representation learning to CDR tasks is quite challenging. In this paper, we introduce a new framework called Hyperbolic Contrastive Learning (HCTS), designed to capture the unique features of each domain while enabling efficient knowledge transfer between domains. We achieve this by embedding users and items from each domain separately and mapping them onto distinct hyperbolic manifolds with adjustable curvatures for prediction. To improve the representations of users and items in the target domain, we develop a hyperbolic contrastive learning module for knowledge transfer. Extensive experiments on real-world datasets demonstrate that hyperbolic manifolds are a promising alternative to Euclidean space for CDR tasks. The codes are available at https://github.com/EnkiXin/hcts. | 跨域推荐(CDR)旨在利用不同领域的知识来缓解目标推荐领域中的数据稀疏问题,近年来受到越来越多的关注。尽管在这一领域取得了显著进展,但大多数现有方法将用户和物品表示在欧几里得空间中,这对于处理推荐系统中的长尾分布数据并不理想。此外,添加其他领域的数据可能会加剧整个数据集的长尾特性,使得有效训练CDR模型变得更加困难。最近的研究表明,双曲方法特别适合于建模长尾分布,这促使我们探索在CDR场景中使用双曲表示用户和物品。然而,由于不同领域的特性各异,将双曲表示学习应用于CDR任务相当具有挑战性。在本文中,我们引入了一种称为双曲对比学习(HCTS)的新框架,旨在捕捉每个领域的独特特征,同时实现领域间高效的知识转移。我们通过将每个领域的用户和物品分别嵌入,并将它们映射到具有可调曲率的独立双曲流形上进行预测来实现这一点。为了改进目标领域中用户和物品的表示,我们开发了一个双曲对比学习模块用于知识转移。在真实世界数据集上的广泛实验表明,双曲流形是CDR任务中欧几里得空间的一个有前景的替代方案。代码可在https://github.com/EnkiXin/hcts获取。 | code | 0 |
Guaranteeing Accuracy and Fairness under Fluctuating User Traffic: A Bankruptcy-Inspired Re-ranking Approach | Xiaopeng Ye, Chen Xu, Jun Xu, Xuyang Xie, Gang Wang, Zhenhua Dong | Out of sustainable and economical considerations, two-sided recommendation platforms must satisfy the needs of both users and providers. Previous studies often show that the two sides' needs show different urgency: providers need a relatively long-term exposure demand while users want more short-term and accurate service. However, our empirical study reveals that previous methods for trading off fairness-accuracy often fail to guarantee long-term fairness and short-term accuracy simultaneously in real applications of fluctuating user traffic. Especially, when user traffic is low, the user experience often drops a lot. Our theoretical analysis also confirms that user traffic is a key factor in such a trade-off problem. How to guarantee accuracy and fairness under fluctuating user traffic remains a problem. Inspired by the bankruptcy problem in economics, we propose a novel fairness-aware re-ranking approach named BankFair. Intuitively, BankFair employs the Talmud rule to leverage periods of abundant user traffic to offset periods of user traffic scarcity, ensuring consistent user service at every period while upholding long-term fairness. Specifically, BankFair consists of two modules: (1) employing the Talmud rule to determine the required fairness degree under varying periods of user traffic; and (2) conducting an online re-ranking algorithm based on the fairness degree determined by the Talmud rule. Experiments on two real-world recommendation datasets show that BankFair outperforms all baselines regarding accuracy and provider fairness. | 出于可持续性和经济性的考虑,双边推荐平台必须同时满足用户和提供者的需求。以往的研究通常表明,双方的需求显示出不同的紧迫性:提供者需要相对长期的曝光需求,而用户则希望获得更多短期且准确的服务。然而,我们的实证研究揭示,先前用于权衡公平性与准确性的方法在面对用户流量波动的实际应用中,往往无法同时保证长期的公平性和短期的准确性。特别是在用户流量较低时,用户体验往往会大幅下降。我们的理论分析也证实,用户流量是此类权衡问题中的关键因素。如何在用户流量波动的情况下保证准确性和公平性仍然是一个问题。受经济学中破产问题的启发,我们提出了一种名为BankFair的新型公平感知重排序方法。直观上,BankFair利用用户流量充足期来弥补用户流量稀缺期,通过使用塔木德规则确保每个时期用户服务的连续性,同时维护长期的公平性。具体而言,BankFair包含两个模块:(1)利用塔木德规则确定在不同用户流量时期所需的公平性程度;(2)基于塔木德规则确定的公平性程度进行在线重排序算法。在两个真实世界推荐数据集上的实验表明,BankFair在准确性和提供者公平性方面均优于所有基线方法。 | code | 0 | |
EFVAE: Efficient Federated Variational Autoencoder for Collaborative Filtering | Lu Zhang, Qian Rong, Xuanang Ding, Guohui Li, Ling Yuan | Huazhong University of Science and Technology, Wuhan, China | Federated recommender systems are used to address privacy issues in recommendations. Among them, FedVAE extends the representative non-linear recommendation method MultVAE. However, the bottleneck of FedVAE lies in its communication load during training, as the parameter volume of its first and last layers is correlated with the number of items. This leads to significant communication cost during the model's transmission phases (distribution and upload), making FedVAE's implementation extremely challenging. To address these challenges, we propose an Efficient Federated Variational AutoEncoder for collaborative filtering, EFVAE, which core is the Federated Collaborative Importance Sampling (FCIS) method. FCIS reduces communication costs through a client-to-server collaborative sampling mechanism and provides satisfactory recommendation performance through dynamic multi-stage approximation of the decoding distribution. Extensive experiments and analyses on real-world datasets confirm that EFVAE significantly reduces communication costs by up to 94.51% while maintaining the recommendation performance. Moreover, its recommendation performance is better on sparse datasets, with improvements reaching up to 13.79%. | 联邦推荐系统用于解决推荐中的隐私问题。其中,FedVAE扩展了代表性的非线性推荐方法MultVAE。然而,FedVAE的瓶颈在于其训练过程中的通信负载,因为其第一层和最后一层的参数体积与项目数量相关。这导致模型传输阶段(分发和上传)的通信成本显著增加,使得FedVAE的实现极为困难。为了应对这些挑战,我们提出了一种高效的联邦变分自编码器用于协同过滤,EFVAE,其核心是联邦协同重要性采样(FCIS)方法。FCIS通过客户端到服务器的协同采样机制减少通信成本,并通过解码分布的动态多阶段近似提供令人满意的推荐性能。在真实世界数据集上的广泛实验和分析证实,EFVAE显著减少了高达94.51%的通信成本,同时保持了推荐性能。此外,它在稀疏数据集上的推荐性能更好,提升幅度最高可达13.79%。 | code | 0 |
MSKR: Advancing Multi-modal Structured Knowledge Representation with Synergistic Hard Negative Samples | Shuili Zhang, Hongzhang Mu, Tingwen Liu, Qianqian Tong, Jiawei Sheng | ; Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China | Despite the notable progress achieved by large-scale vision-language pre-training models in a wide range of multi-modal tasks, their performance often falls short in image-text matching challenges that require an in-depth understanding of structured representations. For instance, when distinguishing between texts or images that are generally similar but have distinct structured knowledge (such as entities and relationships in text, or objects and object attributes in images), the model's capabilities are limited. In this paper, we propose a advancing Multi-modal Structured Knowledge Representation with synergistic hard negative samples (MSKR), thereby significantly improving the model's matching capability for such data. Specifically, our model comprises a structured knowledge-enhanced encoder designed to bolster the structured knowledge inherent in textual data, such as entities, their attributes, and the relationships among these entities as well as structured knowledge within images, focusing on elements like objects and their attributes. To further refine the model's learning process, we produce both image and text challenging negative samples. Extensive experimental evaluations on the Winoground, InpaintCOCO, and MSCOCO benchmark reveal that MSKR significantly outperforms the baseline model, showcasing marked improvements 2.66% on average in structured representation learning compared to the baseline. Moreover, general representation results illustrate that our model not only excels in structured representation learning but also maintains its proficiency in general representation learning. | 尽管大规模视觉-语言预训练模型在多模态任务中取得了显著进展,但在需要深入理解结构化表示的图像-文本匹配挑战中,其表现往往不尽如人意。例如,在区分文本或图像时,当这些文本或图像总体相似但具有不同的结构化知识(如文本中的实体及其关系,或图像中的对象及其属性)时,模型的能力受到限制。本文提出了一种通过协同硬负样本推进多模态结构化知识表示(MSKR)的方法,从而显著提升了模型对这类数据的匹配能力。具体而言,我们的模型包括一个结构化知识增强的编码器,旨在强化文本数据中固有的结构化知识,如实体、实体属性及其关系,以及图像中关注对象及其属性的结构化知识。为了进一步优化模型的学习过程,我们生成了图像和文本的挑战性负样本。在Winoground、InpaintCOCO和MSCOCO基准上的广泛实验评估表明,MSKR显著优于基线模型,与基线相比,在结构化表示学习中平均提高了2.66%。此外,通用表示结果显示,我们的模型不仅在结构化表示学习中表现出色,而且在通用表示学习中也保持了其熟练度。 | code | 0 |
Watermarking Recommender Systems | Sixiao Zhang, Cheng Long, Wei Yuan, Hongxu Chen, Hongzhi Yin | Recommender systems embody significant commercial value and represent crucial intellectual property. However, the integrity of these systems is constantly challenged by malicious actors seeking to steal their underlying models. Safeguarding against such threats is paramount to upholding the rights and interests of the model owner. While model watermarking has emerged as a potent defense mechanism in various domains, its direct application to recommender systems remains unexplored and non-trivial. In this paper, we address this gap by introducing Autoregressive Out-of-distribution Watermarking (AOW), a novel technique tailored specifically for recommender systems. Our approach entails selecting an initial item and querying it through the oracle model, followed by the selection of subsequent items with small prediction scores. This iterative process generates a watermark sequence autoregressively, which is then ingrained into the model's memory through training. To assess the efficacy of the watermark, the model is tasked with predicting the subsequent item given a truncated watermark sequence. Through extensive experimentation and analysis, we demonstrate the superior performance and robust properties of AOW. Notably, our watermarking technique exhibits high-confidence extraction capabilities and maintains effectiveness even in the face of distillation and fine-tuning processes. | 推荐系统具有显著的商业价值,并代表着重要的知识产权。然而,这些系统的完整性不断受到恶意行为者的挑战,他们试图窃取其底层模型。保护这些系统免受此类威胁对于维护模型所有者的权益至关重要。尽管模型水印技术在多个领域已成为一种强大的防御机制,但将其直接应用于推荐系统仍未得到探索且具有挑战性。在本文中,我们通过引入自回归分布外水印技术(Autoregressive Out-of-distribution Watermarking, AOW)来填补这一空白,这是一种专为推荐系统量身定制的新技术。我们的方法包括选择初始项目并通过oracle模型查询,随后选择具有较小预测分数的后续项目。这一迭代过程自回归地生成水印序列,然后通过训练将其嵌入模型的记忆中。为了评估水印的有效性,模型被要求在给定截断水印序列的情况下预测后续项目。通过广泛的实验和分析,我们展示了AOW的优越性能和鲁棒特性。值得注意的是,我们的水印技术表现出高置信度的提取能力,并且在面对蒸馏和微调过程时仍能保持有效性。 | code | 0 | |
Multi-modal Food Recommendation with Health-aware Knowledge Distillation | Yixin Zhang, Xin Zhou, Fanglin Zhu, Ning Liu, Wei Guo, Yonghui Xu, Zhiqi Shen, Lizhen Cui | Shandong University, Jinan, China; Nanyang Technological University, Singapore, Singapore | Food recommendation systems play a pivotal role in shaping dietary salubrity and fostering sustainable lifestyles by recommending recipes and foodstuffs that align with user preferences. Metadata information of a recipe, encompassing multi-modal descriptions, constituent ingredients, and health-related attributes, can furnish a more holistic perspective on the recipe's profile, thereby augmenting recommendation performance. However, existing state-of-the-art methods often overlook the inherent interdependencies between modalities, ingredients, and health factors, leaving the health information pertaining to recipe characteristics underexploited. Notably, our preliminary investigation on two datasets unveiled that the semantic divergence between health-related knowledge and collaborative filtering signals is more pronounced in comparison to other metadata information, thereby potentially impeding the efficacy of food recommendation systems. To address these limitations, we propose HealthRec, a novel multi-modal food recommendation framework with health-aware knowledge distillation. HealthRec employs a global graph representation learning module to capture high-order dependencies across diverse food-related relations, enriching the representations. Subsequently, a co-attention network is leveraged to capture local, recipe-level knowledge transfer between modality-related and ingredient-related embeddings. Additionally, we exploit external supervision signals derived from WHO recommendations, utilizing knowledge distillation during the training phase to transfer local health-aware knowledge into global collaborative embeddings. Extensive experimentation on real-world datasets demonstrates HealthRec's superiority compared to current state-of-the-art recommendation baselines, highlighting its effectiveness in modeling health-aware food recommendations. | 食物推荐系统在通过推荐符合用户偏好的食谱和食品来塑造饮食健康和促进可持续生活方式方面发挥着关键作用。食谱的元数据信息,包括多模态描述、成分成分和健康相关属性,可以提供关于食谱特征的更全面视角,从而增强推荐性能。然而,现有的最先进方法往往忽视了模态、成分和健康因素之间的固有相互依赖性,导致与食谱特征相关的健康信息未得到充分利用。值得注意的是,我们初步研究的两个数据集揭示了健康相关知识与协同过滤信号之间的语义差异相较于其他元数据信息更为显著,这可能阻碍食物推荐系统的有效性。为了解决这些限制,我们提出了HealthRec,一种新颖的多模态食物推荐框架,结合了健康意识的知识蒸馏。HealthRec采用全局图表示学习模块来捕捉跨多种食物相关关系的高阶依赖性,丰富表示内容。随后,利用协同注意力网络来捕捉模态相关和成分相关嵌入之间的局部、食谱级别知识传递。此外,我们利用来自世界卫生组织建议的外部监督信号,在训练阶段通过知识蒸馏将局部健康意识知识传递到全局协同嵌入中。在真实世界数据集上的广泛实验表明,HealthRec相较于当前最先进的推荐基线具有优越性,突显了其在建模健康意识食物推荐方面的有效性。 | code | 0 |
Preference Prototype-Aware Learning for Universal Cross-Domain Recommendation | Yuxi Zhang, Ji Zhang, Feiyang Xu, Lvying Chen, Bohan Li, Lei Guo, Hongzhi Yin | ; Polytechnic Institute, Zhejiang University, Hangzhou, China; College of Computer Science and Technology, Shandong Normal University, Jinan, China | Cross-domain recommendation (CDR) aims to suggest items from new domains that align with potential user preferences, based on their historical interactions. Existing methods primarily focus on acquiring item representations by discovering user preferences under specific, yet possibly redundant, item features. However, user preferences may be more strongly associated with interacted items at higher semantic levels, rather than specific item features. Consequently, this item feature-focused recommendation approach can easily become suboptimal or even obsolete when conducting CDR with disturbances of these redundant features. In this paper, we propose a novel Preference Prototype-Aware (PPA) learning method to quantitatively learn user preferences while minimizing disturbances from the source domain. The PPA framework consists of two complementary components: a mix-encoder and a preference prototype-aware decoder, forming an end-to-end unified framework suitable for various real-world scenarios. The mix-encoder employs a mix-network to learn better general representations of interacted items and capture the intrinsic relationships between items across different domains. The preference prototype-aware decoder implements a learnable prototype matching mechanism to quantitatively perceive user preferences, which can accurately capture user preferences at a higher semantic level. This decoder can also avoid disturbances caused by item features from the source domain. The experimental results on public benchmark datasets in different scenarios demonstrate the superiority of the proposed PPA learning method compared to state-of-the-art counterparts. PPA excels not only in providing accurate recommendations but also in offering reliable preference prototypes. Our code is available at https://github.com/zyx-nuaa/PPA-for-CDR. | 跨域推荐(CDR)旨在根据用户的历史交互行为,向他们推荐来自新领域的符合其潜在偏好的项目。现有方法主要集中在通过在特定但可能冗余的项目特征下发现用户偏好来获取项目表示。然而,用户偏好可能更多地与高语义层次上的交互项目相关联,而不是特定的项目特征。因此,这种以项目特征为中心的推荐方法在进行跨域推荐时,当遇到这些冗余特征的干扰时,很容易变得次优甚至失效。本文提出了一种新的偏好原型感知(PPA)学习方法,该方法在最小化源域干扰的同时,定量学习用户偏好。PPA框架由两个互补组件组成:一个混合编码器和一个偏好原型感知解码器,形成一个适用于各种现实场景的端到端统一框架。混合编码器采用混合网络来学习交互项目的更好泛化表示,并捕捉不同领域项目之间的内在关系。偏好原型感知解码器实现了一个可学习的原型匹配机制,以定量感知用户偏好,能够准确捕捉用户在高语义层次上的偏好。该解码器还能避免来自源域项目特征的干扰。在不同场景的公共基准数据集上的实验结果表明,所提出的PPA学习方法相比现有最先进的方法具有优越性。PPA不仅在提供准确推荐方面表现出色,还能提供可靠的偏好原型。我们的代码可在https://github.com/zyx-nuaa/PPA-for-CDR获取。 | code | 0 |
Multi-Task Modeling of Student Knowledge and Behavior | Siqian Zhao, Sherry Sahebi | Department of Computer Science, University at Albany - SUNY, Albany, NY, USA | Knowledge Tracing (KT) and Behavior Modeling (BM) are essential mining and discovery problems in education. KT models student knowledge based on prior performance with learning materials, while BM focuses on patterns such as student preferences, engagement, and procrastination. Traditional research in these areas focuses on each task individually, thereby overlooking their interconnections. However, recent research on multi-activity knowledge tracing suggests that student preferences for learning materials are key to understanding student learning. In this paper, we propose a novel multi-task model, the Multi-Task Student Knowledge and Behavior Model (KTBM), which combines KT and BM to improve both performance and interoperability. KTBM includes a multi-activity KT component and a preference behavior component while enabling robust information transfer between them. We conceptualize this approach as a multi-task learning problem with two objectives: predicting students' performance and their choices concerning learning material types. To address this dual-objective challenge, we employ a Pareto multi-task learning optimization algorithm. Our experiments on three real-world datasets show that KTBM significantly enhances both KT and BM performance, demonstrating improvement across various settings and providing interpretable results. | 知识追踪(KT)和行为建模(BM)是教育领域中重要的挖掘与发现问题。KT根据学生先前在学习材料上的表现来模型化学生的知识水平,而BM则关注学生的偏好、参与度及拖延等模式。传统研究往往单独处理这些任务,忽略了它们之间的关联。然而,近期关于多活动知识追踪的研究表明,学生对学习材料的偏好是理解其学习行为的关键。本文中,我们提出了一种新颖的多任务模型——多任务学生知识与行为模型(KTBM),该模型结合了KT和BM,旨在提升性能与互操作性。KTBM包含一个多活动KT组件和一个偏好行为组件,并支持两者之间的信息稳健传递。我们将这一方法构想为一个具有两个目标的多任务学习问题:预测学生的表现及其对学习材料类型的选择。为应对这一双重目标挑战,我们采用了帕累托多任务学习优化算法。我们在三个真实世界数据集上的实验表明,KTBM显著提升了KT和BM的性能,展示了在各种情境下的改进,并提供了可解释的结果。 | code | 0 |
Accurate Embedding-based Log Determinant Optimization | Daye Eun, Byungkon Kang | SUNY Korea, Incheon, Republic of Korea; MagicFinger, Incheon, Republic of Korea | Many tangible and intangible objects are represented as itemsets; i.e., composition of individual items. In this paper, we address the problem of finding the embedding of such items so as to use those embeddings in tasks like missing item prediction. We approach this problem by means of determinantal point process (DPP) in order to reflect the diversity within each set. Doing so requires an optimization of a log determinant of a symmetric positive definite (SPD) matrix. The standard practice to achieve this is to perform a low-rank decomposition of the matrix and derive update rules for the low rank matrix. In this work, we propose to approach this problem by means of item embedding. That is, we will learn the SPD matrix by trying to find the right vector representations for the given data for a fixed kernel function. To this end, we propose a novel algorithm to accurately compute the gradients of the log determinant with respect to the embedding vectors. We also show that our approach outperforms Autodiff-based learning in terms of gradient direction and running time, and that other general log determinant optimization problems can be addressed. | 许多有形和无形的物体被表示为项集;即,由单个项组成的组合。本文中,我们解决了为这些项寻找嵌入的问题,以便在诸如缺失项预测等任务中使用这些嵌入。我们通过行列式点过程(DPP)来解决这个问题,以反映每个集合内的多样性。这样做需要对对称正定(SPD)矩阵的对数行列式进行优化。实现这一目标的标准做法是对矩阵进行低秩分解,并推导出低秩矩阵的更新规则。在这项工作中,我们提出通过项嵌入的方式来解决这个问题。也就是说,我们将通过尝试为给定数据找到适当的向量表示来学习SPD矩阵,对于一个固定的核函数。为此,我们提出了一种新的算法,能够准确计算对数行列式相对于嵌入向量的梯度。我们还表明,我们的方法在梯度方向和运行时间方面优于基于自动微分的学习,并且可以解决其他一般性的对数行列式优化问题。 | code | 0 |
Knowledge-enhanced Dynamic Modeling framework for Multi-Behavior Recommendation | Xiujuan Li, Nan Wang, Jin Zeng, Yingli Zhong, Zhonghui Shen | Heilongjiang University, Harbin, China | Multi-behavior recommendation (MBR) aims at predicting the items that the user will interact with at the next moment through the target behavior. Most existing MBR models are devoted to designing novel graph convolutional networks to combine multi-behavioral information. However, they ignore the negative impact of auxiliary behaviours, and also fail to take into account the effects of item characteristics. These limitations can lead to model performance degradation and affect user satisfaction. To address these issues, we propose a Knowledge-enhanced Dynamic Modeling framework for Multi-Behavior Recommendation (KDMBR). The algorithm utilises a multi-behavioral interaction module and a knowledge graph module to capture the user's overall interest and feature information respectively. The former designs a behavior-aware attention to distinguish contributions between behaviors. The latter introduces KG to enrich item characteristics and proposes a graph reconstruction strategy to enrich user information. Experiments on two large datasets further demonstrate the effectiveness of KDMBR. | 多行为推荐(MBR)旨在通过目标行为预测用户在下一时刻将与之交互的项目。大多数现有的MBR模型致力于设计新颖的图卷积网络以结合多行为信息。然而,这些模型忽视了辅助行为的负面影响,也未能考虑到项目特征的影响。这些局限性可能导致模型性能下降并影响用户满意度。为了解决这些问题,我们提出了一个知识增强的动态建模框架,用于多行为推荐(KDMBR)。该算法利用多行为交互模块和知识图谱模块分别捕捉用户的整体兴趣和特征信息。前者设计了一种行为感知的注意力机制来区分不同行为之间的贡献。后者引入了知识图谱(KG)来丰富项目特征,并提出了一种图重建策略来丰富用户信息。在两个大型数据集上的实验进一步证明了KDMBR的有效性。 | code | 0 |
Multi-DSI: Non-deterministic Identifier and Concept Alignment for Differentiable Search Index | YuZe Liu, JyunYu Jiang, PuJen Cheng | National Taiwan University, Taipei, Taiwan; Amazon Search, Palo Alto, CA, USA | With the advent of generative deep learning models, generative IR has gained increasing attention. However, existing methods face two issues: (1) when a document is represented by a single semantic ID, the retrieval model may fail to capture the multifaceted and complex content of the document; and (2) when the generated training data exhibits semantic ambiguity, the retrieval model may struggle to distinguish the differences in the content of similar documents. To address these issues, we propose Multi-DSI to (1) offer multiple non-deterministic semantic identifiers and (2) align the concepts of queries and documents to avoid ambiguity. Extensive experiments on two benchmark datasets demonstrate that Multi-DSI significantly outperforms baseline methods by 7.4%. | 随着生成式深度学习模型的兴起,生成式信息检索(IR)受到了越来越多的关注。然而,现有方法面临两个问题:(1)当文档由单一的语义ID表示时,检索模型可能无法捕捉到文档多面且复杂的内容;(2)当生成的训练数据存在语义模糊时,检索模型可能难以区分相似文档内容的差异。为解决这些问题,我们提出了多维度语义标识符(Multi-DSI),以(1)提供多个非确定性的语义标识符,并(2)对齐查询与文档的概念,以避免语义模糊。在两个基准数据集上的广泛实验表明,Multi-DSI显著优于基线方法,性能提升了7.4%。 | code | 0 |
Do We Really Need to Drop Items with Missing Modalities in Multimodal Recommendation? | Daniele Malitesta, Emanuele Rossi, Claudio Pomo, Tommaso Di Noia, Fragkiskos D. Malliaros | Generally, items with missing modalities are dropped in multimodal recommendation. However, with this work, we question this procedure, highlighting that it would further damage the pipeline of any multimodal recommender system. First, we show that the lack of (some) modalities is, in fact, a widely-diffused phenomenon in multimodal recommendation. Second, we propose a pipeline that imputes missing multimodal features in recommendation by leveraging traditional imputation strategies in machine learning. Then, given the graph structure of the recommendation data, we also propose three more effective imputation solutions that leverage the item-item co-purchase graph and the multimodal similarities of co-interacted items. Our method can be plugged into any multimodal RSs in the literature working as an untrained pre-processing phase, showing (through extensive experiments) that any data pre-filtering is not only unnecessary but also harmful to the performance. | 通常,在多模态推荐中,会丢弃缺少模态的项。然而,通过这项工作,我们质疑这一做法,强调这将进一步损害任何多模态推荐系统的流程。首先,我们展示了在多模态推荐中,某些模态的缺失实际上是一个普遍存在的现象。其次,我们提出了一种流程,通过利用机器学习中的传统插补策略来填补推荐中缺失的多模态特征。然后,鉴于推荐数据的图结构,我们还提出了三种更有效的插补解决方案,这些方案利用了物品间共同购买图和共同交互物品的多模态相似性。我们的方法可以插入到任何现有的多模态推荐系统中,作为一个未经训练的预处理阶段,通过广泛的实验表明,任何数据预过滤不仅是不必要的,而且对性能有害。 | code | 0 | |
SOUP: A Unified Shopping Query Suggestion Framework to Optimize Language Model with User Preference | Xu Meng, Zhaohui Luo, Xinxin Wang, Wen Jiang, Wei Ning, Shuhan Qi | Alibaba Group, Hangzhou, China; Harbin Institute of Technology, Shenzhen, Shenzhen, China | The shopping query suggestion offers personalized queries to users and plays a crucial role in search engines. However, existing shopping query suggestion methods suffer from poor task generalization and limited semantic comprehension problems. This paper presents a comprehensive framework for the shopping query suggestion that effectively addresses the shortcomings of existing approaches. Our proposed framework leverages a generative language model and fine-grained preference alignment to enhance semantic comprehension and improve the quality of generated queries. Our key contributions include the introduction of a personalized prompt set for diverse query suggestion tasks, the integration of interaction behavior time to capture user query interests, and the utilization of reinforcement learning techniques to align user preferences. Experimental results demonstrate enhancements in different scenarios. Our codes are available at https://github.com/1170300319/CIKM2024_SOUP. | 购物查询建议为用户提供个性化的查询,并在搜索引擎中发挥关键作用。然而,现有的购物查询建议方法存在任务泛化性差和语义理解有限的问题。本文提出了一种全面的购物查询建议框架,有效解决了现有方法的不足。我们提出的框架利用生成式语言模型和细粒度偏好对齐来增强语义理解,并提高生成查询的质量。我们的主要贡献包括引入个性化提示集以应对多样化的查询建议任务,整合交互行为时间以捕捉用户的查询兴趣,以及利用强化学习技术来对齐用户偏好。实验结果显示在不同场景下均有改进。我们的代码可在 https://github.com/1170300319/CIKM2024_SOUP 获取。 | code | 0 |
LayerPlexRank: Exploring Node Centrality and Layer Influence through Algebraic Connectivity in Multiplex Networks | Hao Ren, Jiaojiao Jiang | As the calculation of centrality in complex networks becomes increasinglyvital across technological, biological, and social systems, precise andscalable ranking methods are essential for understanding these networks. Thispaper introduces LayerPlexRank, an algorithm that simultaneously assesses nodecentrality and layer influence in multiplex networks using algebraicconnectivity metrics. This method enhances the robustness of the rankingalgorithm by effectively assessing structural changes across layers usingrandom walk, considering the overall connectivity of the graph. We substantiatethe utility of LayerPlexRank with theoretical analyses and empiricalvalidations on varied real-world datasets, contrasting it with establishedcentrality measures. | 随着复杂网络中中心性计算在技术、生物和社会系统中的日益重要,精确且可扩展的排序方法对于理解这些网络至关重要。本文介绍了LayerPlexRank算法,该算法利用代数连通性指标在多层网络中同时评估节点中心性和层级影响。通过使用随机游走有效评估跨层的结构变化,并考虑图的整体连通性,这种方法增强了排序算法的鲁棒性。我们通过理论分析和在不同真实世界数据集上的实证验证,证明了LayerPlexRank的实用性,并与现有的中心性度量方法进行了对比。 | code | 0 | |
Osprey 🪶: A Reference Framework for Online Grooming Detection via Neural Models and Conversation Features | Hamed Waezi, Reza Barzegar, Hossein Fani | School of Computer Science, University of Windsor, Windsor, ON, Canada | Online grooming is the process of an adult initiating a sexual relationship with a minor through online conversation platforms. While neural models are developed to detect such incidents, their practical implications in real-world settings remain moot for their closed, irreproducible, and poor evaluation methodologies under the sparse distribution of grooming conversations in the training datasets, like undermining recall over precision. Furthermore, proposed models overlook characteristic features of grooming in online conversations, including the number of participants, message exchange patterns, and temporal signals, such as the elapsed times between messages. In this paper, we foremost contribute Osprey, an open-source library to support a standard pipeline and experimental details, incorporating canonical neural models and a variety of vector representation learning for conversations while accommodating new models and training datasets. Further, we incorporate conversation features into the models to improve recall while maintaining precision. Our experiments across neural baselines and vector representations of conversations demonstrated that recurrent neural models, particularly gru, on the sequence of pretrained transformer-based embeddings of messages in a conversation along with conversation features obtain state-of-the-art performance, winning the best recall with competitive precision. Osprey is available at https://github.com/fani-lab/Osprey/tree/cikm24. | 在线“狩猎”是指成年人通过在线聊天平台与未成年人建立性关系的过程。尽管已开发出神经模型来检测此类事件,但由于训练数据集中“狩猎”对话的分布稀疏,这些模型在实际应用中的效果仍不明确,其封闭、不可复现且评估方法不当的问题削弱了召回率而偏重精度。此外,现有模型忽视了在线对话中“狩猎”行为的特征,如参与者数量、消息交换模式及时间信号(如消息之间的时间间隔)。本文首先贡献了Osprey,一个开源库,支持标准流程和实验细节,整合了经典的神经模型和多种对话向量表示学习方法,并兼容新模型和训练数据集。进一步,我们将对话特征融入模型,以提高召回率的同时保持精度。我们的实验表明,在神经基线和对话向量表示中,基于预训练变换器嵌入的消息序列及对话特征的循环神经模型,特别是GRU,取得了最先进的性能,获得了最佳召回率并保持了竞争力的精度。Osprey可通过以下链接获取:https://github.com/fani-lab/Osprey/tree/cikm24。 | code | 0 |
The Effect of Icon Semantic Distance on Preschool Children's Information Search: Evidence from an Eye-Tracking Study | Jiaqi Yang, Pianran Wang | Department of Information Management, Peking University, Beijing, China | Icons are frequently employed in children-oriented information systems due to children's limited literacy. However, the inherent semantic distances of icons, which may influence their affordance to children, are often overlooked in the development of such systems and related research. In this study, we apply semantic distance to measure the explicitness of icons in children-oriented book search, utilizing self-developed icons tailored for indexing picture books. We first gathered data from children through questionnaires to assess the perceived semantic distance of each icon. Subsequently, we conducted eye-tracking experiments with 50 preschool children, measuring their search accuracy, response time, and eye movement patterns while using icons to locate specific picture books. Our findings indicate that preschool children are easier to use icons with close semantic distance and single icons for searching. Additionally, the ability to use icons with distant semantic distances and combination icons significantly improves with age. These findings may contribute to the development of more effective and children-friendly information search systems. | 由于儿童的识字能力有限,图标经常被用于面向儿童的信息系统中。然而,图标固有的语义距离,这可能会影响它们对儿童的可操作性,在开发此类系统和相关研究中往往被忽视。在本研究中,我们应用语义距离来衡量面向儿童的图书搜索中图标的显性程度,使用为索引图画书而定制的自开发图标。我们首先通过问卷从儿童那里收集数据,以评估每个图标的感知语义距离。随后,我们对50名学前儿童进行了眼动追踪实验,测量他们在使用图标定位特定图画书时的搜索准确性、反应时间和眼动模式。我们的研究结果表明,学前儿童更容易使用语义距离接近的单个图标进行搜索。此外,随着年龄的增长,使用语义距离较远的组合图标的能力显著提高。这些发现可能有助于开发更有效且更适合儿童的信息搜索系统。 | code | 0 |
Contrastive Disentangled Representation Learning for Debiasing Recommendation with Uniform Data | Xinxin Yang, Zhen Liu, Xiaoman Lu, Yafan Yuan, Sibo Lu, Yibo Gao | Beijing Jiaotong University, Beijing, China | In recommender systems, learning high-quality user and item representations is crucial for predicting user preferences. However, there are various confounding factors in observational data, resulting in data bias, which hinders the learning of user and item representations. Recent work proposed to use uniform data to alleviate bias problem. However, these methods fail to learn pure representations for unbiased prediction, which are not affected by confounding factors. This paper introduces a novel disentangled framework, named CDLRec, for learning unbiased representations, leveraging uniform data as supervisory signal for disentangling. Furthermore, to address the scarcity problem of uniform data, the contrastive learning is utilized to implement disentanglement by providing augmented samples. Specifically, two contrastive strategies are designed based on different sampling ways for positives and negatives. Extensive experiments are conducted over two real-world datasets and the results demonstrate the superior performance of our proposed method. | 在推荐系统中,学习高质量的用户和物品表示对于预测用户偏好至关重要。然而,观察数据中存在多种混淆因素,导致数据偏差,这阻碍了用户和物品表示的学习。最近的工作提出使用均匀数据来缓解偏差问题。然而,这些方法未能学习到不受混淆因素影响的纯表示,从而无法进行无偏预测。本文引入了一种名为CDLRec的新型解耦框架,利用均匀数据作为解耦的监督信号来学习无偏表示。此外,为了解决均匀数据稀缺的问题,本文利用对比学习通过提供增强样本来实现解耦。具体来说,基于正负样本的不同采样方式设计了两种对比策略。在两个真实世界数据集上进行了大量实验,结果表明我们提出的方法具有优越的性能。 | code | 0 |
Dual-level Intents Modeling for Knowledge-aware Recommendation | Jin Zeng, Nan Wang, Jinbao Li | Qilu University of Technology, Jinan, China; Heilongjiang University, Harbin, China | Previous user-item interaction graphs have typically focused on simple interaction between users and items, failing to identify the important effects of user's intents in the interaction. While recent studies have ventured into exploring intent relationships between users and items for modeling, they predominantly emphasize user preferences manifesting in the interaction, overlooking knowledge-driven insight, thereby limiting the interpretability of intent. In this paper, we utilize the rich interpretable knowledge information in the knowledge graph to design a novel dual-level intents modeling framework called DIM. DIM aims to mine user's true intents, which usually include user popularity preference and personalized preference. Therefore, we extract both the popular and personalized user preferences from attribute tuples within the knowledge graph at the global and local levels, respectively. Experimental results on three datasets demonstrate the superiority of DIM over various state-of-the-art approaches. | 以往的用户-项目交互图主要关注用户与项目之间的简单交互,未能识别出用户意图在交互中的重要影响。虽然近期研究开始探索用户与项目之间的意图关系以进行建模,但它们主要强调交互中表现出的用户偏好,忽视了基于知识的洞察,从而限制了意图的可解释性。本文利用知识图谱中丰富的可解释知识信息,设计了一种新颖的双层次意图建模框架,称为DIM。DIM旨在挖掘用户的真实意图,这些意图通常包括用户的热门偏好和个性化偏好。因此,我们从知识图谱的属性元组中分别提取了全局和局部层面的热门和个性化用户偏好。在三个数据集上的实验结果表明,DIM优于各种最先进的方法。 | code | 0 |
Distilling Knowledge Based on Curriculum Learning for Temporal Knowledge Graph Embeddings | Bin Zhang, Jiayin Li, Yuanfei Dai | Nanjing Tech University, Nanjing, China; Fujian Normal University, Fuzhou, China | Lower-dimensional temporal knowledge graph embedding (TKGE) models are crucial for practical applications and resource-limited scenarios, although existing models employ higher-dimensional embeddings in training. In this paper, we propose a new framework for distilling TKGE models via an easy to hard pedagogical principle. The framework utilizes a learnable curriculum temperature (CT) module to optimize and guide the knowledge distillation process dynamically, ensuring that the entire procedure adheres to the principle. It also employs a self-adaptive attention mechanism to endeavor to achieve efficient transfer of knowledge from higher-dimensional models to lower-dimensional ones. Evaluation on various TKGE models and datasets demonstrates the proposed approach significantly reduces the model's parameters without noticeably affecting its performance. | 低维时间知识图谱嵌入(TKGE)模型对于实际应用和资源有限场景至关重要,尽管现有模型在训练中采用高维嵌入。本文提出了一种基于由易到难教学原则的TKGE模型蒸馏新框架。该框架利用可学习的课程温度(CT)模块动态优化并指导知识蒸馏过程,确保整个过程遵循该原则。同时,采用自适应注意力机制,努力实现从高维模型到低维模型的知识高效传递。在多种TKGE模型和数据集上的评估表明,所提出的方法显著减少了模型参数,且对性能影响甚微。 | code | 0 |
Feedback Reciprocal Graph Collaborative Filtering | Weijun Chen, Yuanchen Bei, Qijie Shen, Hao Chen, Xiao Huang, Feiran Huang | Collaborative filtering on user-item interaction graphs has achieved success in the industrial recommendation. However, recommending users' truly fascinated items poses a seesaw dilemma for collaborative filtering models learned from the interaction graph. On the one hand, not all items that users interact with are equally appealing. Some items are genuinely fascinating to users, while others are unfascinated. Training graph collaborative filtering models in the absence of distinction between them can lead to the recommendation of unfascinating items to users. On the other hand, disregarding the interacted but unfascinating items during graph collaborative filtering will result in an incomplete representation of users' interaction intent, leading to a decline in the model's recommendation capabilities. To address this seesaw problem, we propose Feedback Reciprocal Graph Collaborative Filtering (FRGCF), which emphasizes the recommendation of fascinating items while attenuating the recommendation of unfascinating items. Specifically, FRGCF first partitions the entire interaction graph into the Interacted Fascinated (I F) graph and the Interacted Unfascinated (I U) graph based on the user feedback. Then, FRGCF introduces separate collaborative filtering on the I F graph and the I U graph with feedback-reciprocal contrastive learning and macro-level feedback modeling. This enables the I F graph recommender to learn multi-grained interaction characteristics from the I U graph without being misdirected by it. Extensive experiments on four benchmark datasets and a billion-scale industrial dataset demonstrate that FRGCF improves the performance by recommending more fascinating items and fewer unfascinating items. Besides, online A/B tests on Taobao's recommender system verify the superiority of FRGCF. | 在用户-物品交互图上的协同过滤在工业推荐中取得了成功。然而,从交互图中学习的协同过滤模型在推荐用户真正感兴趣的物品时面临着跷跷板难题。一方面,用户交互的所有物品并非同样吸引人。有些物品确实能引起用户的兴趣,而有些则不能。在没有区分这些物品的情况下训练图协同过滤模型可能会导致向用户推荐不吸引人的物品。另一方面,在图协同过滤过程中忽略那些虽被交互但不吸引人的物品会导致用户交互意图的表示不完整,从而降低模型的推荐能力。为了解决这一跷跷板问题,我们提出了反馈互惠图协同过滤(FRGCF),该方法强调推荐吸引人的物品,同时减弱对不吸引人物品的推荐。具体来说,FRGCF首先根据用户反馈将整个交互图划分为交互且吸引人(I F)图和交互但不吸引人(I U)图。然后,FRGCF在I F图和I U图上分别引入协同过滤,并结合反馈互惠对比学习和宏观层次的反馈建模。这使得I F图推荐器能够从I U图中学习多粒度的交互特征,而不会被其误导。在四个基准数据集和一个亿级工业数据集上的广泛实验表明,FRGCF通过推荐更多吸引人的物品和更少不吸引人的物品,提升了性能。此外,在淘宝推荐系统上的在线A/B测试验证了FRGCF的优越性。 | code | 0 | |
DIFN: A Dual Intention-aware Network for Repurchase Recommendation with Hierarchical Spatio-temporal Fusion | Li Lin, Xin Xu, Hai Wang, Tian He, Desheng Zhang, Shuai Wang | JD Logistics, Beijing, Beijing, China; Southeast University, Nanjing, Jiangsu, China; Rutgers University, Piscataway, New Jersey, USA | Recommendation systems play a crucial role in both industrial applications and research fields, which target to understand user preferences and intentions to provide personalized services. Compared to conventional recommendations, repurchase recommendations aim to suggest suitable products to users that they used to buy based on their intention evolution. Existing research on product recommendation can mainly be divided into behavior sequence-based methods and graph-based methods. Although these methods represent user interests and preference features effectively, they still fail to model repurchase behaviors because (i) the environment causing repurchase intention change is neglected and (ii) the lack of feedback after purchasing makes it difficult to learn the impacts of diverse behaviors. To comprehensively consider these limitations, we design a D ual I ntention-aware F usion N etwork framework (DIFN) to understand the effects of environment and after-purchasing feedback on users' intentions. Firstly, a hierarchical graph-based multi-level relational attention module is designed to effectively extract basic user features and spatial features from complex environmental information. Then, we introduce a behavior intention module and a usage intention module for different types of feedback data. Finally, we propose a dual intention fusion network that effectively fuses user basic features with spatial attributes and user intention features with temporal attributes for recommendation. Comprehensive evaluations on real-world datasets show that our method exceeds state-of-the-art baselines, which show an average of 8.2% improvements in different metrics. | 推荐系统在工业应用和研究领域中扮演着至关重要的角色,其目标是通过理解用户偏好和意图来提供个性化服务。与传统推荐相比,复购推荐旨在根据用户意图的演变,向其推荐他们曾经购买过的合适产品。现有的产品推荐研究主要可分为基于行为序列的方法和基于图的方法。尽管这些方法能够有效表示用户的兴趣和偏好特征,但它们未能对复购行为进行建模,原因有二:(i) 忽视了导致复购意图变化的环境因素;(ii) 购买后缺乏反馈,使得难以学习多样行为的影响。为了全面考虑这些限制,我们设计了一个双意图感知融合网络框架(DIFN),以理解环境和购买后反馈对用户意图的影响。首先,我们设计了一个基于层次图的多级关系注意力模块,以有效从复杂的环境信息中提取基本用户特征和空间特征。接着,我们引入了行为意图模块和使用意图模块,分别处理不同类型的反馈数据。最后,我们提出了一种双意图融合网络,该网络能够有效融合用户基本特征与空间属性以及用户意图特征与时间属性,用于推荐。在真实世界数据集上的综合评估表明,我们的方法超越了最先进的基线方法,在不同指标上平均提升了8.2%。 | code | 0 |
Building Natural Language Interface for Product Search | Vijit Malik, Vinayak Puranik, Anirban Majumder, Vivek Sembium | Amazon, Bangalore, India | Automatic extraction of attribute preferences from search queries is a critical problem in providing accurate product recommendations to customer. The task becomes even more challenging in cold-start settings where we do not have any supervised/labelled data available to train ML models. In this work, we implement a novel dataset generation pipeline (LLM-API) that leverages Large Language Models (LLMs), search logs and proprietary product information data from an ecommerce website to create a high quality dataset. Our proposed pipeline of LLM-API is robust as it can generalize to any product category with minimal changes in the LLM prompts. For the problem of converting product search queries to API calls we propose a multi-task schema generator model which we train on our generated dataset. Experiments on an internal test set reveals that our proposed model achieves an improvement of ≈9.6% and ≈5% in Exact Match and Micro-F1 respectively, over competitive baselines. Benchmarking our approach on public test set of search queries further reveals a gain of ≈8.6% and ≈10.5% in Exact Match and Micro-F1. We further demonstrate that our approach outperforms a state-of-the-art LLM (Claude) applied on our task using few-shot prompting and CoT reasoning, while at the same time, achieves improvement in inference latency. | 从搜索查询中自动提取属性偏好是向客户提供准确产品推荐的关键问题。在冷启动场景下,这一任务变得更加具有挑战性,因为我们没有任何监督/标注数据来训练机器学习模型。在这项工作中,我们实现了一种新颖的数据集生成管道(LLM-API),该管道利用大型语言模型(LLMs)、搜索日志以及来自电子商务网站的专有产品信息数据,来创建高质量的数据集。我们提出的LLM-API管道具有鲁棒性,因为它可以推广到任何产品类别,只需对LLM提示进行最小的更改。对于将产品搜索查询转换为API调用的问题,我们提出了一种多任务模式生成器模型,并在我们生成的数据集上进行训练。在内部测试集上的实验表明,我们提出的模型在精确匹配和微观F1得分上分别比竞争基线提高了约9.6%和5%。在公共测试集上的基准测试进一步显示,精确匹配和微观F1得分分别提高了约8.6%和10.5%。我们进一步证明,我们的方法在少样本提示和CoT推理下,优于应用于我们任务的最先进的LLM(Claude),同时在推理延迟方面也实现了改进。 | code | 0 |
EASE: Learning Lightweight Semantic Feature Adapters from Large Language Models for CTR Prediction | Zexuan Qiu, Jieming Zhu, Yankai Chen, Guohao Cai, Weiwen Liu, Zhenhua Dong, Irwin King | The Chinese University of Hong Kong, Hong Kong, China; Huawei Noah's Ark Lab, Shenzhen, China; Conrnell University, Ithaca, USA | Recent studies highlight the potential of large language models (LLMs) to enhance content integration in recommender systems by leveraging their semantic understanding capabilities. However, directly incorporating LLMs into an online inference pipeline significantly increases computation costs for large-scale deployment, posing a practical challenge in balancing their benefits and costs. In this work, we propose the EASE framework, which enriches and aligns semantic feature embeddings using LLMs during the training phase while establishing a lightweight inference pipeline that does not directly involve LLMs. Specifically, we train a semantic adapter to align item features with LLMs and simultaneously enrich semantic embeddings through reconstruction tasks from LLMs. During inference, we retain only the item feature encoder and lightweight semantic adapter, thereby eliminating the computation overhead of resource-intensive LLMs. Our EASE framework is flexible, supporting not only text and visual features but also other pre-processed embedding features. Extensive experiments on both public and industrial datasets demonstrate that enriching semantic feature embeddings with our EASE framework yields consistent improvements in downstream click-through rate prediction tasks. | 近期研究强调了大型语言模型(LLMs)通过利用其语义理解能力来增强推荐系统中内容整合的潜力。然而,直接将LLMs整合到在线推理管道中显著增加了大规模部署的计算成本,这使得在平衡其收益和成本方面面临实际挑战。在此工作中,我们提出了EASE框架,该框架在训练阶段利用LLMs丰富和校准语义特征嵌入,同时建立一个不直接涉及LLMs的轻量级推理管道。具体而言,我们训练一个语义适配器来校准项目特征与LLMs,并通过从LLMs进行重建任务来同时丰富语义嵌入。在推理阶段,我们仅保留项目特征编码器和轻量级语义适配器,从而消除了资源密集型LLMs的计算开销。我们的EASE框架具有灵活性,不仅支持文本和视觉特征,还支持其他预处理的嵌入特征。在公共和工业数据集上的广泛实验表明,通过我们的EASE框架丰富语义特征嵌入在下游点击率预测任务中持续提升了性能。 | code | 0 |
Mitigating Extreme Cold Start in Graph-based RecSys through Re-ranking | Alessandro Sbandi, Federico Siciliano, Fabrizio Silvestri | TIM S.p.A. & Sapienza University of Rome, Rome, Italy; Sapienza University of Rome, Rome, Italy | Recommender systems based on Graph Neural Networks (GNN) have become the state-of-the-art approach in recommendation, but they struggle with in extreme cold-start settings, where most users or items lack interaction data. This paper proposes a novel framework to address this challenge in four steps: (i) a propensity model to predict item purchase behaviour, with associated explainability to identify the most relevant features, (ii) a link augmentation module to connect users based on previously obtained similarities, (iii) a GNN-based link prediction step on the obtained dense graph and (iv) a final re-ranking stage to increase diversity in predictions leveraging users embeddings. By exploiting the enriched graph structure, the framework generates embeddings for cold-start users and items, enabling diverse recommendations, containing long tail and unsold items, for both established and new users. We validate the framework's effectiveness on real-world industrial data from TIM S.p.A. | 基于图神经网络(GNN)的推荐系统已成为推荐领域的最先进方法,但在极端冷启动情况下表现不佳,此时大多数用户或物品缺乏交互数据。本文提出了一种新颖的框架,通过四个步骤来解决这一挑战:(i)一个倾向模型用于预测物品购买行为,并提供相关解释性以识别最相关的特征;(ii)一个链接增强模块,基于先前获得的相似性连接用户;(iii)在获得的密集图上进行基于GNN的链接预测步骤;(iv)一个最终的重新排序阶段,利用用户嵌入来增加预测的多样性。通过利用丰富的图结构,该框架为冷启动用户和物品生成嵌入,从而能够为既有用户和新用户提供包含长尾和未售物品的多样化推荐。我们在TIM S.p.A.的真实工业数据上验证了该框架的有效性。 | code | 0 |
Sequence-level Semantic Representation Fusion for Recommender Systems | Lanling Xu, Zhen Tian, Bingqian Li, Junjie Zhang, Daoyuan Wang, Hongyu Wang, Jinpeng Wang, Sheng Chen, Wayne Xin Zhao | With the rapid development of recommender systems, there is increasing side information that can be employed to improve the recommendation performance. Specially, we focus on the utilization of the associated textual data of items (eg product title) and study how text features can be effectively fused with ID features in sequential recommendation. However, there exists distinct data characteristics for the two kinds of item features, making a direct fusion method (eg adding text and ID embeddings as item representation) become less effective. To address this issue, we propose a novel Text-ID semantic fusion approach for sequential Recommendation, namely . The core idea of our approach is to conduct a sequence-level semantic fusion approach by better integrating global contexts. The key strategy lies in that we transform the text embeddings and ID embeddings by Fourier Transform from time domain to frequency domain. In the frequency domain, the global sequential characteristics of the original sequences are inherently aggregated into the transformed representations, so that we can employ simple multiplicative operations to effectively fuse the two kinds of item features. Our fusion approach can be proved to have the same effects of contextual convolution, so as to achieving sequence-level semantic fusion. In order to further improve the fusion performance, we propose to enhance the discriminability of the text embeddings from the text encoder, by adaptively injecting positional information via a mixture-of-experts (MoE) modulation method. Our implementation is available at this repository: . | 随着推荐系统的快速发展,越来越多的辅助信息可用于提升推荐性能。特别是,我们专注于利用项目的关联文本数据(如产品标题),并研究如何在序列推荐中有效地将文本特征与ID特征融合。然而,这两类项目特征存在显著的数据特性,使得直接的融合方法(如将文本和ID嵌入作为项目表示相加)效果不佳。为了解决这一问题,我们提出了一种新的文本-ID语义融合方法,用于序列推荐,即。我们的方法的核心思想是通过更好地整合全局上下文,进行序列级别的语义融合。关键策略在于,我们通过傅里叶变换将文本嵌入和ID嵌入从时域转换到频域。在频域中,原始序列的全局序列特性自然地聚合到转换后的表示中,从而我们可以使用简单的乘法操作来有效融合这两类项目特征。我们的融合方法可以证明具有与上下文卷积相同的效果,从而实现序列级别的语义融合。为了进一步提高融合性能,我们提出通过混合专家(MoE)调制方法自适应地注入位置信息,以增强文本编码器中文本嵌入的区分性。我们的实现代码可在以下仓库中获取:。 | code | 0 | |
Effective Utilization of Large-scale Unobserved Data in Recommendation Systems | Feng Zhang, Yulin Xu, Hongjie Chen, Xu Yuan, Qingwen Liu, Yuning Jiang | Taotian Group, Beijing, China; Taotian Group, Hangzhou, China | Ranking models play an important role in industrial recommendation systems. However, most ranking models are trained only with the observed items but used to retrieve all items in the entire space, which may suffer from the sample selection bias and the exposure bias. Inspired by the entire space learning framework, we carry out detailed data analyses on large-scale unobserved items and find that they contain quite a few "potentially-positive" samples. In this paper, we propose an "Extract and Transfer" (EAT) framework, utilizing quantities of unobserved items and other domains' data to construct more training data for ranking models. Specifically, we first extract "potentially-positive" samples and negative ones according to their ranking scores from the unobserved data, and then design an Entire Space Transfer Learning (ESTL) model to transfer knowledge between observed and unobserved samples, instead of directly mixing them together to avoid negative transfer. Experiments on production data collected from Taobao validate the proposed method's superiority. Besides, we have deployed EAT on the Taobao recommendation system, obtaining 6.22% IPV (Item Page View) and 3.77% CTR improvement. The code is available at https://github.com/Recommender1/EAT.git1. | 排名模型在工业推荐系统中扮演着重要角色。然而,大多数排名模型仅使用观察到的项目进行训练,但在整个项目空间中检索所有项目时使用,这可能导致样本选择偏差和曝光偏差。受整体空间学习框架的启发,我们对大规模未观察到的项目进行了详细的数据分析,发现其中包含相当数量的“潜在正面”样本。本文提出了一种“提取与迁移”(EAT)框架,利用大量未观察到的项目和其他领域的数据来构建更多用于排名模型的训练数据。具体而言,我们首先根据排名分数从未观察到的数据中提取“潜在正面”样本和负样本,然后设计了一个整体空间迁移学习(ESTL)模型,用于在观察到的样本和未观察到的样本之间进行知识迁移,而不是直接将它们混合在一起,以避免负迁移。在从淘宝收集的生产数据上的实验验证了所提出方法的优越性。此外,我们已经在淘宝推荐系统中部署了EAT,获得了6.22%的IPV(商品页面浏览量)和3.77%的CTR提升。代码可在https://github.com/Recommender1/EAT.git1获取。 | code | 0 |
ECRT: Flexible Sequence Enhancement Framework for Cross-Domain Information Reuse in Recommendation | Weiqiang Zhao, ZiYuan Wu, Yatao Yang, Lifeng Hua, Hao Xiong | Alibaba Group, Hangzhou, China; Nanjing University, Nanjing, China | Chronological sequence of user-item interactions is a key feature in recommender systems, as it reveals the transition of users' interests as well as contextual relevance between nearby items. In modern e-commerce applications, various scenarios are usually integrated in one entry page, and the behavior sequence tend to be a combination of user-item interactions across multiple domains, such as on-sale goods, search queries, short videos, livestreams, etc. However, traditional domain-specified recommendations only deal with the interactions within the target domain, which neglects the overall profiles depicted by the behavior across the entire application, leading to overestimation of retargeted items as well as underestimation of unseen ones. So it is crucial to leverage cross-domain data from prominent domains to better supplement user behavior sequences for our targets. To tackle this problem, we propose the Enhanced Cross-domain Ralation Transfer (ECRT) framework to make flexible sequence augmentation with the assist of cross-domain information from other domains. We first employ similarity-based retrieval to obtain relevant sequence information from neighbor domains and build a heterogeneous graph to represent the complex behavior of users. Then we use innovative mining approaches to sample relevant information from the graph to supplement users' behavior sequences, and a hierarchical gated attention structure is used to aggregate these augmented information. We apply our proposed method in the livestream recommendation of Taobao channel pages, and the final experimental results indicate that our method demonstrates excellent performance in both online and offline environments, with an excess of up to 3.6% in main online indicators beyond past SOTA methods. | 用户与物品交互的时间序列是推荐系统中的一个关键特征,因为它揭示了用户兴趣的转变以及相邻物品之间的上下文相关性。在现代电子商务应用中,各种场景通常集成在一个入口页面上,行为序列往往是跨多个领域(如在售商品、搜索查询、短视频、直播等)的用户-物品交互的组合。然而,传统的特定领域推荐系统仅处理目标领域内的交互,忽略了整个应用中行为所描绘的整体轮廓,导致对再定位物品的高估和对未见物品的低估。因此,利用来自显著领域的跨领域数据来更好地补充用户行为序列以实现我们的目标至关重要。为了解决这个问题,我们提出了增强的跨领域关系迁移(ECRT)框架,通过借助其他领域的跨领域信息来实现灵活的序列增强。我们首先采用基于相似性的检索方法从邻近领域获取相关的序列信息,并构建一个异构图来表示用户的复杂行为。然后,我们使用创新的挖掘方法从图中采样相关信息来补充用户的行为序列,并使用分层门控注意力结构来聚合这些增强的信息。我们将所提出的方法应用于淘宝频道页面的直播推荐,最终的实验结果表明,我们的方法在在线和离线环境中均表现出色,主要在线指标比过去的SOTA方法高出多达3.6%。 | code | 0 |
Collaborative Scope: Encountering the Substitution Effect within the Delivery Scope in Online Food Delivery Platform | Yida Zhu, Liying Chen, Chen Zheng, Jia Shi, Daping Xiong, Zewen Huang, Shihao Ren, Shuiping Chen, Jinghua Hao, Renqing He | Meituan, Beijing, China | Online food delivery (OFD) services, known for offering varied meals at home, have gained global popularity. Meituan has recently ventured into the affordable market segment with its "Pinhaofan'' service, highlighting the imperative to delivery efficiency. To achieve this, delivery scope is regarded as one of the most effective operational tools. The delivery scope of a merchant refers to the geo-graphical area where they can serve customers. Current methods for generating delivery scopes primarily focus on optimizing a single merchant's efficiency or rely on manual delineated from the merchant's perspective, neglecting the merchant substitution effect and potentially resulting in order loss. In this paper, we propose a novel method, named Collaborative Scope, which views the delivery scope as an assortment optimization problem, considering the substitution effect between merchants from the user's perspective. We introduce the discrete choice model of econometrics and use the Enhanced Multinomial Logit Model to predict user conversion rates in the merchant list. Next, we formulate the delivery scope optimization problem of multiple merchants as a mixed integer programming problem. The city-wide solution of this problem, owing to the large-scale combinatorial optimization triggered by high-dimensional decision variables, incurs high computational complexity. To address this, we propose an approximate solution to the original problem through a first-order Taylor series approximation, which significantly reduces the computation complexity at the expense of a slight decrease in solution accuracy. Offline and online A/B test results indicate that, compared to existing methods, Collaborative Scope significantly improves delivery efficiency by reducing delivery difficulty without hurt of order volume. Notably, Collaborative Scope is currently deployed on "Pinhaofan'', serving tens of millions of online users. | 在线食品配送(OFD)服务因其能够将多样化的餐食送至家中而风靡全球。美团最近通过其“拼好饭”服务进军了平价市场领域,突显了配送效率的重要性。为此,配送范围被视为最有效的运营工具之一。商家的配送范围指的是他们能够服务顾客的地理区域。目前生成配送范围的方法主要集中在优化单个商家的效率,或依赖商家视角的手动划定,忽视了商家之间的替代效应,可能导致订单损失。本文提出了一种名为“协同范围”的新方法,将配送范围视为一个组合优化问题,从用户的角度考虑商家之间的替代效应。我们引入了计量经济学的离散选择模型,并使用增强的多项式逻辑回归模型来预测用户在商家列表中的转化率。接着,我们将多个商家的配送范围优化问题形式化为一个混合整数规划问题。由于高维决策变量引发的大规模组合优化,该问题的城市级解决方案计算复杂度高。为此,我们通过一阶泰勒级数近似提出了原问题的近似解,显著降低了计算复杂度,但略微降低了解决方案的精度。离线和在线A/B测试结果表明,与现有方法相比,协同范围显著提高了配送效率,减少了配送难度,且不影响订单量。值得注意的是,协同范围目前已在“拼好饭”上部署,服务于数千万在线用户。 | code | 0 |
EDGE: A Conversational Interface driven by Large Language Models for Educational Knowledge Graphs Exploration | Neda Afreen, Giacomo Balloccu, Ludovico Boratto, Gianni Fenu, Francesca Maridina Malloci, Mirko Marras, Andrea Giovanni Martis | University of Cagliari, Cagliari, Italy | As education adopts digital platforms, the vast amount of information from various sources, such as learning management systems and learning object repositories, presents challenges in navigation and elaboration. Traditional interfaces involve a steep learning curve, limited user accessibility, and lack flexibility. Language models alone cannot address these issues as they do not have access to structured information specific to the educational organization. In this paper, we propose EDGE (EDucational knowledge Graph Explorer), a natural language interface that uses knowledge graphs to organize educational information. EDGE translates natural language requests into queries and converts the results back into natural language responses. We show EDGE's versatility using knowledge graphs built from public datasets, providing example interactions of different stakeholders. Demo video: https://u.garr.it/eYq63. | 随着教育转向数字化平台,来自学习管理系统、学习对象资源库等各种来源的海量信息在导航和详细阐述方面带来了挑战。传统界面存在学习曲线陡峭、用户可访问性有限以及缺乏灵活性的问题。仅依赖语言模型无法解决这些问题,因为它们无法访问教育机构特有的结构化信息。本文提出了EDGE(教育知识图谱探索器),这是一个利用知识图谱组织教育信息的自然语言界面。EDGE将自然语言请求转换为查询,并将结果转换回自然语言响应。我们展示了EDGE的多样性,使用了从公共数据集构建的知识图谱,并提供了不同利益相关者的示例交互。演示视频链接:https://u.garr.it/eYq63。 | code | 0 |
A Supervised BERT Model for Identifying Core-Intent Bearing Phrases in e-Commerce Queries | Abhishek Sudhakar Deshmukh, Arnab Dutta | eBay GmbH, Dreilinden, Germany; eBay GmbH, Aachen, Germany | In the realm of e-Commerce, a fundamental problem is accurate interpretation of users' core intent. The intent is often subtly expressed implicitly or stated explicitly with the usage of verbose tokens or key phrases in a user query. In this work, we focus on the later class of problems where we identify a subset of query tokens which are the primary intent bearing phrases that convey explicit intents. We did not solve this as an intent detection problem but rather an immutable component detection problem because we believe that discovering the immutable phrases in a query entails that those are the intent bearing phrases. Furthermore, identifying a certain set of query tokens as immutable ensures better downstream processing in terms of unprecedented token handling, query category detection or query rewrites. We have developed a BERT based supervised learned model which can identify core-intent tokens, thereby improving F1 score over the baseline by over 35%. Furthermore, we integrated our proposed approach for a query recovery strategy which produces approximately 11.9% improvement in offline relevance scores compared to the production model. | 在电子商务领域,一个基本问题是如何准确解读用户的中心意图。用户的意图通常通过隐晦的隐式表达或使用冗长的标记或关键词在用户查询中明确表达。在这项工作中,我们关注的是后一类问题,即识别查询标记中的一部分,这些标记是主要承载意图的短语,传达明确的意图。我们没有将这个问题视为意图检测问题,而是视为不可变组件检测问题,因为我们认为,发现查询中的不可变短语意味着这些短语是承载意图的短语。此外,将某些查询标记识别为不可变,确保了在处理前所未见的标记、查询类别检测或查询重写方面的下游处理效果更好。我们开发了一个基于BERT的有监督学习模型,该模型能够识别核心意图标记,从而将F1得分比基线提高了35%以上。此外,我们将提出的方法集成到一个查询恢复策略中,与生产模型相比,离线相关性评分提高了约11.9%。 | code | 0 |
Traversing the Journey of Data and AI: From Convergence to Translation | Nitesh V. Chawla | University of Notre Dame, Notre Dame, IN, USA | In this talk, I will present our work on fundamental advances in AI, inspired by interdisciplinary problem statements and societal challenges. I will also highlight our innovation journey that encapsulates both the opportunities and challenges inherent in harnessing the full potential of AI for societal benefit, in particular highlighting the realization of societal impact through translational work and partnerships. Additionally, I will highlight our educational endeavors, emphasizing experiential learning and interdisciplinary approaches as fundamental elements of the student experience. | 在本次演讲中,我将介绍我们在人工智能基础性进展方面的工作,这些工作受到跨学科问题陈述和社会挑战的启发。我还将重点介绍我们的创新历程,这一历程既体现了利用人工智能全部潜力造福社会的机遇,也揭示了其中的挑战,特别强调了通过转化工作和合作伙伴关系实现社会影响的重要性。此外,我将强调我们的教育努力,突出体验式学习和跨学科方法作为学生体验的基本要素。 | code | 0 |
Is the Search Engine of the Future a Chatbot? | Suzan Verberne | Leiden Institute of Advanced Computer Science, Leiden University, Leiden, Netherlands | The rise of Large Language Models (LLMs) has had a huge impact on the interaction of users with information. Many people argue that the age of search engines as we know them has ended, while other people argue that retrieval technology is more relevant than ever before, because we need information to be grounded in sources. In my talk I will argue that both statements are true. I will discuss the multiple relations between LLMs and Information Retrieval: how can they strengthen each other, what are the challenges we face, and what directions should we go in our research? | 大型语言模型(LLMs)的崛起对用户与信息的交互方式产生了巨大影响。许多人认为,我们所熟知的搜索引擎时代已经结束,而另一些人则认为,检索技术比以往任何时候都更加重要,因为我们需要的不仅仅是信息,而是基于来源的信息。在我的演讲中,我将主张这两种观点都是正确的。我将探讨LLMs与信息检索之间的多重关系:它们如何相互增强,我们面临的挑战是什么,以及我们的研究应该朝着哪些方向发展? | code | 0 |
Navigating the Landscape of Reproducible Research: A Predictive Modeling Approach | Akhil Pandey Akella, Sagnik Ray Choudhury, David Koop, Hamed Alhoori | Northern Illinois University & Northwestern University, Dekalb, IL, USA; University of North Texas, Denton, TX, USA; Northern Illinois University, Dekalb, IL, USA | The reproducibility of scientific articles is central to the advancement of science. Despite this importance, evaluating reproducibility remains challenging due to the scarcity of ground truth data. Predictive models can address this limitation by streamlining the tedious evaluation process. Typically, a paper's reproducibility is inferred based on the availability of artifacts such as code, data, or supplemental information, often without extensive empirical investigation. To address these issues, we utilized artifacts of papers as fundamental units to develop a novel, dual-spectrum framework that focuses on author-centric and external-agent perspectives. We used the author-centric spectrum, followed by the external-agent spectrum, to guide a structured, model-based approach to quantify and assess reproducibility. We explored the interdependencies between different factors influencing reproducibility and found that linguistic features such as readability and lexical diversity are strongly correlated with papers achieving the highest statuses on both spectrums. Our work provides a model-driven pathway for evaluating the reproducibility of scientific research. The code, methods, and artifacts for our study are publicly available at: https://github.com/reproducibilityproject/NLRR/ | 科学文章的可重复性对于科学的进步至关重要。尽管其重要性不言而喻,但由于缺乏真实数据,评估可重复性仍然具有挑战性。预测模型可以通过简化繁琐的评估过程来解决这一限制。通常,一篇论文的可重复性是基于代码、数据或补充信息等资源的存在来推断的,而往往没有进行广泛的实证调查。为了解决这些问题,我们利用论文的资源作为基本单位,开发了一种新颖的双光谱框架,该框架侧重于以作者为中心和外部代理的视角。我们使用以作者为中心的光谱,随后是外部代理的光谱,来指导一种结构化的、基于模型的方法来量化和评估可重复性。我们探讨了影响可重复性的不同因素之间的相互依赖关系,并发现语言特征如可读性和词汇多样性与论文在两个光谱上达到最高状态之间存在强烈的相关性。我们的工作为评估科学研究的可重复性提供了一种模型驱动的途径。我们研究的代码、方法和资源可在以下公开获取:https://github.com/reproducibilityproject/NLRR/。 | code | 0 |
Aligning Large Language Model with Direct Multi-Preference Optimization for Recommendation | Zhuoxi Bai, Ning Wu, Fengyu Cai, Xinyi Zhu, Yun Xiong | Shanghai University, Shanghai, China; CashCat, Hangzhou, China; Technical University of Darmstadt, Darmstadt, Germany; Beihang University, Beijing, China; Fudan University, Shanghai, China | Large Language Models (LLMs) have shown impressive performance in various domains, prompting researchers to explore their potential application in recommendation systems. However, directly applying LLMs to recommendation tasks has proven to be less effective due to the significant gap between the data used for pre-training LLMs and the specific requirements of recommendation tasks. In this study, we propose Direct Multi-Preference Optimization (DMPO), a streamlined framework to bridge this gap and enhance the alignment of LLMs for recommendation tasks. DMPO can be viewed as a pair-wise ranking loss to distinguish between positive and negative samples in recommendation tasks. Furthermore, DMPO improves the performance of LLM-based recommenders by maximizing the probability of positive samples and minimizing the probability of multiple negative samples at the same time. Experimental evaluations are conducted to compare DMPO with traditional recommendation methods and other LLM-based recommendation methods. The results reveal that DMPO significantly enhances the recommendation capabilities of LLMs across three real-world public datasets in few-shot scenarios. Furthermore, the experiments also demonstrate that DMPO exhibits superior generalization ability in cross-domain recommendation. A case study elucidates the reasons behind these consistent improvements and also underscores DMPO's potential as an explainable recommendation system. Our code and data are available at https://github.com/BZX667/DMPO. | 大型语言模型(LLMs)在多个领域展示了卓越的性能,促使研究人员探索其在推荐系统中的潜在应用。然而,直接将LLMs应用于推荐任务已被证明效果不佳,因为用于预训练LLMs的数据与推荐任务的具体需求之间存在显著差距。在本研究中,我们提出了直接多偏好优化(Direct Multi-Preference Optimization, DMPO),这是一个简化的框架,旨在弥合这一差距并增强LLMs在推荐任务中的适应性。DMPO可以视为一种成对排序损失,用于区分推荐任务中的正样本和负样本。此外,DMPO通过最大化正样本的概率并同时最小化多个负样本的概率,提升了基于LLM的推荐系统的性能。我们进行了实验评估,将DMPO与传统的推荐方法以及其他基于LLM的推荐方法进行了比较。结果显示,在少样本场景下,DMPO在三个真实世界的公开数据集上显著增强了LLMs的推荐能力。此外,实验还表明,DMPO在跨领域推荐中展现出优越的泛化能力。一项案例研究阐明了这些持续改进的原因,并强调了DMPO作为可解释推荐系统的潜力。我们的代码和数据可在https://github.com/BZX667/DMPO 获取。 | code | 0 |
Wise Fusion: Group Fairness Enhanced Rank Fusion | Kathleen Cachel, Elke A. Rundensteiner | Worcester Polytechnic Institute, Worcester, MA, USA | Rank fusion is a technique for combining multiple rankings into a single aggregated ranking, commonly used in high-stakes applications. For hiring decisions, a fused ranking might combine evaluations of different candidates from various job boards into one list. Ideally, such fused rankings are fair. Meaning they do not withhold opportunities or resources from marginalized groups of candidates, even if such biases may be present in the to-be-fused rankings. Prior work fairly aggregating rankings is limited to ensuring proportional (not addressing equality) fairness when combining ranked lists containing the same candidate items. Yet, real-world fusion tasks often combine rankings of varying candidate sets, may also contain relevance scores, or are better suited to equal representation. To address fairness in these settings, we present a new plug-and-play fairness-aware fusion strategy: WISE fusion. WISE works in fusion settings where we have closed-box access to a score-powered rank fusion (SRF) method, making it possible to fairness-enhance existing fusion pipelines with little added cost. WISE uses existing evaluations of candidates from an as-is SRF method to achieve proportional or equal rank fairness in the final fused ranking. Our experimental study demonstrates that WISE beats the fairness and utility performance of state-of-the-art methods applied to these new fair rank fusion settings. | 排名融合是一种将多个排名合并为单一综合排名的技术,常见于高风险应用中。例如,在招聘决策中,一个融合后的排名可能会将来自不同求职网站的候选人评估整合成一个名单。理想情况下,这种融合后的排名应该是公平的,即它们不会剥夺边缘化候选人群体的机会或资源,即使待融合的排名本身可能存在偏见。先前的工作在公平地聚合排名方面,仅限于在合并包含相同候选人项目的排名列表时确保比例(而非平等)公平性。然而,现实世界的融合任务通常涉及合并不同候选人集合的排名,可能还包含相关性评分,或者更适合于平等代表性。为了在这些情境中解决公平性问题,我们提出了一种新的即插即用公平感知融合策略:WISE融合。WISE适用于我们对基于分数的排名融合(SRF)方法有封闭式访问权限的融合场景,使得我们能够以较低的额外成本增强现有融合流程的公平性。WISE利用现有SRF方法对候选人的评估,以在最终融合排名中实现比例或平等的排名公平性。我们的实验研究表明,WISE在这些新的公平排名融合设置中,其公平性和效用性能优于最先进的方法。 | code | 0 |
FCS-HGNN: Flexible Multi-type Community Search in Heterogeneous Information Networks | Guoxin Chen, Fangda Guo, Yongqing Wang, Yanghao Liu, Peiying Yu, Huawei Shen, Xueqi Cheng | ; Soochow University Institute of Computing Technology; Chinese Academy of Sciences Institute of Computing Technology; University of Chinese Academy of Sciences Institute of Computing Technology | Community search is a personalized community discovery problem designed to identify densely connected subgraphs containing the query node. Recently, community search in heterogeneous information networks (HINs) has received considerable attention. Existing methods typically focus on modeling relationships in HINs through predefined meta-paths or user-specified relational constraints. However, metapath-based methods are primarily designed to identify single-type communities with nodes of the same type rather than multi-type communities involving nodes of different types. Constraint-based methods require users to have a good understanding of community patterns to define a suitable set of relational constraints, which increases the burden on users. In this paper, we propose FCS-HGNN, a novel method for flexibly identifying both single-type and multi-type communities in HINs. Specifically, FCS-HGNN extracts complementary information from different views and dynamically considers the contribution of each relation instead of treating them equally, thereby capturing more fine-grained heterogeneous information. Furthermore, to improve efficiency on large-scale graphs, we further propose LS-FCS-HGNN, which incorporates i) the neighbor sampling strategy to improve training efficiency, and ii) the depth-based heuristic search strategy to improve query efficiency. We conducted extensive experiments to demonstrate the superiority of our proposed methods over state-of-the-art methods, achieving average improvements of 14.3% and 11.1% on single-type and multi-type communities, respectively. | 社区搜索是一个个性化的社区发现问题,旨在识别包含查询节点的密集连接子图。近来,异构信息网络(HINs)中的社区搜索引起了广泛关注。现有方法通常通过预定义的元路径或用户指定的关系约束来建模HINs中的关系。然而,基于元路径的方法主要设计用于识别单一类型的社区,其中节点属于同一类型,而不是涉及不同类型节点的多类型社区。基于约束的方法要求用户对社区模式有良好的理解,以定义合适的关系约束集合,这增加了用户的负担。在本文中,我们提出了FCS-HGNN,一种新颖的方法,用于在HINs中灵活识别单一类型和多类型社区。具体来说,FCS-HGNN从不同视角提取互补信息,并动态考虑每种关系的贡献,而不是平等对待它们,从而捕捉到更细粒度的异构信息。此外,为了提高大规模图上的效率,我们进一步提出了LS-FCS-HGNN,它结合了:i)邻居采样策略以提高训练效率,以及ii)基于深度的启发式搜索策略以提高查询效率。我们进行了广泛的实验,以展示我们提出的方法相对于最先进方法的优越性,在单一类型和多类型社区上分别实现了14.3%和11.1%的平均改进。 | code | 0 |
ELCoRec: Enhance Language Understanding with Co-Propagation of Numerical and Categorical Features for Recommendation | Jizheng Chen, Kounianhua Du, Jianghao Lin, Bo Chen, Ruiming Tang, Weinan Zhang, Yong Yu | Large language models have been flourishing in the natural language processing (NLP) domain, and their potential for recommendation has been paid much attention to. Despite the intelligence shown by the recommendation-oriented finetuned models, LLMs struggle to fully understand the user behavior patterns due to their innate weakness in interpreting numerical features and the overhead for long context, where the temporal relations among user behaviors, subtle quantitative signals among different ratings, and various side features of items are not well explored. Existing works only fine-tune a sole LLM on given text data without introducing that important information to it, leaving these problems unsolved. In this paper, we propose ELCoRec to Enhance Language understanding with CoPropagation of numerical and categorical features for Recommendation. Concretely, we propose to inject the preference understanding capability into LLM via a GAT expert model where the user preference is better encoded by parallelly propagating the temporal relations, and rating signals as well as various side information of historical items. The parallel propagation mechanism could stabilize heterogeneous features and offer an informative user preference encoding, which is then injected into the language models via soft prompting at the cost of a single token embedding. To further obtain the user's recent interests, we proposed a novel Recent interaction Augmented Prompt (RAP) template. Experiment results over three datasets against strong baselines validate the effectiveness of ELCoRec. The code is available at https://anonymous.4open.science/r/CIKM_Code_Repo-E6F5/README.md. | 大型语言模型在自然语言处理(NLP)领域蓬勃发展,其在推荐系统中的潜力也备受关注。尽管面向推荐的微调模型展现了智能性,但大型语言模型(LLMs)由于其固有的对数值特征解释的弱点以及长上下文处理的负担,难以完全理解用户行为模式。这些模式包括用户行为之间的时间关系、不同评分之间的细微数量信号以及项目各种侧边特征等,都未得到充分探索。现有工作仅在给定文本数据上对单一LLM进行微调,而未引入这些重要信息,导致这些问题未得到解决。本文提出ELCoRec,旨在通过数值和类别特征的协同传播来增强语言理解,以提升推荐效果。具体而言,我们提出通过一个图注意力网络(GAT)专家模型将偏好理解能力注入LLM,其中用户偏好通过并行传播时间关系、评分信号以及历史项目的各种侧信息得到更好编码。这种并行传播机制能够稳定异构特征,并提供信息丰富的用户偏好编码,随后通过单个令牌嵌入的软提示方式注入语言模型。为进一步捕捉用户的近期兴趣,我们提出了新颖的近期交互增强提示(RAP)模板。在三个数据集上与强基线方法的实验结果验证了ELCoRec的有效性。代码可访问https://anonymous.4open.science/r/CIKM_Code_Repo-E6F5/README.md获取。 | code | 0 | |
Social Influence Learning for Recommendation Systems | Ximing Chen, Pui Ieng Lei, Yijun Sheng, Yanyan Liu, Zhiguo Gong | University of Macau, Macau, China | Social recommendation systems leverage the social relations among users to deal with the inherent cold-start problem in user-item interactions. However, previous models only treat the social graph as the static auxiliary to the user-item interaction graph, rather than dig out the hidden essentials and optimize them for better recommendations. Thus, the potential of social influence is still under-explored. In this paper, we will fill this gap by proposing a novel model for social influence learning to derive the essential influence patterns within the user relationships. Our model views the social influence from the perspectives of (1) the diversity of neighborhood's influence on the users, (2) the disentanglement of neighborhood's influence on the users, and (3) the exploration of underlying implicit social influence. To this end, we first employ a novel layerwise graph-enhanced variational autoencoder for the reconstruction of neighborhoods' representations, which aims to learn the pattern of social influence as well as simulate the social profile of each user for overcoming the sparsity issue in social relation data. Meanwhile, we introduce a layerwise graph attentive network for capturing the most influential scope of neighborhood. Finally, we adopt a dual sampling process to generate new social relations for enhancing the social recommendation. Extensive experiments have been conducted on three widely-used benchmark datasets, verifying the superiority of our proposed model compared with the representative approaches. | 社交推荐系统利用用户之间的社交关系来应对用户-物品交互中固有的冷启动问题。然而,以往的模型仅将社交图视为用户-物品交互图的静态辅助,而未深入挖掘隐藏的基本要素并对其进行优化以实现更好的推荐。因此,社交影响力的潜力仍未得到充分探索。本文通过提出一种新的社交影响力学习模型来填补这一空白,旨在从用户关系中提取基本的影响模式。我们的模型从以下三个角度看待社交影响力:(1)邻域对用户影响的多样性,(2)邻域对用户影响的解耦,以及(3)潜在隐性社交影响力的探索。为此,我们首先采用了一种新颖的分层图增强变分自编码器来重构邻域的表示,旨在学习社交影响力的模式并模拟每个用户的社交概况,以克服社交关系数据中的稀疏性问题。同时,我们引入了一种分层图注意力网络来捕捉邻域中最具影响力的范围。最后,我们采用双重采样过程来生成新的社交关系,以增强社交推荐。我们在三个广泛使用的基准数据集上进行了大量实验,验证了我们提出的模型相对于代表性方法的优越性。 | code | 0 |
Enhancing Deep Entity Resolution with Integrated Blocker-Matcher Training: Balancing Consensus and Discrepancy | Wenzhou Dou, Derong Shen, Xiangmin Zhou, Hui Bai, Yue Kou, Tiezheng Nie, Hang Cui, Ge Yu | University of Illinois at Urbana-Champaign, Urbana, USA; Northeastern University, Shenyang, China; RMIT University, Melbourne, Australia | Deep entity resolution (ER) identifies matching entities across data sources using techniques based on deep learning. It involves two steps: a blocker for identifying the potential matches to generate the candidate pairs, and a matcher for accurately distinguishing the matches and non-matches among these candidate pairs. Recent deep ER approaches utilize pretrained language models (PLMs) to extract similarity features for blocking and matching, achieving state-of-the-art performance. However, they often fail to balance the consensus and discrepancy between the blocker and matcher, emphasizing the consensus while neglecting the discrepancy. This paper proposes MutualER, a deep entity resolution framework that integrates and jointly trains the blocker and matcher, balancing both the consensus and discrepancy between them. Specifically, we firstly introduce a lightweight PLM in siamese structure for the blocker and a heavier PLM in cross structure or an autoregressive large language model (LLM) for the matcher. Two optimization techniques named Mutual Sample Selection (MSS) and Similarity Knowledge Transferring (SKT) are designed to jointly train the blocker and matcher. MSS enables the blocker and matcher to mutually select the customized training samples for each other to maintain the discrepancy, while SKT allows them to share the similarity knowledge for improving their blocking and matching capabilities respectively to maintain the consensus. Extensive experiments on five datasets demonstrate that MutualER significantly outperforms existing PLM-based and LLM-based approaches, achieving leading performance in both effectiveness and efficiency. | 深度实体解析(ER)利用基于深度学习的技术,识别跨数据源的匹配实体。它包括两个步骤:用于识别潜在匹配以生成候选对的阻塞器,以及用于在这些候选对中准确区分匹配与非匹配的匹配器。最近的深度ER方法利用预训练语言模型(PLMs)来提取用于阻塞和匹配的相似性特征,从而达到最先进的性能。然而,这些方法往往未能平衡阻塞器和匹配器之间的共识与差异,过于强调共识而忽视了差异。本文提出了MutualER,这是一个深度实体解析框架,它整合并联合训练阻塞器和匹配器,平衡两者之间的共识与差异。具体来说,我们首先在阻塞器中引入一个轻量级的PLM,采用孪生结构,而在匹配器中使用更重的PLM,采用交叉结构或自回归大型语言模型(LLM)。我们设计了两种优化技术,名为互样本选择(MSS)和相似性知识转移(SKT),以联合训练阻塞器和匹配器。MSS使阻塞器和匹配器能够相互选择定制的训练样本,以保持差异,而SKT则允许它们共享相似性知识,以分别提高各自的阻塞和匹配能力,从而保持共识。在五个数据集上的广泛实验表明,MutualER显著优于现有的基于PLM和LLM的方法,在效果和效率方面均达到了领先水平。 | code | 0 |
CHDAER: Consistent Hashing-based Data Allocation for Efficient Recommendation in Edge Environment | Zhikang Feng, Chao Yan, Rong Jiang, Xiaolong Xu, Xuyun Zhang, Xiaokang Zhou, Wanchun Dou, Lianyong Qi | code | 0 | |||
HierRec: Scenario-Aware Hierarchical Modeling for Multi-scenario Recommendations | Jingtong Gao, Bo Chen, Menghui Zhu, Xiangyu Zhao, Xiaopeng Li, Yuhao Wang, Yichao Wang, Huifeng Guo, Ruiming Tang | Huawei Noah's Ark Lab, Shenzhen, China; City University of Hong Kong, Hong Kong, Hong Kong | Click-Through Rate (CTR) prediction is a fundamental technique in recommendation and advertising systems. Recent studies have shown that implementing multi-scenario recommendations contributes to strengthening information sharing and improving overall performance. However, existing multi-scenario models only consider coarse-grained explicit scenario modeling that depends on pre-defined scenario identification from manual prior rules, which is biased and sub-optimal. To address these limitations, we propose a Scenario-Aware Hierarchical Dynamic Network for Multi-Scenario Recommendations (HierRec), which perceives implicit patterns adaptively, and conducts explicit and implicit scenario modeling jointly. In particular, HierRec designs a basic scenario-oriented module based on the dynamic weight to capture scenario-specific representations. Then the hierarchical explicit and implicit scenario-aware modules are proposed to model hybrid-grained scenario information, where the multi-head implicit modeling design contributes to perceiving distinctive patterns from different perspectives. Our experiments on two public datasets and real-world industrial applications on a mainstream online advertising platform demonstrate that HierRec outperforms existing models significantly. The implementation code is available for reproducibility. | 点击率(CTR)预测是推荐和广告系统中的基础技术。近期的研究表明,实施多场景推荐有助于增强信息共享并提升整体性能。然而,现有的多场景模型仅考虑基于预定义场景识别的粗粒度显式场景建模,这种识别依赖于人工先验规则,存在偏差且效果不佳。为解决这些局限,我们提出了面向多场景推荐的场景感知分层动态网络(HierRec),该网络能够自适应地感知隐式模式,并联合进行显式和隐式场景建模。具体而言,HierRec设计了一个基于动态权重的基本场景导向模块,以捕捉特定场景的表示。随后,提出了分层的显式和隐式场景感知模块,用于建模混合粒度的场景信息,其中多头的隐式建模设计有助于从不同角度感知独特的模式。我们在两个公开数据集以及主流在线广告平台的实际工业应用中的实验表明,HierRec显著优于现有的模型。实现代码已公开,便于复现。 | code | 0 |
Information Retrieval Optimization for Non-Exemplar Class Incremental Learning | Shuai Guo, Yang Gu, Yuan Ma, Yingwei Zhang, Weining Weng, Jun Liu, Weiwei Dai, Yiqiang Chen | Existing non-example class-incremental learning (NECIL) methods usually utilize a combination strategy of replay mechanism and knowledge distillation. However, this combination strategy only focuses on the preservation of old information quantitatively, ignoring the preservation quality. When the old knowledge has wrong redundant information, catastrophic forgetting is more likely to occur. Therefore, obtaining adequate information without impurities as much as possible and removing invalid or even harmful information has become an effective solution to improve the performance of NECIL. This process is consistent with the information bottleneck (IB) theory. Thus, we propose a new NECIL method based on the IB framework. By using the different information obtained from the new and old class samples and the implicit knowledge in the teacher model training process, the error of harmful redundant information learned is eliminated. Specifically, we propose two optimization strategies that align with the two optimization processes of the information bottleneck. Firstly, we employ a pseudo-prototype selection mechanism that selectively incorporates pseudo-samples into the learning process of new and old categories, thus enhancing the distinction between new and old categories and diminishing the mutual information between the input and intermediate features. Secondly, we introduce an attention-based feature distillation method that regulates the distillation strength between feature pairs based on their similarity, thereby augmenting the mutual information between intermediate features and output prediction. Extensive experiments on three benchmarks demonstrate that the proposed method exhibits significant incremental performance improvements over existing methods. | 现有的非示例类增量学习(NECIL)方法通常采用回放机制和知识蒸馏的组合策略。然而,这种组合策略仅关注旧信息的定量保存,忽略了保存的质量。当旧知识包含错误的冗余信息时,灾难性遗忘更容易发生。因此,尽可能获取无杂质的充足信息并去除无效甚至有害信息,已成为提升NECIL性能的有效解决方案。这一过程与信息瓶颈(IB)理论相一致。因此,我们提出了一种基于IB框架的新NECIL方法。通过利用新旧类别样本获取的不同信息以及教师模型训练过程中蕴含的隐性知识,消除了有害冗余信息学习的错误。具体而言,我们提出了两种优化策略,分别对应信息瓶颈的两个优化过程。首先,我们采用伪原型选择机制,有选择地将伪样本纳入新旧类别的学习过程,从而增强新旧类别之间的区分度,并减少输入与中间特征之间的互信息。其次,我们引入了一种基于注意力的特征蒸馏方法,根据特征对之间的相似性调节蒸馏强度,从而增强中间特征与输出预测之间的互信息。在三个基准上的大量实验表明,所提出的方法相较于现有方法显著提升了增量学习性能。 | code | 0 | |
Fragment Allocations for Partially Replicated Databases Considering Data Modifications and Changing Workloads | Stefan Halfpap, Rainer Schlosser | BIFOLD, TU Berlin, Berlin, Germany; Hasso Plattner Institute, Potsdam, Germany | Columnar database systems can process complex mixed workloads on a single node. In case of increasing and peak analytical processing demand, we can offload read-only queries to replicas. Partial replication, i.e., duplicating only data subsets to additional nodes, is more cost-efficient than full replication for two primary reasons: (i) Partial replicas require less storage and can be set up faster. (ii) Partial replicas must synchronize only stored data subsets, allowing better scalability. However, determining which queries to offload is challenging for larger workloads because queries access overlapping data subsets and cause synchronization costs. This paper shows how to calculate optimized replica configurations that consider reallocation and data modification costs using integer linear programming (ILP) techniques. While ILP is effective for solving assignment problems, it does not scale well. For larger problems, users often fall back to simple heuristics, which can lose optimization potential. This paper demonstrates that scalable heuristics can be built on ILP, preserving its strengths. The three proposed approaches for reducing the calculation time allow trading solution quality flexibly. Our evaluations using TPC-H, TPC-DS, and a large real-world accounting workload show that our approach outperforms state-of-the-art solutions, often reducing reallocated data by more than 80% and halving modification costs. At the same time, the new allocations reduce the storage consumption by over 30%, with solutions computed in just a few seconds. | 列式数据库系统能够在单个节点上处理复杂的混合工作负载。在分析处理需求增加和达到峰值的情况下,我们可以将只读查询卸载到副本上。部分复制,即仅将数据子集复制到附加节点,相较于全复制更为成本高效,主要原因有两点:(i) 部分副本所需的存储较少,且设置速度更快;(ii) 部分副本仅需同步存储的数据子集,从而实现更好的可扩展性。然而,对于较大的工作负载,确定哪些查询应被卸载是一个挑战,因为查询访问重叠的数据子集,并导致同步成本增加。本文展示了如何使用整数线性规划(ILP)技术计算优化的副本配置,考虑了重新分配和数据修改成本。尽管ILP在解决分配问题方面有效,但其扩展性不佳。对于较大的问题,用户通常会回退到简单的启发式方法,这可能会失去优化的潜力。本文证明,可扩展的启发式方法可以建立在ILP基础上,保留其优势。所提出的三种减少计算时间的方法允许灵活地权衡解决方案的质量。我们使用TPC-H、TPC-DS和大规模真实会计工作负载的评估显示,我们的方法优于最先进的解决方案,通常将重新分配的数据减少超过80%,并将修改成本减半。同时,新的分配方案将存储消耗减少了30%以上,且解决方案在几秒钟内即可计算完成。 | code | 0 |
Practical and Robust Safety Guarantees for Advanced Counterfactual Learning to Rank | Shashank Gupta, Harrie Oosterhuis, Maarten de Rijke | Counterfactual learning to rank (CLTR ) can be risky; various circumstances can cause it to produce sub-optimal models that hurt performance when deployed. Safe CLTR was introduced to mitigate these risks when using inverse propensity scoring to correct for position bias. However, the existing safety measure for CLTR is not applicable to state-of-the-art CLTR, it cannot handle trust bias, and its guarantees rely on specific assumptions about user behavior. Our contributions are two-fold. First, we generalize the existing safe CLTR approach to make it applicable to state-of-the-art doubly robust (DR) CLTR and trust bias. Second, we propose a novel approach, proximal ranking policy optimization (PRPO ), that provides safety in deployment without assumptions about user behavior. PRPO removes incentives for learning ranking behavior that is too dissimilar to a safe ranking model. Thereby, PRPO imposes a limit on how much learned models can degrade performance metrics, without relying on any specific user assumptions. Our experiments show that both our novel safe doubly robust method and PRPO provide higher performance than the existing safe inverse propensity scoring approach. However, when circumstances are unexpected, the safe doubly robust approach can become unsafe and bring detrimental performance. In contrast, PRPO always maintains safety, even in maximally adversarial situations. By avoiding assumptions, PRPO is the first method with unconditional safety in deployment that translates to robust safety for real-world applications. | 反事实学习排序(CLTR)存在风险;多种情况可能导致其生成次优模型,从而在部署时损害性能。安全CLTR被引入以在使用逆倾向评分来纠正位置偏差时减轻这些风险。然而,现有的CLTR安全措施不适用于最先进的CLTR,无法处理信任偏差,并且其安全保证依赖于对用户行为的特定假设。我们的贡献有两方面。首先,我们将现有的安全CLTR方法推广到适用于最先进的双重稳健(DR)CLTR和信任偏差。其次,我们提出了一种新颖的方法,即近端排序策略优化(PRPO),该方法在部署时提供安全性,而无需对用户行为做出任何假设。PRPO消除了学习与安全排序模型过于不同的排序行为的激励。因此,PRPO在不依赖任何特定用户假设的情况下,对学习模型可能降低性能指标的程度施加了限制。我们的实验表明,我们的新颖安全双重稳健方法和PRPO都比现有的安全逆倾向评分方法提供了更高的性能。然而,在遇到意外情况时,安全双重稳健方法可能会变得不安全并带来有害的性能。相比之下,PRPO始终保持安全,即使在最恶劣的对抗情况下也是如此。通过避免假设,PRPO是第一种在部署时具有无条件安全性的方法,这为实际应用带来了强大的安全性。 | code | 0 | |
Quantum Cognition-Inspired EEG-based Recommendation via Graph Neural Networks | Jinkun Han, Wei Li, Yingshu Li, Zhipeng Cai | Department of Computer Science, Georgia State University, Atlanta, Georgia, USA | Current recommendation systems recommend goods by considering users' historical behaviors, social relations, ratings, and other multi-modals. Although outdated user information presents the trends of a user's interests, no recommendation system can know the users' real-time thoughts indeed. With the development of brain-computer interfaces, it is time to explore next-generation recommenders that show users' real-time thoughts without delay. Electroencephalography (EEG) is a promising method of collecting brain signals because of its convenience and mobility. Currently, there is only few research on EEG-based recommendations due to the complexity of learning human brain activity. To explore the utility of EEG-based recommendation, we propose a novel neural network model, QUARK, combining Quantum Cognition Theory and Graph Convolutional Networks for accurate item recommendations. Compared with the state-of-the-art recommendation models, the superiority of QUARK is confirmed via extensive experiments. | 当前的推荐系统通过考虑用户的历史行为、社交关系、评分以及其他多模态信息来推荐商品。尽管过时的用户信息展示了用户兴趣的趋势,但没有任何推荐系统能够真正了解用户的实时想法。随着脑机接口的发展,是时候探索下一代推荐系统,这些系统能够即时展示用户的实时想法。脑电图(EEG)是一种有前景的收集脑信号的方法,因其便捷性和移动性。目前,由于学习人类脑活动复杂性,基于脑电图的推荐研究还很少。为了探索基于脑电图的推荐的实用性,我们提出了一种新颖的神经网络模型——QUARK,它结合了量子认知理论和图卷积网络,以实现准确的项目推荐。通过广泛的实验,QUARK相较于最先进的推荐模型展现出了优越性。 | code | 0 |
From Retrieval to Generation: Efficient and Effective Entity Set Expansion | Shulin Huang, Shirong Ma, Yangning Li, Yinghui Li, HaiTao Zheng | Entity Set Expansion (ESE) is a critical task aiming at expanding entities of the target semantic class described by seed entities. Most existing ESE methods are retrieval-based frameworks that need to extract contextual features of entities and calculate the similarity between seed entities and candidate entities. To achieve the two purposes, they iteratively traverse the corpus and the entity vocabulary, resulting in poor efficiency and scalability. Experimental results indicate that the time consumed by the retrieval-based ESE methods increases linearly with entity vocabulary and corpus size. In this paper, we firstly propose Generative Entity Set Expansion (GenExpan) framework, which utilizes a generative pre-trained auto-regressive language model to accomplish ESE task. Specifically, a prefix tree is employed to guarantee the validity of entity generation, and automatically generated class names are adopted to guide the model to generate target entities. Moreover, we propose Knowledge Calibration and Generative Ranking to further bridge the gap between generic knowledge of the language model and the goal of ESE task. For efficiency, expansion time consumed by GenExpan is independent of entity vocabulary and corpus size, and GenExpan achieves an average 600% speedup compared to strong baselines. For expansion effectiveness, our framework outperforms previous state-of-the-art ESE methods. | 实体集扩展(Entity Set Expansion, ESE)是一项关键任务,旨在通过种子实体扩展目标语义类别的实体。大多数现有的ESE方法基于检索框架,需要提取实体的上下文特征并计算种子实体与候选实体之间的相似度。为了实现这两个目的,它们需要迭代遍历语料库和实体词汇表,导致效率和扩展性较差。实验结果表明,基于检索的ESE方法所消耗的时间随实体词汇表和语料库大小的增加而线性增长。在本文中,我们首先提出了生成式实体集扩展(Generative Entity Set Expansion, GenExpan)框架,该框架利用生成式预训练的自回归语言模型来完成ESE任务。具体而言,我们使用前缀树来保证实体生成的有效性,并采用自动生成的类别名称来指导模型生成目标实体。此外,我们提出了知识校准和生成式排序,以进一步缩小语言模型的通用知识与ESE任务目标之间的差距。在效率方面,GenExpan所消耗的扩展时间与实体词汇表和语料库大小无关,并且与强基线相比,GenExpan实现了平均600%的加速。在扩展效果方面,我们的框架优于之前最先进的ESE方法。 | code | 0 | |
RD-P: A Trustworthy Retrieval-Augmented Prompter with Knowledge Graphs for LLMs | Yubo Huang, Guosun Zeng | Tongji University, Shanghai, China | Large Language Models (LLMs) face challenges due to hallucination issues. Current solutions use retrieval-augmented generation (RAG), integrating LLMs with external knowledge to enhance answer accuracy. However, the misuse of irrelevant external knowledge can be misleading. In this paper, we propose a novel method called Retrieve-and-Discriminate Prompter (RD-P), which leverages knowledge graphs (KGs) for trustworthy RAG by synchronizing knowledge retrieval and discrimination in a unified model. Specifically, we train a prompter based on a pre-trained language model with shared parameters. It has two key modules: the retriever and the discriminator. The retriever identifies relevant reasoning paths in the KG, while the discriminator evaluates their credibility through "logical coverage calculation" and in turn instructs the retrieval process. Prompts are then constructed to guide LLMs in reasoning and answering questions using both retrieved and implicit knowledge. Experiments on knowledge-intensive question answering (QA) tasks demonstrate that our method significantly improves answer coverage rate while reducing the retrieval scale, achieving superior performance in complex KGQA tasks compared with state-of-the-art RAG methods at a low cost. | 大型语言模型(LLMs)面临幻觉问题的挑战。当前的解决方案采用检索增强生成(RAG),将LLMs与外部知识整合以提高答案的准确性。然而,不相关外部知识的误用可能会产生误导。在本文中,我们提出了一种名为“检索与鉴别提示器”(Retrieve-and-Discriminate Prompter, RD-P)的新方法,该方法通过在统一模型中同步知识检索与鉴别,利用知识图谱(KGs)实现可信的RAG。具体而言,我们基于一个具有共享参数的预训练语言模型训练了一个提示器,该提示器包含两个关键模块:检索器和鉴别器。检索器在KG中识别相关的推理路径,而鉴别器通过“逻辑覆盖计算”评估这些路径的可信度,并反过来指导检索过程。随后,构建提示以指导LLMs使用检索到的知识和隐含知识进行推理并回答问题。在知识密集型问答(QA)任务上的实验表明,我们的方法显著提高了答案覆盖率,同时减少了检索规模,在复杂KGQA任务中以较低成本实现了优于现有RAG方法的性能。 | code | 0 |
Understanding GNNs for Boolean Satisfiability through Approximation Algorithms | Jan Hula, David Mojzísek, Mikolás Janota | The paper deals with the interpretability of Graph Neural Networks in the context of Boolean Satisfiability. The goal is to demystify the internal workings of these models and provide insightful perspectives into their decision-making processes. This is done by uncovering connections to two approximation algorithms studied in the domain of Boolean Satisfiability: Belief Propagation and Semidefinite Programming Relaxations. Revealing these connections has empowered us to introduce a suite of impactful enhancements. The first significant enhancement is a curriculum training procedure, which incrementally increases the problem complexity in the training set, together with increasing the number of message passing iterations of the Graph Neural Network. We show that the curriculum, together with several other optimizations, reduces the training time by more than an order of magnitude compared to the baseline without the curriculum. Furthermore, we apply decimation and sampling of initial embeddings, which significantly increase the percentage of solved problems. | 本文探讨了在布尔可满足性(Boolean Satisfiability)背景下图神经网络(Graph Neural Networks, GNN)的可解释性问题。其目标在于揭秘这些模型的内部运作机制,并深入洞察其决策过程。通过揭示与布尔可满足性领域内两种近似算法——信念传播(Belief Propagation)和半定规划松弛(Semidefinite Programming Relaxations)之间的联系,我们得以引入一系列具有影响力的改进措施。首先,一个显著的改进是课程训练程序,该程序在训练集中逐步增加问题复杂度,同时提升图神经网络的消息传递迭代次数。研究表明,结合课程训练与其他多项优化措施,相较于无课程训练的基线模型,训练时间减少了超过一个数量级。此外,我们还采用了初始嵌入的降维和采样技术,这显著提高了问题解决的百分比。 | code | 0 | |
HiQuE: Hierarchical Question Embedding Network for Multimodal Depression Detection | Juho Jung, Chaewon Kang, Jeewoo Yoon, Seungbae Kim, Jinyoung Han | The utilization of automated depression detection significantly enhances early intervention for individuals experiencing depression. Despite numerous proposals on automated depression detection using recorded clinical interview videos, limited attention has been paid to considering the hierarchical structure of the interview questions. In clinical interviews for diagnosing depression, clinicians use a structured questionnaire that includes routine baseline questions and follow-up questions to assess the interviewee's condition. This paper introduces HiQuE (Hierarchical Question Embedding network), a novel depression detection framework that leverages the hierarchical relationship between primary and follow-up questions in clinical interviews. HiQuE can effectively capture the importance of each question in diagnosing depression by learning mutual information across multiple modalities. We conduct extensive experiments on the widely-used clinical interview data, DAIC-WOZ, where our model outperforms other state-of-the-art multimodal depression detection models and emotion recognition models, showcasing its clinical utility in depression detection. | 自动化抑郁症检测的利用显著增强了针对抑郁症患者的早期干预。尽管已有众多关于使用录制的临床访谈视频进行自动化抑郁症检测的提议,但很少有研究考虑到访谈问题之间的层次结构。在诊断抑郁症的临床访谈中,临床医生使用包含常规基线问题和后续问题的结构化问卷来评估受访者的状况。本文介绍了HiQuE(分层问题嵌入网络),这是一种利用临床访谈中主要问题与后续问题之间层次关系的新型抑郁症检测框架。HiQuE通过跨多种模态学习互信息,能够有效捕捉每个问题在抑郁症诊断中的重要性。我们在广泛使用的临床访谈数据DAIC-WOZ上进行了大量实验,结果表明,我们的模型在多模态抑郁症检测和情感识别模型中表现优于其他最先进的模型,展示了其在抑郁症检测中的临床实用性。 | code | 0 | |
Embedding Knowledge Graphs in Function Spaces | Louis Mozart Kamdem Teyou, Caglar Demir, AxelCyrille Ngonga Ngomo | We introduce a novel embedding method diverging from conventional approaches by operating within function spaces of finite dimension rather than finite vector space, thus departing significantly from standard knowledge graph embedding techniques. Initially employing polynomial functions to compute embeddings, we progress to more intricate representations using neural networks with varying layer complexities. We argue that employing functions for embedding computation enhances expressiveness and allows for more degrees of freedom, enabling operations such as composition, derivatives and primitive of entities representation. Additionally, we meticulously outline the step-by-step construction of our approach and provide code for reproducibility, thereby facilitating further exploration and application in the field. | 我们提出了一种新颖的嵌入方法,该方法与传统方法不同,它在有限维函数空间中进行操作,而非有限维向量空间,从而显著区别于标准的知识图谱嵌入技术。初始阶段使用多项式函数来计算嵌入,随后我们采用具有不同层复杂度的神经网络来实现更为复杂的表示。我们认为,使用函数进行嵌入计算能够增强表达能力,并提供更多的自由度,使得实体表示的组合、导数和原函数等操作成为可能。此外,我们详细描述了该方法的逐步构建过程,并提供了可重复使用的代码,从而促进该领域内的进一步探索和应用。 | code | 0 | |
Federated Deep Equilibrium Learning: Harnessing Compact Global Representations to Enhance Personalization | Long Tan Le, Tuan Dung Nguyen, TungAnh Nguyen, Choong Seon Hong, Suranga Seneviratne, Wei Bao, Nguyen H. Tran | The University of Pennsylvania, Philadelphia, PA, USA; Kyung Hee University, Yongin-si, Republic of Korea; The University of Sydney, Sydney, NSW, Australia | Federated Learning (FL) has emerged as a groundbreaking distributed learning paradigm enabling clients to train a global model collaboratively without exchanging data. Despite enhancing privacy and efficiency in information retrieval and knowledge management contexts, training and deploying FL models confront significant challenges such as communication bottlenecks, data heterogeneity, and memory limitations. To comprehensively address these challenges, we introduce FeDEQ, a novel FL framework that incorporates deep equilibrium learning and consensus optimization to harness compact global data representations for efficient personalization. Specifically, we design a unique model structure featuring an equilibrium layer for global representation extraction, followed by explicit layers tailored for local personalization. We then propose a novel FL algorithm rooted in the alternating directions method of multipliers (ADMM), which enables the joint optimization of a shared equilibrium layer and individual personalized layers across distributed datasets. Our theoretical analysis confirms that FeDEQ converges to a stationary point, achieving both compact global representations and optimal personalized parameters for each client. Extensive experiments on various benchmarks demonstrate that FeDEQ matches the performance of state-of-the-art personalized FL methods, while significantly reducing communication size by up to 4 times and memory footprint by 1.5 times during training. | 联邦学习(Federated Learning, FL)作为一种突破性的分布式学习范式,使得客户端能够在不交换数据的情况下协作训练全局模型。尽管在信息检索和知识管理领域中增强了隐私和效率,但训练和部署FL模型仍面临重大挑战,如通信瓶颈、数据异质性和内存限制。为了全面应对这些挑战,我们提出了FeDEQ,这是一种新颖的FL框架,它结合了深度平衡学习和共识优化,以利用紧凑的全局数据表示实现高效个性化。具体而言,我们设计了一种独特的模型结构,包括一个用于提取全局表示的平衡层,随后是专门为本地个性化定制的显式层。接着,我们提出了一种基于交替方向乘子法(ADMM)的新型FL算法,该算法能够联合优化分布式数据集中的共享平衡层和各个个性化层。我们的理论分析证实,FeDEQ能够收敛到一个平稳点,为每个客户端实现紧凑的全局表示和最优的个性化参数。在多个基准上的广泛实验表明,FeDEQ与最先进的个性化FL方法性能相当,同时在训练过程中将通信量减少了高达4倍,内存占用减少了1.5倍。 | code | 0 |
Privacy-preserving Spatial Dataset Search in Cloud | Pengyue Li, Hua Dai, Sheng Wang, Wenzhe Yang, Geng Yang | School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China; School of Computer Science, Wuhan University, Wuhan, China | The development of cloud computing has met the growing demand for dataset search in the era of massive data. In the field of spatial dataset search, the high prevalence of sensitive information in spatial datasets underscores the necessity of privacy-preserving search processing in the cloud. However, existing spatial dataset search schemes are designed on plaintext datasets and do not consider privacy protection in search processing. In this paper, we first propose a privacy-preserving spatial dataset search scheme. The density distribution-based similarity model is proposed to measure the similarity between spatial datasets, and then the order-preserving encrypted similarity is designed to achieve secure similarity calculation. With the above idea, the baseline search scheme (PriDAS) is proposed. To improve the search efficiency, a two-layer index is designed to filter candidate datasets and accelerate the similarity calculation between datasets. By using the index, the optimized search scheme (PriDAS+) is proposed. To analyze the security of the proposed schemes, the game simulation-based proof is presented. Experimental results on three real-world spatial data repositories with 100,000 spatial datasets show that PriDAS+ only needs less than 0.4 seconds to accomplish the search processing. | 云计算的发展满足了大数据时代对数据集搜索日益增长的需求。在空间数据集搜索领域,空间数据集中敏感信息的高普及率突显了在云环境中进行隐私保护搜索处理的必要性。然而,现有的空间数据集搜索方案设计基于明文数据集,并未考虑搜索处理中的隐私保护。本文首先提出了一种隐私保护的空间数据集搜索方案。基于密度分布的相似性模型被提出用于衡量空间数据集之间的相似性,然后设计了顺序保持加密的相似性以实现安全的相似性计算。基于上述思路,提出了基线搜索方案(PriDAS)。为了提高搜索效率,设计了一个两层索引以过滤候选数据集并加速数据集之间的相似性计算。通过使用该索引,提出了优化的搜索方案(PriDAS+)。为了分析所提出方案的安全性,提出了基于博弈模拟的证明。在包含10万个空间数据集的三个真实世界空间数据存储库上的实验结果表明,PriDAS+仅需不到0.4秒即可完成搜索处理。 | code | 0 |
Privacy-Preserving Graph Embedding based on Local Differential Privacy | Zening Li, RongHua Li, Meihao Liao, Fusheng Jin, Guoren Wang | Beijing Institute of Technology, Beijing, China | Graph embedding has become a powerful tool for learning latent representations of nodes in a graph. Despite its superior performance in various graph-based machine learning tasks, serious privacy concerns arise when the graph data contains personal or sensitive information. To address this issue, we investigate and develop graph embedding algorithms that satisfy local differential privacy (LDP). We introduce a novel privacy-preserving graph embedding framework, named PrivGE, to protect node data privacy. Specifically, we propose an LDP mechanism to obfuscate node data and utilize personalized PageRank as the proximity measure to learn node representations. Furthermore, we provide a theoretical analysis of the privacy guarantees and utility offered by the PrivGE framework. Extensive experiments on several real-world graph datasets demonstrate that PrivGE achieves an optimal balance between privacy and utility, and significantly outperforms existing methods in node classification and link prediction tasks. | 图嵌入已成为学习图中节点潜在表示的有力工具。尽管在各种基于图的机器学习任务中表现优异,但当图数据包含个人信息或敏感信息时,严重的隐私问题也随之产生。为解决这一问题,我们研究并开发了满足局部差分隐私(LDP)的图嵌入算法。我们引入了一种新的隐私保护图嵌入框架,命名为PrivGE,以保护节点数据隐私。具体而言,我们提出了一种LDP机制来混淆节点数据,并利用个性化PageRank作为接近度度量来学习节点表示。此外,我们对PrivGE框架提供的隐私保障和效用进行了理论分析。在多个真实世界图数据集上的广泛实验表明,PrivGE在隐私与效用之间实现了最佳平衡,并在节点分类和链接预测任务中显著优于现有方法。 | code | 0 |
On Evaluation Metrics for Diversity-enhanced Recommendations | Xueqi Li, Gao Cong, Guoqing Xiao, Yang Xu, Wenjun Jiang, Kenli Li | Nanyang Technological University, Singapore, Singapore; College of Computer Science and Electronic Engineering, Hunan University, Changsha, China | Diversity is increasingly recognized as a crucial factor in recommendation systems for enhancing user satisfaction. However, existing studies on diversity-enhanced recommendation systems primarily focus on designing recommendation strategies, often overlooking the development of evaluation metrics. Widely used diversity metrics such as CC, ILAD, and ILMD are typically assessed independently of accuracy. This separation leads to a critical limitation: existing diversity measures are unable to distinguish between diversity improvements from effective recommendations and those from in effective recommendations. Our evaluations reveal that the diversity improvements are primarily contributed by ineffective recommendations, which often do not positively contribute to user satisfaction. Furthermore, existing diversity metrics disregard the feature distribution of ground-truth items, potentially skewing the assessment of diversity performance. To address these limitations, we design three new accuracy-aware metrics: DCC, FDCC, and DILAD, and conduct a re-evaluation using these metrics. Surprisingly, our results illustrate that the diversity improvements of existing diversity-enhanced approaches are limited and even negative compared to those of accurate recommendations. This finding underscores the need to explore more sophisticated diversity-enhanced techniques for improving the diversity within effective recommendations. | 多样性在提升用户满意度的推荐系统中日益被视为一个关键因素。然而,现有关于增强多样性的推荐系统的研究主要集中在设计推荐策略上,往往忽视了评估指标的开发。广泛使用的多样性指标如CC、ILAD和ILMD通常独立于准确性进行评估。这种分离导致了一个关键的局限性:现有的多样性测量无法区分多样性改进是来自有效的推荐还是无效的推荐。我们的评估显示,多样性改进主要由无效推荐贡献,这些推荐往往对用户满意度没有积极贡献。此外,现有的多样性指标忽略了真实物品的特征分布,这可能扭曲了对多样性表现的评估。为了解决这些限制,我们设计了三种新的关注准确性的指标:DCC、FDCC和DILAD,并使用这些指标进行了重新评估。令人惊讶的是,我们的结果表明,现有增强多样性的方法在多样性改进方面是有限的,甚至相比于准确推荐是负面的。这一发现强调了需要探索更复杂的多样性增强技术,以提高有效推荐中的多样性。 | code | 0 |
RecDiff: Diffusion Model for Social Recommendation | Zongwei Li, Lianghao Xia, Chao Huang | Social recommendation has emerged as a powerful approach to enhancepersonalized recommendations by leveraging the social connections among users,such as following and friend relations observed in online social platforms. Thefundamental assumption of social recommendation is that socially-connectedusers exhibit homophily in their preference patterns. This means that usersconnected by social ties tend to have similar tastes in user-item activities,such as rating and purchasing. However, this assumption is not always valid dueto the presence of irrelevant and false social ties, which can contaminate userembeddings and adversely affect recommendation accuracy. To address thischallenge, we propose a novel diffusion-based social denoising framework forrecommendation (RecDiff). Our approach utilizes a simple yet effectivehidden-space diffusion paradigm to alleivate the noisy effect in the compressedand dense representation space. By performing multi-step noise diffusion andremoval, RecDiff possesses a robust ability to identify and eliminate noisefrom the encoded user representations, even when the noise levels vary. Thediffusion module is optimized in a downstream task-aware manner, therebymaximizing its ability to enhance the recommendation process. We conductedextensive experiments to evaluate the efficacy of our framework, and theresults demonstrate its superiority in terms of recommendation accuracy,training efficiency, and denoising effectiveness. The source code for the modelimplementation is publicly available at: https://github.com/HKUDS/RecDiff. | 社交推荐作为一种利用用户间社交关系(如在线社交平台中的关注和好友关系)来增强个性化推荐的方法,已经崭露头角。社交推荐的基本假设是,通过社交关系连接的用户在偏好模式上表现出同质性。这意味着通过社交纽带连接的用户在用户-项目活动(如评分和购买)中往往具有相似的品味。然而,由于存在无关和虚假的社交关系,这一假设并不总是成立,这些关系可能会污染用户嵌入,从而对推荐准确性产生负面影响。为了应对这一挑战,我们提出了一种基于扩散的社交去噪推荐框架(RecDiff)。我们的方法采用了一种简单而有效的隐空间扩散范式,以减轻压缩和密集表示空间中的噪声影响。通过执行多步噪声扩散和去除,RecDiff具有强大的能力来识别和消除编码用户表示中的噪声,即使在噪声水平变化的情况下也是如此。扩散模块以任务感知的方式进行优化,从而最大化其增强推荐过程的能力。我们进行了广泛的实验来评估我们框架的有效性,结果表明它在推荐准确性、训练效率和去噪效果方面具有优越性。该模型的实现代码已在以下公开可用:https://github.com/HKUDS/RecDiff。 | code | 0 | |
Efficient and Robust Regularized Federated Recommendation | Langming Liu, Wanyu Wang, Xiangyu Zhao, Zijian Zhang, Chunxu Zhang, Shanru Lin, Yiqi Wang, Lixin Zou, Zitao Liu, Xuetao Wei, Hongzhi Yin, Qing Li | The University of Queensland, Brisbane, Australia; Jilin University, Changchun, China; The Hong Kong Polytechnic University, Hong Kong, China; City University of Hong Kong, Hong Kong, China; Michigan State University, East Lansing, USA; Jinan University, Guangzhou, China; Southern University of Science and Technology, Shenzhen, China; Wuhan University, Wuhan, China | Recommender systems play a pivotal role across practical scenarios, showcasing remarkable capabilities in user preference modeling. However, the centralized learning paradigm predominantly used raises serious privacy concerns. The federated recommender system (FedRS) addresses this by updating models on clients, while a central server orchestrates training without accessing private data. Existing FedRS approaches, however, face unresolved challenges, including non-convex optimization, vulnerability, potential privacy leakage risk, and communication inefficiency. This paper addresses these challenges by reformulating the federated recommendation problem as a convex optimization issue, ensuring convergence to the global optimum. Based on this, we devise a novel method, RFRec, to tackle this optimization problem efficiently. In addition, we propose RFRecF, a highly efficient version that incorporates non-uniform stochastic gradient descent to improve communication efficiency. In user preference modeling, both methods learn local and global models, collaboratively learning users' common and personalized interests under the federated learning setting. Moreover, both methods significantly enhance communication efficiency, robustness, and privacy protection, with theoretical support. Comprehensive evaluations on four benchmark datasets demonstrate RFRec and RFRecF's superior performance compared to diverse baselines. | 推荐系统在实际应用场景中扮演着关键角色,展现了在用户偏好建模方面的显著能力。然而,目前主要采用的集中式学习模式引发了严重的隐私问题。联邦推荐系统(FedRS)通过在客户端更新模型来解决这一问题,同时中央服务器在不访问私人数据的情况下协调训练。尽管如此,现有的FedRS方法仍面临一些未解决的挑战,包括非凸优化、易受攻击性、潜在的隐私泄露风险以及通信效率低下。本文通过将联邦推荐问题重新表述为凸优化问题,确保了全局最优解的收敛性,从而应对这些挑战。基于此,我们设计了一种新型方法RFRec,以高效解决这一优化问题。此外,我们还提出了RFRecF,这是一种高效版本,结合了非均匀随机梯度下降以提高通信效率。在用户偏好建模方面,这两种方法都学习局部和全局模型,在联邦学习设置下协同学习用户的共同和个性化兴趣。此外,这两种方法在理论支持下显著提升了通信效率、鲁棒性和隐私保护。对四个基准数据集的综合评估表明,RFRec和RFRecF相较于多种基线方法表现更为优越。 | code | 0 |
Two Heads are Better than One: Zero-shot Cognitive Reasoning via Multi-LLM Knowledge Fusion | Liang Liu, Dong Zhang, Shoushan Li, Guodong Zhou, Erik Cambria | College of Computing and Data Science, Nanyang Technological University, Singapore, Singapore; School of Computer Science and Technology, Soochow University, Suzhou, China | Cognitive reasoning holds a significant place within Natural Language Processing (NLP). Yet, the exploration of zero-shot scenarios, which align more closely with real-life situations than supervised scenarios, has been relatively limited. While a few studies have employed Large Language Models (LLMs) to tackle zero-shot cognitive reasoning tasks, they still grapple with two key challenges: 1) Traditional approaches rely on the chain-of-thought (CoT) mechanism, wherein LLMs are provided with a "Let's think step by step'' prompt. However, this schema may not accurately understand the meaning of a given question and ignores the possible learned knowledge (e.g., background or commonsense) of the LLMs about the questions, leading to incorrect answers. 2) Previous CoT methods normally exploit a single Large Language Model (LLM) and design many strategies to augment this LLM. We argue that the power of a single LLM is typically finite since it may not have learned some relevant knowledge about the question. To address these issues, we propose a Multi-LLM Knowledge Fusion (MLKF) approach, which resorts to heterogeneous knowledge emerging from multiple LLMs, for zero-shot cognitive reasoning tasks. Through extensive experiments and detailed analysis, we demonstrate that our MLKF can outperform the existing zero-shot or unsupervised state-of-the-art methods on four kinds of zero-shot tasks: aspect sentiment analysis, named entity recognition, question answering, and mathematical reasoning. Our code is available at https://github.com/trueBatty/MLKF | 认知推理在自然语言处理(NLP)中占有重要地位。然而,零样本场景的研究相对较少,这些场景比监督场景更接近现实情况。尽管一些研究已经使用大型语言模型(LLMs)来解决零样本认知推理任务,但它们仍然面临两个关键挑战:1)传统方法依赖于思维链(CoT)机制,其中LLMs通过“让我们一步一步地思考”提示进行推理。然而,这种模式可能无法准确理解给定问题的含义,并且忽略了LLMs关于问题的潜在学习知识(例如背景或常识),导致答案错误。2)之前的CoT方法通常利用单一的大型语言模型(LLM),并设计多种策略来增强该LLM。我们认为,单一LLM的能力通常是有限的,因为它可能没有学习到与问题相关的某些知识。为了解决这些问题,我们提出了一种多LLM知识融合(MLKF)方法,该方法利用多个LLMs中涌现的异构知识来进行零样本认知推理任务。通过广泛的实验和详细分析,我们证明了我们的MLKF在四种零样本任务(方面情感分析、命名实体识别、问答和数学推理)上可以超越现有的零样本或无监督的最先进方法。我们的代码可在https://github.com/trueBatty/MLKF获取。 | code | 0 |
Collaborative Fraud Detection on Large Scale Graph Using Secure Multi-Party Computation | Xin Liu, Xiaoyu Fan, Rong Ma, Kun Chen, Yi Li, Guosai Wang, Wei Xu | Tsinghua University, Beijing, China; Independent Researcher, Beijing, China; Ant Group, Beijing, China; Tsingjiao Information Technology Co. Ltd., Beijing, China | Enabling various parties to share data enhances online fraud detection capabilities considering fraudsters tend to reuse resources attacking multiple platforms. Multi-party computation (MPC) techniques, such as secret sharing, offer potential privacy-preserving solutions but face efficiency challenges when handling large-scale data. This paper presents a novel approach, SecureFD (Secure Fraud Detector), aimed at detecting fraud in multi-party graph data, ensuring privacy, accuracy, and scalability. We propose a graph neural network EPR-GNN, which is MPC-friendly, as the base detector. Then we design a framework that allows multiple parties to train EPR-GNN collaboratively on secure sparse graphs in a privacy- preserving manner. The oblivious node embedding sharing protocol in the collaborative training procedure achieves up to a 45× speed-up, supporting over four million users compared to the naive solution. Additionally, we further reduce secure computation by locally pruning a significant number of non-suspicious users and selecting only the most valuable resources for sharing. Experiments on real datasets demonstrate that by securely integrating data from different parties, SecureFD achieves superior detection performance compared to state-of-the-art local detectors. And the local pruning greatly improves the scalability without compromising detection accuracies. | 使各方能够共享数据增强了在线欺诈检测能力,因为欺诈者倾向于重复使用资源攻击多个平台。多方计算(MPC)技术,如秘密共享,提供了潜在的隐私保护解决方案,但在处理大规模数据时面临效率挑战。本文提出了一种新方法,SecureFD(安全欺诈检测器),旨在检测多方图数据中的欺诈行为,确保隐私、准确性和可扩展性。我们提出了一种图神经网络EPR-GNN,它对MPC友好,作为基础检测器。然后,我们设计了一个框架,允许多方在隐私保护的方式下,在安全稀疏图上协作训练EPR-GNN。协作训练过程中的不经意节点嵌入共享协议实现了高达45倍的速度提升,相比朴素解决方案,支持超过四百万用户。此外,我们通过本地修剪大量非可疑用户并仅选择最有价值的资源进行共享,进一步减少了安全计算。在真实数据集上的实验表明,通过安全地整合来自不同方的数据,SecureFD相比最先进的本地检测器实现了更优越的检测性能。而本地修剪极大地提高了可扩展性,且不影响检测准确性。 | code | 0 |
AlignRec: Aligning and Training in Multimodal Recommendations | Yifan Liu, Kangning Zhang, Xiangyuan Ren, Yanhua Huang, Jiarui Jin, Yingjie Qin, Ruilong Su, Ruiwen Xu, Yong Yu, Weinan Zhang | With the development of multimedia systems, multimodal recommendations are playing an essential role, as they can leverage rich contexts beyond interactions. Existing methods mainly regard multimodal information as an auxiliary, using them to help learn ID features; However, there exist semantic gaps among multimodal content features and ID-based features, for which directly using multimodal information as an auxiliary would lead to misalignment in representations of users and items. In this paper, we first systematically investigate the misalignment issue in multimodal recommendations, and propose a solution named AlignRec. In AlignRec, the recommendation objective is decomposed into three alignments, namely alignment within contents, alignment between content and categorical ID, and alignment between users and items. Each alignment is characterized by a specific objective function and is integrated into our multimodal recommendation framework. To effectively train AlignRec, we propose starting from pre-training the first alignment to obtain unified multimodal features and subsequently training the following two alignments together with these features as input. As it is essential to analyze whether each multimodal feature helps in training and accelerate the iteration cycle of recommendation models, we design three new classes of metrics to evaluate intermediate performance. Our extensive experiments on three real-world datasets consistently verify the superiority of AlignRec compared to nine baselines. We also find that the multimodal features generated by AlignRec are better than currently used ones, which are to be open-sourced in our repository https://github.com/sjtulyf123/AlignRec_CIKM24. | 随着多媒体系统的发展,多模态推荐系统正发挥着至关重要的作用,因为它们能够利用超越交互的丰富上下文信息。现有的方法主要将多模态信息视为辅助手段,利用它们来帮助学习ID特征;然而,多模态内容特征与基于ID的特征之间存在语义鸿沟,直接将多模态信息作为辅助使用会导致用户和物品表示之间的对齐错误。在本文中,我们首先系统地研究了多模态推荐中的对齐问题,并提出了一种名为AlignRec的解决方案。在AlignRec中,推荐目标被分解为三种对齐方式,即内容内部对齐、内容与类别ID之间的对齐,以及用户与物品之间的对齐。每种对齐方式都由一个特定的目标函数表征,并被整合到我们的多模态推荐框架中。为了有效训练AlignRec,我们提出从预训练第一个对齐开始,以获得统一的多模态特征,随后将这些特征作为输入,同时训练后续的两个对齐。由于分析每个多模态特征是否有助于训练并加速推荐模型的迭代周期至关重要,我们设计了三类新的指标来评估中间性能。我们在三个真实世界数据集上的广泛实验一致验证了AlignRec相对于九个基线的优越性。我们还发现,由AlignRec生成的多模态特征优于当前使用的特征,这些特征将在我们的代码库https://github.com/sjtulyf123/AlignRec_CIKM24中开源。 | code | 0 | |
A Universal Sets-level Optimization Framework for Next Set Recommendation | Yuli Liu, Min Liu, Christian Walder, Lexing Xie | ; Google DeepMind, Montreal, Canada; Australian National University, Canberra, Australia; Qinghai University & Australian National University, Xining, China | Next Set Recommendation (NSRec), encompassing related tasks such as next basket recommendation and temporal sets prediction, stands as a trending research topic. Although numerous attempts have been made on this topic, there are certain drawbacks: (i) Existing studies are still confined to utilizing objective functions commonly found in Next Item Recommendation (NIRec), such as binary cross entropy and BPR, which are calculated based on individual item comparisons; (ii) They place emphasis on building sophisticated learning models to capture intricate dependency relationships across sequential sets, but frequently overlook pivotal dependency in their objective functions; (iii) Diversity factor within sequential sets is frequently overlooked. In this research, we endeavor to unveil a universal and Sets-level optimization framework for Next Set Recommendation (SNSRec), offering a holistic fusion of diversity distribution and intricate dependency relationships within temporal sets. To realize this, the following contributions are made: (i) We directly model the temporal set in a sequence as a cohesive entity, leveraging the Structured Determinantal Point Process (SDPP), wherein the probabilistic DPP distribution prioritizes collections of structures (sequential sets) instead of individual items; (ii) We introduce a co-occurrence representation to discern and acknowledge the importance of different sets; (iii) We propose a sets-level optimization criterion, which integrates the diversity distribution and dependency relations across the entire sequence of sets, guiding the model to recommend relevant and diversified set. Extensive experiments on real-world datasets show that our approach consistently outperforms previous methods on both relevance and diversity. | 下一集合推荐(NSRec),包括下一篮子推荐和时间集合预测等相关的任务,已经成为一个热门的研究课题。尽管在这个领域已经有很多尝试,但仍存在一些不足:(i)现有的研究仍然局限于使用在下一项目推荐(NIRec)中常见的目标函数,如二元交叉熵和BPR,这些函数是基于单个项目的比较计算的;(ii)它们注重构建复杂的学习模型来捕捉序列集合之间的复杂依赖关系,但往往忽略了目标函数中的关键依赖关系;(iii)序列集合内的多样性因素经常被忽视。在本研究中,我们努力揭示一个通用的、集合级别的优化框架,用于下一集合推荐(SNSRec),提供了一个将多样性分布和时间集合内的复杂依赖关系全面融合的方案。为了实现这一点,我们做出了以下贡献:(i)我们将时间序列集合直接建模为一个有凝聚力的实体,利用结构化行列式点过程(SDPP),其中概率DPP分布优先考虑结构集合(序列集合)而不是单个项目;(ii)我们引入了一个共现表示来识别和承认不同集合的重要性;(iii)我们提出了一种集合级别的优化标准,该标准整合了整个序列集合的多样性分布和依赖关系,指导模型推荐相关且多样化的集合。在真实世界数据集上的广泛实验表明,我们的方法在相关性和多样性方面始终优于以前的方法。 | code | 0 |
Adversarial Text Rewriting for Text-aware Recommender Systems | Sejoon Oh, Gaurav Verma, Srijan Kumar | Text-aware recommender systems incorporate rich textual features, such as titles and descriptions, to generate item recommendations for users. The use of textual features helps mitigate cold-start problems, and thus, such recommender systems have attracted increased attention. However, we argue that the dependency on item descriptions makes the recommender system vulnerable to manipulation by adversarial sellers on e-commerce platforms. In this paper, we explore the possibility of such manipulation by proposing a new text rewriting framework to attack text-aware recommender systems. We show that the rewriting attack can be exploited by sellers to unfairly uprank their products, even though the adversarially rewritten descriptions are perceived as realistic by human evaluators. Methodologically, we investigate two different variations to carry out text rewriting attacks: (1) two-phase fine-tuning for greater attack performance, and (2) in-context learning for higher text rewriting quality. Experiments spanning 3 different datasets and 4 existing approaches demonstrate that recommender systems exhibit vulnerability against the proposed text rewriting attack. Our work adds to the existing literature around the robustness of recommender systems, while highlighting a new dimension of vulnerability in the age of large-scale automated text generation. | 文本感知推荐系统利用丰富的文本特征,如标题和描述,为用户生成物品推荐。使用文本特征有助于缓解冷启动问题,因此这类推荐系统引起了越来越多的关注。然而,我们认为,对物品描述的依赖使得推荐系统容易受到电子商务平台上对抗性卖家的操纵。在本文中,我们通过提出一个新的文本重写框架来探索这种操纵的可能性,以攻击文本感知推荐系统。我们展示了重写攻击可以被卖家利用来不公平地提升其产品的排名,即使对抗性重写的描述被人类评估者认为是真实的。在方法上,我们研究了两种不同的变体来进行文本重写攻击:(1)两阶段微调以提高攻击性能,(2)上下文内学习以提高文本重写质量。跨越3个不同数据集和4种现有方法的实验表明,推荐系统对所提出的文本重写攻击表现出脆弱性。我们的工作增加了关于推荐系统鲁棒性的现有文献,同时在大规模自动化文本生成时代突显了新的脆弱性维度。 | code | 0 | |
Towards Completeness-Oriented Tool Retrieval for Large Language Models | Changle Qu, Sunhao Dai, Xiaochi Wei, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, Jun Xu, JiRong Wen | Recently, integrating external tools with Large Language Models (LLMs) has gained significant attention as an effective strategy to mitigate the limitations inherent in their pre-training data. However, real-world systems often incorporate a wide array of tools, making it impractical to input all tools into LLMs due to length limitations and latency constraints. Therefore, to fully exploit the potential of tool-augmented LLMs, it is crucial to develop an effective tool retrieval system. Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions, frequently leading to the retrieval of redundant, similar tools. Consequently, these methods fail to provide a complete set of diverse tools necessary for addressing the multifaceted problems encountered by LLMs. In this paper, we propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools. Specifically, we first fine-tune the PLM-based retrieval models to capture the semantic relationships between queries and tools in the semantic learning stage. Subsequently, we construct three bipartite graphs among queries, scenes, and tools and introduce a dual-view graph collaborative learning framework to capture the intricate collaborative relationships among tools during the collaborative learning stage. Extensive experiments on both the open benchmark and the newly introduced ToolLens dataset show that COLT achieves superior performance. Notably, the performance of BERT-mini (11M) with our proposed model framework outperforms BERT-large (340M), which has 30 times more parameters. Furthermore, we will release ToolLens publicly to facilitate future research on tool retrieval. | 近期,将外部工具与大型语言模型(LLMs)集成作为一种有效策略,以缓解其预训练数据固有的局限性,已引起广泛关注。然而,现实世界系统通常包含多种工具,由于长度限制和延迟约束,将所有工具输入LLMs并不现实。因此,为了充分挖掘工具增强型LLMs的潜力,开发一个高效的工具检索系统至关重要。现有的工具检索方法主要集中在用户查询与工具描述之间的语义匹配上,这往往导致检索出冗余、相似的工具。因此,这些方法无法提供一套多样化的工具来解决LLMs面临的多方面问题。本文提出了一种新颖的模型无关的基于协同学习的工具检索方法,称为COLT,该方法不仅捕捉用户查询与工具描述之间的语义相似性,还考虑了工具之间的协同信息。具体而言,我们首先微调基于PLM的检索模型,以在语义学习阶段捕捉查询与工具之间的语义关系。随后,我们在查询、场景和工具之间构建三个二部图,并引入一个双重视图的图协同学习框架,以在协同学习阶段捕捉工具之间复杂的协同关系。在公开基准和新引入的ToolLens数据集上的广泛实验表明,COLT表现优异。值得注意的是,在我们提出的模型框架下,BERT-mini(11M参数)的性能超过了BERT-large(340M参数),后者参数数量是前者的30倍。此外,我们将公开发布ToolLens数据集,以促进未来在工具检索领域的研究。 | code | 0 | |
No Query Left Behind: Query Refinement via Backtranslation | Delaram Rajaei, Zahra Taheri, Hossein Fani | School of Computer Science, University of Windsor, Windsor, ON., Canada | Query refinement is to enhance the relevance of search results by modifying users' original queries to refined versions. State-of-the-art query refinement models have been trained on web query logs, which are predisposed to topic drifts. To fill the gap, little work has been proposed to generate benchmark datasets of (query ’ refined query) pairs through an overwhelming application of unsupervised or supervised modifications to the original query while controlling topic drifts. In this paper, however, we propose leveraging natural language backtranslation, a round-trip translation of a query from a source language via target languages, as a simple yet effective unsupervised approach to scale up generating gold-standard benchmark datasets. Backtranslation can (1) uncover terms that are omitted in a query for being commonly understood in a source language, but may not be known in a target language (e.g., 'figs'’(tamil) 'in a target language (e.g., ‘figs’→(tamil) ‘அத்திமரங்கள்’→‘the fig trees’), (2) augment a query with context-aware synonyms in a target language (e.g., ‘italian nobel prize winners’→(farsi) ’برنده های ایتالیایی جایزه نوبل‘ →‘italian nobel laureates’, and (3) help with the semantic disambiguation of polysemous terms and collocations (e.g., 'custer's last stand'’(malay)pertahan terakhir custer'’ custer's last defence'. Our experiments across 5 query sets with different query lengths and topics and 10 languages from 7 language families using 2 neural machine translators validated the effectiveness of query backtranslation in generating a more extensive gold-standard dataset for query refinement. We open-sourced our research at https://github.com/fani-lab/RePair/tree/nqlb. |
查询精炼是通过修改用户原始查询为精炼版本,以增强搜索结果的相关性。目前最先进的查询精炼模型已经在网络查询日志上进行了训练,这些日志容易出现主题偏移。为了填补这一空白,很少有研究提出通过广泛应用无监督或监督的修改方法来生成(查询-精炼查询)对的基准数据集,同时控制主题偏移。然而,本文提出利用自然语言回译(一种通过目标语言进行源语言查询的往返翻译)作为一种简单而有效的无监督方法,来扩展生成黄金标准的基准数据集。回译可以(1)揭示在源语言中因常见而被省略但在目标语言中可能不为人知的术语(例如,‘figs’→(tamil) ‘அத்திமரங்கள்’→‘the fig trees’),(2)通过目标语言中的上下文相关同义词来增强查询(例如,‘italian nobel prize winners’→(farsi) ’برنده های ایتالیایی جایزه نوبل‘ →‘italian nobel laureates’),以及(3)帮助消除多义词和搭配的语义歧义(例如,‘custer's last stand’→(malay) ‘pertahan terakhir custer’→‘custer's last defence’)。我们在5个不同查询长度和主题的查询集以及来自7个语系的10种语言上进行的实验,使用了2种神经机器翻译器,验证了查询回译在生成更广泛的查询精炼黄金标准数据集方面的有效性。我们在https://github.com/fani-lab/RePair/tree/nqlb 开源了我们的研究。 | code | 0 |
Retrieval-enhanced Knowledge Editing in Language Models for Multi-Hop Question Answering | Yucheng Shi, Qiaoyu Tan, Xuansheng Wu, Shaochen Zhong, Kaixiong Zhou, Ninghao Liu | Large Language Models (LLMs) have shown proficiency in question-answering tasks but often struggle to integrate real-time knowledge, leading to potentially outdated or inaccurate responses. This problem becomes even more challenging when dealing with multi-hop questions, since they require LLMs to update and integrate multiple knowledge pieces relevant to the questions. To tackle the problem, we propose the Retrieval-Augmented model Editing (RAE) framework for multi-hop question answering. RAE first retrieves edited facts and then refines the language model through in-context learning. Specifically, our retrieval approach, based on mutual information maximization, leverages the reasoning abilities of LLMs to identify chain facts that traditional similarity-based searches might miss. In addition, our framework includes a pruning strategy to eliminate redundant information from the retrieved facts, which enhances the editing accuracy and mitigates the hallucination problem. Our framework is supported by theoretical justification for its fact retrieval efficacy. Finally, comprehensive evaluation across various LLMs validates RAE's ability in providing accurate answers with updated knowledge. Our code is available at: https://github.com/sycny/RAE. | 大型语言模型(LLMs)在问答任务中表现出色,但往往难以整合实时知识,导致可能提供过时或不准确的信息。在处理多跳问题时,这一问题变得更加复杂,因为这要求LLMs更新并整合与问题相关的多个知识片段。为了解决这一问题,我们提出了多跳问答的检索增强模型编辑(RAE)框架。RAE首先检索编辑后的事实,然后通过上下文学习对语言模型进行精炼。具体而言,我们的检索方法基于互信息最大化,利用LLMs的推理能力来识别传统基于相似性搜索可能遗漏的链式事实。此外,我们的框架还包括一种剪枝策略,以消除检索事实中的冗余信息,从而提高编辑的准确性并缓解幻觉问题。我们的框架得到了理论上的支持,证明了其在事实检索中的有效性。最后,通过对多种LLMs的综合评估,验证了RAE在提供更新知识的基础上准确回答问题的能力。我们的代码可在以下链接获取:https://github.com/sycny/RAE。 | code | 0 | |
Large Language Models Enhanced Collaborative Filtering | Zhongxiang Sun, Zihua Si, Xiaoxue Zang, Kai Zheng, Yang Song, Xiao Zhang, Jun Xu | Kuaishou Technology Co., Ltd; Renmin University of China Gaoling School of Artificial Intelligence | Recent advancements in Large Language Models (LLMs) have attractedconsiderable interest among researchers to leverage these models to enhanceRecommender Systems (RSs). Existing work predominantly utilizes LLMs togenerate knowledge-rich texts or utilizes LLM-derived embeddings as features toimprove RSs. Al- though the extensive world knowledge embedded in LLMsgenerally benefits RSs, the application can only take limited number of usersand items as inputs, without adequately exploiting collaborative filteringinformation. Considering its crucial role in RSs, one key challenge inenhancing RSs with LLMs lies in providing better collaborative filteringinformation through LLMs. In this paper, drawing inspiration from thein-context learning and chain of thought reasoning in LLMs, we propose theLarge Language Models enhanced Collaborative Filtering (LLM-CF) framework,which distils the world knowledge and reasoning capabilities of LLMs intocollaborative filtering. We also explored a concise and efficientinstruction-tuning method, which improves the recommendation capabilities ofLLMs while preserving their general functionalities (e.g., not decreasing onthe LLM benchmark). Comprehensive experiments on three real-world datasetsdemonstrate that LLM-CF significantly enhances several backbone recommendationmodels and consistently outperforms competitive baselines, showcasing itseffectiveness in distilling the world knowledge and reasoning capabilities ofLLM into collaborative filtering. | 近期大型语言模型(LLMs)的进展引起了研究者的广泛关注,他们试图利用这些模型来提升推荐系统(RSs)的性能。现有研究主要利用LLMs生成知识丰富的文本,或使用从LLM派生的嵌入作为特征来改进推荐系统。尽管LLMs嵌入的广泛世界知识通常对推荐系统有益,但这些应用只能处理有限数量的用户和物品作为输入,未能充分挖掘协同过滤信息。考虑到协同过滤在推荐系统中的关键作用,利用LLMs提升推荐系统的一个主要挑战在于通过LLMs提供更好的协同过滤信息。本文受LLMs中的上下文学习和思维链推理的启发,提出了大型语言模型增强的协同过滤(LLM-CF)框架,该框架将LLMs的世界知识和推理能力提炼到协同过滤中。我们还探索了一种简洁高效的指令调优方法,该方法在保留LLMs通用功能(如在LLM基准测试中不降低性能)的同时,提升了其推荐能力。在三个真实世界数据集上的综合实验表明,LLM-CF显著增强了多个骨干推荐模型,并持续优于竞争基线,展示了其将LLM的世界知识和推理能力提炼到协同过滤中的有效性。 | code | 0 |
Natural Language-Assisted Multi-modal Medication Recommendation | Jie Tan, Yu Rong, Kangfei Zhao, Tian Bian, Tingyang Xu, Junzhou Huang, Hong Cheng, Helen Meng | The Chinese University of Hong Kong, HongKong, China; DAMO Academy, Alibaba Group, Hupan Lab, Hangzhou, China; The Chinese University of Hong Kong, Hong Kong, China; Beijing Institute of Technology, Beijing, China; University of Texas at Arlington, Arlington, TX, USA | Combinatorial medication recommendation (CMR) is a fundamental task of healthcare, which offers opportunities for clinical physicians to provide more precise prescriptions for patients with intricate health conditions, particularly in the scenarios of long-term medical care. Previous research efforts have sought to extract meaningful information from electronic health records (EHRs) to facilitate combinatorial medication recommendations. Existing learning-based approaches further consider the chemical structures of medications, but ignore the textual medication descriptions in which the functionalities are clearly described. Furthermore, the textual knowledge derived from the EHRs of patients remains largely underutilized. To address these issues, we introduce the Natural Language-Assisted Multi-modal Medication Recommendation (NLA-MMR), a multimodal alignment framework designed to learn knowledge from the patient view and medication view jointly. Specifically, NLA-MMR formulates CMR as an alignment problem from patient and medication modalities. In this vein, we employ pretrained language models (PLMs) to extract in-domain knowledge regarding patients and medications, serving as the foundational representation for both modalities. In the medication modality, we exploit both chemical structures and textual descriptions to create medication representations. In the patient modality, we generate the patient representations based on textual descriptions of diagnosis, procedure, and symptom. Extensive experiments conducted on three publicly accessible datasets demonstrate that NLA-MMR achieves new state-of-the-art performance, with a notable average improvement of 4.72% in Jaccard score. | 组合药物推荐(CMR)是医疗保健中的一个基本任务,它为临床医生提供了为病情复杂的患者提供更精确处方的机会,特别是在长期医疗护理的情景中。以往的研究致力于从电子健康记录(EHRs)中提取有意义的信息,以促进组合药物推荐。现有的基于学习的方法进一步考虑了药物的化学结构,但忽略了药物描述文本,这些文本中清楚地描述了药物的功能。此外,从患者EHRs中提取的文本知识在很大程度上未被充分利用。为了解决这些问题,我们引入了自然语言辅助的多模态药物推荐(NLA-MMR),这是一个多模态对齐框架,旨在从患者视角和药物视角共同学习知识。具体来说,NLA-MMR将CMR表述为从患者和药物模态出发的对齐问题。为此,我们采用预训练语言模型(PLMs)来提取关于患者和药物的领域内知识,作为两种模态的基础表示。在药物模态中,我们利用化学结构和文本描述来创建药物表示。在患者模态中,我们基于诊断、治疗程序和症状的文本描述生成患者表示。在三个公开可用的数据集上进行的广泛实验表明,NLA-MMR达到了新的最先进性能,Jaccard分数平均提高了4.72%。 | code | 0 |
LAMRec: Label-aware Multi-view Drug Recommendation | Yunsen Tang, Ning Liu, Haitao Yuan, Yonghe Yan, Lei Liu, Weixing Tan, Lizhen Cui | Shandong University, Jinan, China; Nanyang Technological University, Singapore, Singapore; Shandong Research Institute of Industrial Technology, Jinan, China | The drug recommendation task aims to predict safe and effective drug prescriptions based on the patients' historical electronic health records (EHRs). However, existing drug recommendation models generally have two limitations. First, they neglect the inherent characteristics of multiple views existing in patients' clinical data (e.g., diagnoses and procedures), leading to fragmented and inconsistent patient representations. Second, they do not fully exploit drug label information. Most models do not explicitly establish a mapping relationship between drug labels and patients' historical visits. To address these two problems, we proposed a label-aware multi-view drug recommendation model named LAMRec. In particular, LAMRec uses a cross-attention module to fuse information from the diagnosis and procedure views, and increases the mutual information of patient multi-view representations through multi-view contrastive loss; the label-wise attention mechanism fully explores drug label information by constructing an adaptive mapping of drug-visit to generate personalized representations that are aware of the drug-related visit information. Experiments on three real world medical datasets demonstrated the superiority of LAMRec, with a relative reduction of 5.25% in DDI compared to the optimal baseline, a relative improvement of 4.20% in Jaccard similarity scores, and a relative improvement of 3.10% in F1 scores. We released the code online at: https://github.com/Tyunsen/LAMRec. | 药物推荐任务旨在根据患者的电子健康记录(EHR)历史数据预测安全有效的药物处方。然而,现有的药物推荐模型普遍存在两个局限性。首先,它们忽略了患者临床数据中多视图(如诊断和手术)的固有特性,导致患者表示碎片化和不一致。其次,它们未能充分利用药物标签信息。大多数模型没有明确建立药物标签与患者历史就诊之间的映射关系。为了解决这两个问题,我们提出了一种名为LAMRec的标签感知多视图药物推荐模型。具体而言,LAMRec通过交叉注意力模块融合诊断和手术视图的信息,并通过多视图对比损失增加患者多视图表示的互信息;标签感知注意力机制通过构建药物-就诊的自适应映射,充分挖掘药物标签信息,生成包含药物相关就诊信息的个性化表示。在三个真实世界的医疗数据集上的实验表明,LAMRec具有优越性,与最优基线相比,DDI相对减少5.25%,Jaccard相似度分数相对提高4.20%,F1分数相对提高3.10%。我们在网上发布了代码:https://github.com/Tyunsen/LAMRec。 | code | 0 |
Retrieval Augmented Deep Anomaly Detection for Tabular Data | Hugo Thimonier, Fabrice Popineau, Arpad Rimmel, BichLiên Doan | Deep learning for tabular data has garnered increasing attention in recent years, yet employing deep models for structured data remains challenging. While these models excel with unstructured data, their efficacy with structured data has been limited. Recent research has introduced retrieval-augmented models to address this gap, demonstrating promising results in supervised tasks such as classification and regression. In this work, we investigate using retrieval-augmented models for anomaly detection on tabular data. We propose a reconstruction-based approach in which a transformer model learns to reconstruct masked features of normal samples. We test the effectiveness of KNN-based and attention-based modules to select relevant samples to help in the reconstruction process of the target sample. Our experiments on a benchmark of 31 tabular datasets reveal that augmenting this reconstruction-based anomaly detection (AD) method with sample-sample dependencies via retrieval modules significantly boosts performance. The present work supports the idea that retrieval module are useful to augment any deep AD method to enhance anomaly detection on tabular data. | 近年来,深度学习在表格数据处理领域引起了越来越多的关注,然而,将深度模型应用于结构化数据仍然充满挑战。尽管这些模型在非结构化数据上表现出色,但它们在结构化数据上的有效性却受到限制。最近的研究引入了检索增强模型来填补这一空白,在分类和回归等监督任务中展示了有前景的结果。在这项工作中,我们探讨了使用检索增强模型进行表格数据异常检测的方法。我们提出了一种基于重构的方法,其中变压器模型学习重构正常样本的掩码特征。我们测试了基于KNN和基于注意力模块的有效性,以选择相关样本来辅助目标样本的重构过程。我们在31个表格数据集的基准测试中进行的实验表明,通过检索模块增强这种基于重构的异常检测(AD)方法,利用样本间的依赖关系,显著提升了性能。本研究支持了这样一个观点:检索模块对于增强任何深度AD方法,以提高表格数据上的异常检测效能是有益的。 | code | 0 | |
On Causally Disentangled State Representation Learning for Reinforcement Learning based Recommender Systems | Siyu Wang, Xiaocong Chen, Lina Yao | In Reinforcement Learning-based Recommender Systems (RLRS), the complexity and dynamism of user interactions often result in high-dimensional and noisy state spaces, making it challenging to discern which aspects of the state are truly influential in driving the decision-making process. This issue is exacerbated by the evolving nature of user preferences and behaviors, requiring the recommender system to adaptively focus on the most relevant information for decision-making while preserving generaliability. To tackle this problem, we introduce an innovative causal approach for decomposing the state and extracting \textbf{C}ausal-\textbf{I}n\textbf{D}ispensable \textbf{S}tate Representations (CIDS) in RLRS. Our method concentrates on identifying the \textbf{D}irectly \textbf{A}ction-\textbf{I}nfluenced \textbf{S}tate Variables (DAIS) and \textbf{A}ction-\textbf{I}nfluence \textbf{A}ncestors (AIA), which are essential for making effective recommendations. By leveraging conditional mutual information, we develop a framework that not only discerns the causal relationships within the generative process but also isolates critical state variables from the typically dense and high-dimensional state representations. We provide theoretical evidence for the identifiability of these variables. Then, by making use of the identified causal relationship, we construct causal-indispensable state representations, enabling the training of policies over a more advantageous subset of the agent's state space. We demonstrate the efficacy of our approach through extensive experiments, showcasing our method outperforms state-of-the-art methods. | 在基于强化学习的推荐系统(RLRS)中,用户交互的复杂性和动态性常常导致高维度和噪声状态空间,使得难以辨别哪些状态方面真正影响决策过程。这一问题因用户偏好和行为的不断演变而加剧,要求推荐系统在保持泛化能力的同时,自适应地专注于决策中最相关的信息。为解决此问题,我们引入了一种创新的因果分解方法,用于在RLRS中提取\textbf{C}ausal-\textbf{I}n\textbf{D}ispensable \textbf{S}tate Representations(CIDS)。我们的方法专注于识别\textbf{D}irectly \textbf{A}ction-\textbf{I}nfluenced \textbf{S}tate Variables(DAIS)和\textbf{A}ction-\textbf{I}nfluence \textbf{A}ncestors(AIA),这些变量对于做出有效推荐至关重要。通过利用条件互信息,我们开发了一个框架,不仅能够辨别生成过程中的因果关系,还能从通常密集且高维的状态表示中隔离出关键状态变量。我们提供了这些变量可识别性的理论证据。随后,通过利用识别出的因果关系,我们构建了因果不可或缺的状态表示,使得在代理状态空间中更具优势的子集上训练策略成为可能。我们通过广泛的实验展示了我们方法的有效性,证明其优于现有最先进的方法。 | code | 0 | |
Topology-aware Retrieval Augmentation for Text Generation | Yu Wang, Nedim Lipka, Ruiyi Zhang, Alexa F. Siu, Yuying Zhao, Bo Ni, Xin Wang, Ryan A. Rossi, Tyler Derr | Despite the impressive advancements of Large Language Models (LLMs) ingenerating text, they are often limited by the knowledge contained in the inputand prone to producing inaccurate or hallucinated content. To tackle theseissues, Retrieval-augmented Generation (RAG) is employed as an effectivestrategy to enhance the available knowledge base and anchor the responses inreality by pulling additional texts from external databases. In real-worldapplications, texts are often linked through entities within a graph, such ascitations in academic papers or comments in social networks. This paperexploits these topological relationships to guide the retrieval process in RAG.Specifically, we explore two kinds of topological connections: proximity-based,focusing on closely connected nodes, and role-based, which looks at nodessharing similar subgraph structures. Our empirical research confirms theirrelevance to text relationships, leading us to develop a Topology-awareRetrieval-augmented Generation framework. This framework includes a retrievalmodule that selects texts based on their topological relationships and anaggregation module that integrates these texts into prompts to stimulate LLMsfor text generation. We have curated established text-attributed networks andconducted comprehensive experiments to validate the effectiveness of thisframework, demonstrating its potential to enhance RAG with topologicalawareness. | 尽管大型语言模型(LLM)在生成文本方面取得了显著进展,但它们通常受限于输入中的知识,容易产生不准确或虚构的内容。为了解决这些问题,检索增强生成(RAG)被用作一种有效的策略,通过从外部数据库中提取额外的文本来增强可用知识库,并将响应锚定在现实世界中。在实际应用中,文本通常通过图中的实体相互关联,例如学术论文中的引用或社交网络中的评论。本文利用这些拓扑关系来指导RAG中的检索过程。具体而言,我们探索了两类拓扑连接:基于接近度的连接,关注紧密连接的节点;以及基于角色的连接,关注共享相似子图结构的节点。我们的实证研究表明它们与文本关系的相关性,从而开发了一种拓扑感知检索增强生成框架。该框架包括一个基于拓扑关系选择文本的检索模块和一个将这些文本整合到提示中以刺激LLM进行文本生成的聚合模块。我们精心构建了现有的文本属性网络,并进行了全面的实验以验证该框架的有效性,展示了其通过拓扑感知增强RAG的潜力。 | code | 0 | |
LLM4MSR: An LLM-Enhanced Paradigm for Multi-Scenario Recommendation | Yuhao Wang, Yichao Wang, Zichuan Fu, Xiangyang Li, Wanyu Wang, Yuyang Ye, Xiangyu Zhao, Huifeng Guo, Ruiming Tang | As the demand for more personalized recommendation grows and a dramatic boomin commercial scenarios arises, the study on multi-scenario recommendation(MSR) has attracted much attention, which uses the data from all scenarios tosimultaneously improve their recommendation performance. However, existingmethods tend to integrate insufficient scenario knowledge and neglect learningpersonalized cross-scenario preferences, thus leading to suboptimal performanceand inadequate interpretability. Meanwhile, though large language model (LLM)has shown great capability of reasoning and capturing semantic information, thehigh inference latency and high computation cost of tuning hinder itsimplementation in industrial recommender systems. To fill these gaps, wepropose an effective efficient interpretable LLM-enhanced paradigm LLM4MSR inthis work. Specifically, we first leverage LLM to uncover multi-level knowledgeincluding scenario correlations and users' cross-scenario interests from thedesigned scenario- and user-level prompt without fine-tuning the LLM, thenadopt hierarchical meta networks to generate multi-level meta layers toexplicitly improves the scenario-aware and personalized recommendationcapability. Our experiments on KuaiSAR-small, KuaiSAR, and Amazon datasetsvalidate two significant advantages of LLM4MSR: (i) the effectiveness andcompatibility with different multi-scenario backbone models (achieving 1.51deployability on industrial recommender systems, and (iii) improvedinterpretability. The implemented code and data is available to easereproduction. | 随着对更个性化推荐的需求日益增长以及商业场景中的显著繁荣,多场景推荐(MSR)研究引起了广泛关注。该研究利用所有场景的数据,旨在同时提升各场景的推荐性能。然而,现有方法往往整合不足的场景知识,忽视了跨场景个性化偏好的学习,导致性能次优且解释性不足。同时,尽管大型语言模型(LLM)在推理和捕捉语义信息方面展现出强大能力,但其高推理延迟和高计算成本的调优阻碍了其在工业推荐系统中的应用。 |
为填补这些空白,本文提出了一种高效的、可解释的LLM增强范式LLM4MSR。具体而言,我们首先利用LLM在不进行微调的情况下,通过设计场景级和用户级提示,揭示包括场景关联和用户跨场景兴趣在内的多层次知识。随后,采用分层元网络生成多层次元层,以显式提升场景感知和个性化推荐能力。我们在KuaiSAR-small、KuaiSAR和Amazon数据集上的实验验证了LLM4MSR的两个显著优势:(i)有效性和与不同多场景骨干模型的兼容性(在工业推荐系统中实现1.51的部署性),以及(iii)提升的解释性。我们提供的实现代码和数据将有助于复现研究成果。|code|0| |Time-Sensitve Retrieval-Augmented Generation for Question Answering|Feifan Wu, Lingyuan Liu, Wentao He, Ziqi Liu, Zhiqiang Zhang, Haofen Wang, Meng Wang|Southeast University, Nanjing, China; Ant Group, Hangzhou, China; Southeast University & XAI Lab, Tongji University, Nanjing, China; College of Design and Innovation, Tongji University, Shanghai, China|Retrieval-augmented generation (RAG) enhances large language models (LLMs) by accessing external data sources, offering a promising way to improve accuracy and reliability. Despite its potential, conventional retrievers encounter bias and flaws with time-sensitive queries. In this paper, a benchmark query dataset is constructed to retrieve documents containing time-evolving facts, and the results show that current embedding-based similarity-matching methods struggle to handle queries with explicit temporal constraints. Therefore, we propose a novel approach that integrates supervised contrastive learning with tailored negative sample pairs for temporal constraints to train the retriever of an RAG system, along with query-side fine-tuning and routing techniques. Experimental results show that our approach significantly enhances the retriever performance of time-sensitive queries while ensuring the effectiveness of general queries. We will make the code and dataset publicly available at https://github.com/suzhou-22/TS-Retriever.|检索增强生成(RAG)通过访问外部数据源,增强了大型语言模型(LLMs)的能力,为提高准确性和可靠性提供了一种有前景的方式。尽管其潜力巨大,传统的检索器在处理与时间敏感的查询时仍面临偏差和缺陷。本文构建了一个基准查询数据集,用于检索包含时间演化事实的文档,结果显示当前基于嵌入的相似性匹配方法在处理带有明确时间约束的查询时表现不佳。因此,我们提出了一种新方法,将监督对比学习与针对时间约束定制的负样本对相结合,用于训练RAG系统的检索器,同时结合查询端的微调和路由技术。实验结果表明,我们的方法显著提升了对时间敏感查询的检索器性能,同时确保了通用查询的有效性。我们将在https://github.com/suzhou-22/TS-Retriever公开代码和数据集。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Time-Sensitve+Retrieval-Augmented+Generation+for+Question+Answering)|0| |Bridge the Gap between Past and Future: Siamese Model Optimization for Context-Aware Document Ranking|Songhao Wu, Quan Tu, Mingjie Zhong, Hong Liu, Jia Xu, Jinjie Gu, Rui Yan|Ant Group, Hangzhou, China; Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China|In the realm of information retrieval, users often engage in multi-turn interactions with search engines to acquire information, leading to the formation of sequences of user feedback behaviors. Leveraging the session context has proven to be beneficial for inferring user search intent and document ranking. A multitude of approaches have been proposed to exploit in-session context for improved document ranking. Despite these advances, the limitation of historical session data for capturing evolving user intent remains a challenge. In this work, we explore the integration of future contextual information into the session context to enhance document ranking. We present the siamese model optimization framework, comprising a history-conditioned model and a future-aware model. The former processes only the historical behavior sequence, while the latter integrates both historical and anticipated future behaviors. Both models are trained collaboratively using the supervised labels and pseudo labels predicted by the other. The history-conditioned model, referred to as ForeRanker, progressively learns future-relevant information to enhance ranking, while it singly uses historical session at inference time. To mitigate inconsistencies during training, we introduce the peer knowledge distillation method with a dynamic gating mechanism, allowing models to selectively incorporate contextual information. Experimental results on benchmark datasets demonstrate the effectiveness of our ForeRanker, showcasing its superior performance compared to existing methods.|在信息检索领域,用户通常通过与搜索引擎的多轮交互来获取信息,从而形成一系列用户反馈行为。利用会话上下文已被证明有助于推断用户搜索意图和文档排序。已经提出了多种方法来利用会话内上下文以改进文档排序。尽管取得了这些进展,但捕捉用户意图演变的历史会话数据的局限性仍然是一个挑战。在这项工作中,我们探讨了将会话上下文与未来上下文信息相结合以增强文档排序的方法。我们提出了孪生模型优化框架,包括一个历史条件模型和一个未来感知模型。前者仅处理历史行为序列,而后者则结合了历史和预期的未来行为。两个模型通过监督标签和另一个模型预测的伪标签进行协同训练。历史条件模型,称为ForeRanker,逐步学习与未来相关的信息以提升排序,而在推理时单独使用历史会话。为了减少训练过程中的不一致性,我们引入了具有动态门控机制的同行知识蒸馏方法,使模型能够选择性地整合上下文信息。在基准数据集上的实验结果证明了我们ForeRanker的有效性,展示了其相对于现有方法的优越性能。|code|0| |Federated Node Classification over Distributed Ego-Networks with Secure Contrastive Embedding Sharing|Han Xie, Li Xiong, Carl Yang|Emory University, Atlanta, GA, USA|Federated learning on graphs (a.k.a., federated graph learning - FGL) has recently received increasing attention due to its capacity to enable collaborative learning over distributed graph datasets without compromising local clients' data privacy. In previous works, clients of FGL typically represent institutes or organizations that possess sets of entire graphs (e.g., molecule graphs in biochemical research) or parts of a larger graph (e.g., sub-user networks of e-commerce platforms). However, another natural paradigm exists where clients act as remote devices retaining the graph structures of local neighborhoods centered around the device owners (i.e., ego-networks), which can be modeled for specific graph applications such as user profiling on social ego-networks and infection prediction on contact ego-networks. FGL in such novel yet realistic ego-network settings faces the unique challenge of incomplete neighborhood information for non-ego local nodes since they likely appear and have different sets of neighbors in multiple ego-networks. To address this challenge, we propose an FGL method for distributed ego-networks in which clients obtain complete neighborhood information of local nodes through sharing node embeddings with other clients. A contrastive learning mechanism is proposed to bridge the gap between local and global node embeddings and stabilize the local training of graph neural network models, while a secure embedding sharing protocol is employed to protect individual node identity and embedding privacy against the server and other clients. Comprehensive experiments on various distributed ego-network datasets successfully demonstrate the effectiveness of our proposed embedding sharing method on top of different federated model sharing frameworks, and we also provide discussions on the potential efficiency and privacy drawbacks of the method as well as their future mitigation.|图上的联邦学习(又称联邦图学习 - FGL)近年来因其能够在不损害本地客户端数据隐私的情况下实现分布式图数据集上的协作学习而受到越来越多的关注。在以往的研究中,FGL的客户端通常代表拥有完整图集(例如,生物化学研究中的分子图)或大型图的一部分(例如,电子商务平台的子用户网络)的机构或组织。然而,还存在另一种自然范式,其中客户端作为远程设备,保留以设备所有者为中心的本地邻域的图结构(即自我网络),这些图结构可以用于特定的图应用,例如社交自我网络上的用户画像和接触自我网络上的感染预测。在这种新颖且现实的自我网络设置中,FGL面临一个独特的挑战,即非自我本地节点的邻域信息不完整,因为它们可能出现在多个自我网络中并具有不同的邻居集合。为了应对这一挑战,我们提出了一种针对分布式自我网络的FGL方法,其中客户端通过与其他客户端共享节点嵌入来获取本地节点的完整邻域信息。我们提出了一种对比学习机制,以弥合本地和全局节点嵌入之间的差距,并稳定图神经网络模型的本地训练,同时采用了一种安全的嵌入共享协议,以保护服务器和其他客户端对个体节点身份和嵌入隐私的访问。在各种分布式自我网络数据集上的综合实验成功证明了我们提出的嵌入共享方法在不同联邦模型共享框架上的有效性,并且我们还讨论了该方法潜在的效率和隐私缺陷及其未来的缓解措施。|code|0| |UniMPC: Towards a Unified Framework for Multi-Party Conversations|Yunhe Xie, Chengjie Sun, Yifan Liu, Zhenzhou Ji, Bingquan Liu|Faculty of Computing, Harbin Institute of Technology, Harbin, China|The Multi-Party Conversation (MPC) system has gained attention for its relevance in modern communication. Recent work has focused on developing specialized models for different MPC subtasks, improving state-of-the-art (SOTA) performance. However, since MPC demands often arise collaboratively, managing multiple specialized models is impractical. Additionally, dialogue evolves through diverse meta-information, where knowledge from specific subtasks can influence others. To address this, we propose UniMPC, a unified framework that consolidates common MPC subtasks. UniMPC uses a graph network with utterance nodes, a global node for combined local and global information, and two adaptable free nodes. It also incorporates discourse parsing to enhance model updates. We introduce MPCEval, a new benchmark for evaluating MPC systems. Experiments show UniMPC achieves over 95% of SOTA performance across all subtasks, with some surpassing existing SOTA, highlighting the effectiveness of the global node, free nodes, and dynamic discourse-aware graphs.|多方对话(Multi-Party Conversation, MPC)系统因其与现代通信的相关性而受到关注。近期研究致力于为不同的MPC子任务开发专用模型,以提升最先进(SOTA)的性能。然而,由于MPC需求通常是协同产生的,管理多个专用模型并不现实。此外,对话通过多样化的元信息演变,特定子任务的知识可以影响其他子任务。为了解决这一问题,我们提出了UniMPC,这是一个整合常见MPC子任务的统一框架。UniMPC采用了一个图网络,其中包括话语节点、一个用于结合局部和全局信息的全局节点以及两个可适应的自由节点。它还集成了话语解析以增强模型更新。我们引入了MPCEval,一个新的用于评估MPC系统的基准。实验表明,UniMPC在所有子任务上达到了超过95%的SOTA性能,其中一些子任务超过了现有的SOTA,突显了全局节点、自由节点和动态话语感知图的有效性。|code|0| |AlignGroup: Learning and Aligning Group Consensus with Member Preferences for Group Recommendation|Jinfeng Xu, Zheyu Chen, Jinze Li, Shuo Yang, Hewei Wang, Edith C. H. Ngai||Group activities are important behaviors in human society, providing personalized recommendations for groups is referred to as the group recommendation task. Existing methods can usually be categorized into two strategies to infer group preferences: 1) determining group preferences by aggregating members' personalized preferences, and 2) inferring group consensus by capturing group members' coherent decisions after common compromises. However, the former would suffer from the lack of group-level considerations, and the latter overlooks the fine-grained preferences of individual users. To this end, we propose a novel group recommendation method AlignGroup, which focuses on both group consensus and individual preferences of group members to infer the group decision-making. Specifically, AlignGroup explores group consensus through a well-designed hypergraph neural network that efficiently learns intra- and inter-group relationships. Moreover, AlignGroup innovatively utilizes a self-supervised alignment task to capture fine-grained group decision-making by aligning the group consensus with members' common preferences. Extensive experiments on two real-world datasets validate that our AlignGroup outperforms the state-of-the-art on both the group recommendation task and the user recommendation task, as well as outperforms the efficiency of most baselines.|群体活动在人类社会中是重要的行为,为群体提供个性化推荐被称为群体推荐任务。现有的方法通常可以分为两种策略来推断群体偏好:1) 通过聚合成员的个性化偏好来确定群体偏好,2) 通过捕捉群体成员在共同妥协后的连贯决策来推断群体共识。然而,前者缺乏对群体层面的考虑,而后者则忽略了用户的细粒度偏好。为此,我们提出了一种新的群体推荐方法AlignGroup,该方法同时关注群体共识和群体成员的个体偏好,以推断群体决策。具体而言,AlignGroup通过一个精心设计的高阶图神经网络来探索群体共识,该网络有效地学习群体内和群体间的关系。此外,AlignGroup创新性地利用自监督对齐任务,通过将群体共识与成员的共同偏好对齐来捕捉细粒度的群体决策。在两个真实世界数据集上的广泛实验验证了我们的AlignGroup在群体推荐任务和用户推荐任务上均优于现有技术水平,并且在效率上也优于大多数基线方法。|code|0| |Shape-aware Graph Spectral Learning|Junjie Xu, Enyan Dai, Dongsheng Luo, Xiang Zhang, Suhang Wang||Spectral Graph Neural Networks (GNNs) are gaining attention for their abilityto surpass the limitations of message-passing GNNs. They rely on supervisionfrom downstream tasks to learn spectral filters that capture the graph signal'suseful frequency information. However, some works empirically show that thepreferred graph frequency is related to the graph homophily level. Thisrelationship between graph frequency and graphs with homophily/heterophily hasnot been systematically analyzed and considered in existing spectral GNNs. Tomitigate this gap, we conduct theoretical and empirical analyses revealing apositive correlation between low-frequency importance and the homophily ratio,and a negative correlation between high-frequency importance and the homophilyratio. Motivated by this, we propose shape-aware regularization on a NewtonInterpolation-based spectral filter that can (i) learn an arbitrary polynomialspectral filter and (ii) incorporate prior knowledge about the desired shape ofthe corresponding homophily level. Comprehensive experiments demonstrate thatNewtonNet can achieve graph spectral filters with desired shapes and superiorperformance on both homophilous and heterophilous datasets.|谱图神经网络(GNNs)因其能够超越消息传递GNN的局限性而受到关注。它们依赖于下游任务的监督来学习捕捉图信号有用频谱信息的谱滤波器。然而,一些研究表明,优选的图频谱与图同质性水平相关。这种图频谱与具有同质性/异质性的图之间的关系尚未在现有的谱GNN中得到系统的分析和考虑。为了弥补这一差距,我们进行了理论和实证分析,揭示了低频重要性与同质性比率之间的正相关关系,以及高频重要性与同质性比率之间的负相关关系。受此启发,我们提出了一种基于牛顿插值的谱滤波器的形状感知正则化方法,该方法能够(i)学习任意多项式谱滤波器,(ii)结合关于所需同质性水平的先验知识。综合实验表明,NewtonNet能够实现具有所需形状的图谱滤波器,并在同质性和异质性数据集上均表现出优越的性能。|code|0| |Topological Anonymous Walk Embedding: A New Structural Node Embedding Approach|Yuchen Yan, Yongyi Hu, Qinghai Zhou, Shurang Wu, Dingsu Wang, Hanghang Tong|Shanghai Jiao Tong University, Minhang, Shanghai, China; University of Science and Technology of China, Hefei, Anhui, China; University of Illinois at Urbana-Champaign, Urbana, IL, USA|Network embedding is a commonly used technique in graph mining and plays an important role in a variety of applications. Most network embedding works can be categorized into positional node embedding methods and target at capturing the proximity/relative position of node pairs. Recently, structural node embedding has attracted tremendous research interest, which is intended to perceive the local structural information of node, i.e., nodes can share similar local structures in different positions of graphs. Although numerous structural node embedding methods are designed to encode such structural information, most, if not all, of these methods cannot simultaneously achieve the following three desired properties: (1) bijective mapping between embedding and local structure of node; (2) inductive capability; and (3) good interpretability of node embedding. To address this challenge, in this paper, we propose a novel structural node embedding algorithm named topological anonymous walk embedding (TAWE). Specifically, TAWE creatively integrates anonymous walk and breadth-first search (BFS) to construct the bijective mapping between node embedding and local structure of node. In addition, TAWE possesses inductive capability and good interpretability of node embedding. Experimental results on both synthetic and real-world datasets demonstrate the effectiveness of the proposed TAWE algorithm in both structural node classification task and structural node clustering task.|网络嵌入是图挖掘中常用的一种技术,在多种应用中发挥着重要作用。大多数网络嵌入工作可以归类为位置节点嵌入方法,旨在捕捉节点对的接近度/相对位置。最近,结构节点嵌入引起了极大的研究兴趣,其目的是感知节点的局部结构信息,即节点可以在图的不同位置共享相似的局部结构。尽管设计了大量结构节点嵌入方法来编码这种结构信息,但大多数(如果不是全部)这些方法无法同时实现以下三个期望属性:(1)嵌入与节点局部结构之间的双射映射;(2)归纳能力;(3)节点嵌入的良好可解释性。为了解决这一挑战,本文提出了一种名为拓扑匿名游走嵌入(Topological Anonymous Walk Embedding, TAWE)的新型结构节点嵌入算法。具体来说,TAWE创新地将匿名游走与广度优先搜索(BFS)相结合,构建了节点嵌入与其局部结构之间的双射映射。此外,TAWE具有归纳能力和良好的节点嵌入可解释性。在合成数据集和真实世界数据集上的实验结果表明,所提出的TAWE算法在结构节点分类任务和结构节点聚类任务中均表现出了有效性。|code|0| |Spectral-Aware Augmentation for Enhanced Graph Representation Learning|Kaiqi Yang, Haoyu Han, Wei Jin, Hui Liu|Michigan State University; Emory University|Graph Contrastive Learning (GCL) has demonstrated remarkable effectiveness in learning representations on graphs in recent years. To generate ideal augmentation views, the augmentation generation methods should preserve essential information while discarding less relevant details for downstream tasks. However, current augmentation methods usually involve random topology corruption in the spatial domain, which fails to adequately address information spread across different frequencies in the spectral domain. Our preliminary study highlights this issue, demonstrating that spatial random perturbations impact all frequency bands almost uniformly. Given that task-relevant information typically resides in specific spectral regions that vary across graphs, this one-size-fits-all approach can pose challenges. We argue that indiscriminate spatial random perturbation might unintentionally weaken task-relevant information, reducing its effectiveness. To tackle this challenge, we propose applying perturbations selectively, focusing on information specific to different frequencies across diverse graphs. In this paper, we present GASSER, a model that applies tailored perturbations to specific frequencies of graph structures in the spectral domain, guided by spectral hints. Through extensive experimentation and theoretical analysis, we demonstrate that the augmentation views generated by GASSER are adaptive, controllable, and intuitively aligned with the homophily ratios and spectrum of graph structures.|图对比学习(GCL)近年来在图的表示学习中展示了显著的有效性。为了生成理想的增强视图,增强生成方法应在保留关键信息的同时,去除与下游任务不相关的细节。然而,当前的增强方法通常涉及空间域中的随机拓扑破坏,这未能充分解决频谱域中不同频率上的信息分布问题。我们的初步研究表明了这一问题,表明空间随机扰动几乎均匀地影响所有频段。鉴于任务相关的信息通常位于特定图谱区域,这些区域在不同图中有所不同,这种一刀切的方法可能会带来挑战。我们认为,不分青红皂白的空间随机扰动可能会无意中削弱与任务相关的信息,从而降低其有效性。为了应对这一挑战,我们提出有选择地应用扰动,专注于不同频率上特定于不同图的信息。本文中,我们介绍了GASSER模型,该模型在频谱域中根据频谱提示对图结构的特定频率应用定制的扰动。通过广泛的实验和理论分析,我们证明GASSER生成的增强视图具有自适应性、可控性,并且直观地与图结构的同质性比率和频谱相一致。|code|0| |Efficient Pruned Top-K Subgraph Matching with Topology-Aware Bounds|Linglin Yang, Yuqi Zhou, Yue Pang, Lei Zou|Peking University, Beijing, China|Given a query graph, top-k subgraph matching finds up to k matches in a data graph with the highest scores according to a user-defined scoring function. It has wide applications across many fields, including knowledge graphs and social networks. Due to the enormous search space, existing methods are not efficient enough on large graphs. In this paper, we propose PTAB, an efficient algorithm for top-k subgraph matching. It traverses an efficiently pruned search space by topology-aware sub-space score upper bounds computed from a novel hop index, which stores the range of node properties in a constrained multi-hop neighborhood of each node. Additionally, PTAB integrates a cost-aware root selection strategy, which chooses query nodes leading to a search process that utilizes the pruning power of the hop index as much as possible. Furthermore, we use a novel edge-cut strategy to handle general query graphs with cycles. Experimental results on real and synthetic datasets demonstrate that our method outperforms existing methods.|给定一个查询图,top-k子图匹配任务是在数据图中找到至多k个匹配度最高的子图,匹配度由用户定义的评分函数决定。该任务在多个领域中有广泛应用,包括知识图谱和社会网络。由于搜索空间巨大,现有方法在大规模图上效率不足。本文提出了PTAB,一种高效的top-k子图匹配算法。它通过一种新颖的跳跃索引计算出的拓扑感知子空间评分上界,遍历经过高效剪枝的搜索空间,该跳跃索引存储了每个节点在受限多跳邻域内节点属性的范围。此外,PTAB集成了一个成本感知的根节点选择策略,选择能够引导搜索过程尽可能利用跳跃索引剪枝能力的查询节点。我们还采用了一种新颖的边切策略来处理包含循环的一般查询图。在真实和合成数据集上的实验结果表明,我们的方法优于现有方法。|code|0| |A New Framework for Evaluating Faithfulness of Video Moment Retrieval against Multiple Distractors|Nakyeong Yang, Minsung Kim, Seunghyun Yoon, Joongbo Shin, Kyomin Jung|Seoul National University; Adobe Research; LG AI Research|With the explosion of multimedia content, video moment retrieval (VMR), which aims to detect a video moment that matches a given text query from a video, has been studied intensively as a critical problem. However, the existing VMR framework evaluates video moment retrieval performance, assuming that a video is given, which may not reveal whether the models exhibit overconfidence in the falsely given video. In this paper, we propose the MVMR (Massive Videos Moment Retrieval for Faithfulness Evaluation) task that aims to retrieve video moments within a massive video set, including multiple distractors, to evaluate the faithfulness of VMR models. For this task, we suggest an automated massive video pool construction framework to categorize negative (distractors) and positive (false-negative) video sets using textual and visual semantic distance verification methods. We extend existing VMR datasets using these methods and newly construct three practical MVMR datasets. To solve the task, we further propose a strong informative sample-weighted learning method, CroCs, which employs two contrastive learning mechanisms: (1) weakly-supervised potential negative learning and (2) cross-directional hard-negative learning. Experimental results on the MVMR datasets reveal that existing VMR models are easily distracted by the misinformation (distractors), whereas our model shows significantly robust performance, demonstrating that CroCs is essential to distinguishing positive moments against distractors. Our code and datasets are publicly available: https://github.com/yny0506/Massive-Videos-Moment-Retrieval.|随着多媒体内容的激增,视频时刻检索(VMR)——旨在从视频中检测与给定文本查询匹配的视频片段——已被广泛研究为一个关键问题。然而,现有的VMR框架在评估视频时刻检索性能时,假设视频是已知的,这可能无法揭示模型是否在错误提供的视频上表现出过度自信。在本文中,我们提出了MVMR(大规模视频时刻检索以评估模型忠实度)任务,该任务旨在从包含多个干扰项的大规模视频集中检索视频片段,以评估VMR模型的忠实度。为此任务,我们建议了一种自动大规模视频池构建框架,通过文本和视觉语义距离验证方法来分类负样本(干扰项)和正样本(假负例)视频集。我们使用这些方法扩展了现有的VMR数据集,并新构建了三个实用的MVMR数据集。为了解决该任务,我们进一步提出了一种强信息样本加权学习方法CroCs,该方法采用两种对比学习机制:(1)弱监督潜在负样本学习;(2)跨方向硬负样本学习。在MVMR数据集上的实验结果表明,现有VMR模型容易被错误信息(干扰项)所迷惑,而我们的模型表现出显著的鲁棒性能,证明了CroCs在区分正样本时刻与干扰项方面的重要性。我们的代码和数据集已公开发布:https://github.com/yny0506/Massive-Videos-Moment-Retrieval。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+New+Framework+for+Evaluating+Faithfulness+of+Video+Moment+Retrieval+against+Multiple+Distractors)|0| |Attacking Visually-aware Recommender Systems with Transferable and Imperceptible Adversarial Styles|Shiyi Yang, Chen Wang, Xiwei Xu, Liming Zhu, Lina Yao|Data61, CSIRO & The University of New South Wales, Eveleigh, Australia; The University of New South Wales & Data61, CSIRO, Sydney, Australia|The inclusion of the images opens up a security vulnerability of visually-aware recommender systems (VARSs). It can be exploited by unscrupulous parties to upload well-crafted adversarial images for certain malicious purposes (e.g., promoting their own products for profits). Some studies have focused on attacking VARSs to gain insights into their robustness, while they are still far from practical, i.e., the attacks often 1) lack diversity in perturbations, 2) are easily perceived and 3) have limited transferability, which may lead to overestimation of defenses in practice. To tackle the problems, we propose to perturb the style of the product, which is an unnoticeable but important property of visual recommendations. Specifically, we propose a novel Style perturbation-based Practical Attack Framework (SPAF). Unlike existing attacks that change pixels within l∞ -norm constraints, SPAF interferes with styles in latent feature space so that the attack becomes unbounded in the pixel space to reflect possible actual perturbations. SPAF formulates attack objectives as an optimization problem and adopts an adaptive adversarial style transfer network to solve it so that transferable and imperceptible attacks can be generated. Comprehensive experiments on real-world datasets demonstrate that SPAF significantly outperforms state-of-the-art attacks.|图像的引入为视觉感知推荐系统(VARSs)带来了一个安全漏洞。不法分子可以利用这一漏洞上传精心制作的对抗性图像以达到某些恶意目的(例如,推广自己的产品以获取利润)。一些研究专注于攻击VARSs以了解其鲁棒性,但这些攻击方法在实际应用中仍存在不足,主要表现为:1) 扰动缺乏多样性,2) 容易被察觉,3) 转移性有限,这可能导致对防御措施的实际效果产生高估。为解决这些问题,我们提出对产品风格进行扰动,这是一种不易察觉但影响视觉推荐的重要属性。具体而言,我们提出了一种基于风格扰动的实用攻击框架(SPAF)。与现有在l∞范数约束下改变像素的攻击方法不同,SPAF在潜在特征空间中干扰风格,使得攻击在像素空间中变得无边界,以反映可能的实际扰动。SPAF将攻击目标形式化为一个优化问题,并采用自适应对抗风格转移网络来解决该问题,从而生成可转移且不可察觉的攻击。在真实世界数据集上的全面实验表明,SPAF显著优于现有的最先进攻击方法。|code|0| |A Cause-Focused Query Optimizer Alert System|Runfan Ye, Zibo Liang, Xu Chen, Shuncheng Liu, Kai Zheng||A series of studies apply machine learning to assist cost-based query optimizers in DBMS, emphasizing incorporating uncertainty predictions to guide decision-making. While these approaches have demonstrated advancement in some benchmarks, their drawbacks, such as unstable performance, stem from the inherent challenges of using machine learning models to predict the cost of execution plans and the lack of exploration of the intrinsic characteristics of suboptimal plans. In this paper, we introduce an alert system for query optimization, which is built upon cost models to reduce the selection of regressed plans. The key insight is that there are differences in the predictive uncertainty that lead to query optimization and the regression of execution plans. We investigate the causes of these differences in uncertainty and design a discriminator to filter out execution plans with higher risks of regression. The alert system can be integrated with various cost models, enhancing the robustness of query optimizers. In our experiments, the system further reduces execution time by 20% compared to learned optimizers. Meanwhile, the proportion of optimized queries reduced by the alert system is just 15% of the proportion of regressed queries diminished.|一系列研究将机器学习应用于数据库管理系统(DBMS)中的基于成本的查询优化器,强调了将不确定性预测纳入决策过程的重要性。尽管这些方法在一些基准测试中展示了进步,但其缺点,如性能不稳定,源于使用机器学习模型预测执行计划成本的固有挑战以及对次优计划内在特性的探索不足。本文介绍了一种查询优化预警系统,该系统基于成本模型来减少选择退化的执行计划。关键见解在于,导致查询优化和执行计划退化的预测不确定性之间存在差异。我们研究了这些不确定性差异的原因,并设计了一个判别器来筛选出更有可能退化的执行计划。该预警系统可以与各种成本模型集成,增强了查询优化器的鲁棒性。在我们的实验中,该系统相比学习型优化器进一步减少了20%的执行时间。同时,预警系统减少的优化查询比例仅为减少的退化查询比例的15%。|code|0| |DAMe: Personalized Federated Social Event Detection with Dual Aggregation Mechanism|Xiaoyan Yu, Yifan Wei, Pu Li, Shuaishuai Zhou, Hao Peng, Li Sun, Liehuang Zhu, Philip S. Yu||Training social event detection models through federated learning (FedSED) aims to improve participants' performance on the task. However, existing federated learning paradigms are inadequate for achieving FedSED's objective and exhibit limitations in handling the inherent heterogeneity in social data. This paper proposes a personalized federated learning framework with a dual aggregation mechanism for social event detection, namely DAMe. We present a novel local aggregation strategy utilizing Bayesian optimization to incorporate global knowledge while retaining local characteristics. Moreover, we introduce a global aggregation strategy to provide clients with maximum external knowledge of their preferences. In addition, we incorporate a global-local event-centric constraint to prevent local overfitting and “client-drift”. Experiments within a realistic simulation of a natural federated setting, utilizing six social event datasets spanning six languages and two social media platforms, along with an ablation study, have demonstrated the effectiveness of the proposed framework. Further robustness analyses have shown that DAMe is resistant to injection attacks.|通过联邦学习(FedSED)训练社交事件检测模型的目的是提高参与者在任务中的表现。然而,现有的联邦学习范式不足以实现FedSED的目标,并且在处理社交数据固有的异质性方面存在局限性。本文提出了一种具有双重聚合机制的个性化联邦学习框架,用于社交事件检测,即DAMe。我们提出了一种新颖的局部聚合策略,利用贝叶斯优化来融合全局知识同时保留局部特征。此外,我们引入了一种全局聚合策略,以向客户端提供与其偏好相关的最大外部知识。此外,我们结合了一个全局-局部以事件为中心的约束,以防止局部过拟合和“客户端漂移”。在一个现实的联邦设置模拟实验中,使用了跨越六种语言和两个社交媒体平台的六个社交事件数据集,以及一项消融研究,已证明了所提出框架的有效性。进一步的鲁棒性分析表明,DAMe对注入攻击具有抵抗力。|code|0| |Transformer Based Bayesian Network Embedding for Efficient Multiple Probabilistic Inferences|Kun Yue, Zhiwei Qi, Liang Duan|Chengdu Univ Informat Technol, Sch Software Engn, Chengdu, Peoples R China; Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Kunming, Yunnan, Peoples R China; Yunnan Univ, Sch Informat Sci & Engn, Kunming, Yunnan, Peoples R China|Bayesian network (BN) is a well adopted framework for representing and inferring uncertain knowledge. By the existing methods, multiple probabilistic inferences on the same BN are often fulfilled one by one via repeated searches and calculations of probabilities. However, lots of intermediate results of probability calculations cannot be shared and reused among different probabilistic inferences. It is necessary to improve the overall efficiency of multiple probabilistic inferences on the same BN by incorporating an easy-to-calculate representation of BN and an easy-to-reuse technique for common calculations in multiple inferences. In this paper, we first propose the method of Bayesian network embedding to generate the easy-to-reuse node embeddings. Specifically, we transform BN into the point mutual information (PMI) matrix to simultaneously preserve the directed acyclic graph (DAG) and conditional probability tables (CPTs). Then, we give the singular value decomposition (SVD) based method to factorize the PMI matrix for generating node embeddings. Secondly, we propose a novel method of random sampling to make multiple probabilistic inferences via similarity calculation between node embeddings. Experimental results show that the runtime of our proposed BNERS performing 10 times of inferences is 30% faster than Gibbs sampling (GS) and 50% faster than forward sampling (FS) on LINK BN (very large network), while maintaining almost the same results as GS and FS.|贝叶斯网络(BN)是一个广泛采用的框架,用于表示和推断不确定的知识。现有的方法通常通过重复搜索和概率计算来逐一完成同一贝叶斯网络上的多个概率推断。然而,不同概率推断之间无法共享和重用大量的概率计算中间结果。为了提高同一贝叶斯网络上多个概率推断的整体效率,有必要结合易于计算的贝叶斯网络表示和易于在多次推断中重用的通用计算技术。本文首先提出了贝叶斯网络嵌入方法,以生成易于重用的节点嵌入。具体而言,我们将贝叶斯网络转换为点互信息(PMI)矩阵,以同时保留有向无环图(DAG)和条件概率表(CPTs)。接着,我们给出了基于奇异值分解(SVD)的方法,用于对PMI矩阵进行分解以生成节点嵌入。其次,我们提出了一种新的随机采样方法,通过节点嵌入之间的相似度计算来进行多个概率推断。实验结果表明,在我们提出的BNERS方法中,进行10次推断的运行时间比吉布斯采样(GS)快30%,比前向采样(FS)快50%,并且在LINK BN(非常大的网络)上保持了与GS和FS几乎相同的结果。|code|0| |Do We Really Need Graph Convolution During Training? Light Post-Training Graph-ODE for Efficient Recommendation|Weizhi Zhang, Liangwei Yang, Zihe Song, Henry Peng Zou, Ke Xu, Liancheng Fang, Philip S. Yu||The efficiency and scalability of graph convolution networks (GCNs) in training recommender systems (RecSys) have been persistent concerns, hindering their deployment in real-world applications. This paper presents a critical examination of the necessity of graph convolutions during the training phase and introduces an innovative alternative: the Light Post-Training Graph Ordinary-Differential-Equation (LightGODE). Our investigation reveals that the benefits of GCNs are more pronounced during testing rather than training. Motivated by this, LightGODE utilizes a novel post-training graph convolution method that bypasses the computation-intensive message passing of GCNs and employs a non-parametric continuous graph ordinary-differential-equation (ODE) to dynamically model node representations. This approach drastically reduces training time while achieving fine-grained post-training graph convolution to avoid the distortion of the original training embedding space, termed the embedding discrepancy issue. We validate our model across several real-world datasets of different scales, demonstrating that LightGODE not only outperforms GCN-based models in terms of efficiency and effectiveness but also significantly mitigates the embedding discrepancy commonly associated with deeper graph convolution layers. Our LightGODE challenges the prevailing paradigms in RecSys training and suggests re-evaluating the role of graph convolutions, potentially guiding future developments of efficient large-scale graph-based RecSys.|图卷积网络(GCNs)在训练推荐系统(RecSys)中的效率和可扩展性一直是持续关注的问题,阻碍了其在实际应用中的部署。本文对训练阶段图卷积的必要性进行了批判性审视,并提出了一种创新替代方案:轻量级后训练图常微分方程(LightGODE)。我们的研究揭示,GCNs的优势在测试阶段比在训练阶段更为显著。基于此,LightGODE采用了一种新颖的后训练图卷积方法,该方法绕过了GCNs计算密集的消息传递过程,并采用非参数连续图常微分方程(ODE)来动态建模节点表示。这种方法大幅减少了训练时间,同时实现了细粒度的后训练图卷积,以避免原始训练嵌入空间的扭曲,即所谓的嵌入差异问题。我们在多个不同规模的实际数据集上验证了模型的有效性,结果表明LightGODE不仅在效率和效果上优于基于GCN的模型,而且显著缓解了深层图卷积层常见的嵌入差异问题。我们的LightGODE挑战了当前推荐系统训练的主流范式,并建议重新评估图卷积的作用,可能为未来高效大规模基于图的推荐系统的发展提供指导。|code|0| |ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems|Yi Zhang, Ruihong Qiu, Jiajun Liu, Sen Wang||Offline reinforcement learning (RL) is an effective tool for real-world recommender systems with its capacity to model the dynamic interest of users and its interactive nature. Most existing offline RL recommender systems focus on model-based RL through learning a world model from offline data and building the recommendation policy by interacting with this model. Although these methods have made progress in the recommendation performance, the effectiveness of model-based offline RL methods is often constrained by the accuracy of the estimation of the reward model and the model uncertainties, primarily due to the extreme discrepancy between offline logged data and real-world data in user interactions with online platforms. To fill this gap, a more accurate reward model and uncertainty estimation are needed for the model-based RL methods. In this paper, a novel model-based Reward Shaping in Offline Reinforcement Learning for Recommender Systems, ROLeR, is proposed for reward and uncertainty estimation in recommendation systems. Specifically, a non-parametric reward shaping method is designed to refine the reward model. In addition, a flexible and more representative uncertainty penalty is designed to fit the needs of recommendation systems. Extensive experiments conducted on four benchmark datasets showcase that ROLeR achieves state-of-the-art performance compared with existing baselines. The source code can be downloaded at https://github.com/ArronDZhang/ROLeR.|离线强化学习(RL)凭借其对用户动态兴趣的建模能力和交互特性,成为现实世界推荐系统的有效工具。现有的多数离线RL推荐系统侧重于基于模型的RL方法,即通过从离线数据中学习世界模型,并通过与该模型交互来构建推荐策略。尽管这些方法在推荐性能上取得了进展,但基于模型的离线RL方法的有效性往往受限于奖励模型估计的准确性和模型不确定性,主要原因在于离线日志数据与用户在在线平台上的真实交互数据之间存在极大的差异。为填补这一差距,基于模型的RL方法需要更准确的奖励模型和不确定性估计。本文提出了一种新颖的基于模型的离线强化学习推荐系统奖励塑造方法——ROLeR,用于推荐系统中的奖励和不确定性估计。具体而言,设计了一种非参数的奖励塑造方法来优化奖励模型。此外,还设计了一种灵活且更具代表性的不确定性惩罚机制,以满足推荐系统的需求。在四个基准数据集上进行的大量实验表明,ROLeR相较于现有的基线方法,实现了最先进的性能。源代码可在https://github.com/ArronDZhang/ROLeR下载。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=ROLeR:+Effective+Reward+Shaping+in+Offline+Reinforcement+Learning+for+Recommender+Systems)|0| |Aligning Explanations for Recommendation with Rating and Feature via Maximizing Mutual Information|Yurou Zhao, Yiding Sun, Ruidong Han, Fei Jiang, Lu Guan, Xiang Li, Wei Lin, Weizhi Ma, Jiaxin Mao||Providing natural language-based explanations to justify recommendations helps to improve users' satisfaction and gain users' trust. However, as current explanation generation methods are commonly trained with an objective to mimic existing user reviews, the generated explanations are often not aligned with the predicted ratings or some important features of the recommended items, and thus, are suboptimal in helping users make informed decision on the recommendation platform. To tackle this problem, we propose a flexible model-agnostic method named MMI (Maximizing Mutual Information) framework to enhance the alignment between the generated natural language explanations and the predicted rating/important item features. Specifically, we propose to use mutual information (MI) as a measure for the alignment and train a neural MI estimator. Then, we treat a well-trained explanation generation model as the backbone model and further fine-tune it through reinforcement learning with guidance from the MI estimator, which rewards a generated explanation that is more aligned with the predicted rating or a pre-defined feature of the recommended item. Experiments on three datasets demonstrate that our MMI framework can boost different backbone models, enabling them to outperform existing baselines in terms of alignment with predicted ratings and item features. Additionally, user studies verify that MI-enhanced explanations indeed facilitate users' decisions and are favorable compared with other baselines due to their better alignment properties.|提供基于自然语言的解释以证明推荐理由,有助于提升用户的满意度并赢得用户的信任。然而,当前的解释生成方法通常以模仿现有用户评论为目标进行训练,导致生成的解释往往与预测的评分或推荐项目的重要特征不一致,从而在帮助用户在推荐平台上做出明智决策方面表现不佳。为解决这一问题,我们提出了一种灵活的、与模型无关的方法,名为MMI(最大化互信息)框架,以增强生成的自然语言解释与预测评分或重要项目特征之间的一致性。具体而言,我们建议使用互信息(MI)作为一致性的度量,并训练一个神经互信息估计器。随后,我们将一个训练良好的解释生成模型作为基础模型,并通过强化学习对其进行进一步微调,强化学习的指导来自互信息估计器,该估计器奖励那些与预测评分或预定义的项目特征更一致的生成解释。在三个数据集上的实验表明,我们的MMI框架能够提升不同的基础模型,使其在预测评分和项目特征的一致性方面优于现有的基线模型。此外,用户研究表明,经过互信息增强的解释确实有助于用户做出决策,并且由于其更好的对齐特性,相比其他基线方法更受用户青睐。|code|0| |Interaction-level Membership Inference Attack against Recommender Systems with Long-tailed Distribution|Da Zhong, Xiuling Wang, Zhichao Xu, Jun Xu, Wendy Hui Wang|Stevens Institute of Technology, Hoboken, NJ, USA; University of Utah, Salt Lake City, UT, USA|Recommender systems (RSs) are susceptible to Interaction-level Membership Inference Attacks (IMIAs), which aim to determine whether specific user-item interactions are present in the training data of the target RS. However, existing IMIAs struggle with inferring the membership of tail interactions, i.e., the interactions involving tail items, due to the limited information available about these items. This paper introduces MINER, a new IMIA designed to enhance attack performance against RSs with long-tailed item distribution. MINER addresses the information scarcity of tail items at both the feature and sample levels. At the feature level, MINER leverages the Knowledge Graphs (KGs) to obtain the auxiliary knowledge of tail items. At the sample level, MINER designs a Bilateral-Branch Network (BBN) as the attack model. The BBN trains two branches independently, with one branch trained on interaction samples with the original long-tailed item distribution and the other on interaction samples with a more balanced item distribution. The outputs of the two branches are aggregated using a cumulative learning component. Our experimental results demonstrate that MINER significantly enhances the attack accuracy of IMIA, especially for tail interactions. Beyond attack design, we design a defense mechanism named RGL to defend against MINER. Empirical evaluations demonstrate that RGL effectively mitigates the privacy risks posed by MINER while preserving recommendation accuracy. Our code is available at https://github.com/dzhong2/MINER.|推荐系统(RSs)容易受到交互级别成员推断攻击(IMIA)的影响,这种攻击旨在确定特定用户-项目交互是否存在于目标推荐系统的训练数据中。然而,现有的IMIA在推断尾部交互(即涉及尾部项目的交互)的成员身份时遇到困难,因为这些项目的信息有限。本文介绍了MINER,这是一种新的IMIA,旨在提高对具有长尾项目分布的推荐系统的攻击性能。MINER在特征和样本两个层面上解决了尾部项目的信息稀缺问题。在特征层面上,MINER利用知识图谱(KGs)获取尾部项目的辅助知识。在样本层面上,MINER设计了一个双分支网络(BBN)作为攻击模型。BBN独立训练两个分支,其中一个分支在具有原始长尾项目分布的交互样本上训练,另一个分支在具有更平衡项目分布的交互样本上训练。两个分支的输出通过累积学习组件进行聚合。我们的实验结果表明,MINER显著提高了IMIA的攻击准确性,尤其是对尾部交互的攻击。除了攻击设计,我们还设计了一种名为RGL的防御机制来抵御MINER。实证评估表明,RGL在保持推荐准确性的同时,有效减轻了MINER带来的隐私风险。我们的代码可在https://github.com/dzhong2/MINER获取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Interaction-level+Membership+Inference+Attack+against+Recommender+Systems+with+Long-tailed+Distribution)|0| |A Power Method to Alleviate Over-smoothing for Recommendation|Peng Zhou, Yachao Cui, Han Cao|Shaanxi Normal University, Xi'an, Shaanxi, China|In recent years, graph convolution networks (GCNs) have been widely used in recommender systems due to high-order node information propagation and aggregation mechanisms. However, existing GCN-based recommender systems drop sharply in performance as the depth of the network increases. This phenomenon is called over-smoothing, which refers to the fact that the embeddings of all nodes become more similar and indistinguishable. Previous works have rarely explored over-smoothing from characteristics of the recommendation field. Specifically, we found experimentally that too many layers can lead to such large loss values that they are difficult to decrease. After theoretical analysis, we can effectively solve the problem of difficulty in decreasing the loss value by adding only a hyperparameter, called "power". This hyperparameter can effectively control the smoothness and alleviate the over-smoothing problem. Experiments on four public datasets demonstrate that this hyperparameter can effectively improve performance.|近年来,图卷积网络(GCN)由于其高阶节点信息传播和聚合机制,在推荐系统中得到了广泛应用。然而,现有的基于GCN的推荐系统在网络深度增加时性能急剧下降。这种现象被称为过平滑,即所有节点的嵌入变得更为相似且难以区分。以往的研究很少从推荐领域的特性出发探讨过平滑问题。具体来说,我们通过实验发现,过多的层数会导致损失值变得如此之大,以至于难以进一步降低。经过理论分析,我们可以通过仅添加一个称为“幂”的超参数来有效解决损失值难以降低的问题。这一超参数能有效控制平滑度并缓解过平滑问题。在四个公开数据集上的实验结果表明,该超参数能有效提升性能。|code|0| |Not All Negatives are Equally Negative: Soft Contrastive Learning for Unsupervised Sentence Representations|Haojie Zhuang, Wei Emma Zhang, Jian Yang, Weitong Chen, Quan Z. Sheng|The University of Adelaide, Adelaide, Australia; Macquarie University, Sydney, Australia|Contrastive learning has been extensively studied in sentence representation learning as it demonstrates effectiveness in various downstream applications, where the same sentence with different dropout masks (or other augmentation methods) is considered as positive pair while taking other sentences in the same mini-batch as negative pairs. However, these methods mostly treat all negative examples equally and overlook the different similarities between the negative examples and the anchors, which thus fail to capture the fine-grained semantic information of the sentences. To address this issue, we explicitly differentiate the negative examples by their similarities with the anchor, and thus propose a simple yet effective method SoftCSE that individualizes either the weight or temperature of each negative pair in the standard InfoNCE loss according to the similarities of the negative examples and the anchors. We further provide the theoretical analysis of our methods to show why and how SoftCSE works, including the optimal solution, gradient analysis and the connection with other loss. Empirically, we conduct extensive experiments on semantic textual similarity (STS) and transfer (TR) tasks, as well as text retrieval and reranking, where we observe significant performance improvements compared to strong baseline models.|对比学习在句子表征学习中得到了广泛研究,因为它在各种下游应用中展示了有效性,其中同一个句子在不同的dropout掩码(或其他增强方法)下被视为正样本对,而同一小批次中的其他句子则被视为负样本对。然而,这些方法大多将所有负样本平等对待,忽略了负样本与锚点之间的不同相似性,从而未能捕捉到句子的细粒度语义信息。为了解决这一问题,我们根据负样本与锚点的相似性显式区分负样本,并提出了一种简单而有效的方法SoftCSE,该方法根据负样本与锚点的相似性,在标准的InfoNCE损失中个性化地调整每个负样本对的权重或温度。我们进一步提供了方法的理论分析,以展示SoftCSE为何及如何工作,包括最优解、梯度分析以及与其他损失函数的联系。在实验上,我们在语义文本相似性(STS)和迁移(TR)任务以及文本检索和重排序任务中进行了广泛的实验,观察到与强基线模型相比显著的性能提升。|code|0| |Professionalism-Aware Pre-Finetuning for Profitability Ranking|ChungChi Chen, Hiroya Takamura, Ichiro Kobayashi, Yusuke Miyao|The University of Tokyo, Tokyo, Japan; National Institute of Advanced Industrial Science and Technology, Tokyo, Japan; Ochanomizu University, Tokyo, Japan|Opinion mining, specifically in the investment sector, has experienced a significant increase in interest over recent years. This paper presents a novel approach to overcome current limitations in assessing and ranking investor opinions based on profitability. The study introduces a pre-finetuning scheme to improve language models' capacity to distinguish professionalism, thus enabling ranking of all available opinions. Furthermore, the paper evaluates ranking results using traditional metrics and suggests the use of a pairwise setting for better performances over a regression setting. Lastly, our method is shown to be effective across various investor opinion tasks, encompassing both professional and amateur investors. The results indicate that this approach significantly enhances the efficiency and accuracy of opinion mining in the investment sector.|观点挖掘,特别是在投资领域,近年来引起了极大的关注。本文提出了一种新颖的方法,以克服当前在根据盈利能力评估和排序投资者观点方面的局限性。研究引入了一种预微调方案,以提高语言模型区分专业性的能力,从而实现对所有可用观点的排序。此外,本文使用传统指标评估排序结果,并建议采用成对设置以在回归设置中获得更好的性能。最后,我们的方法在各种投资者观点任务中显示出有效性,涵盖了专业和业余投资者。结果表明,这种方法显著提高了投资领域观点挖掘的效率和准确性。|code|0| |Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching|Yuyang Ding, Hanglei Hu, Jie Zhou, Qin Chen, Bo Jiang, Liang He||With the introduction of large language models (LLMs), automatic math reasoning has seen tremendous success. However, current methods primarily focus on providing solutions or using techniques like Chain-of-Thought to enhance problem-solving accuracy. In this paper, we focus on improving the capability of mathematics teaching via a Socratic teaching-based LLM (), which guides learners toward profound thinking with clarity and self-discovery via conversation. We collect and release a high-quality mathematical teaching dataset, named , which provides Socratic-style conversations of problems with extra knowledge. Also, we propose a knowledge-enhanced LLM as a strong baseline to generate reliable responses with review, guidance/heuristic, rectification, and summarization. Experimental results show the great advantages of by comparing it with several strong generative models. The codes and datasets are available on .|随着大型语言模型(LLMs)的引入,自动数学推理取得了显著的成功。然而,当前的方法主要集中在提供解决方案或使用链式思维(Chain-of-Thought)等技术来提高问题解决的准确性。在本文中,我们专注于通过基于苏格拉底教学法的LLM()来提升数学教学能力,该模型通过对话引导学习者进行清晰且自我发现的深刻思考。我们收集并发布了一个高质量的数学教学数据集,命名为,该数据集提供了包含额外知识的苏格拉底式问题对话。此外,我们提出了一个知识增强的LLM作为强基线模型,以生成包含审查、指导/启发、纠正和总结的可靠响应。实验结果表明,通过与几个强大的生成模型进行比较,显示出显著的优势。代码和数据集可在上获取。|code|0| |Towards Better Utilization of Multiple Views for Bundle Recommendation|Kyungho Kim, Sunwoo Kim, Geon Lee, Kijung Shin|KAIST, Seoul, Republic of Korea|Bundle recommender systems aim to recommend suitable collections (i.e., bundles) of items to each user, meeting their diverse needs with all-in-one convenience. Typically, they utilize three distinct types of information: user-bundle purchase interactions (U-B view), user-item purchase interactions (U-I view), and bundle-item affiliations (B-I view). Our focus is on better integrating these three perspectives (i.e., views) to deliver more accurate bundle recommendations. Our examination of different role (main or sub-views) combinations of the views reveals two key observations: (1) the best combination varies across target users (i.e., who receive recommendations), and (2) the U-I view is relatively weak as the main role. Driven by these observations, we propose PET, which synergizes the three views through (1) personalized view weighting, (2) U-I view enhancement, and (3) two-pronged contrastive learning. Our extensive experiments demonstrate that PET significantly outperforms existing methods in all popular benchmark datasets. Our code and datasets are available at https://github.com/K-Kyungho/PET.|捆绑推荐系统旨在向每位用户推荐合适的物品集合(即捆绑包),以一站式服务的便利性满足他们的多样化需求。通常,这些系统利用三种不同类型的信息:用户-捆绑包购买交互(U-B视图)、用户-物品购买交互(U-I视图)以及捆绑包-物品关联(B-I视图)。我们的重点是更好地整合这三种视角(即视图),以提供更准确的捆绑推荐。我们对不同角色(主视图或子视图)组合的视图进行了考察,发现了两个关键观察结果:(1)最佳组合因目标用户(即接收推荐的用户)而异,(2)U-I视图作为主角色时相对较弱。基于这些观察,我们提出了PET,它通过以下方式协同整合三种视图:(1)个性化视图加权,(2)U-I视图增强,以及(3)双管齐下的对比学习。我们的广泛实验表明,PET在所有流行的基准数据集上显著优于现有方法。我们的代码和数据集可在https://github.com/K-Kyungho/PET获取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Towards+Better+Utilization+of+Multiple+Views+for+Bundle+Recommendation)|0| |Improving Prompt-based News Recommendation with Individual Template and Customized Answer|Yijiang Li, Jun Wu||Prompt learning plays a key role in aligning the task of news recommendation (NR) with the Pre-trained Language Models (PLMs). However, current prompt-based NR methods utilize fixed templates and answer words, ignoring the personalization of user's demand and the diversity between news topics. To this end, we propose an Automatic Prompt based NR (AutoPNR) scheme, which automatically generates individual templates for users according to their potential interests, and customized answer words w.r.t. the topics of candidate news. Concretely, such an individual template utilizes several specific tokens to encode a user's interest extracted from her/his reading history, while a pair of customized answer words are retrieved from a large vocabulary (often existing alongside PLMs) based on the topic of candidate news. Through extensive experiments on the real-world datasets, we show that our AutoPNR works well with different PLMs, and considerably outperforms state-of-the-art NR techniques.|提示学习在将新闻推荐(NR)任务与预训练语言模型(PLMs)对齐方面起着关键作用。然而,当前基于提示的NR方法使用固定的模板和答案词,忽视了用户需求的个性化以及新闻主题之间的多样性。为此,我们提出了一种基于自动提示的NR(AutoPNR)方案,该方案根据用户的潜在兴趣自动生成个性化的模板,并根据候选新闻的主题定制答案词。具体而言,这种个性化模板使用多个特定标记来编码从用户的阅读历史中提取的兴趣,而一对定制的答案词则根据候选新闻的主题从大型词汇表(通常与PLMs一起存在)中检索。通过对真实世界数据集的广泛实验,我们展示了AutoPNR在不同PLMs上的良好表现,并显著优于现有的最先进NR技术。|code|0| |RecPrompt: A Self-tuning Prompting Framework for News Recommendation Using Large Language Models|Dairui Liu, Boming Yang, Honghui Du, Derek Greene, Neil Hurley, Aonghus Lawlor, Ruihai Dong, Irene Li|University College Dublin Insight Centre for Data Analytics; The University of Tokyo Information Technology Center|News recommendations heavily rely on Natural Language Processing (NLP) methods to analyze, understand, and categorize content, enabling personalized suggestions based on user interests and reading behaviors. Large Language Models (LLMs) like GPT-4 have shown promising performance in understanding natural language. However, the extent of their applicability to news recommendation systems remains to be validated. This paper introduces RecPrompt, the first self-tuning prompting framework for news recommendation, leveraging the capabilities of LLMs to perform complex news recommendation tasks. This framework incorporates a news recommender and a prompt optimizer that applies an iterative bootstrapping process to enhance recommendations through automatic prompt engineering. Extensive experimental results with 400 users show that RecPrompt can achieve an improvement of 3.36 MRR, 9.64 Additionally, we introduce TopicScore, a novel metric to assess explainability by evaluating LLM's ability to summarize topics of interest for users. The results show LLM's effectiveness in accurately identifying topics of interest and delivering comprehensive topic-based explanations.|新闻推荐系统严重依赖自然语言处理(NLP)方法来分析、理解和分类内容,从而根据用户的兴趣和阅读行为提供个性化的建议。像GPT-4这样的大型语言模型(LLMs)在理解自然语言方面展示了良好的性能。然而,它们在新闻推荐系统中的适用性仍需验证。本文介绍了RecPrompt,这是首个用于新闻推荐的自调优提示框架,利用LLMs的能力执行复杂的新闻推荐任务。该框架整合了一个新闻推荐器和一个提示优化器,通过自动提示工程的迭代引导过程来增强推荐效果。对400名用户进行的广泛实验结果显示,RecPrompt能够实现3.36的MRR改进和9.64的额外提升。此外,我们引入了TopicScore,一种评估LLM对用户感兴趣主题的总结能力的新指标,以评估解释性。结果表明,LLM在准确识别用户感兴趣的主题并提供全面基于主题的解释方面表现出色。|code|0| |Enhanced Privacy Bound for Shuffle Model with Personalized Privacy|Yixuan Liu, Yuhan Liu, Li Xiong, Yujie Gu, Hong Chen||The shuffle model of Differential Privacy (DP) is an enhanced privacy protocol which introduces an intermediate trusted server between local users and a central data curator. It significantly amplifies the central DP guarantee by anonymizing and shuffling the local randomized data. Yet, deriving a tight privacy bound is challenging due to its complicated randomization protocol. While most existing work are focused on unified local privacy settings, this work focuses on deriving the central privacy bound for a more practical setting where personalized local privacy is required by each user. To bound the privacy after shuffling, we first need to capture the probability of each user generating clones of the neighboring data points. Second, we need to quantify the indistinguishability between two distributions of the number of clones on neighboring datasets. Existing works either inaccurately capture the probability, or underestimate the indistinguishability between neighboring datasets. Motivated by this, we develop a more precise analysis, which yields a general and tighter bound for arbitrary DP mechanisms. Firstly, we derive the clone-generating probability by hypothesis testing perspective, which leads to a more accurate characterization of the probability. Secondly, we analyze the indistinguishability in the context of f-DP, where the convexity of the distributions is leveraged to achieve a tighter privacy bound. Theoretical and numerical results demonstrate that our bound remarkably outperforms the existing results in the literature.|差分隐私(DP)的洗牌模型是一种增强隐私协议,它在本地用户和中央数据管理员之间引入了一个中间的可信服务器。通过匿名化和洗牌本地随机化的数据,它显著增强了中央DP的保障。然而,由于其复杂的随机化协议,推导出一个紧密的隐私界限是具有挑战性的。尽管大多数现有工作集中在统一的本地隐私设置上,但本文关注的是为每个用户需要个性化本地隐私的更实际设置推导中央隐私界限。为了在洗牌后界定隐私,我们首先需要捕捉每个用户生成相邻数据点副本的概率。其次,我们需要量化相邻数据集上副本数量的两个分布之间的不可区分性。现有工作要么不准确地捕捉概率,要么低估了相邻数据集之间的不可区分性。受此启发,我们开发了一种更精确的分析方法,为任意DP机制提供了一个更通用且更紧密的界限。首先,我们通过假设检验的角度推导出副本生成的概率,从而更准确地描述了概率。其次,我们在f-DP的背景下分析不可区分性,利用分布的凸性来实现更紧密的隐私界限。理论和数值结果表明,我们的界限在文献中显著优于现有结果。|code|0| |Channel-Aware Low-Rank Adaptation in Time Series Forecasting|Tong Nie, Yuewen Mei, Guoyang Qin, Jian Sun, Wei Ma||The balance between model capacity and generalization has been a key focus of recent discussions in long-term time series forecasting. Two representative channel strategies are closely associated with model expressivity and robustness, including channel independence (CI) and channel dependence (CD). The former adopts individual channel treatment and has been shown to be more robust to distribution shifts, but lacks sufficient capacity to model meaningful channel interactions. The latter is more expressive for representing complex cross-channel dependencies, but is prone to overfitting. To balance the two strategies, we present a channel-aware low-rank adaptation method to condition CD models on identity-aware individual components. As a plug-in solution, it is adaptable for a wide range of backbone architectures. Extensive experiments show that it can consistently and significantly improve the performance of both CI and CD models with demonstrated efficiency and flexibility. The code is available at https://github.com/tongnie/C-LoRA.|在长期时间序列预测中,模型容量与泛化能力的平衡一直是近期讨论的焦点。两种具有代表性的通道策略与模型的表达能力和鲁棒性密切相关,分别是通道独立性(CI)和通道依赖性(CD)。前者采用独立通道处理,已被证明对分布偏移更具鲁棒性,但缺乏足够的容量来建模有意义的通道交互。后者在表示复杂的跨通道依赖方面更具表达力,但容易过拟合。为了平衡这两种策略,我们提出了一种通道感知的低秩适应方法,将CD模型条件化为具有身份感知的独立组件。作为一种即插即用的解决方案,它适用于广泛的主干架构。大量实验表明,该方法能够持续且显著地提升CI和CD模型的性能,并展现出高效性和灵活性。代码可在https://github.com/tongnie/C-LoRA获取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Channel-Aware+Low-Rank+Adaptation+in+Time+Series+Forecasting)|0| |Learning Links for Adaptable and Explainable Retrieval|Jianqiang Shen, Yuchin Juan, Ping Liu, Wen Pu, Shaobo Zhang, Qianqi Shen, Liangjie Hong, Wenjing Zhang|LinkedIn, Mountain View, CA, USA|Web-scale search systems typically tackle the scalability challenge with a two-step paradigm: retrieval and ranking. The retrieval step, also known as candidate selection, often involves extracting entities, creating an inverted index, and performing term matching for retrieval. Such traditional methods require manual and time-consuming development of retrieval models. In this paper, we propose a framework for constructing a graph that integrates human knowledge with user activity data analysis. The learned links are utilized for retrieval purposes. The model is easy to explain, debug, and tune. The system implementation is straightforward and can directly leverage existing inverted index systems. We applied this retrieval framework to enhance the job search and recommendation systems on a large professional networking portal, resulting in significant performance improvements.|网络规模的搜索引擎通常采用两步范式来应对可扩展性挑战:检索和排序。检索步骤,也称为候选选择,通常涉及提取实体、创建倒排索引以及执行术语匹配以进行检索。这些传统方法需要手动且耗时的检索模型开发。在本文中,我们提出了一种构建图的框架,该框架将人类知识与用户活动数据分析相结合。学习到的链接用于检索目的。该模型易于解释、调试和调整。系统实现简单直接,可以直接利用现有的倒排索引系统。我们将此检索框架应用于大型专业社交门户网站上的职位搜索和推荐系统,显著提升了性能。|code|0| |Preliminary Study on Incremental Learning for Large Language Model-based Recommender Systems|Tianhao Shi, Yang Zhang, Zhijian Xu, Chong Chen, Fuli Feng, Xiangnan He, Qi Tian|University of Science and Technology of China Hefei; Huawei Cloud BU|Adapting Large Language Models for Recommendation (LLM4Rec) has shown promising results. However, the challenges of deploying LLM4Rec in real-world scenarios remain largely unexplored. In particular, recommender models need incremental adaptation to evolving user preferences, while the suitability of traditional incremental learning methods within LLM4Rec remains ambiguous due to the unique characteristics of Large Language Models (LLMs). In this study, we empirically evaluate two commonly employed incremental learning strategies (full retraining and fine-tuning) for LLM4Rec. Surprisingly, neither approach shows significant improvements in the performance of LLM4Rec. Instead of dismissing the role of incremental learning, we attribute the lack of anticipated performance enhancement to a mismatch between the LLM4Rec architecture and incremental learning: LLM4Rec employs a single adaptation module for learning recommendations, limiting its ability to simultaneously capture long-term and short-term user preferences in the incremental learning context. To test this speculation, we introduce a Long- and Short-term Adaptation-aware Tuning (LSAT) framework for incremental learning in LLM4Rec. Unlike the single adaptation module approach, LSAT utilizes two distinct adaptation modules to independently learn long-term and short-term user preferences. Empirical results verify that LSAT enhances performance, thereby validating our speculation. We release our code at: https://github.com/TianhaoShi2001/LSAT.|将大型语言模型(LLM)应用于推荐系统(LLM4Rec)已显示出良好的前景。然而,在实际场景中部署LLM4Rec的挑战仍未得到充分探索。特别是,推荐模型需要对不断变化的用户偏好进行增量适应,而由于大型语言模型的独特特性,传统增量学习方法在LLM4Rec中的适用性尚不明确。在本研究中,我们实证评估了两种常用的增量学习策略(全量重新训练和微调)在LLM4Rec中的应用。令人意外的是,这两种方法均未显著提升LLM4Rec的性能。我们并未因此否定增量学习的作用,而是认为预期的性能提升未能实现的原因在于LLM4Rec架构与增量学习之间的不匹配:LLM4Rec采用单一适应模块进行推荐学习,这限制了其在增量学习情境下同时捕捉长期和短期用户偏好的能力。为验证这一推测,我们引入了长短期适应感知调优(Long- and Short-term Adaptation-aware Tuning, LSAT)框架,用于LLM4Rec中的增量学习。与单一适应模块方法不同,LSAT使用两个独立的适应模块分别学习长期和短期用户偏好。实证结果验证了LSAT能够提升性能,从而证实了我们的推测。我们已在以下链接公开了代码:https://github.com/TianhaoShi2001/LSAT。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Preliminary+Study+on+Incremental+Learning+for+Large+Language+Model-based+Recommender+Systems)|0| | Tabularis Revilio: Converting Text to Tables|Mukul Singh, Gust Verbruggen, Vu Le, Sumit Gulwani|Microsoft, Redmond, WA, USA; Microsoft, Keerbergen, Belgium|Copying tables from documents and applications without proper tabular support, like PDF documents, web pages or images, surprisingly remains a challenge. In this paper, we present Revilio, a novel neurosymbolic system for reconstructing tables when their column boundaries have been lost. Revilio addresses this task by detecting headers, generating an initial table sketch using a large language model, and using that sketch as a guiding representation during an enumerate-and-test strategy that evaluates syntactic and semantic table structures. We evaluate Revilio on a diverse set of datasets, demonstrating significant improvements over existing table parsing methods. Revilio outperforms traditional techniques in both accuracy and scalability, handling large tables with over 100,000 rows. Our experiments find an increase in reconstruction accuracy by 5.8-11.3% over both neural and symbolic baseline systems|从缺乏适当表格支持的文档和应用程序(如PDF文档、网页或图像)中复制表格,依然是一个出乎意料的挑战。本文介绍了Revilio,这是一种新颖的神经符号系统,用于在表格的列边界丢失时重建表格。Revilio通过检测表头、利用大型语言模型生成初始表格草图,并在此草图的指导下,采用枚举与测试策略来评估句法和语义表格结构,从而解决这一任务。我们在多个数据集上对Revilio进行了评估,结果显示其显著优于现有的表格解析方法。Revilio在准确性和可扩展性方面均超越了传统技术,能够处理包含超过100,000行的大型表格。我们的实验发现,与神经和符号基线系统相比,Revilio的重建准确性提高了5.8%至11.3%。|code|0| |STAR: Sparse Text Approach for Recommendation|Anna Tigunova, Ghazaleh Haratinezhad Torbati, Andrew Yates, Gerhard Weikum|Max Planck Institute for Informatics, Saarbrücken, Germany; Amazon, Berlin, Germany; University of Amsterdam, Amsterdam, Netherlands|In this work we propose to adapt Learned Sparse Retrieval, an emerging approach in IR, to text-centric content-based recommendations, leveraging the strengths of transformer models for an efficient and interpretable user-item matching. We conduct extensive experiments, showing that our LSR-based recommender, dubbed STAR, outperforms existing dense bi-encoder baselines on three recommendation domains. The obtained word-level representations of users and items are easy to examine and result in over 10x more compact indexes.|在这项工作中,我们提出将新兴的信息检索(IR)方法——学习稀疏检索(Learned Sparse Retrieval, LSR),应用于以文本为中心的内容推荐系统中,利用转换器模型的优势实现高效且可解释的用户-物品匹配。我们进行了广泛的实验,结果表明,基于LSR的推荐系统(称为STAR)在三个推荐领域中均优于现有的密集双编码器基线。所获得的用户和物品的词级表示易于检查,并且生成的索引比传统方法紧凑10倍以上。|code|0| |Harnessing Empathy and Ethics for Relevance Detection and Information Categorization in Climate and COVID-19 Tweets|Apoorva Upadhyaya, Wolfgang Nejdl, Marco Fisichella|L3S Research Center, Hannover, Germany|In this work, we aim to understand the general public perception of societal issues related to the current climate crisis and the COVID-19 pandemic on Twitter (X). Social media discussions on such matters often lead to misleading information, resulting in delays in initiatives proposed by governments or policymakers. Hence, we focus on extracting relevant information from the conversations on climate change and COVID that could be useful for authorities to curb the spread of potentially biased information by proposing the classification tasks of relevance detection (RD) and information categorization (IC). We first curate the datasets for the RD and IC tasks for the climate domain and extend the COVID-19 benchmark attention-worthy Twitter dataset for the IC task through manual annotation. We initially conduct experiments with LLMs and observe that LLMs can extract the relevant information in zero and few-shot settings based on multi-perspective reasoning in the form of cognitive empathy and ethical standards, but still perform worse than fine-tuned small language models. Based on the initial findings, we conclude that LLMs may not be the best extractor of relevant information, but induce cognitive empathy and ethical reasonings that can intuitively guide supervised models. To achieve this idea, we develop a cognitive empathy and ethical reasoning-based multi-tasking pipelined network for RD and IC tasks. Our proposed approach provides valuable insights that could be useful in real-world scenarios for governments, policymakers, and other researchers to decode the overall public outlook on societal issues.|在这项工作中,我们的目标是理解公众在Twitter(X)上对当前气候危机和COVID-19大流行相关社会问题的看法。社交媒体上关于这些话题的讨论往往会导致误导性信息的传播,从而延误政府或政策制定者提出的倡议。因此,我们专注于从气候变化和COVID的对话中提取相关信息,这些信息对当局来说可能有助于遏制潜在偏见信息的传播,具体通过提出相关性检测(RD)和信息分类(IC)的分类任务来实现。我们首先为气候领域的RD和IC任务策划数据集,并通过手动注释扩展了COVID-19基准的值得关注的Twitter数据集以用于IC任务。我们最初使用大型语言模型(LLMs)进行实验,观察到LLMs在零样本和少样本设置下能够基于认知共情和伦理标准的多角度推理提取相关信息,但仍然表现不如经过微调的小型语言模型。基于初步发现,我们得出结论,LLMs可能不是相关信息的最佳提取器,但能引发认知共情和伦理推理,这些可以直观地指导监督模型。为了实现这一想法,我们开发了一个基于认知共情和伦理推理的多任务流水线网络,用于RD和IC任务。我们提出的方法提供了有价值的见解,这些见解在现实世界中对政府、政策制定者和其他研究人员解读公众对社会问题的整体看法时可能非常有用。|code|0| |FashionLOGO: Prompting Multimodal Large Language Models for Fashion Logo Embeddings|Zhen Wang, Da Li, Yulin Su, Min Yang, Minghui Qiu, Walton Wang||Logo embedding models convert the product logos in images into vectors, enabling their utilization for logo recognition and detection within e-commerce platforms. This facilitates the enforcement of intellectual property rights and enhances product search capabilities. However, current methods treat logo embedding as a purely visual problem. A noteworthy issue is that visual models capture features more than logos. Instead, we view this as a multimodal task, using text as auxiliary information to facilitate the visual model's understanding of the logo. The emerging Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in both visual and textual understanding. Inspired by this, we propose an approach, FashionLOGO, to explore how to prompt MLLMs to generate appropriate text for product images, which can help visual models achieve better logo embeddings. We adopt a cross-attention transformer block that enables visual embedding to automatically learn supplementary knowledge from textual embedding. Our extensive experiments on real-world datasets prove that FashionLOGO is capable of generating generic and robust logo embeddings, achieving state-of-the-art performance in all benchmarks.|Logo嵌入模型将图像中的产品Logo转换为向量,从而使其能够在电子商务平台中用于Logo识别和检测。这有助于知识产权的保护并提升产品搜索能力。然而,现有方法将Logo嵌入视为纯粹的视觉问题。一个值得注意的问题是,视觉模型捕捉到的特征往往超出Logo本身。相反,我们将其视为一个多模态任务,利用文本作为辅助信息来帮助视觉模型更好地理解Logo。新兴的多模态大型语言模型(MLLMs)在视觉和文本理解方面展示出卓越的能力。受此启发,我们提出了一种名为FashionLOGO的方法,探索如何引导MLLMs为产品图像生成适当的文本,这有助于视觉模型实现更好的Logo嵌入。我们采用了一个交叉注意力转换器块,使视觉嵌入能够自动从文本嵌入中学习补充知识。我们在真实世界数据集上的广泛实验证明,FashionLOGO能够生成通用且鲁棒的Logo嵌入,在所有基准测试中均达到了最先进的性能。|code|0| |CrossPred: A Cross-City Mobility Prediction Framework for Long-Distance Travelers via POI Feature Matching|Shuai Xu, Donghai Guan|Nanjing University of Aeronautics and Astronautics, Nanjing, China|Current studies mainly rely on overlapping users (who leave trajectories in both cities) as a medium to learn travelers' preference in the target city, however it is unrealistic to find overlapping users when two cities are far apart, thus a severe data scarcity issue exists for this problem. Besides, due to the mixture of mobility pattern from both cities, directly applying the model trained in the source city may lead to negative transfer in the target city. To tackle these issues, in this paper, we conceive and implement a novel framework called CrossPred to predict the cross-city mobility of long-distance travelers in the target city. Specifically, POI features including popularity, textual description, spatial distribution as well as sequential pattern are considered for cross-city POI matching, which further acts as a vital link for jointly modeling native user mobility preference in both source and target cities. Maximum Mean Discrepancy (MMD) is adopted to strengthen the shared POI features among cities and weaken the unique POI features, thereby promoting cross-city POI feature matching. Extensive experiments on real-world datasets demonstrate the effectiveness and superiority of the proposed framework.|当前的研究主要依赖于重叠用户(在两个城市都留下轨迹的用户)作为媒介来学习目标城市中旅行者的偏好,然而当两个城市相距甚远时,找到重叠用户是不现实的,因此这个问题存在严重的数据稀缺问题。此外,由于两个城市的移动模式混合,直接应用在源城市训练的模型可能会导致在目标城市中出现负迁移。为了解决这些问题,本文构思并实现了一个名为CrossPred的新框架,用于预测目标城市中长途旅行者的跨城市移动。具体而言,POI特征包括流行度、文本描述、空间分布以及序列模式被考虑用于跨城市POI匹配,这进一步作为联合建模源城市和目标城市中本地用户移动偏好的关键环节。最大均值差异(MMD)被采用来加强城市间的共享POI特征并削弱独特的POI特征,从而促进跨城市POI特征匹配。在真实世界数据集上的广泛实验证明了所提出框架的有效性和优越性。|code|0| |Enhancing Content-based Recommendation via Large Language Model|Wentao Xu, Qianqian Xie, Shuo Yang, Jiangxia Cao, Shuchao Pang||In real-world applications, users express different behaviors when theyinteract with different items, including implicit click/like interactions, andexplicit comments/reviews interactions. Nevertheless, almost all recommenderworks are focused on how to describe user preferences by the implicitclick/like interactions, to find the synergy of people. For the content-basedexplicit comments/reviews interactions, some works attempt to utilize them tomine the semantic knowledge to enhance recommender models. However, they stillneglect the following two points: (1) The content semantic is a universal worldknowledge; how do we extract the multi-aspect semantic information to empowerdifferent domains? (2) The user/item ID feature is a fundamental element forrecommender models; how do we align the ID and content semantic feature space?In this paper, we propose a `plugin' semantic knowledge transferring methodLoID, which includes two major components: (1) LoRA-based largelanguage model pretraining to extract multi-aspect semantic information; (2)ID-based contrastive objective to align their feature spaces. We conductextensive experiments with SOTA baselines on real-world datasets, the detailedresults demonstrating significant improvements of our method LoID.|在实际应用中,用户在与不同项目互动时表现出不同的行为,包括隐式的点击/点赞互动和显式的评论/评价互动。然而,几乎所有的推荐系统都专注于如何通过隐式的点击/点赞互动来描述用户偏好,以发现人群的协同效应。对于基于内容的显式评论/评价互动,一些研究尝试利用它们来挖掘语义知识以增强推荐模型。但是,这些研究仍然忽略了以下两点:(1)内容语义是一种普遍的世界知识;我们如何提取多方面的语义信息以赋能不同领域?(2)用户/项目ID特征是推荐模型的基本元素;我们如何对齐ID和内容语义特征空间?在本文中,我们提出了一种名为LoID的“插件”语义知识转移方法,该方法包括两个主要组件:(1)基于LoRA的大语言模型预训练,以提取多方面的语义信息;(2)基于ID的对比目标,以对齐它们的特征空间。我们在真实世界的数据集上进行了广泛的实验,与最先进的基线方法相比,详细结果显示我们的方法LoID显著提升了性能。|code|0| |Learn From Mistakes: Guidance on Zero-shot Conversational Text-to-SQL|Wenshuo Zhai, Xiang Zhao, Jinzhi Liao, Ziyang Chen||Large language models (LLMs) possess powerful contextual comprehension capabilities and have demonstrated remarkable success in conversational tasks. However, existing works that apply LLMs to conversational text-to-SQL task have the problem of repetitive mistakes, which results in the failure to bring out the performance of LLMs. In this paper, we propose a novel approach that provides guidance through learning from mistakes. Specifically, the guidance offered by our approach includes tailored suggestions, corrective feedback, and personalized strategies aimed at improving learning outcomes. Furthermore, we employ chain-of-thought (CoT) to utilize guidance that is not suitable directly as prompts. Our method rigorously analyzes actual errors and strategizes on how to utilize the derived guidance effectively. Experimental results demonstrate that our approach improves the state-of-the-art (SOTA) performance metrics, increasing QEX performance from 66.3% to 70.9% (an absolute improvement of 4.6%) and IEX performance from 37.4% to 45.1% (an absolute improvement of 7.7%) on the CoSQL dataset.|大型语言模型(LLMs)具备强大的上下文理解能力,并在对话任务中展示了显著的成功。然而,现有将LLMs应用于对话式文本到SQL任务的研究存在重复错误的问题,导致无法充分发挥LLMs的性能。本文提出了一种通过从错误中学习来提供指导的新方法。具体而言,我们的方法提供的指导包括定制建议、纠正反馈和个性化策略,旨在提升学习效果。此外,我们采用思维链(Chain-of-Thought, CoT)来利用不适合直接作为提示的指导。我们的方法严格分析实际错误,并策略性地规划如何有效利用所得到的指导。实验结果表明,我们的方法提升了最先进(SOTA)的性能指标,在CoSQL数据集上,QEX性能从66.3%提高到70.9%(绝对提升4.6%),IEX性能从37.4%提高到45.1%(绝对提升7.7%)。|code|0| |Mamba Retriever: Utilizing Mamba for Effective and Efficient Dense Retrieval|Hanqi Zhang, Chong Chen, Lang Mei, Qi Liu, Jiaxin Mao||In the information retrieval (IR) area, dense retrieval (DR) models use deep learning techniques to encode queries and passages into embedding space to compute their semantic relations. It is important for DR models to balance both efficiency and effectiveness. Pre-trained language models (PLMs), especially Transformer-based PLMs, have been proven to be effective encoders of DR models. However, the self-attention component in Transformer-based PLM results in a computational complexity that grows quadratically with sequence length, and thus exhibits a slow inference speed for long-text retrieval. Some recently proposed non-Transformer PLMs, especially the Mamba architecture PLMs, have demonstrated not only comparable effectiveness to Transformer-based PLMs on generative language tasks but also better efficiency due to linear time scaling in sequence length. This paper implements the Mamba Retriever to explore whether Mamba can serve as an effective and efficient encoder of DR model for IR tasks. We fine-tune the Mamba Retriever on the classic short-text MS MARCO passage ranking dataset and the long-text LoCoV0 dataset. Experimental results show that (1) on the MS MARCO passage ranking dataset and BEIR, the Mamba Retriever achieves comparable or better effectiveness compared to Transformer-based retrieval models, and the effectiveness grows with the size of the Mamba model; (2) on the long-text LoCoV0 dataset, the Mamba Retriever can extend to longer text length than its pre-trained length after fine-tuning on retrieval task, and it has comparable or better effectiveness compared to other long-text retrieval models; (3) the Mamba Retriever has superior inference speed for long-text retrieval. In conclusion, Mamba Retriever is both effective and efficient, making it a practical model, especially for long-text retrieval.|在信息检索(IR)领域,密集检索(DR)模型利用深度学习技术将查询和文档编码到嵌入空间中,以计算它们的语义关系。对于DR模型来说,平衡效率和效果至关重要。预训练语言模型(PLMs),特别是基于Transformer的PLMs,已被证明是有效的DR模型编码器。然而,基于Transformer的PLM中的自注意力组件导致了计算复杂度随序列长度呈二次增长的特性,因此在长文本检索中表现出较慢的推理速度。最近提出的一些非Transformer的PLMs,特别是Mamba架构的PLMs,不仅在生成性语言任务上展示了与基于Transformer的PLMs相当的有效性,而且由于序列长度线性时间缩放的特性,还表现出更高的效率。本文实现了Mamba检索器,以探讨Mamba是否可以作为IR任务中DR模型的有效且高效的编码器。我们在经典的短文本MS MARCO文档排序数据集和长文本LoCoV0数据集上对Mamba检索器进行了微调。实验结果表明:(1)在MS MARCO文档排序数据集和BEIR上,Mamba检索器与基于Transformer的检索模型相比,达到了相当或更好的效果,并且效果随着Mamba模型规模的增大而提升;(2)在长文本LoCoV0数据集上,Mamba检索器在检索任务上微调后,可以扩展到比预训练时更长的文本长度,并且与其他长文本检索模型相比,具有相当或更好的效果;(3)Mamba检索器在长文本检索中具有优越的推理速度。综上所述,Mamba检索器既有效又高效,尤其适用于长文本检索,是一个实用的模型。|code|0| |Generating Cross-model Analytics Workloads Using LLMs|Xiuwen Zheng, Arun Kumar, Amarnath Gupta|University of California, San Diego, La Jolla, USA|Data analytics applications today often require processing heterogeneous data from different data models, including relational, graph, and text data, for more holistic analytics. While query optimization for single data models, especially relational data, has been studied for decades, there is surprisingly little work on query optimization for cross-model data analytics. Cross-model query optimization can benefit from the long line of prior work in query optimization in the relational realm, wherein cost-based and/or machine learning-based (ML-based) optimizers are common. Both approaches require a large and diverse set of query workloads to measure, tune, and evaluate a query optimizer. To the best of our knowledge, there are still no large public cross-model benchmark workloads, a significant obstacle for systems researchers in this space. In this paper, we take a step toward filling this research gap by generating new query workloads spanning relational and graph data, which are ubiquitous in analytics applications. Our approach leverages large language models (LLMs) via different prompting strategies to generate queries and proposes new rule-based post-processing methods to ensure query correctness. We evaluate the pros and cons of each strategy and perform an in-depth analysis by categorizing the syntactic and semantic errors of the generated queries. So far, we have produced over 4000 correct cross-model queries, the largest set ever. Our code, prompts, data, and query workloads will all be released publicly.|当今的数据分析应用通常需要处理来自不同数据模型的异构数据,包括关系型数据、图数据和文本数据,以实现更全面的分析。尽管针对单一数据模型的查询优化,尤其是关系型数据,已经研究了几十年,但关于跨模型数据分析的查询优化研究却出乎意料地少。跨模型查询优化可以借鉴关系领域中丰富的查询优化先前工作,其中基于成本和/或基于机器学习(ML-based)的优化器是常见的。这两种方法都需要大量且多样化的查询工作负载来测量、调整和评估查询优化器。据我们所知,目前仍没有大型公共的跨模型基准工作负载,这对该领域的系统研究人员来说是一个重大障碍。在本文中,我们朝着填补这一研究空白迈出了一步,生成了涵盖关系型和图数据的新查询工作负载,这些数据在分析应用中非常普遍。我们的方法利用大型语言模型(LLMs)通过不同的提示策略生成查询,并提出了新的基于规则的后处理方法以确保查询的正确性。我们评估了每种策略的优缺点,并通过分类生成的查询的句法和语义错误进行了深入分析。到目前为止,我们已经生成了超过4000个正确的跨模型查询,这是迄今为止最大的集合。我们的代码、提示、数据和查询工作负载将全部公开发布。|code|0| |Deep Journey Hierarchical Attention Networks for Conversion Predictions in Digital Marketing|Girim Ban, Hyeonseok Yun, Banseok Lee, David Sung, Simon S. Woo|Korea Telecom (KT) NexR, Seoul, Republic of Korea; Sungkyunkwan University, Suwon, Republic of Korea|In digital marketing, precise audience targeting is crucial for campaign efficiency. However, digital marketing agencies often struggle with incomplete user profiles and interaction details from Advertising Identifier (ADID) data in user behavior modeling. To address this, we introduce the Deep Journey Hierarchical Attention Networks (DJHAN). This novel method enhances conversion predictions by leveraging heterogeneous action sequences associated with ADIDs and encapsulating these interactions into structured journeys. These journeys are hierarchically aggregated to effectively represent ADID's behavioral attributes. Moreover, DJHAN incorporates three specialized attention mechanisms: temporal attention for time-sensitive contexts, action attention for emphasizing key behaviors, and journety attention for highlighting influential journeys in the purchase conversion process. Emprically, DJHAN surpasses state-of-the-art (SOTA) models across three diverse datasets, including real-world data from NasMedia, a leading media representative in Asia. In backtesting simulations with three advertisers, DJHAN outperforms existing baselines, achieving the highest improvements in Conversion Rate (CVR) and Return on Ad Spend (ROAS) across three advertisers, demonstrating its practical potential in digital marketing.|在数字营销中,精准的受众定位对于提升广告活动效率至关重要。然而,数字营销机构在用户行为建模中常常面临用户资料不完整以及从广告标识符(ADID)数据中获取的互动细节不足的问题。为解决这一问题,我们提出了深度旅程分层注意力网络(DJHAN)。这一创新方法通过利用与ADID相关的异构行为序列,并将这些互动封装成结构化的旅程,从而提高转化预测的准确性。这些旅程被分层聚合,以有效表示ADID的行为属性。此外,DJHAN结合了三种专门的注意力机制:时间注意力用于处理时间敏感的上下文,行为注意力用于强调关键行为,以及旅程注意力用于突出购买转化过程中具有影响力的旅程。实验证明,DJHAN在三个不同数据集上均超越了当前最先进(SOTA)的模型,其中包括来自亚洲领先媒体代表NasMedia的真实世界数据。在针对三家广告商的回测模拟中,DJHAN优于现有的基准模型,在转化率(CVR)和广告支出回报率(ROAS)方面均实现了最高的提升,展示了其在数字营销中的实际应用潜力。|code|0| |LiNR: Model Based Neural Retrieval on GPUs at LinkedIn|Fedor Borisyuk, Qingquan Song, Mingzhou Zhou, Ganesh Parameswaran, Madhu Arun, Siva Popuri, Tugrul Bingol, Zhuotao Pei, KuangHsuan Lee, Lu Zheng, Qizhan Shao, Ali Naqvi, Sen Zhou, Aman Gupta||This paper introduces LiNR, LinkedIn's large-scale, GPU-based retrieval system. LiNR supports a billion-sized index on GPU models. We discuss our experiences and challenges in creating scalable, differentiable search indexes using TensorFlow and PyTorch at production scale. In LiNR, both items and model weights are integrated into the model binary. Viewing index construction as a form of model training, we describe scaling our system for large indexes, incorporating full scans and efficient filtering. A key focus is on enabling attribute-based pre-filtering for exhaustive GPU searches, addressing the common challenge of post-filtering in KNN searches that often reduces system quality. We further provide multi-embedding retrieval algorithms and strategies for tackling cold start issues in retrieval. Our advancements in supporting larger indexes through quantization are also discussed. We believe LiNR represents one of the industry's first Live-updated model-based retrieval indexes. Applied to out-of-network post recommendations on LinkedIn Feed, LiNR has contributed to a 3% relative increase in professional daily active users. We envisage LiNR as a step towards integrating retrieval and ranking into a single GPU model, simplifying complex infrastructures and enabling end-to-end optimization of the entire differentiable infrastructure through gradient descent.|本文介绍了LiNR,即LinkedIn基于GPU的大规模检索系统。LiNR支持在GPU模型上处理十亿级索引。我们探讨了在生产规模下使用TensorFlow和PyTorch创建可扩展、可微分搜索索引的经验和挑战。在LiNR中,项目和模型权重都被整合到模型二进制文件中。将索引构建视为一种模型训练的形式,我们描述了如何扩展系统以处理大规模索引,包括全扫描和高效过滤。一个关键重点是启用基于属性的预过滤,以支持全面的GPU搜索,解决KNN搜索中常见的后过滤问题,这些问题通常会降低系统质量。此外,我们还提供了多嵌入检索算法和策略,用于解决检索中的冷启动问题。我们还在通过量化支持更大索引方面取得了进展。我们相信,LiNR代表了业界首批实时更新的基于模型的检索索引之一。应用于LinkedIn Feed的非网络帖子推荐,LiNR已为专业每日活跃用户带来了3%的相对增长。我们设想LiNR是朝着将检索和排序整合到单一GPU模型中的方向迈出的一步,简化了复杂的基础设施,并通过梯度下降实现了整个可微分基础设施的端到端优化。|code|0| |Personalized Video Summarization by Multimodal Video Understanding|Brian Y. Chen, Xiangyuan Zhao, Yingnan Zhu|VDIL, Samsung Research America, Irvine, CA, USA; VDIL, Samsung Research America, irvine, CA, USA|Video summarization techniques have been proven to improve the overall user experience when it comes to accessing and comprehending video content. If the user's preference is known, video summarization can identify significant information or relevant content from an input video, aiding them in obtaining the necessary information or determining their interest in watching the original video. Adapting video summarization to various types of video and user preferences requires significant training data and expensive human labeling. To facilitate such research, we proposed a new benchmark for video summarization that captures various user preferences. Also, we present a pipeline called Video Summarization with Language (VSL) for user-preferred video summarization that is based on pre-trained visual language models (VLMs) to avoid the need to train a video summarization system on a large training dataset. The pipeline takes both video and closed captioning as input and performs semantic analysis at the scene level by converting video frames into text. Subsequently, the user's genre preference was used as the basis for selecting the pertinent textual scenes. The experimental results demonstrate that our proposed pipeline outperforms current state-of-the-art unsupervised video summarization models. We show that our method is more adaptable across different datasets compared to supervised query-based video summarization models. In the end, the runtime analysis demonstrates that our pipeline is more suitable for practical use when scaling up the number of user preferences and videos.|视频摘要技术已被证明在用户访问和理解视频内容时能够提升整体用户体验。如果用户的偏好已知,视频摘要可以从输入视频中识别出重要信息或相关内容,帮助用户获取必要信息或判断是否对观看原视频感兴趣。将视频摘要适应于各种类型的视频和用户偏好需要大量的训练数据和昂贵的人工标注。为了促进此类研究,我们提出了一种新的视频摘要基准,该基准捕捉了多种用户偏好。此外,我们提出了一种名为Video Summarization with Language(VSL)的管道,用于基于预训练的视觉语言模型(VLMs)的用户偏好视频摘要,以避免在大规模训练数据集上训练视频摘要系统的需求。该管道接受视频和封闭字幕作为输入,并通过将视频帧转换为文本来在场景级别进行语义分析。随后,用户的类型偏好被用作选择相关文本场景的基础。实验结果表明,我们提出的管道优于当前最先进的无监督视频摘要模型。我们展示了与基于监督查询的视频摘要模型相比,我们的方法在不同数据集上更具适应性。最后,运行时分析表明,当扩展用户偏好和视频数量时,我们的管道更适合实际应用。|code|0| |Blind-Match: Efficient Homomorphic Encryption-Based 1: N Matching for Privacy-Preserving Biometric Identification|Hyunmin Choi, Jiwon Kim, Chiyoung Song, Simon S. Woo, Hyoungshick Kim||We present Blind-Match, a novel biometric identification system that leverages homomorphic encryption (HE) for efficient and privacy-preserving 1:N matching. Blind-Match introduces a HE-optimized cosine similarity computation method, where the key idea is to divide the feature vector into smaller parts for processing rather than computing the entire vector at once. By optimizing the number of these parts, Blind-Match minimizes execution time while ensuring data privacy through HE. Blind-Match achieves superior performance compared to state-of-the-art methods across various biometric datasets. On the LFW face dataset, Blind-Match attains a 99.63 feature vector, demonstrating its robustness in face recognition tasks. For fingerprint identification, Blind-Match achieves a remarkable 99.55 accuracy on the PolyU dataset, even with a compact 16-dimensional feature vector, significantly outperforming the state-of-the-art method, Blind-Touch, which achieves only 59.17 efficiency in large-scale biometric identification scenarios, such as Naver Cloud's FaceSign, by processing 6,144 biometric samples in 0.74 seconds using a 128-dimensional feature vector.|我们提出了Blind-Match,这是一种新颖的生物识别身份验证系统,利用同态加密(HE)实现高效且隐私保护的1:N匹配。Blind-Match引入了一种HE优化的余弦相似度计算方法,其关键思想是将特征向量分成较小的部分进行处理,而不是一次性计算整个向量。通过优化这些部分的数量,Blind-Match在确保通过HE保护数据隐私的同时,最小化了执行时间。与现有最先进的方法相比,Blind-Match在各种生物识别数据集上表现出色。在LFW人脸数据集上,Blind-Match实现了99.63的特征向量识别率,展示了其在人脸识别任务中的鲁棒性。对于指纹识别,Blind-Match在PolyU数据集上达到了惊人的99.55的准确率,即使使用的是紧凑的16维特征向量,也显著优于最先进的方法Blind-Touch,后者在大规模生物识别身份验证场景中,如Naver Cloud的FaceSign,仅实现了59.17的效率,通过处理6,144个生物识别样本在0.74秒内使用128维特征向量。|code|0| |Automated Contrastive Learning Strategy Search for Time Series|Baoyu Jing, Yansen Wang, Guoxin Sui, Jing Hong, Jingrui He, Yuqing Yang, Dongsheng Li, Kan Ren|Microsoft Research Asia; ShanghaiTech University; University of Illinois at Urbana-Champaign; Ruijin Hospital|In recent years, Contrastive Learning (CL) has become a predominantrepresentation learning paradigm for time series. Most existing methods in theliterature focus on manually building specific Contrastive Learning Strategies(CLS) by human heuristics for certain datasets and tasks. However, manuallydeveloping CLS usually require excessive prior knowledge about the datasets andtasks, e.g., professional cognition of the medical time series in healthcare,as well as huge human labor and massive experiments to determine the detailedlearning configurations. In this paper, we present an Automated MachineLearning (AutoML) practice at Microsoft, which automatically learns tocontrastively learn representations for various time series datasets and tasks,namely Automated Contrastive Learning (AutoCL). We first construct a principleduniversal search space of size over 3x1012, covering data augmentation,embedding transformation, contrastive pair construction and contrastive losses.Further, we introduce an efficient reinforcement learning algorithm, whichoptimizes CLS from the performance on the validation tasks, to obtain moreeffective CLS within the space. Experimental results on various real-worldtasks and datasets demonstrate that AutoCL could automatically find thesuitable CLS for a given dataset and task. From the candidate CLS found byAutoCL on several public datasets/tasks, we compose a transferable GenerallyGood Strategy (GGS), which has a strong performance for other datasets. We alsoprovide empirical analysis as a guidance for future design of CLS.|近年来,对比学习(Contrastive Learning, CL)已成为时间序列的主要表示学习范式。现有文献中的大多数方法侧重于通过人类启发式方法为特定数据集和任务手动构建特定的对比学习策略(Contrastive Learning Strategy, CLS)。然而,手动开发CLS通常需要对数据集和任务有大量的先验知识,例如医疗保健领域中对医疗时间序列的专业认知,以及大量的人力劳动和实验来确定详细的学习配置。在本文中,我们介绍了微软在自动化机器学习(Automated Machine Learning, AutoML)方面的一项实践,该实践能够自动学习为各种时间序列数据集和任务进行对比学习表示,即自动化对比学习(Automated Contrastive Learning, AutoCL)。我们首先构建了一个原则性的通用搜索空间,其大小超过3x10^12,涵盖了数据增强、嵌入变换、对比对构建和对比损失。此外,我们引入了一种高效的强化学习算法,该算法从验证任务的性能出发优化CLS,以在空间内获得更有效的CLS。在各种真实世界任务和数据集上的实验结果表明,AutoCL能够自动找到适合给定数据集和任务的CLS。从AutoCL在几个公共数据集/任务上找到的候选CLS中,我们组合了一个可迁移的通用良好策略(Generally Good Strategy, GGS),该策略在其他数据集上表现出色。我们还提供了经验分析,作为未来设计CLS的指导。|code|0| |REAPER: Reasoning based Retrieval Planning for Complex RAG Systems|Ashutosh Joshi, Sheikh Muhammad Sarwar, Samarth Varshney, Sreyashi Nag, Shrivats Agrawal, Juhi Naik||Complex dialog systems often use retrieved evidence to facilitate factual responses. Such RAG (Retrieval Augmented Generation) systems retrieve from massive heterogeneous data stores that are usually architected as multiple indexes or APIs instead of a single monolithic source. For a given query, relevant evidence needs to be retrieved from one or a small subset of possible retrieval sources. Complex queries can even require multi-step retrieval. For example, a conversational agent on a retail site answering customer questions about past orders will need to retrieve the appropriate customer order first and then the evidence relevant to the customer's question in the context of the ordered product. Most RAG Agents handle such Chain-of-Thought (CoT) tasks by interleaving reasoning and retrieval steps. However, each reasoning step directly adds to the latency of the system. For large models (>100B parameters) this latency cost is significant – in the order of multiple seconds. Multi-agent systems may classify the query to a single Agent associated with a retrieval source, though this means that a (small) classification model dictates the performance of a large language model. In this work we present REAPER (REAsoning-based PlannER) - an LLM based planner to generate retrieval plans in conversational systems. We show significant gains in latency over Agent-based systems and are able to scale easily to new and unseen use cases as compared to classification-based planning. Though our method can be applied to any RAG system, we show our results in the context of Rufus – Amazon's conversational shopping assistant.|复杂的对话系统通常使用检索到的证据来支持事实性回答。这类RAG(检索增强生成)系统从通常架构为多个索引或API而非单一整体源的庞大异构数据存储中进行检索。对于给定的查询,相关证据需要从一个或少数几个可能的检索源中检索出来。复杂的查询甚至可能需要多步检索。例如,一个零售网站上的对话代理在回答关于过去订单的客户问题时,首先需要检索到适当的客户订单,然后在此订单产品的上下文中检索与客户问题相关的证据。大多数RAG代理通过交错推理和检索步骤来处理此类思维链(Chain-of-Thought, CoT)任务。然而,每个推理步骤都会直接增加系统的延迟。对于大型模型(超过1000亿参数),这种延迟成本是显著的——通常在几秒的量级。多代理系统可能会将查询分类到一个与检索源相关的单一代理,尽管这意味着一个(小型)分类模型决定了大型语言模型的性能。在这项工作中,我们提出了REAPER(基于推理的计划器)——一个基于LLM的计划器,用于在对话系统中生成检索计划。我们展示了在基于代理的系统中显著的延迟改进,并且能够轻松扩展到新的和未见过的用例,相比于基于分类的计划。虽然我们的方法可以应用于任何RAG系统,但我们展示了在Rufus——亚马逊的对话购物助手——中的结果。|code|0| |RL-ISLAP: A Reinforcement Learning Framework for Industrial-Scale Linear Assignment Problems at Alipay|Hanjie Li, Yue Ning, Yang Bao, Changsheng Li, Boxiao Chen, Xingyu Lu, Ye Yuan, Guoren Wang|Beijing Institute of Technology, Beijing, China; Independent Researcher, Shanghai, China; Independent Researcher, Hangzhou, China|Industrial-scale linear assignment problems (LAPs) are frequently encountered in various industrial scenarios, e.g., asset allocation within the domain of credit management. However, optimization algorithms for such problems (e.g., PJ-ADMM) are highly sensitive to hyper-parameters. Existing solving systems rely on empirical parameter selection, which is challenging to achieve convergence and extremely time-consuming. Additionally, the resulting parameter rules are often inefficient. To alleviate this issue, we propose RL-ISLAP, an efficient and lightweight Reinforcement Learning framework for Industrial-Scale Linear Assignment Problems. We formulate the hyper-parameter selection for PJ-ADMM as a sequential decision problem and leverage reinforcement learning to enhance its convergence. Addressing the sparse reward challenge inherent in learning policies for such problems, we devise auxiliary rewards to provide dense signals for policy optimization, and present a rollback mechanism to prevent divergence in the solving process. Experiments on OR-Library benchmark demonstrate that our method is competitive to SOTA stand-alone solvers. Furthermore, the scale-independent design of observations enables us to transfer the acquired hyper-parameter policy to a scenario of LAPs in varying scales. On two real-world industrial-scale LAPs with up to 10 millions of decision variables, our proposed RL-ISLAP achieves solutions of comparable quality in 2/3 of the time when compared to the SOTA distributed solving system employing fine-tuned empirical parameter rules.|在各种工业场景中,例如信用管理领域的资产分配,经常会遇到大规模的线性分配问题(LAPs)。然而,针对此类问题的优化算法(例如PJ-ADMM)对超参数非常敏感。现有的求解系统依赖于经验参数选择,这不仅难以实现收敛,而且极其耗时。此外,由此产生的参数规则往往效率低下。为了缓解这一问题,我们提出了RL-ISLAP,一个针对工业规模线性分配问题的高效且轻量级的强化学习框架。我们将PJ-ADMM的超参数选择问题形式化为一个序列决策问题,并利用强化学习来增强其收敛性。针对此类问题中固有的稀疏奖励挑战,我们设计了辅助奖励以提供密集的信号用于策略优化,并提出了一种回滚机制以防止求解过程中的发散。在OR-Library基准测试中的实验表明,我们的方法与最先进的独立求解器具有竞争力。此外,观察结果的规模无关设计使我们能够将获得的超参数策略迁移到不同规模的LAP场景中。在两个具有多达1000万个决策变量的真实工业规模LAP问题上,与使用精细调参的经验参数规则的最先进分布式求解系统相比,我们提出的RL-ISLAP在2/3的时间内实现了同等质量的解决方案。|code|0| |Explainable and Coherent Complement Recommendation Based on Large Language Models|Zelong Li, Yan Liang, Ming Wang, Sungro Yoon, Jiaying Shi, Xin Shen, Xiang He, Chenwei Zhang, Wenyi Wu, Hanbo Wang, Jin Li, Jim Chan, Yongfeng Zhang|Amazon.com, Seattle, WA, USA; Rutgers University, New Brunswick, NJ, USA|A complementary item is an item that pairs well with another item when consumed together. In the context of e-commerce, providing recommendations for complementary items is essential for both customers and stores. Current models for suggesting complementary items often rely heavily on user behavior data, such as co-purchase relationships. However, just because two items are frequently bought together does not necessarily mean they are truly complementary. Relying solely on co-purchase data may not align perfectly with the goal of making meaningful complementary recommendations. In this paper, we introduce the concept of "coherent complement recommendation", where "coherent" implies that recommended item pairs are compatible and relevant. Our approach builds upon complementary item pairs, with a focus on ensuring that recommended items are well used together and contextually relevant. To enhance the explainability and coherence of our complement recommendations, we fine-tune the Large Language Model (LLM) with coherent complement recommendation and explanation generation tasks since LLM has strong natural language explanation generation ability and multi-task fine-tuning enhances task understanding. Experimental results indicate that our model can provide more coherent complementary recommendations than existing state-of-the-art methods, and human evaluation validates that our approach achieves up to a 48% increase in the coherent rate of complement recommendations.|互补商品是指在共同消费时能够良好搭配的商品。在电子商务领域,为顾客和商家提供互补商品的推荐至关重要。当前推荐互补商品的模型通常严重依赖用户行为数据,如共同购买关系。然而,仅仅因为两种商品经常被一起购买并不一定意味着它们真正具有互补性。仅依赖共同购买数据可能无法完全实现提供有意义的互补推荐的目标。本文中,我们引入了“连贯互补推荐”的概念,其中“连贯”意味着推荐的商品对是兼容且相关的。我们的方法基于互补商品对,重点确保推荐的商品在使用时能够良好搭配并具有上下文相关性。为了增强互补推荐的解释性和连贯性,我们通过连贯互补推荐和解释生成任务对大型语言模型(LLM)进行微调,因为LLM具有强大的自然语言解释生成能力和多任务微调可以增强任务理解。实验结果表明,我们的模型能够提供比现有最先进方法更为连贯的互补推荐,而人工评估验证了我们的方法在互补推荐连贯率上实现了高达48%的提升。|code|0| |Boosting LLM-based Relevance Modeling with Distribution-Aware Robust Learning|Hong Liu, Saisai Gong, Yixin Ji, Kaixin Wu, Jia Xu, Jinjie Gu|Ant Group, Hangzhou, China|Relevance modeling plays a crucial role in e-commerce search engines, striving to identify the utmost pertinent items corresponding to a given search query. With the rapid advancement of pre-trained large language models (LLMs), recent endeavors have leveraged the capabilities of LLMs in relevance modeling, resulting in enhanced performance. This is usually done through the process of fine-tuning LLMs on specifically annotated datasets to determine the relevance between queries and items. However, there are two limitations when LLMs are naively employed for relevance modeling through fine-tuning and inference. First, it is not inherently efficient for performing nuanced tasks beyond simple yes or no answers, such as assessing search relevance. It may therefore tend to be overconfident and struggle to distinguish fine-grained degrees of relevance (e.g., strong relevance, weak relevance, irrelevance) used in search engines. Second, it exhibits significant performance degradation when confronted with data distribution shift in real-world scenarios. In this paper, we propose a novel Distribution-Aware Robust Learning framework (DaRL) for relevance modeling in Alipay Search. Specifically, we design an effective loss function to enhance the discriminability of LLM-based relevance modeling across various fine-grained degrees of query-item relevance. To improve the generalizability of LLM-based relevance modeling, we first propose the Distribution-Aware Sample Augmentation (DASA) module. This module utilizes out-of-distribution (OOD) detection techniques to actively select appropriate samples that are not well covered by the original training set for model fine-tuning. Furthermore, we adopt a multi-stage fine-tuning strategy to simultaneously improve in-distribution (ID) and OOD performance, bridging the performance gap between them. DaRL has been deployed online to serve the Alipay's insurance product search. Both offline experiments on real-world industry data and online A/B testing show that DaRL effectively improves the performance of relevance modeling.|相关性建模在电子商务搜索引擎中扮演着至关重要的角色,旨在识别与给定搜索查询最相关的商品。随着预训练大型语言模型(LLMs)的快速发展,近期研究已利用LLMs在相关性建模中的能力,从而提升了性能。这通常通过在专门标注的数据集上微调LLMs来实现,以确定查询与商品之间的相关性。然而,当LLMs通过微调和推理简单地用于相关性建模时,存在两个局限性。首先,LLMs并不擅长执行超出简单是或否回答的细微任务,例如评估搜索相关性。因此,它们可能倾向于过度自信,难以区分搜索引擎中使用的细粒度相关性程度(如强相关、弱相关、不相关)。其次,在面对现实场景中的数据分布偏移时,LLMs表现出显著的性能下降。
本文提出了一个名为分布感知鲁棒学习框架(DaRL)的新方法,用于支付宝搜索中的相关性建模。具体而言,我们设计了一种有效的损失函数,以增强基于LLM的相关性建模在不同细粒度查询-商品相关性上的区分能力。为了提高基于LLM的相关性建模的泛化能力,我们首先提出了分布感知样本增强(DASA)模块。该模块利用分布外(OOD)检测技术,主动选择原始训练集未充分覆盖的适当样本进行模型微调。此外,我们采用多阶段微调策略,以同时提升分布内(ID)和分布外(OOD)性能,缩小两者之间的性能差距。DaRL已部署上线,服务于支付宝的保险产品搜索。基于真实行业数据的线下实验和在线A/B测试均表明,DaRL有效提升了相关性建模的性能。|code|0|
|A Self-Adaptive Fairness Constraint Framework for Industrial Recommender System|Zhiqiang Liu, Xiaoxiao Xu, Jiaqi Yu, Han Xu, Lantao Hu, Han Li, Kun Gai|Kuaishou Technology, BeiJing, China; Unaffiliated, Beijing, China; Kuaishou Technology, Beijing, China|Achieving fairness among different individuals or groups is an essential task for industrial recommender systems. Due to the group's personalized selection tendencies and the non-uniform population distributions, existing industrial recommenders tend to make unfair predictions towards the preferences of minority groups. To alleviate this unfairness, we propose a model-agnostic self-adaptive fairness constraint framework (SaFair) based on the posterior preferences of different groups. We construct group-level and individual-level fairness constraints. The former measures consistency between group-level posterior preferences and predicted interests, and the latter relies on the degree of consistency in interests between a user and their associated group to perform self-adaptive constraints. In particular, to balance effectiveness and fairness, we utilize uncertainty estimation to adjust the intensity of constraints according to the model's learning status called self-adaptive constraints. Extensive offline experiments and online A/B Testing are conducted and the results validate the superiority of our proposed method over the baselines. SaFair has been successfully deployed in Kuaishou, one of China's most popular short-video streaming platforms with hundreds of millions of active users.|在工业推荐系统中,实现不同个体或群体之间的公平性是一项至关重要的任务。由于群体的个性化选择倾向和非均匀的人口分布,现有的工业推荐系统往往对少数群体的偏好做出不公平的预测。为了缓解这种不公平性,我们提出了一个基于不同群体后验偏好的模型无关的自适应公平约束框架(SaFair)。我们构建了群体级别和个人级别的公平约束。前者衡量群体级别后验偏好与预测兴趣之间的一致性,而后者则依赖于用户与其所属群体之间兴趣一致性的程度来执行自适应约束。特别地,为了平衡有效性和公平性,我们利用不确定性估计根据模型的学习状态调整约束的强度,称为自适应约束。我们进行了广泛的离线实验和在线A/B测试,结果验证了我们提出的方法相对于基线的优越性。SaFair已成功部署在中国最受欢迎的短视频流媒体平台之一——快手,该平台拥有数亿活跃用户。|code|0|
|GLaD: Synergizing Molecular Graphs and Language Descriptors for Enhanced Power Conversion Efficiency Prediction in Organic Photovoltaic Devices|Thao Nguyen, Tiara TorresFlores, Changhyun Hwang, Carl Edwards, Ying Diao, Heng Ji|University of Illinois Urbana-Champaign Siebel School of Computing and Data Science; University of Illinois Urbana-Champaign Department of Chemical & Biomolecular Engineering|This paper presents a novel approach for predicting Power ConversionEfficiency (PCE) of Organic Photovoltaic (OPV) devices, called GLaD:synergizing molecular Graphs and Language Descriptors for enhanced PCEprediction. Due to the lack of high-quality experimental data, we collect adataset consisting of 500 pairs of OPV donor and acceptor molecules along withtheir corresponding PCE values, which we utilize as the training data for ourpredictive model. In this low-data regime, GLaD leverages properties learnedfrom large language models (LLMs) pretrained on extensive scientific literatureto enrich molecular structural representations, allowing for a multimodalrepresentation of molecules. GLaD achieves precise predictions of PCE, therebyfacilitating the synthesis of new OPV molecules with improved efficiency.Furthermore, GLaD showcases versatility, as it applies to a range of molecularproperty prediction tasks (BBBP, BACE, ClinTox, and SIDER), not limited tothose concerning OPV materials. Especially, GLaD proves valuable for tasks inlow-data regimes within the chemical space, as it enriches molecularrepresentations by incorporating molecular property descriptions learned fromlarge-scale pretraining. This capability is significant in real-worldscientific endeavors like drug and material discovery, where access tocomprehensive data is crucial for informed decision-making and efficientexploration of the chemical space.|本文提出了一种名为GLaD(结合分子图和语言描述符以增强功率转换效率预测)的新方法,用于预测有机光伏(OPV)器件的功率转换效率(PCE)。由于缺乏高质量的实验数据,我们收集了一个包含500对OPV供体和受体分子及其相应PCE值的数据集,并将其用作预测模型的训练数据。在这种数据量较少的情况下,GLaD利用从广泛科学文献预训练的大型语言模型(LLMs)中学习到的属性来丰富分子结构表示,从而实现分子的多模态表示。GLaD能够精确预测PCE,从而促进合成效率更高的新OPV分子。此外,GLaD展示了其多功能性,适用于一系列分子属性预测任务(BBBP、BACE、ClinTox和SIDER),不仅限于与OPV材料相关的任务。特别是在化学空间中的低数据任务中,GLaD通过整合从大规模预训练中学习到的分子属性描述来丰富分子表示,这被证明具有重要价值。这一能力在药物和材料发现等实际科学工作中至关重要,因为全面的数据对于明智的决策和化学空间的有效探索至关重要。|code|0|
|Cross-contextual Sequential Optimization via Deep Reinforcement Learning for Algorithmic Trading|Kaiming Pan, Yifan Hu, Li Han, Haoyu Sun, Dawei Cheng, Yuqi Liang|Seek Data Group, Emoney Inc., Shanghai, China; Software Engineering Institute, East China Normal University, Shanghai, China; Department of Computer Science, Tongji University, Shanghai, China|High-frequency algorithmic trading has consistently attracted attention in both academic and industrial fields, which is formally modeled as a near real-time sequential decision problem. DRL methods are treated as a promising direction compared with the traditional approaches, as they have shown great potential in chasing maximum accumulative return. However, the financial data gathered from volatile market change rapidly, which makes it dramatically difficult to grasp crucial factors for effective decision-making. Existing works mainly focus on capturing temporal relations while ignoring deriving essential factors across features. Therefore, we propose a DRL-based cross-contextual sequential optimization (CCSO) method for algorithmic trading. In particular, we employ a convolution module in the first stage to derive latent factors via inter-sequence aggregation and apply a well-designed self-attention module in the second stage to capture market dynamics by aggregating temporal intra-sequence details. With the two-stage extractor as encoder and a RNN-based decision-maker as decoder, an Encoder-Decoder module is established as the policy network to conduct potent feature analysis and suggest action plans. Then, we design a dynamic programming based learning method to address the challenge of complex network updates in reinforcement learning, leading to considerable enhancement in learning stability and efficiency. To the best of our knowledge, this is the first work that solves the sequential optimization problem by joint representation of trading data across time and features in the DRL framework. Extensive experiments demonstrate the superior performance of our method compared to other state-of-the-art algorithmic trading approaches in various widely-used metrics.|高频算法交易在学术界和工业界持续受到关注,其正式模型被视为一种近实时序列决策问题。与传统方法相比,深度强化学习(DRL)方法被视为一个有前景的方向,因为它们在追求最大累积回报方面展现了巨大潜力。然而,从市场波动中收集的金融数据变化迅速,这使得捕捉有效决策的关键因素变得极其困难。现有研究主要集中在捕捉时间关系,而忽视了跨特征推导重要因素。因此,我们提出了一种基于DRL的跨上下文序列优化(CCSO)方法用于算法交易。具体而言,我们在第一阶段采用卷积模块通过序列间聚合推导潜在因素,并在第二阶段应用设计良好的自注意力模块通过聚合时间序列内细节来捕捉市场动态。通过将两阶段提取器作为编码器和基于RNN的决策者作为解码器,构建了一个编码器-解码器模块作为策略网络,以进行强大的特征分析并提出行动计划。随后,我们设计了一种基于动态规划的学习方法来应对强化学习中复杂网络更新的挑战,从而显著提升了学习稳定性和效率。据我们所知,这是首次在DRL框架中通过时间与特征的联合表示来解决序列优化问题的工作。广泛的实验证明,我们的方法在各种广泛使用的指标上优于其他最先进的算法交易方法。|code|0|
|STIR: Siamese Transformer for Image Retrieval Postprocessing|Aleksei Shabanov, Aleksei Tarasov, Sergey I. Nikolenko||Current metric learning approaches for image retrieval are usually based on learning a space of informative latent representations where simple approaches such as the cosine distance will work well. Recent state of the art methods such as HypViT move to more complex embedding spaces that may yield better results but are harder to scale to production environments. In this work, we first construct a simpler model based on triplet loss with hard negatives mining that performs at the state of the art level but does not have these drawbacks. Second, we introduce a novel approach for image retrieval postprocessing called Siamese Transformer for Image Retrieval (STIR) that reranks several top outputs in a single forward pass. Unlike previously proposed Reranking Transformers, STIR does not rely on global/local feature extraction and directly compares a query image and a retrieved candidate on pixel level with the usage of attention mechanism. The resulting approach defines a new state of the art on standard image retrieval datasets: Stanford Online Products and DeepFashion In-shop. We also release the source code at https://github.com/OML-Team/open-metric-learning/tree/main/pipelines/postprocessing/ and an interactive demo of our approach at https://dapladoc-oml-postprocessing-demo-srcappmain-pfh2g0.streamlit.app/|当前用于图像检索的度量学习方法通常基于学习一个信息丰富的潜在表示空间,其中简单的度量方法(如余弦距离)表现良好。最近的最先进方法,如HypViT,转向了更复杂的嵌入空间,可能带来更好的结果,但更难以扩展到生产环境中。在这项工作中,我们首先构建了一个基于三元组损失与难负样本挖掘的简单模型,该模型达到了最先进的性能水平,但没有这些缺点。其次,我们引入了一种名为Siamese Transformer for Image Retrieval(STIR)的图像检索后处理新方法,通过一次前向传播对多个顶部输出进行重新排序。与之前提出的重新排序变压器不同,STIR不依赖于全局/局部特征提取,而是直接在像素级别上使用注意力机制比较查询图像和检索到的候选图像。该方法在标准图像检索数据集上定义了新的最先进水平:Stanford Online Products和DeepFashion In-shop。我们还发布了源代码,链接为https://github.com/OML-Team/open-metric-learning/tree/main/pipelines/postprocessing/,并提供了一个交互式演示,链接为https://dapladoc-oml-postprocessing-demo-srcappmain-pfh2g0.streamlit.app/。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=STIR:+Siamese+Transformer+for+Image+Retrieval+Postprocessing)|0|
|Enhancing Taobao Display Advertising with Multimodal Representations: Challenges, Approaches and Insights|XiangRong Sheng, Feifan Yang, Litong Gong, Biao Wang, Zhangming Chan, Yujing Zhang, Yueyao Cheng, YongNan Zhu, Tiezheng Ge, Han Zhu, Yuning Jiang, Jian Xu, Bo Zheng||Despite the recognized potential of multimodal data to improve model accuracy, many large-scale industrial recommendation systems, including Taobao display advertising system, predominantly depend on sparse ID features in their models. In this work, we explore approaches to leverage multimodal data to enhance the recommendation accuracy. We start from identifying the key challenges in adopting multimodal data in a manner that is both effective and cost-efficient for industrial systems. To address these challenges, we introduce a two-phase framework, including: 1) the pre-training of multimodal representations to capture semantic similarity, and 2) the integration of these representations with existing ID-based models. Furthermore, we detail the architecture of our production system, which is designed to facilitate the deployment of multimodal representations. Since the integration of multimodal representations in mid-2023, we have observed significant performance improvements in Taobao display advertising system. We believe that the insights we have gathered will serve as a valuable resource for practitioners seeking to leverage multimodal data in their systems.|尽管多模态数据被认为具有提高模型准确性的潜力,但许多大规模工业推荐系统,包括淘宝展示广告系统,主要依赖于模型中的稀疏ID特征。在这项工作中,我们探索了利用多模态数据来增强推荐准确性的方法。我们从识别在工业系统中有效且成本高效地采用多模态数据的关键挑战开始。为了解决这些挑战,我们引入了一个两阶段框架,包括:1)多模态表示的预训练以捕捉语义相似性,以及2)将这些表示与现有的基于ID的模型进行整合。此外,我们详细介绍了我们的生产系统架构,该架构设计用于促进多模态表示的部署。自2023年年中多模态表示整合以来,我们在淘宝展示广告系统中观察到了显著的性能提升。我们相信,我们获得的这些见解将为寻求在其系统中利用多模态数据的从业者提供宝贵的资源。|code|0|
|LLM-based Automated Web Retrieval and Text Classification of Food Sharing Initiatives|Hao Wu, Hyunji Cho, Anna R. Davies, Gareth J. F. Jones|ADAPT Centre, School of Computing, Dublin City University, Dublin, Ireland; Geography, School of Natural Sciences, Trinity College Dublin, Dublin, Ireland|Urban and peri-urban (UPU) food systems encounter challenges in sustainability and are fragile and vulnerable to shocks. Addressing these issues is one of the key drivers of food sharing initiatives (FSIs) which focus on collective acts around food across the food system. FSIs range from seed sharing and surplus food redistribution to community composting. We describe our development and deployment of web retrieval and content classification tools designed to provide automated mapping of FSIs at scale to populate databases of FSIs within cities. We present our novel automated system tailored for retrieving, identifying, categorizing and real-time monitoring of FSIs in over 200 European cities. Developed within the European CULTIVATE project, this system not only aids in comprehending the complex dynamics of the food sharing economy, but also enhances its visibility and operational efficiency. The automation of these processes plays a vital role in supporting the goals of the CULTIVATE project, notably in promoting sustainable food practices and resilient local food networks. Our system integrates web search using queries constructed automatically using domain-specific vocabulary resources with Large Language Model (LLM) query writing and classification methods. Experimental results using a collection of data derived from real online FSI content underscore the potential of digital automation to make significant contributions to innovative digital solutions to contemporary sustainability challenges. As such, the findings of this work pave the way for future research and implementation in similar contexts.|城市和城郊(UPU)食品系统在可持续性方面面临挑战,且容易受到冲击的影响,表现出脆弱性。解决这些问题是食品共享倡议(FSIs)的关键驱动力之一,这些倡议侧重于整个食品系统中的集体食品行为。FSIs的范围从种子共享和剩余食品再分配到社区堆肥。我们描述了开发和部署的网络检索和内容分类工具,这些工具旨在大规模自动绘制FSIs地图,以填充城市内的FSIs数据库。我们展示了一种专为在200多个欧洲城市中检索、识别、分类和实时监控FSIs而设计的自动化系统。该系统是在欧洲CULTIVATE项目中开发的,不仅有助于理解食品共享经济的复杂动态,还增强了其可见性和运营效率。这些过程的自动化对支持CULTIVATE项目的目标起着至关重要的作用,特别是在促进可持续食品实践和弹性本地食品网络方面。我们的系统集成了利用领域特定词汇资源自动构建查询的网络搜索与大型语言模型(LLM)查询编写和分类方法。使用从真实在线FSI内容中提取的数据集进行的实验结果,突显了数字自动化对当代可持续性挑战创新数字解决方案的潜在贡献。因此,这项工作的发现为未来在类似情境中的研究和实施铺平了道路。|code|0|
|Breaking the Barrier: Utilizing Large Language Models for Industrial Recommendation Systems through an Inferential Knowledge Graph|Qian Zhao, Hao Qian, Ziqi Liu, GongDuo Zhang, Lihong Gu|; Ant Group Hangzhou; Utilizing Large Language Models for Industrial Recom-mendation Systems through an Inferential Knowledge Graph|Recommendation systems are widely used in e-commerce websites and onlineplatforms to address information overload. However, existing systems primarilyrely on historical data and user feedback, making it difficult to capture userintent transitions. Recently, Knowledge Base (KB)-based models are proposed toincorporate expert knowledge, but it struggle to adapt to new items and theevolving e-commerce environment. To address these challenges, we propose anovel Large Language Model based Complementary Knowledge EnhancedRecommendation System (LLM-KERec). It introduces an entity extractor thatextracts unified concept terms from item and user information. To providecost-effective and reliable prior knowledge, entity pairs are generated basedon entity popularity and specific strategies. The large language modeldetermines complementary relationships in each entity pair, constructing acomplementary knowledge graph. Furthermore, a new complementary recall moduleand an Entity-Entity-Item (E-E-I) weight decision model refine the scoring ofthe ranking model using real complementary exposure-click samples. Extensiveexperiments conducted on three industry datasets demonstrate the significantperformance improvement of our model compared to existing approaches.Additionally, detailed analysis shows that LLM-KERec enhances users' enthusiasmfor consumption by recommending complementary items. In summary, LLM-KERecaddresses the limitations of traditional recommendation systems byincorporating complementary knowledge and utilizing a large language model tocapture user intent transitions, adapt to new items, and enhance recommendationefficiency in the evolving e-commerce landscape.|推荐系统在电子商务网站和在线平台上被广泛使用,以应对信息过载问题。然而,现有系统主要依赖历史数据和用户反馈,难以捕捉用户意图的转变。最近,基于知识库(KB)的模型被提出以整合专家知识,但它们难以适应新商品和不断变化的电子商务环境。为解决这些挑战,我们提出了一种新颖的大语言模型(LLM)增强的补充知识推荐系统(LLM-KERec)。该系统引入了一个实体提取器,从商品和用户信息中提取统一的概念术语。为了提供成本效益高且可靠的先验知识,实体对基于实体流行度和特定策略生成。大语言模型确定每个实体对中的补充关系,构建一个补充知识图。此外,一个新的补充召回模块和一个实体-实体-商品(E-E-I)权重决策模型使用真实的补充曝光-点击样本来优化排序模型的评分。在三个行业数据集上进行的大量实验表明,我们的模型相较于现有方法显著提升了性能。详细分析显示,LLM-KERec通过推荐补充商品增强了用户的消费热情。总之,LLM-KERec通过整合补充知识并利用大语言模型捕捉用户意图转变,适应新商品,并在不断变化的电子商务环境中提升推荐效率,从而解决了传统推荐系统的局限性。|code|0|
|STaR: Space and Time-aware Statistic Query Answering|Oana Balalau, Simon Ebel, Helena Galhardas, Théo Galizzi, Ioana Manolescu|INESC-ID & IST, Universidade Lisboa, Lisbon, Portugal; Inria & Institut Polytechnique de Paris, Palaiseau, France|High-quality data is essential for informed public debate. High-quality statistical data sources provide valuable reference information for verifying claims. To assist journalists and fact-checkers, user queries about specific claims should be automatically answered using statistical tables. However, the large number and variety of these sources make this task challenging. We propose to demonstrate STaR, a novel method for Space and Time-aware STatistic Retrieval, based on a user natural language query. STaR is deployed within our system StatCheck, which we developed and shared with fact-checking journalists. STaR improves the quality of statistic fact retrieval by treating space and time separately from the other parts of the statistics dataset. Specifically, we use them as dimensions of the data (and the query), and focus the linguistic part of our dataset search on the rich, varied language present in the data. Our demonstration uses statistic datasets from France, Europe, and a few beyond, allowing users to query and explore along space and time dimensions.|高质量的数据对于公众辩论至关重要。高质量的统计数据源为验证声明提供了宝贵的参考信息。为了协助记者和事实核查人员,应能够自动使用统计表格回答用户关于特定声明的查询。然而,这些来源的数量和多样性使得这项任务颇具挑战性。我们提出了STaR,一种基于用户自然语言查询的空间和时间感知统计检索新方法。STaR被部署在我们的系统StatCheck中,该系统是我们开发并与事实核查记者共享的。STaR通过将空间和时间与统计数据集的其他部分分开处理,提高了统计事实检索的质量。具体而言,我们将它们用作数据的维度(以及查询的维度),并将数据集中搜索的语言部分聚焦于数据中丰富多样的语言。我们的演示使用了来自法国、欧洲及其它几个地区的统计数据集,允许用户沿空间和时间维度进行查询和探索。|code|0|
|FairRankTune: A Python Toolkit for Fair Ranking Tasks|Kathleen Cachel, Elke A. Rundensteiner|Worcester Polytechnic Institute, Worcester, MA, USA|We present FairRankTune, a multi-purpose open-source Python toolkit offering three primary services: quantifying fairness-related harms, leveraging bias mitigation algorithms, and constructing custom fairness-relevant datasets. FairRankTune provides researchers and practitioners with a self-contained resource for fairness auditing, experimentation, and advancing research. The central piece of FairRankTune is a novel fairness-tunable ranked data generator, RankTune, that streamlines the creation of custom fairness-relevant ranked datasets. FairRankTune also offers numerous fair ranking metrics and fairness-aware ranking algorithms within the same plug-and-play package. We demonstrate the key innovations of FairRankTune, focusing on features that are valuable to stakeholders via use cases highlighting workflows in the end-to-end process of mitigating bias in ranking systems. FairRankTune addresses the gap of limited publicly available datasets, auditing tools, and implementations for fair ranking.|我们介绍了FairRankTune,这是一个多用途的开源Python工具包,提供三项主要服务:量化与公平性相关的损害、运用偏差缓解算法以及构建自定义的与公平性相关的数据集。FairRankTune为研究人员和实践者提供了一个自包含的资源,用于公平性审计、实验和推动研究进展。该工具包的核心是一个新颖的可调公平性的排名数据生成器RankTune,它简化了自定义公平相关排名数据集的创建过程。FairRankTune还在同一即插即用包中提供了众多公平排名指标和公平意识排名算法。我们展示了FairRankTune的关键创新,重点介绍了通过端到端流程中的用例突出显示的工作流,这些工作流对利益相关者具有重要价值,特别是在缓解排名系统中的偏差方面。FairRankTune解决了公开可用数据集、审计工具和公平排名实现有限的缺口问题。|code|0|
|LLM-PQA: LLM-enhanced Prediction Query Answering|Ziyu Li, Wenjie Zhao, Asterios Katsifodimos, Rihan Hai||The advent of Large Language Models (LLMs) provides an opportunity to change the way queries are processed, moving beyond the constraints of conventional SQL-based database systems. However, using an LLM to answer a prediction query is still challenging, since an external ML model has to be employed and inference has to be performed in order to provide an answer. This paper introduces LLM-PQA, a novel tool that addresses prediction queries formulated in natural language. LLM-PQA is the first to combine the capabilities of LLMs and retrieval-augmented mechanism for the needs of prediction queries by integrating data lakes and model zoos. This integration provides users with access to a vast spectrum of heterogeneous data and diverse ML models, facilitating dynamic prediction query answering. In addition, LLM-PQA can dynamically train models on demand, based on specific query requirements, ensuring reliable and relevant results even when no pre-trained model in a model zoo, available for the task.|大规模语言模型(LLMs)的出现为我们提供了一个改变查询处理方式的契机,使其超越了传统基于SQL的数据库系统的限制。然而,使用LLM来回答预测性查询仍然具有挑战性,因为需要借助外部机器学习模型并进行推理才能提供答案。本文介绍了LLM-PQA,这是一种新颖的工具,专门用于处理以自然语言表达的预测性查询。LLM-PQA首次将LLM的能力与检索增强机制相结合,以满足预测查询的需求,通过整合数据湖和模型库来实现这一目标。这种整合使用户能够访问广泛且异构的数据集以及多样化的机器学习模型,从而促进动态预测查询的解答。此外,LLM-PQA能够根据特定查询需求动态训练模型,即使在模型库中没有适用于该任务的预训练模型时,也能确保结果的可靠性和相关性。|code|0|
|Unified Argument Retrieval System from German News Articles Using Large Language Models|Piriyakorn Piriyatamwong, Saikishore Kalloori, Fabio Zünd|ETH Zürich, Zürich, Switzerland|The rapid growth in the number of news articles published daily can create challenges for users to explore specific topics and gather different perspectives around the topics to make neutral and unbiased conclusions. The system's ability to intelligently cluster news articles from multiple sources and retrieve concise (pro/con) relevant arguments is necessary for users' well-informed decision-making. In this paper, we introduce our unified argument retrieval system that uses our clustering model to cluster news articles and subsequently extracts the core arguments from news articles using the argument prediction model. We conducted a user study to understand the system's usability and users' satisfaction with the quality of clusters and arguments extracted.|随着每日发布的新闻文章数量迅速增长,用户在探索特定话题并收集不同观点以做出中立和无偏见的结论方面面临挑战。系统能够智能地从多个来源聚类新闻文章并检索简明的(支持/反对)相关论点,这对于用户做出明智决策是必要的。在本文中,我们介绍了一个统一的论点检索系统,该系统使用我们的聚类模型对新闻文章进行聚类,并随后使用论点预测模型从新闻文章中提取核心论点。我们进行了一项用户研究,以了解系统的可用性以及用户对聚类和提取论点质量的满意度。|code|0|
|Empowering Shoppers with Event-focused Search|Austin R. Ward, Omar Alonso|Amazon, Palo Alto, CA, USA; Amazon, Seattle, WA, USA|We present Event-focused Search, an automated and scalable pipeline designed to facilitate event discovery and enhance event-based search. This is done by leveraging large language models (LLMs) to populate event datasets, perform temporal search based on selected dates, and aggregate search results based on appropriate events based on those searches. We illustrate this pipeline through proof-of-concept interfaces in an e-commerce context, though such a framework is applicable to different types of search scenarios (e.g., sports, entertainment).|我们提出了事件聚焦搜索(Event-focused Search),这是一个自动化且可扩展的流程,旨在促进事件发现并增强基于事件的搜索。通过利用大型语言模型(LLMs)来填充事件数据集,基于选定日期执行时间搜索,并根据这些搜索结果聚合相关事件的搜索结果,我们实现了这一目标。我们通过在电子商务环境中展示概念验证界面来说明这一流程,尽管该框架同样适用于不同类型搜索场景(例如,体育、娱乐)。|code|0|
|Multi-turn Classroom Dialogue Dataset: Assessing Student Performance from One-on-one Conversations|Jiahao Chen, Zitao Liu, Mingliang Hou, Xiangyu Zhao, Weiqi Luo|Guangdong Institute of Smart Education, Jinan University, Guangzhou, China; TAL Education Group, Beijing, China; School of Data Science, City University of Hong Kong, Hong Kong, China|Accurately judging student on-going performance is crucial for adaptive teaching. In this work, we focus on the task of automatically predicting students' levels of mastery of math questions from teacher-student classroom dialogue data in online one-on-one classes. As a step toward this direction, we introduce the Multi-turn Classroom Dialogue (MCD) dataset as a benchmark testing the capabilities of machine learning models in classroom conversation understanding of student performance judgment. Our dataset contains aligned multi-turn spoken language of 5000+ unique samples of solving grade-8 math questions collected from 500+ hours' worth of online one-on-one tutoring classes. In our experiments, we assess various state-of-the-art models on the MCD dataset, highlighting the importance of understanding multi-turn dialogues and handling noisy ASR transcriptions. Our findings demonstrate the dataset's utility in advancing research on automated student performance assessment. To encourage reproducible research, we make our data publicly available at https://github.com/ai4ed/MCD.|准确判断学生的实时表现对于适应性教学至关重要。本研究聚焦于从在线一对一课堂中的师生对话数据中自动预测学生对数学问题的掌握程度。为此,我们引入了多轮课堂对话(Multi-turn Classroom Dialogue, MCD)数据集,作为评估机器学习模型在课堂对话理解与学生表现判断能力方面的基准。该数据集包含了5000多个独特样本的多轮口语对话,这些样本来自500多小时的在线一对一辅导课程,内容涉及八年级数学问题的解答。在实验中,我们评估了多种最先进的模型在MCD数据集上的表现,强调了理解多轮对话和处理噪声ASR(自动语音识别)转录文本的重要性。研究结果表明,该数据集在推动自动化学生表现评估研究方面具有重要价值。为促进可重复研究,我们公开了数据集,访问地址为:https://github.com/ai4ed/MCD。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Multi-turn+Classroom+Dialogue+Dataset:+Assessing+Student+Performance+from+One-on-one+Conversations)|0|
|An Evaluation Framework for Attributed Information Retrieval using Large Language Models|Hanane Djeddal, Pierre Erbacher, Raouf Toukal, Laure Soulier, Karen PinelSauvagnat, Sophia Katrenko, Lynda Tamine||With the growing success of Large Language models (LLMs) in information-seeking scenarios, search engines are now adopting generative approaches to provide answers along with in-line citations as attribution. While existing work focuses mainly on attributed question answering, in this paper, we target information-seeking scenarios which are often more challenging due to the open-ended nature of the queries and the size of the label space in terms of the diversity of candidate-attributed answers per query. We propose a reproducible framework to evaluate and benchmark attributed information seeking, using any backbone LLM, and different architectural designs: (1) Generate (2) Retrieve then Generate, and (3) Generate then Retrieve. Experiments using HAGRID, an attributed information-seeking dataset, show the impact of different scenarios on both the correctness and attributability of answers.|随着大型语言模型(LLMs)在信息检索场景中的应用日益成功,搜索引擎正采用生成式方法来提供带有内联引用作为归属的答案。尽管现有研究主要集中在归属问答上,本文则针对更具挑战性的信息检索场景,这类场景由于查询的开放性及每个查询对应的候选归属答案多样性导致的标签空间庞大而显得尤为复杂。我们提出了一种可复现的框架,用于评估和基准测试归属信息检索,使用任何骨干LLM,并结合不同的架构设计:(1)生成式(2)检索后生成,以及(3)生成后检索。通过使用HAGRID这一归属信息检索数据集进行的实验,展示了不同场景对答案正确性和归属性的影响。|code|0|
|AnnoRank: A Comprehensive Web-Based Framework for Collecting Annotations and Assessing Rankings|Clara Rus, Gabrielle Poerwawinata, Andrew Yates, Maarten de Rijke|University of Amsterdam, Amsterdam, Netherlands|We present AnnoRank, a web-based user interface (UI) framework designed to facilitate collecting crowdsource annotations in the context of information retrieval. AnnoRank enables the collection of explicit and implicit annotations for a specified query and a single or multiple documents, allowing for the observation of user-selected items and the assignment of relevance judgments. Furthermore, AnnoRank allows for ranking comparisons, allowing for the visualization and evaluation of a ranked list generated by different fairness interventions, along with its utility and fairness metrics. Fairness interventions in the annotation pipeline are necessary to prevent the propagation of bias when a user selects the top-k items in a ranked list. With the widespread use of ranking systems, the application supports multimodality through text and image document formats. We also support the assessment of agreement between annotators to ensure the quality of the annotations. AnnoRank is integrated with the Ranklib library, offering a vast range of ranking models that can be applied to the data and displayed in the UI. AnnoRank is designed to be flexible, configurable, and easy to deploy to meet diverse annotation needs in information retrieval. AnnoRank is publicly available as open-source software, together with detailed documentation at https://github.com/ClaraRus/AnnoRank.|我们提出了AnnoRank,这是一个基于网络的用户界面(UI)框架,旨在促进在信息检索背景下收集众包标注。AnnoRank能够收集针对指定查询和一个或多个文档的显式和隐式标注,使用户能够观察到用户选择的项目并进行相关性判断。此外,AnnoRank支持排序比较,允许可视化和评估由不同公平性干预措施生成的排序列表,以及其效用和公平性指标。在标注流程中实施公平性干预是必要的,以防止用户在选择排序列表中的前k个项目时偏差的传播。随着排序系统的广泛应用,该系统通过文本和图像文档格式支持多模态。我们还支持评估标注者之间的一致性,以确保标注的质量。AnnoRank与Ranklib库集成,提供了一系列广泛的排序模型,这些模型可以应用于数据并在UI中展示。AnnoRank设计灵活、可配置,并且易于部署,以满足信息检索中多样化的标注需求。AnnoRank作为开源软件公开发布,详细文档可在https://github.com/ClaraRus/AnnoRank获取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=AnnoRank:+A+Comprehensive+Web-Based+Framework+for+Collecting+Annotations+and+Assessing+Rankings)|0|
|Advancing Misinformation Awareness in Recommender Systems for Social Media Information Integrity|Royal Pathak|Boise State University, Boise, ID, USA|Recommender systems play an essential role in determining the content users encounter on social media platforms and in uncovering relevant news. However, they also present significant risks, such as reinforcing biases, over-personalizing content, fostering filter bubbles, and inadvertently promoting misinformation. The spread of false information is rampant across various online platforms, such as Twitter (now X), Meta, and TikTok, especially noticeable during events like the COVID-19 pandemic and the US Presidential elections. These instances underscore the critical necessity for transparency and regulatory oversight in the development of recommender systems. Given the challenge of balancing free speech with the risks of outright removal of fake news, this paper aims to address the spread of misinformation from algorithmic biases in recommender systems using a social science perspective.|推荐系统在决定用户在社交媒体平台上接触的内容以及揭示相关新闻方面发挥着至关重要的作用。然而,它们也带来了重大风险,例如强化偏见、过度个性化内容、助长信息茧房以及无意中推广虚假信息。虚假信息的传播在各种在线平台上十分猖獗,如Twitter(现为X)、Meta和TikTok,尤其在COVID-19疫情和美国大选等事件期间更为明显。这些情况凸显了在推荐系统开发中透明度和监管监督的迫切需要。鉴于在平衡言论自由与彻底删除虚假新闻的风险方面的挑战,本文旨在从社会科学的角度探讨推荐系统中的算法偏差如何导致虚假信息的传播。|code|0|
|Multi-Granularity Modeling in Recommendation: from the Multi-Scenario Perspective|Yuhao Wang|City University of Hong Kong, Hong Kong, Hong Kong|In today's digital landscape, Deep Recommender Systems (DRS) play a crucial role in navigating and customizing online content for individual preferences. However, conventional methods, which mainly depend on single recommendation task, scenario, data modality and user behavior, are increasingly seen as insufficient due to their inability to accurately reflect users' complex and changing preferences. This gap underscores the need for multi-granularity modeling, which are central to overcoming these limitations by integrating diverse tasks, scenarios, modalities, and behaviors in the recommendation process, thus promising significant enhancements in recommendation precision, efficiency, and customization. In this paper, from the multi-scenario perspective, we illustrate our existing explorations and present results. Ultimately, we wish to highlight our multi-granularity approach sheds light on building the next generation of recommender system1 .|在当今的数字环境中,深度推荐系统(DRS)在根据个人偏好导航和定制在线内容方面发挥着关键作用。然而,传统方法主要依赖于单一推荐任务、场景、数据模态和用户行为,由于无法准确反映用户复杂且多变的偏好,这些方法的局限性日益凸显。这一差距突显了多粒度建模的必要性,该方法通过在推荐过程中整合多种任务、场景、模态和行为,有望显著提升推荐的准确性、效率和个性化水平。本文从多场景的角度,展示了我们现有的探索成果,并介绍了相关结果。最终,我们希望强调我们的多粒度方法为构建下一代推荐系统提供了重要启示。|code|0|
|Unifying Spectral and Spatial Graph Neural Networks|Zhiqian Chen, Lei Zhang, Liang Zhao|Emory University, Atlanta, GA, USA; Mississippi State University, Starkville, MS, USA; Northern Illinois University, Dekalb, IL, USA|In recent years, Graph Neural Networks (GNNs) have attracted considerable attention. However, the rapid emergence of diverse GNN models, each grounded in different theoretical foundations, complicates the model selection process, as these models are not easily understood within a unified framework. Initial GNNs were constructed using spectral theory, while others were developed based on spatial theory. This theoretical divergence makes direct comparisons difficult. Furthermore, the variety of models within each theoretical domain further complicates their evaluation. In this tutorial, we explore state-of-the-art GNNs and present a comprehensive framework that bridges the spatial and spectral domains, clarifying their interrelationship. This framework deepens our understanding of GNN operations. The tutorial delves into key paradigms, such as spatial and spectral methods, through a synthesis of spectral graph theory and approximation theory. We conduct an in-depth analysis of recent research advancements, addressing emerging issues like over-smoothing, using well-established GNN models to illustrate the universality of our framework.|近年来,图神经网络(GNNs)引起了广泛关注。然而,由于不同理论基础支撑的多样化GNN模型的快速涌现,模型选择过程变得复杂,因为这些模型难以在一个统一的框架内被理解。最初的GNN模型是基于谱理论构建的,而其他模型则基于空间理论发展。这种理论上的分歧使得直接比较变得困难。此外,每个理论领域内模型的多样性进一步增加了评估的复杂性。在本教程中,我们探讨了最先进的GNN,并提出了一个综合框架,该框架连接了空间和谱域,阐明了它们之间的相互关系。这一框架加深了我们对GNN操作的理解。教程深入探讨了关键范式,如空间和谱方法,通过结合谱图理论和近似理论的合成来进行分析。我们对最近的研究进展进行了深入分析,并使用成熟的GNN模型来解决新兴问题,如过平滑问题,以说明我们框架的普遍性。|code|0|
|Tutorial on Landing Generative AI in Industrial Social and E-commerce Recsys|Da Xu, Danqing Zhang, Lingling Zheng, Bo Yang, Guangyu Yang, Shuyuan Xu, Cindy Liang|LinkedIn, Sunnyvale, California, USA; TikTok, San Jose, California, USA; Microsoft, Redmond, Washington, USA; TikTok, Santa Clara, California, USA; Amazon, Palo Alto, California, USA|Over the past two years, GAI has evolved rapidly, influencing various fields including social and e-commerce Recsys. Despite exciting advances, landing these innovations in real-world Recsys remains challenging due to the sophistication of modern industrial product and systems. Our tutorial begins with a brief overview of building industrial Recsys and GAI fundamentals, followed by the ongoing efforts and opportunities to enhance personalized recommendations with foundation models. We then explore the integration of curation capabilities into Recsys, such as repurposing raw content, incorporating external knowledge, and generating personalized insights/explanations to foster transparency and trust. Next, the tutorial illustrates how AI agents can transform Recsys through interactive reasoning and action loops, shifting away from traditional passive feedback models. Finally, we shed insights on real-world solutions for human-AI alignment and responsible GAI practices. A critical component of the tutorial is detailing the AI, Infrastructure, LLMOps, and Product roadmap (including the evaluation and responsible AI practices) derived from the production solutions in LinkedIn, Amazon, TikTok, and Microsoft. While GAI in Recsys is still in its early stages, this tutorial provides valuable insights and practical solutions for the Recsys and GAI communities.|在过去两年中,生成式人工智能(GAI)迅速发展,影响了包括社交和电子商务推荐系统(Recsys)在内的多个领域。尽管取得了令人振奋的进展,但将这些创新应用于实际的推荐系统仍然面临挑战,这主要是由于现代工业产品和系统的复杂性。我们的教程首先简要概述了构建工业级推荐系统的基础知识以及生成式人工智能的基本原理,随后探讨了利用基础模型提升个性化推荐的努力和机遇。接着,我们探讨了如何将内容策划能力整合到推荐系统中,例如重新利用原始内容、整合外部知识,以及生成个性化的见解和解释,以促进透明度和用户信任。然后,教程展示了人工智能代理如何通过交互式推理和行动循环来转变推荐系统,从而摆脱传统的被动反馈模式。最后,我们分享了关于人机协作和负责任的生成式人工智能实践的实际解决方案。教程的一个重要部分是详细介绍了从LinkedIn、Amazon、TikTok和Microsoft的生产解决方案中提炼出的AI、基础设施、LLMOps和产品路线图(包括评估和负责任的AI实践)。尽管生成式人工智能在推荐系统中的应用仍处于早期阶段,但本教程为推荐系统和生成式人工智能社区提供了宝贵的见解和实用的解决方案。|code|0|
|Reviewerly: Modeling the Reviewer Assignment Task as an Information Retrieval Problem|Negar Arabzadeh, Sajad Ebrahimi, Sara Salamat, Mahdi Bashari, Ebrahim Bagheri|Reviewerly, Guelph, ON, Canada; Reviewerly, Toronto, ON, Canada|The peer review process is a fundamental aspect of academic publishing, ensuring the quality and credibility of scholarly work. In this talk, we will explore the critical challenges associated specifically with the assignment of reviewers to submitted papers. We will introduce Reviewerly, our innovative solution designed to enhance the efficiency and effectiveness of reviewer assignments by leveraging data from diverse sources, including OpenAlex, PubMed, and DBLP. By modeling the reviewer assignment problem as an information retrieval task, we focus on retrieving a pool of relevant and diverse reviewers for each paper.|同行评审过程是学术出版的一个基本环节,确保学术工作的质量和可信度。在本次演讲中,我们将探讨与提交论文的审稿人分配相关的关键挑战。我们将介绍Reviewerly,这是我们的一项创新解决方案,旨在通过利用来自OpenAlex、PubMed和DBLP等多种来源的数据,提高审稿人分配的效率和效果。通过将审稿人分配问题建模为信息检索任务,我们专注于为每篇论文检索一组相关且多样化的审稿人。|code|0|
|AI-safe Autocompletion with RAG and Relevance Curation|Kilian Merkelbach, Ksenia Riabinova, Arnab Dutta|eBay Inc., Aachen, Germany; eBay Inc., Dreilinden, Germany|In search, autocomplete (AC) is an essential tool that provides suggestions for each keystroke, functioning well with token-based queries. However, it is challenging to handle at scale when input queries are conversational and semantically rich. Identifying relevant queries for sub-tokens requires efficient lookup strategies, real-time ranking, and relevance in the results. This work integrates Retrieval-Augmented Generation (RAG), AI safety, and relevance ranking to produce autocomplete suggestions for conversational queries in a production system. RAG-based responses ensure a high hit ratio for popular AC inputs and maintain a very low risk category by not triggering any critical AI safety concerns.|在搜索领域,自动补全(Autocomplete,简称AC)是一项关键工具,它能够针对每个按键提供建议,与基于词汇的查询配合良好。然而,当输入的查询是会话式的且语义丰富时,大规模处理这些查询变得极具挑战性。为子词汇识别相关查询需要高效的查找策略、实时排序以及结果的相关性。本研究将检索增强生成(Retrieval-Augmented Generation,简称RAG)、人工智能安全性和相关性排序相结合,以在生产系统中为会话式查询生成自动补全建议。基于RAG的响应确保了对常见AC输入的高命中比率,并通过不触发任何关键的人工智能安全问题,保持了极低的风险类别。|code|0|
|Towards Real-Time and Personalized Code Generation|Han Xu, Xingyuan Wang, Haipeng Chen|University of Illinois Urbana-Champaign, Urbana, IL, USA; William & Mary, Williamsburg, VA, USA; Meta Platforms Inc., Seattle, WA, USA|Large language models (LLMs) have transformed automated code generation. However, their high computational demands often lead to server overload and increased latency in SaaS deployments. To address this, we present SpeCoder, a framework that accelerates server-side code generation using speculative sampling (SpS) and supervised fine-tuning (SFT). SpS allows lower latency in the code generation, whereas SFT enables more personalized code generation tailored to the user's needs.|大型语言模型(LLMs)已经彻底改变了自动化代码生成的领域。然而,其高计算需求常常导致服务器过载,并在软件即服务(SaaS)部署中增加了延迟。为此,我们提出了SpeCoder框架,该框架通过使用推测采样(SpS)和监督微调(SFT)来加速服务器端代码生成。SpS降低了代码生成的延迟,而SFT则使代码生成更加个性化,以满足用户的特定需求。|code|0|
|Advertiser Content Understanding via LLMs for Google Ads Safety|Joseph Wallace, Tushar Dogra, Wei Qiao, Yuan Wang||Ads Content Safety at Google requires classifying billions of ads for Google Ads content policies. Consistent and accurate policy enforcement is important for advertiser experience and user safety and it is a challenging problem, so there is a lot of value for improving it for advertisers and users. Inconsistent policy enforcement causes increased policy friction and poor experience with good advertisers, and bad advertisers exploit the inconsistency by creating multiple similar ads in the hope that some will get through our defenses. This study proposes a method to understand advertiser's intent for content policy violations, using Large Language Models (LLMs). We focus on identifying good advertisers to reduce content over-flagging and improve advertiser experience, though the approach can easily be extended to classify bad advertisers too. We generate advertiser's content profile based on multiple signals from their ads, domains, targeting info, etc. We then use LLMs to classify the advertiser content profile, along with relying on any knowledge the LLM has of the advertiser, their products or brand, to understand whether they are likely to violate a certain policy or not. After minimal prompt tuning our method was able to reach 95% accuracy on a small test set.|谷歌的广告内容安全工作要求对数十亿条广告进行分类,以符合谷歌广告的内容政策。一致且准确的政策执行对于广告主体验和用户安全至关重要,这也是一个具有挑战性的问题,因此改进这一工作对广告主和用户都有很大价值。政策执行的不一致会导致政策摩擦增加,并对遵守规则的广告主带来不良体验,而违规广告主则利用这种不一致性,通过创建多个相似广告,寄希望于其中部分广告能够绕过我们的防御机制。本研究提出了一种利用大型语言模型(LLMs)来理解广告主内容政策违规意图的方法。我们专注于识别遵守规则的广告主,以减少内容过度标记并提升广告主体验,尽管该方法同样可以轻松扩展用于分类违规广告主。我们基于广告主的广告、域名、定向信息等多个信号生成其内容画像,然后利用LLMs对广告主内容画像进行分类,并结合LLMs对广告主、其产品或品牌的已有知识,判断其是否可能违反特定政策。在经过最小化的提示调优后,我们的方法在小型测试集上达到了95%的准确率。|code|0|
|Generative AI and Retrieval-Augmented Generation (RAG) Systems for Enterprise|Anbang Xu, Tan Yu, Min Du, Pritam Gundecha, Yufan Guo, Xinliang Zhu, May Wang, Ping Li, Xinyun Chen|Palo Alto Networks, Santa Clara, California, USA; Amazon, Seattle, Washington, USA; VECML, Seattle, Washington, USA; Amazon, Palo Alto, Washington, USA; Google Brain, Mountain View, California, USA; NVIDIA, Santa Clara, California, USA|This workshop introduces generative AI applications for enterprise, with a focus on retrieval-augmented generation (RAG) systems. Generative AI is a field of artificial intelligence that can create new content and solve complex problems. RAG systems are a novel generative AI technique that combines information retrieval with text generation to generate rich and diverse responses. RAG systems can leverage enterprise data, which is often specific, structured, and dynamic, to provide customized solutions for various domains. However, enterprise data also poses challenges such as scalability, security, and data quality. This workshop convenes researchers and practitioners to explore RAG and other generative AI systems in real-world enterprise scenarios, fostering knowledge exchange, collaboration, and identification of future directions. Relevant to the CIKM community, the workshop intersects with core areas of data science and machine learning, offering potential benefits across various domains.|本次研讨会介绍了生成式AI在企业中的应用,重点聚焦于检索增强生成(Retrieval-Augmented Generation,RAG)系统。生成式AI是人工智能领域的一个分支,能够创建新内容并解决复杂问题。RAG系统是一种新颖的生成式AI技术,它将信息检索与文本生成相结合,以生成丰富多样的响应。RAG系统能够利用企业数据,这些数据通常具有特定性、结构性和动态性,从而为各个领域提供定制化的解决方案。然而,企业数据也带来了诸如可扩展性、安全性和数据质量等方面的挑战。本次研讨会汇聚了研究人员和实践者,共同探讨RAG及其他生成式AI系统在实际企业场景中的应用,促进知识交流、合作以及未来发展方向的识别。研讨会与CIKM社区相关,涉及数据科学和机器学习的核心领域,为多个领域带来了潜在的益处。|code|0|
|A Bayesian Multi-Armed Bandit Algorithm for Bid Shading in Online Display Advertising|Mengzhuo Guo, Wuqi Zhang, Congde Yuan, Binfeng Jia, Guoqing Song, Hua Hua, Shuangyang Wang, Qingpeng Zhang|The University of Hong Kong, Hong Kong, Hong Kong; Sichuan University, Chengdu, Sichuan, China; Tencent, Shenzhen, Guangdong, China|In real-time bidding systems, ad exchanges and supply-side platforms (SSP) are switching from the second-price auction (SPA) to the first-price auction (FPA), where the advertisers should pay what they bid if they win the auction. To avoid overpaying, advertisers are motivated to conceal their truthful evaluations of impression opportunities through bid shading methods. However, advertisers are consistently facing a trade-off between the probability and cost-saving of winning, due to the information asymmetry, where advertisers lack knowledge about their competitors' bids in the market. To address this challenge, we propose a Bayes ian Multi-Armed Bandit (BayesMAB) algorithm for bid shading when the winning price is unknown to advertisers who lose the impression opportunity. BayesMAB incorporates the mechanism of FPA to infer each price interval's winning rate by progressively updating the market price hidden by SSP. In this way, BayesMAB better approximates the winning rates of price intervals and thus is able to derive the optimal shaded bid that balances the trade-off between the probability and cost-saving of winning the impression opportunity. We conducted large-scale A/B tests on Tencent's online display advertising platform. The cost-per-mile (CPM) and cost-per-action (CPA) decreased by 13.06% and 11.90%, respectively, whereas the return on investment (ROI) increased by 12.31% with only 2.7% sacrifice of the winning rate. We also validated BayesMAB's superior performance in an offline semi-simulated experiment with SPA data sets. BayesMAB has been deployed online and is impacting billions of traffic every day. Codes are available at https://github.com/BayesMAB/BayesMAB.|在实时竞价系统中,广告交易平台和供应方平台(SSP)正从第二价格拍卖(SPA)转向第一价格拍卖(FPA),在这种拍卖中,广告主如果赢得竞价,则需支付其出价金额。为了避免支付过高,广告主倾向于通过出价遮蔽(bid shading)方法隐藏其对展示机会的真实评估。然而,由于信息不对称——广告主缺乏对市场上竞争对手出价的了解——他们始终面临着一个在赢得竞价概率与成本节约之间的权衡。为应对这一挑战,我们提出了一种贝叶斯多臂赌博机(BayesMAB)算法,用于在广告主未能赢得展示机会且无法知晓获胜价格时进行出价遮蔽。BayesMAB结合了FPA机制,通过逐步更新SSP隐藏的市场价格来推断每个价格区间的获胜率。通过这种方式,BayesMAB能够更好地逼近价格区间的获胜率,从而得出能够平衡赢得展示机会概率与成本节约之间权衡的最优遮蔽出价。我们在腾讯的在线展示广告平台上进行了大规模A/B测试。结果显示,每千次展示成本(CPM)和每次行动成本(CPA)分别下降了13.06%和11.90%,而投资回报率(ROI)提升了12.31%,同时仅牺牲了2.7%的获胜率。我们还在基于SPA数据集的线下半模拟实验中验证了BayesMAB的优越性能。BayesMAB已在线部署,每天影响数十亿流量。代码可在https://github.com/BayesMAB/BayesMAB获取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=A+Bayesian+Multi-Armed+Bandit+Algorithm+for+Bid+Shading+in+Online+Display+Advertising)|0|
|SGFL-Attack: A Similarity-Guidance Strategy for Hard-Label Textual Adversarial Attack Based on Feedback Learning|Panjia Qiu, Guanghao Zhou, Mingyuan Fan, Cen Chen, Yaliang Li, Wenming Zhou|East China Normal University, ShangHai, China; Alibaba Group, Hangzhou, China; East China Normal University, Shanghai, China|Hard-label black-box textual adversarial attack presents a challenging task where only the predictions of the victim model are available. Moreover, several constraints further complicate the task of launching such attacks, including the inherent discrete and non-differentiable nature of text data and the need to introduce subtle perturbations that remain imperceptible to humans while preserving semantic similarity. Despite the considerable research efforts dedicated to this problem, existing methods still suffer from several limitations. For example, algorithms based on complex heuristic searches necessitate extensive querying, rendering them computationally expensive. The introduction of continuous gradient strategies into discrete text spaces often leads to estimation errors. Meanwhile, geometry-based strategies are prone to falling into local optima. To address these limitations, in this paper, we introduce SGFL-Attack, a novel approach that leverages a Similarity-Guidance strategy based on Feedback Learning for hard-label textual adversarial attack, with limited query budget. Specifically, the proposed SGFL-Attack utilizes word embedding vectors to assess the importance of words and positions in text sequences, and employs a feedback learning mechanism to determine reward or punishment based on changes in predicted labels caused by replacing words. In each iteration, SGFL-Attack guides the search based on knowledge acquired from the feedback learning mechanism, generating more similar samples while maintaining low perturbations. Moreover, to reduce the query budget, we incorporate local hash mapping to avoid redundant queries during the search process. Extensive experiments on seven widely used datasets show that the proposed SGFL-Attack method significantly outperforms state-of-the-art baselines and defenses over multiple language models.|硬标签黑箱文本对抗攻击是一项具有挑战性的任务,其中仅能获取受害模型的预测结果。此外,多种约束条件进一步增加了实施此类攻击的难度,包括文本数据固有的离散性和不可微性,以及需要在保持语义相似性的同时引入细微扰动且不被人察觉。尽管针对这一问题已有大量研究工作,现有方法仍存在诸多局限性。例如,基于复杂启发式搜索的算法需要大量查询,导致计算成本高昂;将连续梯度策略引入离散文本空间往往导致估计误差;而基于几何的策略则容易陷入局部最优。为解决这些局限性,本文提出了SGFL-Attack,这是一种基于反馈学习的相似性引导策略的新型硬标签文本对抗攻击方法,旨在有限的查询预算下实现攻击。具体而言,所提出的SGFL-Attack方法利用词嵌入向量评估文本序列中词语及其位置的重要性,并通过反馈学习机制根据替换词语导致的预测标签变化来决定奖励或惩罚。在每次迭代中,SGFL-Attack根据反馈学习机制获取的知识引导搜索,生成相似度更高的样本同时保持较低的扰动。此外,为减少查询预算,我们引入了局部哈希映射以避免搜索过程中的冗余查询。在七个广泛使用的数据集上的大量实验表明,所提出的SGFL-Attack方法在多种语言模型上显著优于当前最先进的基线和防御方法。|code|0|
|Factor Model-Based Large Covariance Estimation from Streaming Data Using a Knowledge-Based Sketch Matrix|Xiao Tan, Zhaoyang Wang, Hao Qian, Jun Zhou, Peibo Duan, Dian Shen, Meng Wang, Beilun Wang|Southeast University, Nanjing, China; Ant Group, Hangzhou, China; Tongji University, Shanghai, China; Monash University, Melbourne, Australia|Covariance matrix estimation is an important problem in statistics, with wide applications in finance, neuroscience, meteorology, oceanography, and other fields. However, when the data are high-dimensional and constantly generated and updated in a streaming fashion, the covariance matrix estimation faces huge challenges, including the curse of dimensionality and limited memory space. The existing methods either assume sparsity, ignoring any possible common factor among the variables, or obtain poor performance in recovering the covariance matrix directly from sketched data. To address these issues, we propose a novel method - KEEF: Knowledge-based Time and Memory Efficient Covariance Estimator in Factor Model and its extended variation. Our method leverages historical data to train a knowledge-based sketch matrix, which is used to accelerate the factor analysis of streaming data and directly estimates the covariance matrix from the sketched data. We provide theoretical guarantees, showing the advantages of our method in terms of time and space complexity, as well as accuracy. We conduct extensive experiments on synthetic and real-world data, comparing KEEF with several state-of-the-art methods, demonstrating the superior performance of our method.|协方差矩阵估计是统计学中的一个重要问题,广泛应用于金融、神经科学、气象学、海洋学等多个领域。然而,当数据具有高维度且以流式方式持续生成和更新时,协方差矩阵估计面临巨大挑战,包括维度灾难和有限的内存空间。现有的方法要么假设数据稀疏,忽视了变量之间可能存在的共同因子,要么在从草图数据中直接恢复协方差矩阵时表现不佳。为解决这些问题,我们提出了一种新颖的方法——KEEF:基于知识的时-空高效协方差估计器,适用于因子模型及其扩展变体。我们的方法利用历史数据训练一个基于知识的草图矩阵,该矩阵用于加速流数据的因子分析,并直接从草图数据中估计协方差矩阵。我们提供了理论保证,证明了该方法在时间和空间复杂度以及准确性方面的优势。我们在合成数据和真实数据上进行了广泛的实验,将KEEF与几种最先进的方法进行了比较,展示了我们方法的优越性能。|code|0|
|Teach Harder, Learn Poorer: Rethinking Hard Sample Distillation for GNN-to-MLP Knowledge Distillation|Lirong Wu, Yunfan Liu, Haitao Lin, Yufei Huang, Stan Z. Li||To bridge the gaps between powerful Graph Neural Networks (GNNs) and lightweight Multi-Layer Perceptron (MLPs), GNN-to-MLP Knowledge Distillation (KD) proposes to distill knowledge from a well-trained teacher GNN into a student MLP. In this paper, we revisit the knowledge samples (nodes) in teacher GNNs from the perspective of hardness, and identify that hard sample distillation may be a major performance bottleneck of existing graph KD algorithms. The GNN-to-MLP KD involves two different types of hardness, one student-free knowledge hardness describing the inherent complexity of GNN knowledge, and the other student-dependent distillation hardness describing the difficulty of teacher-to-student distillation. However, most of the existing work focuses on only one of these aspects or regards them as one thing. This paper proposes a simple yet effective Hardness-aware GNN-to-MLP Distillation (HGMD) framework, which decouples the two hardnesses and estimates them using a non-parametric approach. Finally, two hardness-aware distillation schemes (i.e., HGMD-weight and HGMD-mixup) are further proposed to distill hardness-aware knowledge from teacher GNNs into the corresponding nodes of student MLPs. As non-parametric distillation, HGMD does not involve any additional learnable parameters beyond the student MLPs, but it still outperforms most of the state-of-the-art competitors. HGMD-mixup improves over the vanilla MLPs by 12.95 over seven real-world datasets.|为了弥合强大的图神经网络(GNN)与轻量级多层感知器(MLP)之间的差距,GNN-to-MLP知识蒸馏(KD)提出将经过良好训练的教师GNN的知识蒸馏到学生MLP中。本文从难度的角度重新审视教师GNN中的知识样本(节点),并识别出硬样本蒸馏可能是现有图KD算法的主要性能瓶颈。GNN-to-MLP KD涉及两种不同类型的难度:一种是描述GNN知识固有复杂性的学生无关知识难度,另一种是描述教师到学生蒸馏难度的学生依赖蒸馏难度。然而,现有的大部分工作只关注其中一个方面,或将两者视为一体。本文提出了一种简单而有效的硬度感知GNN-to-MLP蒸馏(HGMD)框架,该框架将两种难度解耦,并使用非参数方法进行估计。最后,进一步提出了两种硬度感知蒸馏方案(即HGMD-weight和HGMD-mixup),以将硬度感知的知识从教师GNN蒸馏到学生MLP的相应节点中。作为一种非参数蒸馏方法,HGMD除了学生MLP外不涉及任何额外的可学习参数,但仍优于大多数最先进的竞争对手。在七个真实世界数据集上,HGMD-mixup相较于普通MLP提升了12.95%。|code|0|
|Correcting Biases of Shapley Value Attributions for Informative Machine Learning Model Explanations|Ningsheng Zhao, Jia Yuan Yu, Trang Bui, Krzysztof Dzieciolowski|Concordia University, Montreal, Quebec, Canada; University of Waterloo, Waterloo, Ontario, Canada; Concordia University and Daesys Inc., Montreal, Quebec, Canada|Shapley value attribution (SVA) is an increasingly popular Explainable AI (XAI) approach that has been widely used in many recent applied studies to gain new insights into the underlying information systems. However, most existing SVA methods are error-prone, providing biased or unreliable explanations that fail to correctly capture the informational dependencies between features and model outputs. These explanation errors can be decomposed into two components: 1) observation bias which stems from data sparsity and leads to over-informativeness; and 2) structural bias which stems from distributional assumptions and leads to under-informativeness. To alleviate these biases, in this paper, we propose a series of refinement methods that combine out-of-distribution (OOD) detection and importance sampling. In essence, our methods aim to rectify the distribution drift caused by distributional assumptions. We apply our refinement methods to two popular SVAs: the marginal SVA and the surrogate model-based SVA. Our extensive experiments show that the proposed methods significantly enhance the informativeness of both local and global Shapley value-based explanations.|Shapley值归因(SVA)是一种日益流行的可解释人工智能(XAI)方法,近年来在许多应用研究中被广泛使用,以深入了解底层信息系统。然而,现有的SVA方法大多存在误差,提供有偏或不可靠的解释,未能正确捕捉特征与模型输出之间的信息依赖关系。这些解释误差可以分解为两个部分:1)观测偏差,源于数据稀疏性,导致过度信息量;2)结构偏差,源于分布假设,导致信息量不足。为了缓解这些偏差,本文提出了一系列结合分布外(OOD)检测和重要性采样的改进方法。本质上,我们的方法旨在纠正由分布假设引起的分布偏移。我们将这些改进方法应用于两种流行的SVA方法:边际SVA和基于代理模型的SVA。大量实验表明,所提出的方法显著提高了基于Shapley值的局部和全局解释的信息量。|code|0|
|HiMTM: Hierarchical Multi-Scale Masked Time Series Modeling with Self-Distillation for Long-Term Forecasting|Shubao Zhao, Ming Jin, Zhaoxiang Hou, Chengyi Yang, Zengxiang Li, Qingsong Wen, Yi Wang|Monash University; Digital Research Institute of ENN Group; The University of Hong Kong; |Time series forecasting is a critical and challenging task in practical application. Recent advancements in pre-trained foundation models for time series forecasting have gained significant interest. However, current methods often overlook the multi-scale nature of time series, which is essential for accurate forecasting. To address this, we propose HiMTM, a hierarchical multi-scale masked time series modeling with self-distillation for long-term forecasting. HiMTM integrates four key components: (1) hierarchical multi-scale transformer (HMT) to capture temporal information at different scales; (2) decoupled encoder-decoder (DED) that directs the encoder towards feature extraction while the decoder focuses on pretext tasks; (3) hierarchical self-distillation (HSD) for multi-stage feature-level supervision signals during pre-training; and (4) cross-scale attention fine-tuning (CSA-FT) to capture dependencies between different scales for downstream tasks. These components collectively enhance multi-scale feature extraction in masked time series modeling, improving forecasting accuracy. Extensive experiments on seven mainstream datasets show that HiMTM surpasses state-of-the-art self-supervised and end-to-end learning methods by a considerable margin of 3.16-68.54%. Additionally, HiMTM outperforms the latest robust self-supervised learning method, PatchTST, in cross-domain forecasting by a significant margin of 2.3%. The effectiveness of HiMTM is further demonstrated through its application in natural gas demand forecasting.|时间序列预测在实际应用中是一项重要且具有挑战性的任务。近年来,针对时间序列预测的预训练基础模型取得了显著进展,并引起了广泛关注。然而,现有方法往往忽视了时间序列的多尺度特性,而这一特性对于精确预测至关重要。为此,我们提出了HiMTM,一种用于长期预测的分层多尺度掩码时间序列建模方法,结合了自蒸馏机制。HiMTM整合了四个关键组件:(1)分层多尺度Transformer(HMT),用于捕捉不同尺度的时间信息;(2)解耦编码器-解码器(DED),使编码器专注于特征提取,而解码器则聚焦于前置任务;(3)分层自蒸馏(HSD),在预训练过程中提供多阶段的特征级监督信号;(4)跨尺度注意力微调(CSA-FT),用于捕捉下游任务中不同尺度之间的依赖关系。这些组件共同增强了掩码时间序列建模中的多尺度特征提取能力,从而提高了预测精度。在七个主流数据集上的广泛实验表明,HiMTM在自监督和端到端学习方法上显著超越了现有最先进的方法,提升幅度达3.16%至68.54%。此外,HiMTM在跨领域预测中显著优于最新的鲁棒自监督学习方法PatchTST,提升幅度为2.3%。通过在天然气需求预测中的应用,进一步证明了HiMTM的有效性。|code|0|
|GLFNet: Global and Local Frequency-domain Network for Long-term Time Series Forecasting|Xucheng Zhou, Yuwen Liu, Lianyong Qi, Xiaolong Xu, Wanchun Dou, Xuyun Zhang, Yang Zhang, Xiaokang Zhou|Nanjing University of Information Science and Technology, Nanjing, China; Macquarie University, Sydney, Australia; Nanjing University, Nanjing, China; Kansai University, Osaka, Japan; China University of Petroleum (East China), Qingdao, China; China University of Petroleum (East China) & Qufu Normal University, Qingdao, China|Recently, patch-based transformer methods have demonstrated strong effectiveness in time series forecasting. However, the complexity of self-attention imposes demands on memory and compute resources. In addition, though patches can capture comprehensive temporal information while preserving locality, temporal information within patches remains important for time series prediction. The existing methods mainly focus on modeling long-term dependencies across patches, while paying little attention to the short-term dependencies within patches. In this paper, we propose the Global and Local Frequency-domain Network (GLFNet), a novel architecture that efficiently learns global time dependencies and local time relationships in the frequency domain. Specifically, we design a frequency filtering layer to learn the temporal interactions instead of self-attention. Then we devise a dual filtering block consisting of global filter block and local filter block which learns the global dependencies across patches and local dependencies within patches. Experiments on seven benchmark datasets demonstrate that our approach achieve superior performance with improved efficiency.|近年来,基于分块的Transformer方法在时间序列预测中展现了强大的有效性。然而,自注意力机制的复杂性对内存和计算资源提出了较高要求。此外,尽管分块方法能够在保留局部性的同时捕捉全面的时间信息,但分块内部的时间信息对于时间序列预测仍然至关重要。现有方法主要关注跨分块的长程依赖建模,而对分块内部短程依赖的关注较少。本文提出了一种新颖的架构——全局与局部频域网络(GLFNet),该架构能够在频域中高效学习全局时间依赖关系和局部时间关系。具体而言,我们设计了一个频域滤波层来学习时间交互,而非使用自注意力机制。随后,我们提出了一种双滤波块结构,由全局滤波块和局部滤波块组成,分别学习跨分块的全局依赖关系和分块内部的局部依赖关系。在七个基准数据集上的实验表明,我们的方法在提升效率的同时实现了卓越的性能。|code|0|
|Facets of Disparate Impact: Evaluating Legally Consistent Bias in Machine Learning|Jarren Briscoe, Assefaw H. Gebremedhin|Washington State University, Pullman, WA, USA|Leveraging current legal standards, we define bias through the lens of marginal benefits and objective testing with the novel metric "Objective Fairness Index". This index combines the contextual nuances of objective testing with metric stability, providing a legally consistent and reliable measure. Utilizing the Objective Fairness Index, we provide fresh insights into sensitive machine learning applications, such as COMPAS (recidivism prediction), highlighting the metric's practical and theoretical significance. The Objective Fairness Index allows one to differentiate between discriminatory tests and systemic disparities.|借助当前的法律标准,我们通过边际效益和客观测试的视角定义了偏见,并提出了一个新的指标——“客观公平指数”。该指数结合了客观测试的上下文细微差别与指标稳定性,提供了一种法律上一致且可靠的衡量标准。利用客观公平指数,我们对敏感的机器学习应用(如COMPAS再犯预测)进行了深入分析,突显了该指标在实际应用和理论上的重要性。客观公平指数能够区分歧视性测试与系统性差异。|code|0|
|Bubble Sketch: A High-performance and Memory-efficient Sketch for Finding Top-k Items in Data Streams|Lu Cao, Qilong Shi, Yuxi Liu, Hanyue Zheng, Yao Xin, Wenjun Li, Tong Yang, Yangyang Wang, Yang Xu, Weizhe Zhang, Mingwei Xu|Peking University, Beijing, China; Pengcheng Laboratory, Shenzhen, China; Tsinghua University, Beijing, China; Harbin Institute of Technology, Harbin, China; Guangzhou University, Guangzhou, China; Harbin Institute of Technology, Shenzhen, China; Fudan University, Shanghai, China|Sketch algorithms are crucial for identifying top-k items in large-scale data streams. Existing methods often compromise between performance and accuracy, unable to efficiently handle increasing data volumes with limited memory. We present Bubble Sketch, a compact algorithm that excels in both performance and accuracy. Bubble Sketch achieves this by (1) Recording only full keys of hot items, significantly reducing memory usage, and (2) Using threshold relocation to resolve conflicts, enhancing detection accuracy. Unlike traditional methods, Bubble Sketch eliminates the need for a Min-Heap, ensuring fast processing speeds. Experiments show Bubble Sketch outperforms the other seven algorithms compared, with the highest throughput and precision, and surpasses HeavyKeeper in accuracy by up to two orders of magnitude.|草图算法在识别大规模数据流中的前k个项目方面至关重要。现有方法通常在性能和准确性之间做出妥协,无法在有限内存下高效处理日益增长的数据量。我们提出了Bubble Sketch,这是一种在性能和准确性方面表现出色的紧凑型算法。Bubble Sketch通过以下方式实现这一目标:(1) 仅记录热门项目的完整键,显著减少内存使用;(2) 使用阈值重定位来解决冲突,提高检测准确性。与传统方法不同,Bubble Sketch无需使用Min-Heap,从而确保了快速的处理速度。实验表明,Bubble Sketch在与其他七种算法相比时,具有最高的吞吐量和精确度,并且在准确性上超越了HeavyKeeper多达两个数量级。|code|0|
|Optimizing Numerical Estimation and Operational Efficiency in the Legal Domain through Large Language Models|JiaHong Huang, ChaoChun Yang, Yixian Shen, Alessio M. Pacces, Evangelos Kanoulas||The legal landscape encompasses a wide array of lawsuit types, presenting lawyers with challenges in delivering timely and accurate information to clients, particularly concerning critical aspects like potential imprisonment duration or financial repercussions. Compounded by the scarcity of legal experts, there's an urgent need to enhance the efficiency of traditional legal workflows. Recent advances in deep learning, especially Large Language Models (LLMs), offer promising solutions to this challenge. Leveraging LLMs' mathematical reasoning capabilities, we propose a novel approach integrating LLM-based methodologies with specially designed prompts to address precision requirements in legal Artificial Intelligence (LegalAI) applications. The proposed work seeks to bridge the gap between traditional legal practices and modern technological advancements, paving the way for a more accessible, efficient, and equitable legal system. To validate this method, we introduce a curated dataset tailored to precision-oriented LegalAI tasks, serving as a benchmark for evaluating LLM-based approaches. Extensive experimentation confirms the efficacy of our methodology in generating accurate numerical estimates within the legal domain, emphasizing the role of LLMs in streamlining legal processes and meeting the evolving demands of LegalAI.|法律领域的诉讼类型繁多,给律师在向客户提供及时、准确的信息方面带来了挑战,尤其是在涉及潜在监禁时间或财务影响等关键问题上。加之法律专家的稀缺,提升传统法律工作流程的效率显得尤为迫切。近年来,深度学习的进展,特别是大型语言模型(LLMs)的发展,为这一挑战提供了有前景的解决方案。利用LLMs的数学推理能力,我们提出了一种将基于LLM的方法与专门设计的提示相结合的新方法,以满足法律人工智能(LegalAI)应用中的精确性要求。该研究旨在弥合传统法律实践与现代技术进步之间的差距,为构建更加便捷、高效和公平的法律体系铺平道路。为验证这一方法,我们引入了一个针对精确导向的LegalAI任务定制的数据集,作为评估基于LLM方法的基准。广泛的实验证实了我们的方法在法律领域内生成准确数值估计的有效性,突显了LLMs在简化法律流程和满足LegalAI不断变化需求中的重要作用。|code|0|
|A Multi-Node Multi-GPU Distributed GNN Training Framework for Large-Scale Online Advertising|Xuewu Jiao, Xinsheng Luo, Miao Li, Jiang Bian, Junchao Yang, Wei Hu, Mingqing Hu, Weipeng Lu, Shikun Feng, Danlei Feng, Dongxu Yang, Haoyi Xiong, Shuanglong Li, Lin Liu|NVIDIA, CA, USA; Big Data Lab, Baidu Inc., Beijing, China; Baidu Inc., Beijing, China|Graph Neural Networks (GNNs) have become critical in various domains such as online advertising but face scalability challenges due to the growing size of graph data, leading to the needs for advanced distributed GPU computation strategies across multiple nodes. This paper presents PGLBox-Cluster, a robust distributed graph learning framework constructed atop the PaddlePaddle platform, implemented to efficiently process graphs comprising billions of nodes and edges. Through strategic partitioning of the model, node attributes, and graph data and leveraging industrial-grade RPC and NCCL for communication, PGLBox-Cluster facilitates effective distributed computation. The extensive experimental results confirm that PGLBox-Cluster achieves a 1.94x to 2.93x speedup over the single-node configuration, significantly elevating graph neural network scalability and efficiency by handling datasets exceeding 3 billion nodes and 120 billion edges with its novel asynchronous communication and graph partitioning techniques. The repository is released at This Link.|图神经网络(GNN)在在线广告等多个领域中已成为关键技术,但由于图数据规模的不断增长,面临着可扩展性挑战,这促使需要跨多个节点的先进分布式GPU计算策略。本文介绍了PGLBox-Cluster,这是一个构建在PaddlePaddle平台之上的强大分布式图学习框架,旨在高效处理包含数十亿节点和边的图数据。通过策略性地对模型、节点属性和图数据进行分区,并利用工业级的RPC和NCCL进行通信,PGLBox-Cluster促进了有效的分布式计算。广泛的实验结果证实,PGLBox-Cluster相较于单节点配置实现了1.94倍至2.93倍的加速,通过其新颖的异步通信和图分区技术,显著提升了图神经网络的可扩展性和效率,能够处理超过30亿节点和1200亿条边的数据集。该框架的代码库已在此链接发布。|code|0|
|3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection|Rina Carines Cabral, Siwen Luo, Josiah Poon, Soyeon Caren Han||The significance of mental health classification is paramount in contemporary society, where digital platforms serve as crucial sources for monitoring individuals' well-being. However, existing social media mental health datasets primarily consist of text-only samples, potentially limiting the efficacy of models trained on such data. Recognising that humans utilise cross-modal information to comprehend complex situations or issues, we present a novel approach to address the limitations of current methodologies. In this work, we introduce a Multimodal and Multi-Teacher Knowledge Distillation model for Mental Health Classification, leveraging insights from cross-modal human understanding. Unlike conventional approaches that often rely on simple concatenation to integrate diverse features, our model addresses the challenge of appropriately representing inputs of varying natures (e.g., texts and sounds). To mitigate the computational complexity associated with integrating all features into a single model, we employ a multimodal and multi-teacher architecture. By distributing the learning process across multiple teachers, each specialising in a particular feature extraction aspect, we enhance the overall mental health classification performance. Through experimental validation, we demonstrate the efficacy of our model in achieving improved performance.|心理健康分类在当代社会中具有至关重要的意义,数字平台作为监测个体健康状况的关键来源。然而,现有的社交媒体心理健康数据集主要由纯文本样本组成,这可能限制了基于此类数据训练的模型的有效性。认识到人类利用跨模态信息来理解复杂情境或问题,我们提出了一种新颖的方法来解决当前方法的局限性。在这项工作中,我们引入了一种多模态和多教师知识蒸馏模型用于心理健康分类,借鉴了跨模态人类理解的见解。与传统方法通常依赖简单拼接来整合不同特征不同,我们的模型解决了适当表示不同性质输入(如文本和声音)的挑战。为了减轻将所有特征整合到单一模型中带来的计算复杂性,我们采用了多模态和多教师架构。通过将学习过程分布到多个教师中,每个教师专门负责特定的特征提取方面,我们提升了整体心理健康分类的性能。通过实验验证,我们展示了该模型在提高性能方面的有效性。|code|0|
|Hypergraph Hash Learning for Efficient Trajectory Similarity Computation|Yuan Cao, Lei Li, Xiangru Chen, Xue Xu, Zuojin Huang, Yanwei Yu|Computer Science and Technology, Ocean University of China, Qingdao, Shandong, China|Trajectory similarity computation is a fundamental problem in various applications (e.g., transportation optimization, behavioral study). Recent researches learn trajectory representations instead of point matching to realize more accurate and efficient trajectory similarity computation. However, these methods can still not be scaled to large datasets due to high computational cost. In this paper, we propose a novel hash learning method to encode the trajectories into binary hash codes and compute trajectory similarities by Hamming distances which is much more efficient. To the best of our knowledge, this is the first work to conduct hash learning for trajectory similarity computation. Furthermore, unlike the Word2Vec model based on random walk strategy, we utilize hypergraph neural networks for the first time to learn the representations for the grids by constructing the hyperedges according to the real-life trajectories, resulting in more representative grid embeddings. In addition, we design a residual network into the multi-layer GRU to learn more discriminative trajectory representations. The proposed Hypergraph Hash Learning for Trajectory similarity commutation is an end-to-end framework and named HHL-Traj. Experimental results on two real-world trajectory datasets (i.e., Porto and Beijing) demonstrate that the proposed framework achieves up to 6.23% and 15.42% accuracy gains compared with state-of-the-art baselines in unhashed and hashed cases, respectively. The efficiency of trajectory similarity computation based on hash codes is also verified. Our code is available at https://github.com/caoyuan57/HHL-Traj.|轨迹相似度计算是众多应用中的一个基础问题(例如,交通优化、行为研究)。近期的研究通过学习轨迹表示而非点匹配来实现更精确和高效的轨迹相似度计算。然而,这些方法由于高计算成本仍无法扩展到大规模数据集。本文提出了一种新颖的哈希学习方法,将轨迹编码为二进制哈希码,并通过汉明距离计算轨迹相似度,从而显著提高效率。据我们所知,这是首次将哈希学习应用于轨迹相似度计算的工作。此外,与基于随机游走策略的Word2Vec模型不同,我们首次利用超图神经网络,根据现实生活中的轨迹构建超边,从而学习更具代表性的网格表示。此外,我们将残差网络设计融入多层GRU中,以学习更具辨别力的轨迹表示。所提出的超图哈希学习框架用于轨迹相似度计算,是一个端到端的框架,命名为HHL-Traj。在两个真实世界的轨迹数据集(即波尔图和北京)上的实验结果表明,与未哈希和哈希情况下的最先进基线相比,所提出的框架分别实现了高达6.23%和15.42%的准确性提升。基于哈希码的轨迹相似度计算效率也得到了验证。我们的代码可在https://github.com/caoyuan57/HHL-Traj获取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Hypergraph+Hash+Learning+for+Efficient+Trajectory+Similarity+Computation)|0|
|Towards Online and Safe Configuration Tuning with Semi-supervised Anomaly Detection|Haitian Chen, Xu Chen, Zibo Liang, Xiushi Feng, Jiandong Xie, Han Su, Kai Zheng|; Huawei Technologies Co., Ltd., Chengdu, China|The performance of modern database management systems highly relies on hundreds of adjustable knobs. Traditionally, these knobs are manually adjusted by database administrators, a process that is both inefficient and ineffective for tuning large-scale databases in cloud environments. Recent research has explored the use of machine learning techniques to enable the automatic tuning of database configurations. Although most existing learning-based methods achieve satisfactory results on static workloads, they often experience performance degradation and low sampling efficiency in real-world environments. According to our study, this is primarily due to a lack of safety guarantees during the configuration sampling process. To address the aforementioned issues, we propose SafeTune, an online tuning system that adapts to dynamic workloads. Our core idea is to filter out a large number of configurations with potential risks during the configuration sampling process. We employ a two-stage filtering approach: The first stage utilizes a semi-supervised outlier ensemble with feature learning to achieve high-quality feature representation. The second stage employs a ranking-based classifier to refine the filtering process. In addition, to alleviate the cold-start problem, we leverage the historical tuning experience to provide high-quality initial samples during the initialization phase. We conducted comprehensive evaluations on static and dynamic workloads. In comparison to offline baseline methods, SafeTune reduces 95.6%-98.6% unsafe configuration suggestions. In contrast with state-of-the-art methods, SafeTune has improved cumulative performance by 10.5%-46.6% and tuning speed by 15.1%-35.4%.|现代数据库管理系统的表现高度依赖于数百个可调节的参数。传统上,这些参数由数据库管理员手动调整,这种方法在云环境中对大规模数据库进行调优时既不高效也不有效。最近的研究探索了使用机器学习技术来自动调整数据库配置。尽管大多数现有的基于学习的方法在静态工作负载上取得了令人满意的结果,但它们在现实环境中往往会出现性能下降和采样效率低下的问题。根据我们的研究,这主要是由于在配置采样过程中缺乏安全保障。为了解决上述问题,我们提出了SafeTune,一个适应动态工作负载的在线调优系统。我们的核心思想是在配置采样过程中过滤掉大量可能存在风险的配置。我们采用两阶段过滤方法:第一阶段利用带有特征学习的半监督异常值集成来实现高质量的特征表示;第二阶段采用基于排名的分类器来优化过滤过程。此外,为了缓解冷启动问题,我们利用历史调优经验在初始化阶段提供高质量的初始样本。我们对静态和动态工作负载进行了全面的评估。与离线基线方法相比,SafeTune减少了95.6%-98.6%的不安全配置建议。与最先进的方法相比,SafeTune的累计性能提高了10.5%-46.6%,调优速度提高了15.1%-35.4%。|code|0|
|Urban Traffic Accident Risk Prediction Revisited: Regionality, Proximity, Similarity and Sparsity|Minxiao Chen, Haitao Yuan, Nan Jiang, Zhifeng Bao, Shangguang Wang||Traffic accidents pose a significant risk to human health and property safety. Therefore, to prevent traffic accidents, predicting their risks has garnered growing interest. We argue that a desired prediction solution should demonstrate resilience to the complexity of traffic accidents. In particular, it should adequately consider the regional background, accurately capture both spatial proximity and semantic similarity, and effectively address the sparsity of traffic accidents. However, these factors are often overlooked or difficult to incorporate. In this paper, we propose a novel multi-granularity hierarchical spatio-temporal network. Initially, we innovate by incorporating remote sensing data, facilitating the creation of hierarchical multi-granularity structure and the comprehension of regional background. We construct multiple high-level risk prediction tasks to enhance model's ability to cope with sparsity. Subsequently, to capture both spatial proximity and semantic similarity, region feature and multi-view graph undergo encoding processes to distill effective representations. Additionally, we propose message passing and adaptive temporal attention module that bridges different granularities and dynamically captures time correlations inherent in traffic accident patterns. At last, a multivariate hierarchical loss function is devised considering the complexity of the prediction purpose. Extensive experiments on two real datasets verify the superiority of our model against the state-of-the-art methods.|交通事故对人类健康和财产安全构成重大威胁。因此,为预防交通事故,预测其风险的重要性日益凸显。我们认为,理想的预测方案应能应对交通事故的复杂性。具体而言,它应充分考虑区域背景,准确捕捉空间邻近性和语义相似性,并有效处理交通事故数据的稀疏性。然而,这些因素往往被忽视或难以整合。本文提出了一种新颖的多粒度层次时空网络。首先,我们创新性地引入遥感数据,有助于构建层次化的多粒度结构并理解区域背景。我们构建了多个高层风险预测任务,以增强模型应对稀疏性的能力。接着,为捕捉空间邻近性和语义相似性,我们对区域特征和多视角图进行编码,提炼出有效的表示。此外,我们提出了消息传递和自适应时间注意力模块,这些模块连接不同粒度,并动态捕捉交通事故模式中的时间相关性。最后,针对预测目的的复杂性,我们设计了一种多元层次损失函数。在两个真实数据集上的广泛实验验证了我们的模型相较于现有最先进方法的优越性。|code|0|
|Hyperedge Importance Estimation via Identity-aware Hypergraph Attention Network|Yin Chen, Xiaoyang Wang, Chen Chen|Zhejiang Gongshang University, Hangzhou, China; University of New South Wales, Sydney, Australia; University of Wollongong, Wollongong, Australia|Hypergraphs provide a more flexible representation for group interactions in complex systems compared to ordinary graphs, where each hyperedge can connect any number of nodes. In practice, data modeled as hypergraphs often contain hyperedge importance values, which indicate the influence or popularity of the group collaborations. For example, in a co-authorship hypergraph, a paper (hyperedge) is co-authored by multiple authors (nodes). The number of citations a paper receives can be regarded as the importance value of its corresponding hyperedge, reflecting its academic influence and significance. In this work, we introduce hyperedge importance estimation as a new problem in hypergraph learning. The flexibility of hyperedges enables hypergraph modeling to capture high-order relationships between entities, which has attracted widespread attention. The importance value of hyperedge has also been proven to be highly valuable in many applications. To address this problem, we propose the Identity-aware Hypergraph Attention Network (ID-HAN) for efficient hyperedge importance estimation. ID-HANemploys a special attention mechanism to model the importance contribution of each node within the hyperedge, which injects identity information according to the hyperedge-dependent node labels. Additionally, a centrality-aware positional encoding module generates learnable positional embeddings of nodes and hyperedges based on the relative order of degree centrality and identity information, thereby enhancing the consistency between message passing and importance propagation. Extensive experiments on four real-world datasets demonstrate that ID-HANsignificantly outperforms the state-of-the-art hypergraph neural networks on the hyperedge importance estimation task.|相较于普通图,超图在复杂系统中为群体交互提供了更为灵活的表示方式,其中每条超边可以连接任意数量的节点。在实际应用中,以超图形式建模的数据通常包含超边的重要性值,这些值反映了群体协作的影响力或受欢迎程度。例如,在合著超图中,一篇论文(超边)由多位作者(节点)共同撰写。一篇论文的引用次数可以视为其对应超边的重要性值,体现了其学术影响力和重要性。在本研究中,我们引入了超边重要性估计作为超图学习中的一个新问题。超边的灵活性使得超图建模能够捕捉实体间的高阶关系,这一特性受到了广泛关注。超边的重要性值在许多应用中也已被证明具有极高的价值。为解决这一问题,我们提出了身份感知超图注意力网络(Identity-aware Hypergraph Attention Network, ID-HAN),用于高效的超边重要性估计。ID-HAN采用了一种特殊的注意力机制来建模超边内每个节点的重要性贡献,该机制根据超边依赖的节点标签注入了身份信息。此外,一个中心性感知的位置编码模块基于节点的度中心性和身份信息的相对顺序,生成了可学习的节点和超边位置嵌入,从而增强了消息传递与重要性传播之间的一致性。在四个真实世界数据集上的广泛实验表明,ID-HAN在超边重要性估计任务上显著优于当前最先进的超图神经网络。|code|0|
|PIXEL: Prompt-based Zero-shot Hashing via Visual and Textual Semantic Alignment|Zeyu Dong, Qingqing Long, Yihang Zhou, Pengfei Wang, Zhihong Zhu, Xiao Luo, Yidong Wang, Pengyang Wang, Yuanchun Zhou|; University of Macau, Macau, China; Peking University, Beijing, China; University of California, Los Angeles, Los Angeles, USA|Zero-Shot Hashing (ZSH) has aroused significant attention due to its efficiency and generalizability in multi-modal retrieval scenarios, which aims to encode semantic information into hash codes without needing unseen labeled training samples. In addition to commonly used visual images as visual semantics and class labels as global semantics, the corresponding attribute descriptions contain critical local semantics with detailed information. However, most existing methods focus on leveraging the extracted attribute numerical values, without exploring the textual semantics in attribute descriptions. To bridge this gap, in this paper, we propose Prompt-based zero-shot hashing via vIsual and teXtual sEmantic aLignment, namely PIXEL. Concretely, we design the attribute prompt template depending on attribute descriptions to make the model capture the corresponding local semantics. Then, achieving the textual embedding and visual embedding, we proposed an alignment module to model the intra- and inter-class contrastive distances. In addition, the attribute-wise constraint and class-wise constraint are utilized to collaboratively learn the hash code, image representation, and visual attributes more effectively. Finally, extensive experimental results demonstrate the superiority of PIXEL.|零样本哈希(Zero-Shot Hashing, ZSH)因其在大规模多模态检索场景中的高效性和通用性而备受关注,其目标是将语义信息编码为哈希码,而无需使用未见过的标注训练样本。除了常用的视觉图像作为视觉语义和类别标签作为全局语义外,相应的属性描述还包含了具有详细信息的局部语义。然而,现有的大多数方法主要利用提取的属性数值,而未探索属性描述中的文本语义。为了填补这一空白,本文提出了一种基于提示的零样本哈希方法,通过视觉与文本语义对齐来实现,命名为PIXEL。具体而言,我们根据属性描述设计了属性提示模板,以使模型能够捕捉相应的局部语义。随后,在获得文本嵌入和视觉嵌入后,我们提出了一种对齐模块,用于建模类内和类间对比距离。此外,利用属性级约束和类别级约束来协同学习哈希码、图像表示和视觉属性,从而更有效地实现目标。最后,广泛的实验结果证明了PIXEL的优越性。|code|0|
|Progressive Multimodal Pivot Learning: Towards Semantic Discordance Understanding as Humans|Junlin Fang, Wenya Wang, Tianze Luo, Yanyong Huang, Fengmao Lv|; School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, China; School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore|Multimodal recognition can achieve enhanced performance by leveraging the complementary information from different modalities. However, in real-world scenarios, multimodal samples often express discordant semantic meanings across modalities, lacking evident complementary information. Unlike humans who can easily understand the intrinsic semantic information of these semantically discordant samples, existing multimodal recognition models show poor performance on them. With the motivation of improving the robustness of multimodal recognition models in practical scenarios, this work poses a new challenge in multimodal recognition, which is coined as Semantic Discordance Understanding. Unlike existing works only focusing on detecting semantically discordant samples as noisy data, this new challenge requires deep models to follow humans' ability in understanding the inherent semantic meanings of semantically discordant samples. To address this challenge, we further propose the Progressive Multimodal Pivot Learning (PMPL) approach by introducing a learnable pivot memory to explore the inherent semantics meaning hidden under discordant modalities. To this end, our approach inserts Pivot Memory Learning (PML) modules into multiple layers of unimodal foundation models to progressively trade-off the conflict information across modalities. By introducing the multimodal pivot learning paradigm for multimodal recognition, the proposed PMPL approach can alleviate the negative effect of semantic discordance caused by the cross-modal information exchange mechanism of existing multimodal recognition models. Experiments on different benchmarks validate the superiority of our approach. Code is available at https://github.com/tiggers23/PMPL.|多模态识别通过利用不同模态间的互补信息,能够实现性能的提升。然而,在现实场景中,多模态样本往往在不同模态间表现出不一致的语义含义,缺乏明显的互补信息。与人类能够轻松理解这些语义不一致样本的内在语义信息不同,现有的多模态识别模型在这些样本上的表现较差。为了提高多模态识别模型在实际场景中的鲁棒性,本研究提出了一项新的多模态识别挑战,称为“语义不一致理解”。与现有工作仅关注将语义不一致样本检测为噪声数据不同,这一新挑战要求深度模型具备像人类一样理解语义不一致样本内在语义信息的能力。为应对这一挑战,我们进一步提出了渐进式多模态枢纽学习(Progressive Multimodal Pivot Learning, PMPL)方法,通过引入可学习的枢纽记忆体来探索隐藏在不一致模态下的内在语义信息。为此,我们的方法在单模态基础模型的多个层中插入了枢纽记忆学习(Pivot Memory Learning, PML)模块,以逐步权衡模态间的冲突信息。通过引入多模态枢纽学习范式,所提出的PMPL方法能够缓解现有多模态识别模型中跨模态信息交换机制导致的语义不一致的负面影响。在不同基准上的实验验证了我们方法的优越性。代码可在https://github.com/tiggers23/PMPL获取。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Progressive+Multimodal+Pivot+Learning:+Towards+Semantic+Discordance+Understanding+as+Humans)|0|
|Precision Meets Resilience: Cross-Database Generalization with Uncertainty Quantification for Robust Cost Estimation|Shuhuan Fan, Mengshu Hou, Rui Xi, Wenwen Ma|; University of Electronic Science and Technology of China, Chengdu, China|Learning-based models have shown promise in addressing query optimization challenges in the database field, where the learned cost model plays a central role. While these models outperform traditional optimizers on static datasets, their resilience and reliability in real-world applications remain a concern, limiting their widespread adoption. In this paper, we take a step towards a practical cost estimation model, named Tosure, which can quantify the uncerT ainty for cost estimation and generalizes to unseen databases accurately and efficiently. It consists primarily of two modules: a Cross-Database Representation (CDR) module and a Cost Estimation with Uncertainty (CEU) module. The CDR module captures the transferable features by focusing the minimal set based on deep-learning network, thereby enhancing the model's generalization capabilities. The CEU module introduces a novel Neural Network Gaussian Process (NNGP) to quantify the uncertainty in cost estimation, ensuring more robust estimations with an upper bound. To improve the model's performance, we perform pre-training on diverse large-scale datasets. Furthermore, we implement the model and integrate it with traditional query optimizer to validate its usability and effectiveness in real-world scenarios. Extensive experimentation demonstrates that Tosure outperforms state-of-the-art methods, achieving a 20% improvement in cost estimation accuracy and twice of the robustness.|基于学习的模型在解决数据库领域中的查询优化挑战方面展现了潜力,其中学习到的成本模型起着核心作用。尽管这些模型在静态数据集上优于传统优化器,但它们在实际应用中的韧性和可靠性仍然是一个问题,限制了它们的广泛采用。本文中,我们朝着实现一个实用的成本估算模型迈出了一步,该模型名为Tosure,能够量化成本估算的不确定性,并能够准确高效地推广到未见过的数据库。它主要由两个模块组成:跨数据库表示(CDR)模块和带不确定性的成本估算(CEU)模块。CDR模块通过聚焦基于深度学习网络的最小特征集来捕捉可迁移的特征,从而增强模型的泛化能力。CEU模块引入了一种新颖的神经网络高斯过程(NNGP)来量化成本估算中的不确定性,确保更鲁棒的估算并带有上限。为了提升模型的性能,我们在多样的大规模数据集上进行预训练。此外,我们还实现了该模型并将其与传统查询优化器集成,以验证其在实际场景中的可用性和有效性。广泛的实验表明,Tosure优于最先进的方法,成本估算准确性提高了20%,并且鲁棒性提升了两倍。|code|0|
|ACDM: An Effective and Scalable Active Clustering with Pairwise Constraint|Xun Fu, WenBo Xie, Bin Chen, Tao Deng, Tian Zou, Xin Wang|School of Computer Science and Software Engineering, Southwest Petroleum University, Chengdu, China|Clustering is fundamentally a subjective task: a single dataset can be validly clustered in various ways, and without further information, clustering systems cannot determine the appropriate clustering to perform. This underscores the importance of integrating constraints into clustering, enabling users to convey their preferences to the system. Active constraint-based clustering approaches prioritize the identification of the most valuable constraints to inquire about, striving to achieve effective clustering with the minimal number of constraints needed. We propose an A ctive C lustering with D iffusion M odel (ACDM). ACDM applies the nearest-neighbor technique to construct a diffusion graph, and utilizes an online framework to refine the clustering result iteratively. In each iteration, (a) nodes with high uncertainty and representativeness are selected in batch mode, (b) then a novel neighborhood-set-based query is used for categorizing the selected nodes, using pairwise constraints, and (c) the categorized nodes are used as source nodes in the diffusion model for cluster refinement. We experimentally demonstrate that ACDM outperforms state-of-the-art methods in terms of clustering quality and scalability.|聚类本质上是一项主观任务:单一数据集可以通过多种方式进行合理聚类,而在缺乏进一步信息的情况下,聚类系统无法确定应执行的适当聚类方式。这突显了将约束条件整合到聚类过程中的重要性,使用户能够向系统传达其偏好。基于主动约束的聚类方法优先识别最有价值的约束条件以进行询问,力求以最少数量的约束实现有效的聚类。我们提出了一种基于扩散模型的主动聚类方法(Active Clustering with Diffusion Model,简称ACDM)。ACDM采用最近邻技术构建扩散图,并利用在线框架迭代优化聚类结果。在每次迭代中,(a)以批处理模式选择具有高不确定性和代表性的节点,(b)然后使用基于邻域集的新型查询方法,通过成对约束对所选节点进行分类,(c)将分类后的节点作为扩散模型中的源节点,用于进一步的聚类优化。实验结果表明,ACDM在聚类质量和可扩展性方面均优于当前最先进的聚类方法。|code|0|
|Compositional and Hierarchical Semantic Learning Model for Hospital Readmission Prediction|Weiting Gao, Xiangyu Gao, Yi Chen|New Jersey Institute of Technology, Newark, NJ, USA|Clinical notes provide a wealth of patient information that is valuable for predicting clinical outcomes. In particular, predicting hospital 30-day readmission is important to improve healthcare outcomes and reduce cost. Previous works on outcome prediction using clinical notes overlook complex semantic compositions and syntactic structure when learning the note level embedding, which may fail to capture the note semantics and make accurate predictions. To address these limitations, we propose a Compositional and Hierarchical Semantic Learning Model (CHSLM). It formulates the semantic learning of clinical notes into three hierarchies: word, composition, and note, and aggregates the semantics in a bottom-up manner. To aggregate the semantics from words to compositions, we construct heterogeneous medical-composition graphs to represent word interactions within and between medical compositions and use Graph Neural Networks to learn the composition embedding. To aggregate the semantics from composition- to note-level, we incorporate a mutual BiAffine transformation process. The experimental results on 30-day readmission prediction using two types of clinical notes demonstrate the effectiveness of our method over the state-of-the-art clinical prediction models.|临床记录提供了丰富的患者信息,这些信息对于预测临床结果具有重要价值。特别是,预测患者在出院后30天内再次入院的情况对于改善医疗结果和降低成本至关重要。以往利用临床记录进行结果预测的研究在学习笔记级别的嵌入时,忽视了复杂的语义组合和句法结构,这可能导致无法准确捕捉笔记的语义并做出精确的预测。为解决这些局限性,我们提出了一种组合与层次语义学习模型(CHSLM)。该模型将临床记录的语义学习划分为三个层次:词、组合和笔记,并以自底向上的方式聚合语义。为了从词到组合聚合语义,我们构建了异构的医疗组合图,以表示医疗组合内外的词交互,并使用图神经网络来学习组合嵌入。为了从组合层级到笔记层级聚合语义,我们引入了一个双向BiAffine转换过程。在利用两种类型的临床记录进行30天再入院预测的实验中,我们的方法展示了其优于现有最先进临床预测模型的有效性。|code|0|
|Mitigating Cold-Start Problems in Knowledge Tracing with Large Language Models: An Attribute-aware Approach|Yuxiang Guo, Shuanghong Shen, Qi Liu, Zhenya Huang, Linbo Zhu, Yu Su, Enhong Chen|; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China|Knowledge Tracing (KT) is a crucial research task for dynamically monitoring students' knowledge states, particularly in online education systems. Recently, knowledge tracing has gained significant attention and in-depth research. Most existing methods rely on students' response data for question understanding and modeling, which helps better updating students' knowledge states. Meanwhile, question ID is utilized to indicate and represent questions. However, this presents a challenge when transitioning to new, cold-start questions that few students has answered before. Also, prior work has overlooked the semantic modeling of questions, which could better assist in modeling the transfer of students' knowledge states. In this paper, we explore leveraging the power of Large Language Models (LLMs) to help understand questions for knowledge tracing, which benefits mitigating cold-start and sparse problems and modeling the transfer of students' knowledge states in a sophisticated manner. Specifically, we first design an attribute estimation module to estimate the attribute of the questions (e.g., difficulty, ability requirements, expected response time) by prompting Large Language Models. Subsequently, we have developed a question embedding module that incorporates graph attention network to effectively utilizing these attributes. Extensive experiments on various datasets demonstrate that our model outperforms existing state-of-the-art models and effectively addresses the problems of cold-start and sparsity. In addition, due to the estimation of multiple attributes of the questions, our model exhibits superior interpretability.|知识追踪(Knowledge Tracing, KT)是动态监测学生知识状态的关键研究任务,尤其在在线教育系统中具有重要意义。近年来,知识追踪受到了广泛关注并进行了深入研究。现有的大多数方法依赖于学生的回答数据来理解问题并进行建模,从而更好地更新学生的知识状态。同时,问题ID被用来指示和表示问题。然而,这种方法在面对新出现的“冷启动”问题时存在挑战,尤其是那些之前很少有学生回答过的问题。此外,先前的研究忽视了问题的语义建模,这本可以更好地辅助建模学生知识状态的转移。本文探讨了利用大语言模型(Large Language Models, LLMs)来帮助理解问题,以进行知识追踪,这有助于缓解冷启动和数据稀疏问题,并以更精细的方式建模学生知识状态的转移。具体而言,我们首先设计了一个属性估计模块,通过提示大语言模型来估计问题的属性(如难度、能力要求、预期回答时间)。随后,我们开发了一个问题嵌入模块,结合图注意力网络,以有效利用这些属性。在多个数据集上的广泛实验表明,我们的模型优于现有的最先进模型,并有效地解决了冷启动和稀疏性问题。此外,由于对问题多个属性的估计,我们的模型展现出更强的可解释性。|code|0|
|HeckmanCD: Exploiting Selection Bias in Cognitive Diagnosis|Dongxuan Han, Qi Liu, Siqi Lei, Shiwei Tong, Wei Huang|; Tencent Company, Shenzhen, China; Item Bank Department, National Education Examinations Authority, Beijing, China|Cognitive diagnosis, a fundamental task in education assessments, aims to quantify the students' proficiency level based on the historical test logs. However, the interactions between students and exercises are incomplete and even sparse, which means that only a few exercise scores of a specific student are observed. A key finding is that the pattern of this missingness is non-random, which could induce bias in the estimated proficiency value. To this end, we formulate cognitive diagnosis with a sample selection problem where observations are sampled through non-random probabilities that correlate with both the student's response correctness and the features of the student and exercise. We proposed a simple but effective method called HeckmanCD, adapting the Heckman two-stage approach to mitigate this endogeneity issue. We first employ an interaction model to predict the occurrence probability of a specific student-exercise pair. After that, a selection variable, derived from this interaction model, is incorporated as a controlled independent variable in the cognitive diagnosis framework. Our analysis reveals that the vanilla estimations of the item response theory model are inherently biased in the existence of confounders, and our method can correct this bias by capturing the covariance. The proposed HeckmanCD can be applied to most existing cognitive diagnosis models, including deep models, and the empirical evaluation demonstrates the effectiveness of our method while no other auxiliary information is required such as textual descriptions of exercises.|认知诊断,作为教育评估中的基础任务,旨在基于学生的历史测试记录量化其熟练度水平。然而,学生与练习之间的互动往往是不完整的,甚至是稀疏的,这意味着只能观察到特定学生的少数练习成绩。一个关键发现是,这种缺失的模式并非随机,可能会导致估计的熟练度值出现偏差。为此,我们将认知诊断问题形式化为一个样本选择问题,其中观察值是通过与学生回答正确性和学生及练习特征相关的非随机概率进行采样的。我们提出了一种简单但有效的方法,称为HeckmanCD,该方法采用Heckman两阶段方法来缓解这种内生性问题。首先,我们使用一个交互模型来预测特定学生-练习对的发生概率。随后,从该交互模型中得出的选择变量被纳入认知诊断框架中,作为受控的自变量。我们的分析表明,在存在混杂因素的情况下,项目反应理论模型的朴素估计本质上是有偏的,而我们的方法通过捕捉协方差可以纠正这种偏差。所提出的HeckmanCD方法可以应用于大多数现有的认知诊断模型,包括深度模型,并且实证评估显示了该方法的有效性,而无需其他辅助信息,如练习的文本描述。|code|0|
|Spatio-Temporal Transformer Network with Physical Knowledge Distillation for Weather Forecasting|Jing He, Junzhong Ji, Minglong Lei|College of Computer Science, Beijing University of Technology, Beijing, China|Weather forecasting has become a popular research topic recently, which mainly benefits from the development of spatio-temporal neural networks to effectively extract useful patterns from weather data. Generally, the weather changes in the meteorological system are governed by physical principles. However, it is challenging for spatio-temporal methods to capture the physical knowledge of meteorological dynamics. To address this problem, we propose in this paper a spatio-temporal Transformer network with physical knowledge distillation (PKD-STTN) for weather forecasting. First, the teacher network is implemented by a differential equation network that models weather changes by the potential energy in the atmosphere to reveal the physical mechanism of atmospheric movements. Second, the student network uses a spatio-temporal Transformer that concurrently utilizes three attention modules to comprehensively capture the semantic spatial correlation, geographical spatial correlation, and temporal correlation from weather data. Finally, the physical knowledge of the teacher network is transferred to the student network by inserting a distillation position encoding into the Transformer. Notice that the output of the teacher network is distilled to the position encoding rather than the output of the student network, which can largely utilize physical knowledge without influencing the feature extraction process of Transformers. Experiments on benchmark datasets show that the proposed method can effectively utilize physical principles of weather changes and has obvious performance advantages compared with several strong baselines.|天气预报近年来成为热门研究课题,主要得益于时空神经网络的发展,能够有效从天气数据中提取有用模式。通常,气象系统中的天气变化受物理原理支配。然而,时空方法难以捕捉气象动力学的物理知识。为解决此问题,本文提出一种融入物理知识蒸馏的时空Transformer网络(PKD-STTN)用于天气预报。首先,教师网络采用微分方程网络,通过大气中的势能建模天气变化,揭示大气运动的物理机制。其次,学生网络使用时空Transformer,同时利用三个注意力模块全面捕捉天气数据中的语义空间相关性、地理空间相关性和时间相关性。最后,通过插入蒸馏位置编码,将教师网络的物理知识传递给学生网络。注意,教师网络的输出被蒸馏至位置编码而非学生网络的输出,这样可以在不影响Transformer特征提取过程的情况下充分利用物理知识。在基准数据集上的实验表明,所提方法能有效利用天气变化的物理原理,相比多个强基线方法具有明显的性能优势。|code|0|
|New Localization Frameworks: User-centric Approaches to Source Localization in Real-world Propagation Scenarios|Dongpeng Hou, Yuchen Wang, Chao Gao, Xianghua Li, Zhen Wang|Northwestern Polytechnical University, Xi'an, Shaanxi, China|Source localization in social platforms is critical for managing and controlling the misinformation spreading. Despite all the recent advancements, existing methods do not consider the dynamic and heterogeneous propagation behaviors of users and are developed based on simulated data with strong model assumptions, limiting the application in real-world scenarios. This research addresses this limitation by presenting a novel framework for source localization, grounded in real-world propagation cascades from platforms like Weibo and Twitter. What's more, recognizing the user-driven nature of users in information spread, we systematically crawl and integrate user-specific profiles, offering a realistic understanding of user-driven propagation dynamics. In summary, by developing datasets derived from real-world propagation cascades, we set a precedent in enhancing the authenticity and practice of source identification for social media. Our comprehensive experiments not only validate the feasibility and rationale of our novel user-centric localization approaches but also emphasize the significance of considering user profiles in real-world propagation scenarios. The code is available at https://github.com/cgao-comp/NFSL.|社交平台中的信息源定位对于管理和控制虚假信息的传播至关重要。尽管近年来取得了诸多进展,但现有方法并未考虑用户传播行为的动态性和异质性,并且这些方法基于具有强假设的模拟数据开发,限制了其在现实场景中的应用。本研究通过提出一种基于微博和推特等平台真实传播链的全新源定位框架,解决了这一局限性。此外,鉴于信息传播中用户驱动的特性,我们系统地爬取并整合了用户特定的个人资料,从而提供了对用户驱动传播动态的真实理解。总之,通过开发源自真实传播链的数据集,我们为增强社交媒体源识别的真实性和实用性树立了先例。我们的全面实验不仅验证了我们以用户为中心的新型定位方法的可行性和合理性,还强调了在现实传播场景中考虑用户个人资料的重要性。代码已公开,详见 https://github.com/cgao-comp/NFSL。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=New+Localization+Frameworks:+User-centric+Approaches+to+Source+Localization+in+Real-world+Propagation+Scenarios)|0|
|Physics-guided Active Sample Reweighting for Urban Flow Prediction|Wei Jiang, Tong Chen, Guanhua Ye, Wentao Zhang, Lizhen Cui, Zi Huang, Hongzhi Yin||Urban flow prediction is a spatio-temporal modeling task that estimates the throughput of transportation services like buses, taxis, and ride-sharing, where data-driven models have become the most popular solution in the past decade. Meanwhile, the implicitly learned mapping between historical observations to the prediction targets tend to over-simplify the dynamics of real-world urban flows, leading to suboptimal predictions. Some recent spatio-temporal prediction solutions bring remedies with the notion of physics-guided machine learning (PGML), which describes spatio-temporal data with nuanced and principled physics laws, thus enhancing both the prediction accuracy and interpretability. However, these spatio-temporal PGML methods are built upon a strong assumption that the observed data fully conforms to the differential equations that define the physical system, which can quickly become ill-posed in urban flow prediction tasks. The observed urban flow data, especially when sliced into time-dependent snapshots to facilitate predictions, is typically incomplete and sparse, and prone to inherent noise incurred in the collection process. As a result, such physical inconsistency between the data and PGML model significantly limits the predictive power and robustness of the solution. Moreover, due to the interval-based predictions and intermittent nature of data filing in many transportation services, the instantaneous dynamics of urban flows can hardly be captured, rendering differential equation-based continuous modeling a loose fit for this setting. To overcome the challenges, we develop a discretized physics-guided network (PN), and propose a data-aware framework Physics-guided Active Sample Reweighting (P-GASR) to enhance PN. Experimental results in four real-world datasets demonstrate that our method achieves state-of-the-art performance with a demonstrable improvement in robustness.|城市流量预测是一项时空建模任务,旨在估算公交车、出租车和共享出行等交通服务的吞吐量,其中数据驱动模型在过去十年中已成为最受欢迎的解决方案。然而,隐式学习的历史观测与预测目标之间的映射关系往往过于简化现实世界中城市流量的动态特性,导致预测效果不佳。近期,一些时空预测解决方案引入了物理引导机器学习(PGML)的概念,通过利用细致且基于物理定律的描述来增强预测的准确性和可解释性。然而,这些时空PGML方法基于一个强假设,即观测数据完全符合定义物理系统的微分方程,这在城市流量预测任务中往往难以成立。观测到的城市流量数据,尤其是为了便于预测而分割成时间依赖的快照时,通常是不完整且稀疏的,并且容易受到采集过程中固有噪声的影响。因此,数据与PGML模型之间的物理不一致性显著限制了该解决方案的预测能力和鲁棒性。此外,由于许多交通服务中的数据记录是基于时间间隔的,并且具有间歇性,难以捕捉城市流量的瞬时动态,使得基于微分方程的连续建模方法在此场景下并不适用。为应对这些挑战,我们开发了一种离散化的物理引导网络(PN),并提出了一种数据感知框架——物理引导主动样本重加权(P-GASR),以增强PN的性能。在四个真实世界数据集上的实验结果表明,我们的方法在实现最先进性能的同时,显著提升了模型的鲁棒性。|code|0|
|Federated Heterogeneous Contrastive Distillation for Molecular Representation Learning|Jinjia Feng, Zhen Wang, Zhewei Wei, Yaliang Li, Bolin Ding, Hongteng Xu|Sun Yat-sen University, Guangzhou, China; Peng Cheng Laboratory & Renmin University of China, Shenzhen, China; Renmin University of China, Beijing, China; Alibaba Group, Bellevue, WA, USA; Alibaba Group, Bellevue, CA, USA|With the increasing application of deep learning to solve scientific problems in biochemistry, molecular federated learning has become popular due to its ability to offer distributed privacy-preserving solutions. However, most existing molecular federated learning methods rely on joint training with public datasets, which are difficult to obtain in practice. These methods also fail to leverage multi-modal molecular representations effectively. To address the above issues, we propose a novel framework, Federated Heterogeneous Contrastive Distillation (FedHCD), which enables to jointly train global models from clients with heterogeneous data modalities, learning tasks, and molecular models. To aggregate data representations of different modalities in a data-free manner, we design a global multi-modal contrastive strategy to align the representation of clients without public dataset. Utilizing intrinsic characteristics of molecular data in different modalities, we tackle the exacerbation of local model drift and data Non-IIDness caused by multi-modal clients. We introduce a multi-view contrastive knowledge transfer to extract features from atoms, substructures, and molecules, solving the issue of information distillation failure due to dimensional biases in different data modalities. Our evaluations on eight real-world molecular datasets and ablation experiments show that FedHCD outperforms other state-of-the-art FL methods, irrespective of whether or not they use public datasets.|随着深度学习在解决生物化学领域科学问题中的应用日益增多,分子联邦学习因其能够提供分布式隐私保护解决方案而受到广泛关注。然而,现有的大多数分子联邦学习方法依赖于与公共数据集的联合训练,这在实际中难以获取。此外,这些方法未能有效利用多模态分子表示。为解决上述问题,我们提出了一种新颖的框架——联邦异构对比蒸馏(Federated Heterogeneous Contrastive Distillation, FedHCD),该框架能够从具有异构数据模态、学习任务和分子模型的客户端中联合训练全局模型。为了在无需数据的情况下聚合不同模态的数据表示,我们设计了一种全局多模态对比策略,以对齐客户端的表示,而不依赖于公共数据集。利用不同模态分子数据的内禀特性,我们解决了多模态客户端导致的局部模型漂移和数据非独立同分布(Non-IID)问题。我们引入了一种多视角对比知识迁移方法,从原子、子结构和分子中提取特征,解决了由于不同数据模态间的维度偏差导致的信息蒸馏失败问题。我们在八个真实世界的分子数据集上进行了评估和消融实验,结果表明,无论是否使用公共数据集,FedHCD均优于其他最先进的联邦学习方法。|code|0|
|Discrepancy-guided Channel Dropout for Domain Generalization|Seonggyeom Kim, Byeongtae Park, Harim Lee, DongKyu Chae|Hanyang University, Seoul, Republic of Korea|Deep Neural Networks (DNNs) tend to perform poorly on unseen domains due to domain shifts. Domain Generalization (DG) aims to improve the performance on such scenarios by minimizing the distribution discrepancy between source domains. Among many studies, dropout-based DG approaches which remove domain-specific features have gained attention. However, they are limited in minimizing the upper bound of generalization risk because they do not explicitly consider the distribution discrepancy when discarding features. In this paper, we propose a novel Discrepancy-guided Channel Dropout (DgCD) for DG that explicitly derives the discrepancy between domains and drops the channels with significant distribution discrepancy. Given a training batch, we perform two ways of standardization: (1) based on the variance/mean of the batch (i.e., sampled from all source domains) and (2) based on the variance/mean of domain-wise samples in the batch. With the two normal distributions, we explicitly derive the discrepancy using KL-divergence and backpropagate it towards each channel. A channel with a higher contribution to the discrepancy is more likely to be dropped. Experimental results show the superiority of DgCD over the state-of-the-art DG baselines, demonstrating the effectiveness of our dropout strategy which is directly coupled to reducing the domain discrepancy. Our code is available at: https://github.com/gyeomo/DgCD|深度神经网络(DNNs)在面对未见过的领域时往往表现不佳,这是由于领域转移造成的。领域泛化(Domain Generalization, DG)旨在通过最小化源领域之间的分布差异来提升在这些场景下的性能。在众多研究中,基于dropout的DG方法因其能够去除领域特定特征而受到关注。然而,这些方法在最小化泛化风险的上界方面存在局限,因为它们在丢弃特征时并未明确考虑分布差异。本文提出了一种新颖的差异引导通道dropout(Discrepancy-guided Channel Dropout, DgCD)方法,用于领域泛化,该方法明确地推导出领域间的差异,并丢弃具有显著分布差异的通道。在给定的训练批次中,我们执行两种标准化方式:(1)基于批次的方差/均值(即从所有源领域中采样);(2)基于批次中各领域样本的方差/均值。通过这两个正态分布,我们使用KL散度明确地推导出差异,并将其反向传播到每个通道。对差异贡献较大的通道更有可能被丢弃。实验结果表明,DgCD在性能上优于当前最先进的领域泛化基线方法,证明了我们的dropout策略能直接减少领域差异的有效性。我们的代码已公开,地址为:https://github.com/gyeomo/DgCD。|[code](https://paperswithcode.com/search?q_meta=&q_type=&q=Discrepancy-guided+Channel+Dropout+for+Domain+Generalization)|0|
|Efficient and Secure Contribution Estimation in Vertical Federated Learning|Juan Li, Rui Deng, Tianzi Zang, Mingqi Kong, Kun Zhu|Nanjing University of Aeronautics and Astronautics, Nanjing, China|As necessary information about whether cooperation can be reached, rewards should be determined in advance in Vertical Federated Learning (VFL). To determine reasonable rewards, participant contributions should be estimated precisely. We propose a Vertically Federated Contribution Estimation (VF-CE) method. VF-CE calculates Mutual Information (MI) between distributed features and the label using a neural network trained via VFL itself. Note that compensation for CE is low as it only covers computation costs, and reward for real VFL training is high as it needs to cover training costs as well as participants' contributions to model performance and the resulting business benefits. Because MI presents a strong positive correlation with the final model performance, contributions to model performance can be estimated based on contributions to MI. We integrate a scalar-level attention mechanism in MI neural network. The attention weights of participants are treated as their contributions. We find that attention weights can effectively measure contribution redundancy, as its Spearman correlation coefficient with Shapley value is as high as 0.963. We demonstrate that VF-CE also satisfies properties of balance, zero element, and symmetry concerning fairness, which are hallmark properties of Shapley value. Compared with existing work, we consider contribution redundancy precisely, efficiently output approximated Shapley values through one MI calculation instead of 2 n where n is the number of participants, and introduce no extra privacy risk except the inherent risk in VFL, i.e., gradient transmission.|在纵向联邦学习(Vertical Federated Learning, VFL)中,为了确定合作是否能够达成,奖励应提前设定。为了设定合理的奖励,参与者的贡献应被精确估计。我们提出了一种纵向联邦贡献估计(Vertically Federated Contribution Estimation, VF-CE)方法。VF-CE通过使用VFL自身训练的神经网络计算分布式特征与标签之间的互信息(Mutual Information, MI)。需要注意的是,CE的补偿较低,因为它仅涵盖计算成本,而实际VFL训练的奖励较高,因为它需要涵盖训练成本以及参与者对模型性能和由此产生的业务效益的贡献。由于MI与最终模型性能呈现强正相关,因此可以通过MI的贡献来估计对模型性能的贡献。我们在MI神经网络中集成了一个标量级别的注意力机制,将参与者的注意力权重视为他们的贡献。我们发现,注意力权重能够有效衡量贡献冗余,其与Shapley值的Spearman相关系数高达0.963。我们证明,VF-CE在公平性方面也满足平衡性、零元素和对称性等特性,这些是Shapley值的标志性特性。与现有工作相比,我们精确考虑了贡献冗余,通过一次MI计算高效输出近似的Shapley值,而不是2^n次计算(其中n是参与者的数量),并且除了VFL固有的梯度传输隐私风险外,不引入额外的隐私风险。|code|0|
|MoTTo: Scalable Motif Counting with Time-aware Topology Constraint for Large-scale Temporal Graphs|Jiantao Li, Jianpeng Qi, Yueling Huang, Lei Cao, Yanwei Yu, Junyu Dong|Ocean University of China, Qingdao, Shandong, China; University of Arizona, Tucson, USA; Ocean University of China, Qingdao, China|Temporal motifs are recurring subgraph patterns in temporal graphs, and are present in various domains such as social networks, fraud detection, and biological networks. Despite their significance, counting temporal motifs efficiently remains a challenge, particularly on moderately sized datasets with millions of motif instances. To address this challenge, we propose a novel algorithm called Scalable Motif Counting with Time-aware Topology Constraint (MoTTo). MoTTo focuses on accurately counting temporal motifs with up to three nodes and three edges. It first utilizes a topology constraint-based pruning strategy to eliminate nodes that cannot participate in forming temporal motifs before the counting process. Then, it adopts a time-aware topology constraint-based pruning strategy to split large-scale datasets into independent partitions and filter out the unrelated ones, ensuring that the counting results remain unaffected. By investigating the second pruning strategy, we also find that MoTTo can be implemented in a multi-thread manner, further accelerating the counting process significantly. Experimental results on several real-world datasets of varying sizes demonstrate that MoTTo outperforms state-of-the-art methods in terms of efficiency, achieving up to a nine-fold improvement in total temporal motif counting. Specifically, the efficiency of counting triangular temporal motifs is enhanced by up to 31 times compared to state-of-the-art baselines.|时态模体是在时态图中重复出现的子图模式,广泛存在于社交网络、欺诈检测和生物网络等多个领域。尽管它们具有重要意义,但高效地统计时态模体仍然是一个挑战,尤其是在包含数百万模体实例的中等规模数据集上。为了应对这一挑战,我们提出了一种名为“带有时态拓扑约束的可扩展模体计数”(MoTTo)的新算法。MoTTo专注于精确统计包含最多三个节点和三条边的时态模体。首先,它利用基于拓扑约束的剪枝策略,在计数过程之前排除无法参与形成时态模体的节点。接着,它采用基于时态拓扑约束的剪枝策略,将大规模数据集分割成独立的分区并过滤掉无关部分,从而确保计数结果不受影响。通过研究第二种剪枝策略,我们还发现MoTTo可以实现多线程处理,进一步显著加速计数过程。在多个不同规模的实际数据集上的实验结果表明,MoTTo在效率方面优于当前最先进的方法,总时态模体计数效率提升高达九倍。具体而言,与最先进的基线方法相比,三角形时态模体的计数效率提高了多达31倍。|code|0|
|LagCNN: A Fast yet Effective Model for Multivariate Long-term Time Series Forecasting|Linsen Li, Chunfei Jian, Feng Wan, Dongdong Geng, Ziquan Fang, Lu Chen, Yunjun Gao, Weihao Jiang, Jiang Zhu|Zhejiang University, Hangzhou, China; Hikvision Research Institute, Hangzhou, China; Zhejiang University & Hikvision Research Institute, Hangzhou, China|Long-term time series forecasting has gained significant attention in recent years due to its widely-application in various fields. Transformer-based models have gained popularity for the ability to capture long-sequence interactions. However, these models are limited in real-world use because of the memory consumption and computation explosion. The CNN-based models are also one of the main models used for time series prediction, but their performance has always been inferior to the transformer-based models in previous works. We have reconsidered the role of CNN components and redefined the way CNN basic components are used for time series prediction. In addition, the time lags information between periods in the time series is important. Unfortunately, existing works lack consideration of this classic but important information. Motivated by these factors, we propose a fast yet effective CNN model with time lags for multivariate long-term time series forecasting, named LagCNN. Specifically, the time series is transformed into lag-patches to capture the correlation between periods. Then, a fast CNN model is performed in the feature dimension rather than the time dimension like most previous works do. Meanwhile, information aggregation is performed in the time dimension to extract complex temporal patterns. LagCNN significantly outperforms state-of-the-art on multiple publicly available datasets. One step further, LagCNN exhibits significant efficiency advantages over the most efficient Transformer model (PatchTST), resulting in a significant reduction in memory usage (4.4×) and runtime (10.7×).|长期时间序列预测近年来因其广泛的应用而备受关注。基于Transformer的模型因其捕捉长序列交互的能力而受到欢迎。然而,这些模型在实际应用中受到限制,因为它们存在内存消耗和计算爆炸的问题。基于CNN的模型也是用于时间序列预测的主要模型之一,但它们在以往的研究中的表现一直不如基于Transformer的模型。我们重新考虑了CNN组件的作用,并重新定义了用于时间序列预测的CNN基本组件的使用方式。此外,时间序列中各周期之间的时间滞后信息非常重要。遗憾的是,现有研究缺乏对这一经典但重要信息的考虑。受这些因素的启发,我们提出了一种快速且有效的CNN模型,用于多元长期时间序列预测,命名为LagCNN。具体来说,时间序列被转换为滞后补丁,以捕捉周期之间的相关性。然后,在特征维度而不是像大多数先前的工作那样在时间维度上执行快速CNN模型。同时,在时间维度上进行信息聚合,以提取复杂的时间模式。LagCNN在多个公开可用的数据集上显著优于最先进的方法。进一步地,LagCNN在与最有效的Transformer模型(PatchTST)相比时表现出显著的效率优势,导致内存使用量(4.4倍)和运行时间(10.7倍)大幅减少。|code|0|
|Noise-Resilient Unsupervised Graph Representation Learning via Multi-Hop Feature Quality Estimation|Shiyuan Li, Yixin Liu, Qingfeng Chen, Geoffrey I. Webb, Shirui Pan||Unsupervised graph representation learning (UGRL) based on graph neural networks (GNNs), has received increasing attention owing to its efficacy in handling graph-structured data. However, existing UGRL methods ideally assume that the node features are noise-free, which makes them fail to distinguish between useful information and noise when applied to real data with noisy features, thus affecting the quality of learned representations. This urges us to take node noisy features into account in real-world UGRL. With empirical analysis, we reveal that feature propagation, the essential operation in GNNs, acts as a "double-edged sword" in handling noisy features - it can both denoise and diffuse noise, leading to varying feature quality across nodes, even within the same node at different hops. Building on this insight, we propose a novel UGRL method based on Multi-hop feature Quality Estimation (MQE for short). Unlike most UGRL models that directly utilize propagation-based GNNs to generate representations, our approach aims to learn representations through estimating the quality of propagated features at different hops. Specifically, we introduce a Gaussian model that utilizes a learnable "meta-representation" as a condition to estimate the expectation and variance of multi-hop propagated features via neural networks. In this way, the "meta representation" captures the semantic and structural information underlying multiple propagated features but is naturally less susceptible to interference by noise, thereby serving as high-quality node representations beneficial for downstream tasks. Extensive experiments on multiple real-world datasets demonstrate that MQE in learning reliable node representations in scenarios with diverse types of feature noise.|基于图神经网络(GNN)的无监督图表示学习(UGRL)因其有效处理图结构数据的能力而受到越来越多的关注。然而,现有的UGRL方法理想化地假设节点特征是无噪声的,这使得它们在应用于具有噪声特征的真实数据时,无法区分有用信息和噪声,从而影响学习到的表示质量。这促使我们在现实世界的UGRL中考虑节点噪声特征。通过实证分析,我们揭示了特征传播——GNN中的基本操作——在处理噪声特征时起到了“双刃剑”的作用——它既能去噪又能扩散噪声,导致不同节点之间的特征质量不同,甚至在同一节点的不同跳数之间也存在差异。基于这一见解,我们提出了一种基于多跳特征质量估计(MQE)的新型UGRL方法。与大多数直接利用基于传播的GNN生成表示的UGRL模型不同,我们的方法旨在通过估计不同跳数传播特征的质量来学习表示。具体来说,我们引入了一个高斯模型,该模型利用可学习的“元表示”作为条件,通过神经网络估计多跳传播特征的期望和方差。通过这种方式,“元表示”捕捉了多个传播特征背后的语义和结构信息,但自然较少受到噪声的干扰,从而作为高质量的节点表示,有利于下游任务。在多个真实世界数据集上的广泛实验表明,MQE在处理具有多种特征噪声的场景中能够学习到可靠的节点表示。|code|0|
|Aligning Large Language Models to a Domain-specific Graph Database for NL2GQL|Yuanyuan Liang, Keren Tan, Tingyu Xie, Wenbiao Tao, Siyuan Wang, Yunshi Lan, Weining Qian||||code|0|
|ITIU: Intention Understanding via Interactive Table in Large Language Models|Zenghua Liao, Jinzhi Liao, Xiang Zhao||||code|0|
|Unveiling Intellectual Property Vulnerabilities of GAN-Based Distributed Machine Learning through Model Extraction Attacks|Mengyao Ma, Shuofeng Liu, M. A. P. Chamikara, Mohan Baruwal Chhetri, Guangdong Bai||||code|0|
|Semantic Prototypes: Enhancing Transparency without Black Boxes|Orfeas MenisMastromichalakis, Giorgos Filandrianos, Jason Liartis, Edmund Dervakos, Giorgos Stamou||||code|0|
|Revisiting Optimal Window Aggregation in Data Streams: The Prefix-Sum Approach|José Martinez, Guillaume Raschia||||code|0|
|Adaptive Cascading Network for Continual Test-Time Adaptation|Kien X. Nguyen, Fengchun Qiao, Xi Peng||||code|0|
|Exploring Robustness of GNN against Universal Injection Attack from a Worst-case Perspective|Dandan Ni, Sheng Zhang, Cong Deng, Han Liu, Gang Chen, Minhao Cheng, Hongyang Chen||||code|0|
|CADIF-OSN: Detecting Cloned Accounts with Missing Profile Attributes on Online Social Networks|Dewei Ning, YongFeng Ge, Hua Wang, Changjun Zhou||||code|0|
|Distilling Large Language Models for Text-Attributed Graph Learning|Bo Pan, Zheng Zhang, Yifei Zhang, Yuntong Hu, Liang Zhao||||code|0|
|Table-Filling via Mean Teacher for Cross-domain Aspect Sentiment Triplet Extraction|Kun Peng, Lei Jiang, Qian Li, Haoran Li, Xiaoyan Yu, Li Sun, Shuo Sun, Yanxian Bi, Hao Peng||||code|0|
|Periormer: Periodic Transformer for Seasonal and Irregularly Sampled Time Series|Xiaobin Ren, Kaiqi Zhao, Katerina Taskova, Patricia Riddle, Lianyan Li||||code|0|
|Self-supervised One-Stage Learning for RF-based Multi-Person Pose Estimation|Seunghwan Shin, Yusung Kim||||code|0|
|DFLStar: A Decentralized Federated Learning Framework with Self-Knowledge Distillation and Participant Selection|Behnaz Soltani, Venus Haghighi, Yipeng Zhou, Quan Z. Sheng, Lina Yao||||code|0|
|TEXT CAN BE FAIR: Mitigating Popularity Bias with PLMs by Learning Relative Preference|Zuoli Tang, Zhaoxin Huan, Zihao Li, Shirui Hu, Xiaolu Zhang, Jun Zhou, Lixin Zou, Chenliang Li||||code|0|
|LTBoost: Boosted Hybrids of Ensemble Linear and Gradient Algorithms for the Long-term Time Series Forecasting|Hubert Truchan, Christian Kalfar, Zahra Ahmadi||||code|0|
|Why Misinformation is Created? Detecting them by Integrating Intent Features|Bing Wang, Ximing Li, Changchun Li, Bo Fu, Songwen Pei, Shengsheng Wang||||code|0|
|Bots Shield Fake News: Adversarial Attack on User Engagement based Fake News Detection|Lanjun Wang, Zehao Wang, Le Wu, AnAn Liu||||code|0|
|Learning to Differentiate Pairwise-Argument Representations for Implicit Discourse Relation Recognition|Zhipang Wang, Yu Hong, Yuxiang Lu, Xiabing Zhou, Jianmin Yao, Guodong Zhou||||code|0|
|Identifying Disinformation from Online Social Media via Dynamic Modeling across Propagation Stages|Shuai Xu, Jianqiu Xu, Shuo Yu, Bohan Li||||code|0|
|SGES: A General and Space-efficient Framework for Graphlet Counting in Graph Streams|Chen Yang, Lailong Luo, Yuliang Lu, Chu Huang, Qianzhen Zhang, Guozheng Yang, Deke Guo||||code|0|
|Behavior-Aware Hypergraph Convolutional Network for Illegal Parking Prediction with Multi-Source Contextual Information|Guang Yang, Meiqi Tu, Zelong Li, Jinquan Hang, Taichi Liu, Ruofeng Liu, Yi Ding, Yu Yang, Desheng Zhang||||code|0|
|Distilling Multi-Scale Knowledge for Event Temporal Relation Extraction|HaoRen Yao, Luke Breitfeller, Aakanksha Naik, Chunxiao Zhou, Carolyn P. Rosé||||code|0|
|Debiased Graph Poisoning Attack via Contrastive Surrogate Objective|Kanghoon Yoon, Yeonjun In, Namkyeong Lee, Kibum Kim, Chanyoung Park||||code|0|
|Language Models-enhanced Semantic Topology Representation Learning For Temporal Knowledge Graph Extrapolation|Tianli Zhang, Tongya Zheng, Zhenbang Xiao, Zulong Chen, Liangyue Li, Zunlei Feng, Dongxiang Zhang, Mingli Song||||code|0|
|SaLa: Scenario-aware Label Graph Interaction for Multi-intent Spoken Language Understanding|Zhihong Zhu, Xuxin Cheng, Zhanpeng Chen, Zhichang Wang, Zhiqi Huang, Yuexian Zou||||code|0|
|Distributed Boosting: An Enhancing Method on Dataset Distillation|Xuechao Chen, Wenchao Meng, Peiran Wang, Qihang Zhou||||code|0|
|The Factuality of Large Language Models in the Legal Domain|Rajaa El Hamdani, Thomas Bonald, Fragkiskos D. Malliaros, Nils Holzenberger, Fabian M. Suchanek||||code|0|
|Beyond Language Bias: Overcoming Multimodal Shortcut and Distribution Biases for Robust Visual Question Answering|Jingliang Gu, Zhixin Li||||code|0|
|A Contextual Combinatorial Semi-Bandit Approach to Network Bottleneck Identification|Fazeleh Sadat Hoseini, Niklas Åkerblom, Morteza Haghir Chehreghani||||code|0|
|Nonparametric Estimation of Non-Smooth Divergences|M. Mahbub Hossain, Alan Wisler, Kevin R. Moon||||code|0|
|LEX-GNN: Label-Exploring Graph Neural Network for Accurate Fraud Detection|Woochang Hyun, Insoo Lee, Bongwon Suh||||code|0|
|GraphVAE: Unveiling Dynamic Stock Relationships with Variational Autoencoder-based Factor Modeling|Yulong Jia, Guanxing Li, Ganlong Zhao, Xiangru Lin, Guanbin Li||||code|0|
|Covariate Ordered Systematic Sampling as an Improvement to Randomized Controlled Trials|Deddy Jobson, Yilin Li, Naoki Nishimura, Koya Ohashi, Jie Yang, Takeshi Matsumoto||||code|0|
|Flexi-clique: Exploring Flexible and Sub-linear Clique Structures|Song Kim, Junghoon Kim, Susik Yoon, Jungeun Kim||||code|0|
|Intricate Object Detection in Self Driving Environments with Edge-Adaptive Depth Estimation(EADE)|SuBi Kim, Jieun Kang, Yongik Yoon||||code|0|
|Ocean Significant Wave Height Estimation with Spatio-temporally Aware Large Language Models|Zhe Li, Ronghui Xu, Jilin Hu, Zhong Peng, Xi Lu, Chenjuan Guo, Bin Yang||||code|0|
|The Elusiveness of Detecting Political Bias in Language Models|Riccardo Lunardi, David La Barbera, Kevin Roitero||||code|0|
|ILTS: Inducing Intention Propagation in Decentralized Multi-Agent Tasks with Large Language Models|Xihe Qiu, Haoyu Wang, Xiaoyu Tan, Chao Qu||||code|0|
|Automation of Text-Based Economic Indicator Construction: A Pilot Exploration on Economic Policy Uncertainty Index|HsiuHsuan Yeh, YuLieh Huang, Ziho Park, ChungChi Chen||||code|0|
|Prioritized Binary Transformation Method for Efficient Multi-label Classification of Data Streams with Many Labels|Onur Yildirim, Sepehr Bakhshi, Fazli Can||||code|0|
|Forecasting Live Chat Intent from Browsing History|Seeun Yoon, Ahmad Bin Rabiah, Zaid Alibadi, Surya Kallumadi, Julian J. McAuley||||code|0|
|XRDMamba: Large-scale Crystal Material Space Group Identification with Selective State Space Model|Liheng Yu, Pengkun Wang, Zhe Zhao, Zhongchao Yi, Sun Nan, Di Wu, Yang Wang||||code|0|
|Long-Term Hydrologic Time Series Prediction with LSPM|Sicheng Zhou, David C. Anastasiu||||code|0|
|Boosting Entity Recognition by leveraging Cross-task Domain Models for Weak Supervision|Sanjay Agrawal, Srujana Merugu, Vivek Sembium||||code|0|
|PlayBest: Professional Basketball Player Behavior Synthesis via Planning with Diffusion|Xiusi Chen, WeiYao Wang, Ziniu Hu, David Reynoso, Kun Jin, Mingyan Liu, P. Jeffrey Brantingham, Wei Wang||||code|0|
|GraphScale: A Framework to Enable Machine Learning over Billion-node Graphs|Vipul Gupta, Xin Chen, Ruoyun Huang, Fanlong Meng, Jianjun Chen, Yujun Yan||||code|0|
|Cryptocurrency Price Forecasting using Variational Autoencoder with Versatile Quantile Modeling|Sungchul Hong, SeungHwan An, JongJune Jeon||||code|0|
|DAMOCRO: A Data Migration Framework Using Online Classification and Reordering|Zhongxin Hu, Kaiyu Li, Xingjian Mao, Jingfeng Pan, Yunfei Peng, Aijun An, Xiaohui Yu, Dariusz Jania||||code|0|
|XCapsUTL: Cross-domain Unsupervised Transfer Learning Framework using a Capsule Neural Network|Naman Khetan, Sanyog Dewani, Gokul Swamy, Vikalp Gajbhiye||||code|0|
|EFfECT-RL: Enabling Framework for Establishing Causality and Triggering engagement through RL|Debanjan Sadhukhan, Deepanshi Seth, Sanjay Agrawal, Tridib Mukherjee||||code|0|
|CPFD: Confidence-aware Privileged Feature Distillation for Short Video Classification|Jinghao Shi, Xiang Shen, Kaili Zhao, Xuedong Wang, Vera Wen, Zixuan Wang, Yifan Wu, Zhixin Zhang||||code|0|
|Dynamic Graph-based Deep Reinforcement Learning with Long and Short-term Relation Modeling for Portfolio Optimization|Haoyu Sun, Yuxuan Bian, Li Han, Peng Zhu, Dawei Cheng, Yuqi Liang||||code|0|
|Behavior-aware Sparse Trajectory Recovery in Last-mile Delivery with Multi-scale Attention Fusion|Hai Wang, Shuai Wang, Li Lin, Yu Yang, Shuai Wang, Hongkai Wen||||code|0|
|CourIRL: Predicting Couriers' Behavior in Last-Mile Delivery Using Crossed-Attention Inverse Reinforcement Learning|Shuai Wang, Tongtong Kong, Baoshen Guo, Li Lin, Haotian Wang||||code|0|
|Multiscale Representation Enhanced Temporal Flow Fusion Model for Long-Term Workload Forecasting|Shiyu Wang, Zhixuan Chu, Yinbo Sun, Yu Liu, Yuliang Guo, Yang Chen, Huiyang Jian, Lintao Ma, Xingyu Lu, Jun Zhou||||code|0|
|Revolutionizing Biomarker Discovery: Leveraging Generative AI for Bio-Knowledge-Embedded Continuous Space Exploration|Wangyang Ying, Dongjie Wang, Xuanming Hu, Ji Qiu, Jin Park, Yanjie Fu||||code|0|
|A Behavior-aware Cause Identification Framework for Order Cancellation in Logistics Service|Shuxin Zhong, Yahan Gu, Wenjun Lyu, Hongyu Lin, Guang Yang, Yao Lu, Guang Wang, Yu Yang, Desheng Zhang||||code|0|
|LSR-IGRU: Stock Trend Prediction Based on Long Short-Term Relationships and Improved GRU|Peng Zhu, Yuante Li, Yifan Hu, Qinyuan Liu, Dawei Cheng, Yuqi Liang||||code|0|
|RevEx: An Online Consumer Reviews Extraction Tool|Julián Alarte, Carlos Galindo, Carlos Martín, Josep Silva||||code|0|
|Introducing CausalBench: A Flexible Benchmark Framework for Causal Analysis and Machine Learning|Ahmet Kapkiç, Pratanu Mandal, Shu Wan, Paras Sheth, Abhinav Gorantla, Yoonhyuk Choi, Huan Liu, K. Selçuk Candan||||code|0|
|Multi-Graph Explorer: A Framework for Advanced Multi-Graph Analysis and Method Development|Yorgos Tsitsikas, Evangelos E. Papalexakis||||code|0|
|GongBu: Easily Fine-tuning LLMs for Domain-specific Adaptation|Bolin Zhang, Yimin Tian, Shengwei Wang, Zhiying Tu, Dianhui Chu, Zhiqi Shen||||code|0|
|Covid19-twitter: A Twitter-based Dataset for Discourse Analysis in Sentence-level Sentiment Classification|Shashank Gupta, Mohamed Reda Bouadjenek, Antonio RoblesKelly, TszKwan Lee, Thanh Thi Nguyen, Asef Nazari, Dhananjay R. Thiruvady||||code|0|
|CH-Mits: A Cross-Modal Dataset for User Sentiment Analysis on Chinese Social Media|Juhao Ma, Shuai Xu, Yilin Liu, Xiaoming Fu||||code|0|
|The Veracity Problem: Detecting False Information and its Propagation on Online Social Media Networks|Sarah Condran||||code|0|
|Evaluating Social Media Reach via Mainstream Media Discourse|Himarsha R. Jayanetti||||code|0|
|PTM-Mamba: A PTM-Aware Protein Language Model with Bidirectional Gated Mamba Blocks|Zhangzhi Peng||||code|0|
|Towards Making Effective Machine Learning Decisions Against Out-of-Distribution Data|Lakpa Dorje Tamang||||code|0|
|The 'Path' to Clarity: Identifying False Claims Through a Knowledge Graph Exploration|Wenbo Wang||||code|0|
|Hands-On Introduction to Quantum Machine Learning|Samuel YenChi Chen, Joongheon Kim||||code|0|
|Data Quality-aware Graph Machine Learning|Yu Wang, Kaize Ding, Xiaorui Liu, Jian Kang, Ryan A. Rossi, Tyler Derr||||code|0|
|Systems for Scalable Graph Analytics and Machine Learning: Trends and Methods|Da Yan, Lyuheng Yuan, Akhlaque Ahmad, Saugat Adhikari||||code|0|
|Neural Additive Tensor Decomposition for Sparse Tensors|Dawon Ahn, Uday Singh Saini, Evangelos E. Papalexakis, Ali Payani||||code|0|
|A Geometric Perspective for High-Dimensional Multiplex Graphs|Kamel Abdous, Nairouz Mrabah, Mohamed Bouguessa||||code|0|
|Ensembles for Outlier Detection and Evaluation|Charu C. Aggarwal||||code|0|
|Out-of-Distribution Aware Classification for Tabular Data|Amirhossein Ansari, Ke Wang, Pulei Xiong||||code|0|
|Spatio-temporal Graph Normalizing Flow for Probabilistic Traffic Prediction|Yang An, Zhibin Li, Wei Liu, Haoliang Sun, Meng Chen, Wenpeng Lu, Yongshun Gong||||code|0|
|Can LLMs Reason Like Humans? Assessing Theory of Mind Reasoning in LLMs for Open-Ended Questions|Maryam Amirizaniani, Elias Martin, Maryna Sivachenko, Afra Mashhadi, Chirag Shah||||code|0|
|Advances in Citation Text Generation: Leveraging Multi-Source Seq2Seq Models and Large Language Models|Avinash Anand, Ashwin R. Nair, Kritarth Prasad, Vrinda Narayan, Naman Lal, Debanjan Mahata, Yaman Singla, Rajiv Ratn Shah||||code|0|
|City Foundation Models for Learning General Purpose Representations from OpenStreetMap|Pasquale Balsebre, Weiming Huang, Gao Cong, Yi Li||||code|0|
|A Learning-based Approach for Explaining Language Models|Oren Barkan, Yonatan Toib, Yehonatan Elisha, Noam Koenigstein||||code|0|
|Discovering Denial Constraints Based on Deep Reinforcement Learning|Lingfeng Bian, Weidong Yang, Jingyi Xu, Zijing Tan||||code|0|
|Covering a Graph with Dense Subgraph Families, via Triangle-Rich Sets|Sabyasachi Basu, Daniel PaulPena, Kun Qian, C. Seshadhri, Edward W. Huang, Karthik Subbian||||code|0|
|Hierarchical Graph Latent Diffusion Model for Conditional Molecule Generation|Tian Bian, Yifan Niu, Heng Chang, Divin Yan, Junzhou Huang, Yu Rong, Tingyang Xu, Jia Li, Hong Cheng||||code|0|
|Finding MIDDLE Ground: Scalable and Secure Distributed Learning|Marco Bornstein, Nawaf Nazir, Ján Drgona, Soumya Kundu, Veronica Adetola||||code|0|
|MATCC: A Novel Approach for Robust Stock Price Prediction Incorporating Market Trends and Cross-time Correlations|Zhiyuan Cao, Jiayu Xu, Chengqi Dong, Peiwen Yu, Tian Bai||||code|0|
|DiHAN: A Novel Dynamic Hierarchical Graph Attention Network for Fake News Detection|YaTing Chang, Zhibo Hu, Xiaoyu Li, Shuiqiao Yang, Jiaojiao Jiang, Nan Sun||||code|0|
|Improving Message-Passing GNNs by Asynchronous Aggregation|Jialong Chen, Tianchi Liao, Chuan Chen, Zibin Zheng||||code|0|
|ERASE: Error-Resilient Representation Learning on Graphs for Label Noise Tolerance|LingHao Chen, Yuanshuo Zhang, Taohua Huang, Liangcai Su, Zeyi Lin, Xi Xiao, Xiaobo Xia, Tongliang Liu||||code|0|
|Assessing Image Inpainting via Re-Inpainting Self-Consistency Evaluation|Tianyi Chen, Jianfu Zhang, Yan Hong, Yiyi Zhang, Liqing Zhang||||code|0|
|DTFormer: A Transformer-Based Method for Discrete-Time Dynamic Graph Representation Learning|Xi Chen, Yun Xiong, Siwei Zhang, Jiawei Zhang, Yao Zhang, Shiyang Zhou, Xixi Wu, Mingyang Zhang, Tengfei Liu, Weiqiang Wang||||code|0|
|Honest-Majority Maliciously Secure Skyline Queries on Outsourced Data|Yu Chen, Lin Liu, Rongmao Chen, Shaojing Fu, Yuexiang Yang||||code|0|
|SVIPTR: Fast and Efficient Scene Text Recognition with Vision Permutable Extractor|Xianfu Cheng, Weixiao Zhou, Xiang Li, Jian Yang, Hang Zhang, Tao Sun, Wei Zhang, Yuying Mai, Tongliang Li, Xiaoming Chen, Zhoujun Li||||code|0|
|TESSM: Tree-based Selective State Space Models for Efficient Join Order Selection Learning|Yaohui Chu, Yizhe Liu, Yue Zhang, Xuan Hou, Longfei Yu, Zhaohui Peng||||code|0|
|Automatic Large Language Model Evaluation via Peer Review|Zhumin Chu, Qingyao Ai, Yiteng Tu, Haitao Li, Yiqun Liu||||code|0|
|Empowering Private Tutoring by Chaining Large Language Models|Yulin Chen, Ning Ding, HaiTao Zheng, Zhiyuan Liu, Maosong Sun, Bowen Zhou||||code|0|
|Time is Not Enough: Time-Frequency based Explanation for Time-Series Black-Box Models|Hyunseung Chung, Sumin Jo, Yeonsu Kwon, Edward Choi||||code|0|
|PROSPECT: Learn MLPs on Graphs Robust against Adversarial Structure Attacks|Bowen Deng, Jialong Chen, Yanming Hu, Zhiyong Xu, Chuan Chen, Tao Zhang||||code|0|
|ByGCN: Spatial Temporal Byroad-Aware Graph Convolution Network for Traffic Flow Prediction in Road Networks|Tangpeng Dan, Xiao Pan, Bolong Zheng, Xiaofeng Meng||||code|0|
|ALDF: An Adaptive Logical Decision Framework for Multimodal Named Entity Recognition|Guohui Ding, Tianhao Jiang, Rui Zhou, Qian Gao||||code|0|
|DRFormer: Multi-Scale Transformer Utilizing Diverse Receptive Fields for Long Time-Series Forecasting|Ruixin Ding, Yuqi Chen, YuTing Lan, Wei Zhang||||code|0|
|Effective Illicit Account Detection on Large Cryptocurrency MultiGraphs|Zhihao Ding, Jieming Shi, Qing Li, Jiannong Cao||||code|0|
|SGOOD: Substructure-enhanced Graph-Level Out-of-Distribution Detection|Zhihao Ding, Jieming Shi, Shiqi Shen, Xuequn Shang, Jiannong Cao, Zhipeng Wang, Zhi Gong||||code|0|
|Boosting Certificate Robustness for Time Series Classification with Efficient Self-Ensemble|Chang George Dong, Zhengyang David Li, Liangwei Nathan Zheng, Weitong Chen, Wei Emma Zhang||||code|0|
|FZR: Enhancing Knowledge Transfer via Shared Factors Composition in Zero-Shot Relational Learning|Zhijun Dong, Likang Wu, Kai Zhang, Ye Liu, Yanghai Zhang, Zhi Li, Hongke Zhao, Enhong Chen||||code|0|
|Explainable Stock Price Movement Prediction using Contrastive Learning|Kelvin Du, Rui Mao, Frank Xing, Erik Cambria||||code|0|
|Towards Uncertainty Quantification for Time Series Segmentation|Erick Draayer, Huiping Cao||||code|0|
|iMIRACLE: An Iterative Multi-View Graph Neural Network to Model Intercellular Gene Regulation From Spatial Transcriptomic Data|Ziheng Duan, Siwei Xu, Cheyu Lee, Dylan Riffle, Jing Zhang||||code|0|
|Low Carbon Footprint Training for 1D-CNNs with Temporal Max-Pooling|Anandharaju Durai Raju, Ke Wang||||code|0|
|Integrating Fair Representation Learning with Fairness Regularization for Intersectional Group Fairness|David Quashigah Dzakpasu, Jixue Liu, Jiuyong Li, Lin Liu||||code|0|
|Probabilistic Path Integration with Mixture of Baseline Distributions|Yehonatan Elisha, Oren Barkan, Noam Koenigstein||||code|0|
|A Spatio-Temporal Diffusion Model for Missing and Real-Time Financial Data Inference|Yupeng Fang, Ruirui Liu, Huichou Huang, Peilin Zhao, Qingyao Wu||||code|0|
|PARs: Predicate-based Association Rules for Efficient and Accurate Anomaly Explanation|Cheng Feng||||code|0|
|SINKT: A Structure-Aware Inductive Knowledge Tracing Model with Large Language Model|Lingyue Fu, Hao Guan, Kounianhua Du, Jianghao Lin, Wei Xia, Weinan Zhang, Ruiming Tang, Yasheng Wang, Yong Yu||||code|0|
|Graph Local Homophily Network for Anomaly Detection|Ronghui Guo, Minghui Zou, Sai Zhang, Xiaowang Zhang, Zhizhi Yu, Zhiyong Feng||||code|0|
|Look Globally and Reason: Two-stage Path Reasoning over Sparse Knowledge Graphs|Saiping Guan, Jiyao Wei, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng||||code|0|
|Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors|Matt Gorbett, Hossein Shirazi, Indrakshi Ray||||code|0|
|MSTEM: Masked Spatiotemporal Event Series Modeling for Urban Undisciplined Events Forecasting|Zehao Gu, Shiyang Zhou, Yun Xiong, Yang Luo, Hongrun Ren, Qiang Wang, Xiaofeng Gao, Philip S. Yu||||code|0|
|Multi-Modal Sarcasm Detection via Graph Convolutional Network and Dynamic Network|Jiaqi Hao, Junfeng Zhao, Zhigang Wang||||code|0|
|On the Sensitivity of Individual Fairness: Measures and Robust Algorithms|Xinyu He, Jian Kang, Ruizhong Qiu, Fei Wang, Jose Sepulveda, Hanghang Tong||||code|0|
|NC2D: Novel Class Discovery for Node Classification|Yue Hou, Xueyuan Chen, He Zhu, Ruomei Liu, Bowen Shi, Jiaheng Liu, Junran Wu, Ke Xu||||code|0|
|Accurate Neural Network Option Pricing Methods with Control Variate Techniques and Data Synthesis/Cleaning with Financial Rationality|ChiaWei Hsu, TianShyr Dai, ChuanJu Wang, YingPing Chen||||code|0|
|PIECE: Protagonist Identification and Event Chronology Extraction for Enhanced Timeline Summarization|TzHuan Hsu, LiHsuan Chin, YenHao Huang, YiShin Chen||||code|0|
|Prompt-Based Spatio-Temporal Graph Transfer Learning|Junfeng Hu, Xu Liu, Zhencheng Fan, Yifang Yin, Shili Xiang, Savitha Ramasamy, Roger Zimmermann||||code|0|
|APTNESS: Incorporating Appraisal Theory and Emotion Support Strategies for Empathetic Response Generation|Yuxuan Hu, Minghuan Tan, Chenwei Zhang, Zixuan Li, Xiaodan Liang, Min Yang, Chengming Li, Xiping Hu||||code|0|
|A Payment Transaction Pre-training Model for Fraud Transaction Detection|Wenxi Huang, Zhangyi Zhao, Xiaojun Chen, Qin Zhang, Mark Junjie Li, Hanjing Su, Qingyao Wu||||code|0|
|Fast and Accurate PARAFAC2 Decomposition for Time Range Queries on Irregular Tensors|JunGi Jang, Yongchan Park, U Kang||||code|0|
|HiLite: Hierarchical Level-implemented Architecture Attaining Part-Whole Interpretability|Yoo Hyun Jeong, Sunghyun Hwang, DongKyu Chae||||code|0|
|GameTrail: Probabilistic Lifecycle Process Model for Deep Game Understanding|Shanyang Jiang, Lan Zhang, Hui Xu, Jiahui Huang, Qi He, Xing Zhou, Lei Huang, Jie Jiang||||code|0|
|Causality-Aware Spatiotemporal Graph Neural Networks for Spatiotemporal Time Series Imputation|Baoyu Jing, Dawei Zhou, Kan Ren, Carl Yang||||code|0|
|Tackling Noisy Clients in Federated Learning with End-to-end Label Correction|Xuefeng Jiang, Sheng Sun, Jia Li, Jingjing Xue, Runhan Li, Zhiyuan Wu, Gang Xu, Yuwei Wang, Min Liu||||code|0|
|Effectively Capturing Label Correlation for Tabular Multi-Label Classification|Sajjad Kamali Siahroudi, Zahra Ahmadi, Daniel Kudenko||||code|0|
|Transformer for Point Anomaly Detection|Harim Kim, Chang Ha Lee, Charmgil Hong||||code|0|
|PolarDSN: An Inductive Approach to Learning the Evolution of Network Polarization in Dynamic Signed Networks|MinJeong Kim, YeonChang Lee, SangWook Kim||||code|0|
|Enhancing Anomaly Detection via Generating Diversified and Hard-to-distinguish Synthetic Anomalies|Hyuntae Kim, Changhee Lee||||code|0|
|FaDE: A Face Segment Driven Identity Anonymization Framework For Fair Face Recognition|Ziyi Kou, Yijun Tian, Meng Jiang, Xiangliang Zhang||||code|0|
|Vision Language Model is NOT All You Need: Augmentation Strategies for Molecule Language Models|Namkyeong Lee, Siddhartha Laghuvarapu, Chanyoung Park, Jimeng Sun||||code|0|
|FastSimiFeat: A Fast and Generalized Approach Utilizing k-NN for Noisy Data Handling|Jungi Lee, Hwiwoo Park, Myounghwan Kim, Jiseong Yoon, Kwangsun Yoo, SeokJoo Byun||||code|0|
|Learning Fair Invariant Representations under Covariate and Correlation Shifts Simultaneously|Dong Li, Chen Zhao, Minglai Shao, Wenjun Wang||||code|0|
|Dynamic Neural Control Flow Execution: an Agent-Based Deep Equilibrium Approach for Binary Vulnerability Detection|Li Tao Li, Steven H. H. Ding, Andrew Walenstein, Philippe Charland, Benjamin C. M. Fung||||code|0|
|Integrating Structure and Text for Enhancing Hyper-relational Knowledge Graph Representation via Structure Soft Prompt Tuning|Lijie Li, Hui Wang, Jiahang Li, Xiaodi Xu, Ye Wang, Tao Ren||||code|0|
|Seeing the Forest for the Trees: Road-Level Insights Assisted Lane-Level Traffic Prediction|Shuhao Li, Yue Cui, Jingyi Xu, Jing Zhao, Fan Zhang, Weidong Yang, Xiaofang Zhou||||code|0|
|LLM-Empowered Few-Shot Node Classification on Incomplete Graphs with Real Node Degrees|Yun Li, Yi Yang, Jiaqi Zhu, Hui Chen, Hongan Wang||||code|0|
|Design Element Aware Poster Layout Generation|Yinan Li, Jia Chen, Yin Bai, Jia Cheng, Jun Lei||||code|0|
|Learning from Novel Knowledge: Continual Few-shot Knowledge Graph Completion|Zhuofeng Li, Haoxiang Zhang, Qiannan Zhang, Ziyi Kou, Shichao Pei||||code|0|
|Higher-order Spatio-temporal Physics-incorporated Graph Neural Network for Multivariate Time Series Imputation|Guojun Liang, Prayag Tiwari, Slawomir Nowaczyk, Stefan Byttner||||code|0|
|Towards Robust Vision Transformer via Masked Adaptive Ensemble|Fudong Lin, Jiadong Lou, Xu Yuan, NianFeng Tzeng||||code|0|
|Hierarchical Spatio-Temporal Graph Learning Based on Metapath Aggregation for Emergency Supply Forecasting|Li Lin, Kaiwen Xia, Anqi Zheng, Shijie Hu, Shuai Wang||||code|0|
|Self-Supervision Improves Diffusion Models for Tabular Data Imputation|Yixin Liu, Thalaiyasingam Ajanthan, Hisham Husain, Vu Nguyen||||code|0|
|KMCT: k-Means Clustering of Trajectories Efficiently in Location-Based Services|Yuanjun Liu, Guanfeng Liu, Qingzhi Ma, Zhixu Li, Shiting Wen, Lei Zhao, An Liu||||code|0|
|A Universal and Interpretable Method for Enhancing Stock Price Prediction|Yuchen Liu, Shimin Di, Lei Chen, Xiaofang Zhou, Fei Lin||||code|0|
|Multivariate Time-Series Anomaly Detection based on Enhancing Graph Attention Networks with Topological Analysis|Zhe Liu, Xiang Huang, Jingyun Zhang, Zhifeng Hao, Li Sun, Hao Peng||||code|0|
|MOAT: Graph Prompting for 3D Molecular Graphs|Qingqing Long, Yuchen Yan, Wentao Cui, Wei Ju, Zhihong Zhu, Yuanchun Zhou, Xuezhi Wang, Meng Xiao||||code|0|
|A Knowledge-Enhanced Transformer-FL Method for Fault Root Cause Localization|Zhe Lv, Yaqiong Liu, Xidian Wang, Peng Gao, Zhouyuan Li, Yuanzhen Jiang||||code|0|
|Hierarchical Structure Construction on Hypergraphs|Qi Luo, Wenjie Zhang, Zhengyi Yang, Dong Wen, Xiaoyang Wang, Dongxiao Yu, Xuemin Lin||||code|0|
|Data Void Exploits: Tracking & Mitigation Strategies|Miro Mannino, Junior Garcia, Reem Hazim, Azza Abouzied, Paolo Papotti||||code|0|
|PIP: Prototypes-Injected Prompt for Federated Class Incremental Learning|Muhammad Anwar Ma'sum, Mahardhika Pratama, Savitha Ramasamy, Lin Liu, Habibullah Habibullah, Ryszard Kowalczyk||||code|0|
|Link Polarity Prediction from Sparse and Noisy Labels via Multiscale Social Balance|Marco Minici, Federico Cinus, Francesco Bonchi, Giuseppe Manco||||code|0|
|LLaVA-Chef: A Multi-modal Generative Model for Food Recipes|Fnu Mohbat, Mohammed J. Zaki||||code|0|
|Let Silence Speak: Enhancing Fake News Detection with Generated Comments from Large Language Models|Qiong Nan, Qiang Sheng, Juan Cao, Beizhe Hu, Danding Wang, Jintao Li||||code|0|
|Saliency Detection in Educational Videos: Analyzing the Performance of Current Models, Identifying Limitations and Advancement Directions|Evelyn Navarrete, Ralph Ewerth, Anett Hoppe||||code|0|
|Towards Fair Graph Anomaly Detection: Problem, Benchmark Datasets, and Evaluation|Neng Kai Nigel Neo, YeonChang Lee, Yiqiao Jin, SangWook Kim, Srijan Kumar||||code|0|
|Cultural Commonsense Knowledge for Intercultural Dialogues|TuanPhong Nguyen, Simon Razniewski, Gerhard Weikum||||code|0|
|Reviving the Context: Camera Trap Species Classification as Link Prediction on Multimodal Knowledge Graphs|Vardaan Pahuja, Weidi Luo, Yu Gu, ChengHao Tu, HongYou Chen, Tanya Y. BergerWolf, Charles V. Stewart, Song Gao, WeiLun Chao, Yu Su||||code|0|
|The Impact of External Sources on the Friedkin-Johnsen Model|Charlotte Out, Sijing Tu, Stefan Neumann, Ahad N. Zehmakan||||code|0|
|Novelty-aware Graph Traversal and Expansion for Hierarchical Reinforcement Learning|Jongchan Park, Seungjun Oh, Yusung Kim||||code|0|
|Exploiting Pre-trained Models for Drug Target Affinity Prediction with Nearest Neighbors|Qizhi Pei, Lijun Wu, Zhenyu He, Jinhua Zhu, Yingce Xia, Shufang Xie, Rui Yan||||code|0|
|Towards Deconfounded Visual Question Answering via Dual-causal Intervention|Daowan Peng, Wei Wei||||code|0|
|Beyond Over-smoothing: Uncovering the Trainability Challenges in Deep Graph Neural Networks|Jie Peng, Runlin Lei, Zhewei Wei||||code|0|
|Bi-directional Learning of Logical Rules with Type Constraints for Knowledge Graph Completion|Kunxun Qi, Jianfeng Du, Hai Wan||||code|0|
|UniMEL: A Unified Framework for Multimodal Entity Linking with Large Language Models|Qi Liu, Yongyi He, Tong Xu, Defu Lian, Che Liu, Zhi Zheng, Enhong Chen||||code|0|
|PISeL: Pipelining DNN Inference for Serverless Computing|Masoud Rahimi Jafari, Jianchang Su, Yifan Zhang, Oliver Wang, Wei Zhang||||code|0|
|SmartHash: Perceptual Hashing for Image Tampering Detection and Authentication|Priyanka Samanta, Shweta Jain||||code|0|
|Mining Path Association Rules in Large Property Graphs|Yuya Sasaki, Panagiotis Karras||||code|0|
|Leveraging Trustworthy Node Attributes for Effective Network Alignment|DongHyuk Seo, JaeHwan Lim, WonYong Shin, SangWook Kim||||code|0|
|Structural Representation Learning and Disentanglement for Evidential Chinese Patent Approval Prediction|Jinzhi Shan, Qi Zhang, Chongyang Shi, Mengting Gui, Shoujin Wang, Usman Naseem||||code|0|
|Fast Human Action Recognition via Millimeter Wave Radar Point Cloud Sequences Learning|Tongfei Shao, Zheyu Du, Chuanyou Li, Tianxing Wu, Meng Wang||||code|0|
|Robust Federated Unlearning|Xinyi Sheng, Wei Bao, Liming Ge||||code|0|
|AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction|Yuchen Shi, Guochao Jiang, Tian Qiu, Deqing Yang||||code|0|
|Discovering Graph Generating Dependencies for Property Graph Profiling|Larissa Capobianco Shimomura, Nikolay Yakovets, George Fletcher||||code|0|
|XCrowd: Combining Explainability and Crowdsourcing to Diagnose Models in Relation Extraction|Alisa Smirnova, Jie Yang, Philippe CudréMauroux||||code|0|
|HTFabric: A Fast Re-ordering and Parallel Re-execution Method for a High-Throughput Blockchain|Jaeyub Song, Juyeong Jeong, Jemin Lee, Inju Na, MinSoo Kim||||code|0|
|How Much Do Prompting Methods Help LLMs on Quantitative Reasoning with Irrelevant Information?|Seok Hwan Song, Wallapak Tavanapong||||code|0|
|Breaking the Bottleneck on Graphs with Structured State Spaces|Yunchong Song, Siyuan Huang, Jiacheng Cai, Xinbing Wang, Chenghu Zhou, Zhouhan Lin||||code|0|
|A Learning-path based Supervised Method for Concept Prerequisite Relations Extraction in Educational Data|Jingwen Sun, Yu He, Yiyu Xu, Jingwei Sun, Guangzhong Sun||||code|0|
|Multimodal Misinformation Detection using Large Vision-Language Models|Sahar Tahmasebi, Eric MüllerBudack, Ralph Ewerth||||code|0|
|EasyST: A Simple Framework for Spatio-Temporal Prediction|Jiabin Tang, Wei Wei, Lianghao Xia, Chao Huang||||code|0|
|GAS-Norm: Score-Driven Adaptive Normalization for Non-Stationary Time Series Forecasting in Deep Learning|Edoardo Urettini, Daniele Atzeni, Reshawn Ramjattan, Antonio Carta||||code|0|
|Causal Probing for Dual Encoders|Jonas Wallat, Hauke Hinrichs, Avishek Anand||||code|0|
|HC-GST: Heterophily-aware Distribution Consistency based Graph Self-training|Fali Wang, Tianxiang Zhao, Junjie Xu, Suhang Wang||||code|0|
|MMPolymer: A Multimodal Multitask Pretraining Framework for Polymer Property Prediction|Fanmeng Wang, Wentao Guo, Minjie Cheng, Shen Yuan, Hongteng Xu, Zhifeng Gao||||code|0|
|Trojan Activation Attack: Red-Teaming Large Language Models using Steering Vectors for Safety-Alignment|Haoran Wang, Kai Shu||||code|0|
|MANA-Net: Mitigating Aggregated Sentiment Homogenization with News Weighting for Enhanced Market Prediction|Mengyu Wang, Tiejun Ma||||code|0|
|DDIPrompt: Drug-Drug Interaction Event Prediction based on Graph Prompt Learning|Yingying Wang, Yun Xiong, Xixi Wu, Xiangguo Sun, Jiawei Zhang, Guangyong Zheng||||code|0|
|Inferring Information Diffusion Networks without Timestamps|Yuchen Wang, Dongpeng Hou, Chao Gao, Xianghua Li, Zhen Wang||||code|0|
|A Mixed-Curvature Graph Diffusion Model|Yujie Wang, Shuo Zhang, Junda Ye, Hao Peng, Li Sun||||code|0|
|GAD: A Generalized Framework for Anomaly Detection at Different Risk Levels|Rulan Wei, Zewei He, Martin Pavlovski, Fang Zhou||||code|0|
|OptDist: Learning Optimal Distribution for Customer Lifetime Value Prediction|Yunpeng Weng, Xing Tang, Zhenhao Xu, Fuyuan Lyu, Dugang Liu, Zexu Sun, Xiuqiang He||||code|0|
|Identifying Contemporaneous and Lagged Dependence Structures by Promoting Sparsity in Continuous-time Neural Networks|Fan Wu, Woojin Cho, David Korotky, Sanghyun Hong, Donsub Rim, Noseong Park, Kookjin Lee||||code|0|
|StatioCL: Contrastive Learning for Time Series via Non-Stationary and Temporal Contrast|Yu Yvonne Wu, Ting Dang, Dimitris Spathis, Hong Jia, Cecilia Mascolo||||code|0|
|Advancing Certified Robustness of Explanation via Gradient Quantization|Yang Xiao, Zijie Zhang, Yuchen Fang, Da Yan, Yang Zhou, WeiShinn Ku, Bo Hui||||code|0|
|GetCom: An Efficient and Generalizable Framework for Community Detection|Kaiyu Xiong, Yucheng Jin, Yun Xiong, Jiawei Zhang||||code|0|
|Editing Factual Knowledge and Explanatory Ability of Medical Large Language Models|Derong Xu, Ziheng Zhang, Zhihong Zhu, Zhenxi Lin, Qidong Liu, Xian Wu, Tong Xu, Wanyu Wang, Yuyang Ye, Xiangyu Zhao, Enhong Chen, Yefeng Zheng||||code|0|
|Contrasformer: A Brain Network Contrastive Transformer for Neurodegenerative Condition Identification|Jiaxing Xu, Kai He, Mengcheng Lan, Qingtian Bian, Wei Li, Tieying Li, Yiping Ke, Miao Qiao||||code|0|
|scACT: Accurate Cross-modality Translation via Cycle-consistent Training from Unpaired Single-cell Data|Siwei Xu, Junhao Liu, Jing Zhang||||code|0|
|Source Prompt: Coordinated Pre-training of Language Models on Diverse Corpora from Multiple Sources|Yipei Xu, Dakuan Lu, Jiaqing Liang, Jin Zhao, Xintao Wang, Hengkui Wu, Ken Chen, Liujiang Liu, Yingsi Xin, Xuepeng Liu, Yanghua Xiao, Zhixu Li||||code|0|
|CLR2G: Cross modal Contrastive Learning on Radiology Report Generation|Hongchen Xue, Qingzhi Ma, Guanfeng Liu, Jianfeng Qu, Yuanjun Liu, An Liu||||code|0|
|Enhancing the Completeness of Rationales for Multi-Step Question Answering|Shangzi Xue, Zhenya Huang, Xin Lin, Jiayu Liu, Longhu Qin, Tianhuang Su, Haifeng Liu, Qi Liu||||code|0|
|Predicting Scientific Impact Through Diffusion, Conformity, and Contribution Disentanglement|Zhikai Xue, Guoxiu He, Zhuoren Jiang, Sichen Gu, Yangyang Kang, Star Zhao, Wei Lu||||code|0|
|Buffalo: Biomedical Vision-Language Understanding with Cross-Modal Prototype and Federated Foundation Model Collaboration|Bingjie Yan, Qian Chen, Yiqiang Chen, Xinlong Jiang, Wuliang Huang, Bingyu Wang, Zhirui Wang, Chenlong Gao, Teng Zhang||||code|0|
|ST-ECP: A Novel Spatial-Temporal Framework for Energy Consumption Prediction of Vehicle Trajectory|Biao Yang, Yun Xiong, Xi Chen, Xuejing Feng, Meng Wang, Jun Ma||||code|0|
|MalLight: Influence-Aware Coordinated Traffic Signal Control for Traffic Signal Malfunctions|Qinchen Yang, Zejun Xie, Hua Wei, Desheng Zhang, Yu Yang||||code|0|
|Leveraging Local Structure for Improving Model Explanations: An Information Propagation Approach|Ruo Yang, Binghui Wang, Mustafa Bilgic||||code|0|
|TrafCL: Robust Encrypted Malicious Traffic Detection via Contrastive Learning|Xiaodu Yang, Sijie Ruan, Jinyu Li, Yinliang Yue, Bo Sun||||code|0|
|Breaking State-of-the-Art Poisoning Defenses to Federated Learning: An Optimization-Based Attack Framework|Yuxin Yang, Qiang Li, Chenfei Nie, Yuan Hong, Binghui Wang||||code|0|
|What a Surprise! Computing Rewritten Modules Can Be as Efficient as Computing Subset Modules|Zhihao Yang, Yizheng Zhao||||code|0|
|Adaptive Differentially Private Structural Entropy Minimization for Unsupervised Social Event Detection|Zhiwei Yang, Yuecen Wei, Haoran Li, Qian Li, Lei Jiang, Li Sun, Xiaoyan Yu, Chunming Hu, Hao Peng||||code|0|
|Combining Incomplete Observational and Randomized Data for Heterogeneous Treatment Effects|Dong Yao, Caizhi Tang, Qing Cui, Longfei Li||||code|0|
|CKNN: Cleansed k-Nearest Neighbor for Unsupervised Video Anomaly Detection|Jihun Yi, Sungroh Yoon||||code|0|
|GraphCBAL: Class-Balanced Active Learning for Graph Neural Networks via Reinforcement Learning|Chengcheng Yu, Jiapeng Zhu, Xiang Li||||code|0|
|Rethinking Attention Mechanism for Spatio-Temporal Modeling: A Decoupling Perspective in Traffic Flow Prediction|Qi Yu, Weilong Ding, Hao Zhang, Yang Yang, Tianpu Zhang||||code|0|
|Time-Series Representation Learning via Dual Reference Contrasting|Rui Yu, Yongshun Gong, Shoujin Wang, Jiasheng Si, Xueping Peng, Bing Xu, Wenpeng Lu||||code|0|
|Using Distributed Ledgers To Build Knowledge Graphs For Decentralized Computing Ecosystems|Tarek Zaarour, Ahmed Khalid, Preeja Pradeep, Ahmed H. Zahran||||code|0|
|Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples|Qingkai Zeng, Yuyang Bai, Zhaoxuan Tan, Shangbin Feng, Zhenwen Liang, Zhihan Zhang, Meng Jiang||||code|0|
|Benchmarking Challenges for Temporal Knowledge Graph Alignment|Weixin Zeng, Jie Zhou, Xiang Zhao||||code|0|
|M2ConceptBase: A Fine-Grained Aligned Concept-Centric Multimodal Knowledge Base|Zhiwei Zha, Jiaan Wang, Zhixu Li, Xiangru Zhu, Wei Song, Yanghua Xiao||||code|0|
|Cost-Effective Framework with Optimized Task Decomposition and Batch Prompting for Medical Dialogue Summary|Chi Zhang, Tao Chen, Jiehao Chen, Hao Wang, Jiyun Shi, Zhaojing Luo, Meihui Zhang||||code|0|
|InfoMLP: Unlocking the Potential of MLPs for Semi-Supervised Learning with Structured Data|Hengrui Zhang, Qitian Wu, Chenxiao Yang, Philip S. Yu||||code|0|
|Revisit Orthogonality in Graph-Regularized MLPs|Hengrui Zhang, Shen Wang, Vassilis N. Ioannidis, Soji Adeshina, Jiani Zhang, Xiao Qin, Christos Faloutsos, Da Zheng, George Karypis, Philip S. Yu||||code|0|
|CYCLE: Cross-Year Contrastive Learning in Entity-Linking|Pengyu Zhang, Congfeng Cao, Klim Zaporojets, Paul Groth||||code|0|
|Data Imputation from the Perspective of Graph Dirichlet Energy|Weiqi Zhang, Guanlue Li, Jianheng Tang, Jia Li, Fugee Tsung||||code|0|
|DPCAG: A Community Affiliation Graph Generation Model for Preserving Group Relationships|Xinjian Zhang, Bo Ning, Chengfei Liu||||code|0|
|A GAIL Fine-Tuned LLM Enhanced Framework for Low-Resource Knowledge Graph Question Answering|Zhiqiang Zhang, Liqiang Wen, Wen Zhao||||code|0|
|NeutronCache: An Efficient Cache-Enhanced Distributed Graph Neural Network Training System|Chu Zhao, Shengjie Dong, Yuhai Zhao, Yuan Li||||code|0|
|Towards Coarse-grained Visual Language Navigation Task Planning Enhanced by Event Knowledge Graph|Kaichen Zhao, Yaoxian Song, Haiquan Zhao, Haoyu Liu, Tiefeng Li, Zhixu Li||||code|0|
|Zero-shot Knowledge Graph Question Generation via Multi-agent LLMs and Small Models Synthesis|Runhao Zhao, Jiuyang Tang, Weixin Zeng, Ziyang Chen, Xiang Zhao||||code|0|
|Devil in the Tail: A Multi-Modal Framework for Drug-Drug Interaction Prediction in Long Tail Distinction|Liangwei Nathan Zheng, Chang George Dong, Wei Emma Zhang, Xin Chen, Lin Yue, Weitong Chen||||code|0|
|Irregularity-Informed Time Series Analysis: Adaptive Modelling of Spatial and Temporal Dynamics|Liangwei Nathan Zheng, Zhengyang Li, Chang George Dong, Wei Emma Zhang, Lin Yue, Miao Xu, Olaf Maennel, Weitong Chen||||code|0|
|FGITrans: Cross-City Transformer for Fine-grained Urban Flow Inference|Yuhao Zheng, Yishuo Cai, Zihao Cai, Changjun Fan, Senzhang Wang, Jianxin Wang||||code|0|
|AdaTM: Fine-grained Urban Flow Inference with Adaptive Knowledge Transfer across Multiple Cities|Yuhao Zheng, Jinyang Wu, Zihao Cai, Senzhang Wang, Jianxin Wang||||code|0|
|AdaTrans: Adaptive Transfer Time Prediction for Multi-modal Transportation Modes|Shuxin Zhong, Hua Wei, Wenjun Lyu, Guang Yang, Zhiqing Hong, Guang Wang, Yu Yang, Desheng Zhang||||code|0|
|Learning Cross-modal Knowledge Reasoning and Heuristic-prompt for Visual-language Navigation|Dongming Zhou, Zhengbin Pang, Wei Li||||code|0|
|LST2A: Lexical-Syntactic Targeted Adversarial Attack for Texts|Guanghao Zhou, Panjia Qiu, Mingyuan Fan, Cen Chen, Yaliang Li, Wenmeng Zhou||||code|0|
|MTSCI: A Conditional Diffusion Model for Multivariate Time Series Consistent Imputation|Jianping Zhou, Junhao Li, Guanjie Zheng, Xinbing Wang, Chenghu Zhou||||code|0|
|Graph Anomaly Detection with Adaptive Node Mixup|Qinghai Zhou, Yuzhong Chen, Zhe Xu, Yuhang Wu, Menghai Pan, Mahashweta Das, Hao Yang, Hanghang Tong||||code|0|
|REDI: Recurrent Diffusion Model for Probabilistic Time Series Forecasting|Shiyang Zhou, Zehao Gu, Yun Xiong, Yang Luo, Qiang Wang, Xiaofeng Gao||||code|0|
|Scalable Transformer for High Dimensional Multivariate Time Series Forecasting|Xin Zhou, Weiqing Wang, Wray L. Buntine, Shilin Qu, Abishek Sriramulu, Weicong Tan, Christoph Bergmeir||||code|0|
|Regularized Unconstrained Weakly Submodular Maximization|Yanhui Zhu, Samik Basu, A. Pavan||||code|0|
|PRISM: Mitigating EHR Data Sparsity via Learning from Missing Feature Calibrated Prototype Patient Representations|Yinghao Zhu, Zixiang Wang, Long He, Shiyun Xie, Xiaochen Zheng, Liantao Ma, Chengwei Pan||||code|0|
|L-APPLE: Language-agnostic Prototype Prefix Learning for Cross-lingual Event Detection|Ziqin Zhu, Xutan Peng, Qian Li, Cheng Ji, Qingyun Sun, Jianxin Li||||code|0|
|MV-BART: Multi-view BART for Multi-modal Sarcasm Detection|Xingjie Zhuang, Fengling Zhou, Zhixin Li||||code|0|
|Enhancing Event Detection with Inter-Event Dependencies in Large Ontologies|Samireh Abdi||||code|0|
|COSCO: A Sharpness-Aware Training Framework for Few-shot Multivariate Time Series Classification|Jesus Barreda, Ashley Gomez, Ruben Puga, Kaixiong Zhou, Li Zhang||||code|0|
|Accurate Path Prediction of Provenance Traces|Raza Ahmad, HeeYoung Jung, Yuta Nakamura, Tanu Malik||||code|0|
|Fractional Budget Allocation for Influence Maximization under General Marketing Strategies|Akhil Bhimaraju, Eliot W. Robson, Lav R. Varshney, Abhishek K. Umrawal||||code|0|
|IEcons: A New Consensus Approach Using Multi-Text Representations for Clustering Task|Karima Boutalbi, Rafika Boutalbi, Hervé Verjus, Kavé Salamatian, David Telisson, Olivier Le Van||||code|0|
|Scalable Unsupervised Feature Selection with Reconstruction Error Guarantees via QMR Decomposition|Ciwan Ceylan, Kambiz Ghoorchian, Danica Kragic||||code|0|
|End-to-End Aspect Based Sentiment Analysis Using Graph Attention Network|Abir Chakraborty||||code|0|
|Deep Noise-Aware Quality Loss for Speaker Verification|Pantid Chantangphol, Theerat Sakdejayont, Monchai Lertsutthiwong, Tawunrat Chalothorn||||code|0|
|Empowering LLMs for Multi-Page Layout Generation via Consistency-Oriented In-Context Learning|Mengyao Chen, Xinghua Zhang, Junhao Zhang, Quangang Li, Tingwen Liu||||code|0|
|CMG: A Causality-enhanced Multi-view Graph Model for Stock Trend Prediction|Xi Cheng, Liang Wang, Yunan Zeng, Qiang Liu||||code|0|
|MSG-Chart: Multimodal Scene Graph for ChartQA|Yue Dai, Soyeon Caren Han, Wei Liu||||code|0|
|Quantifying Uncertainty in Neural Networks through Residuals|Dalavai Udbhav Mallanna, Rini Smita Thakur, Rajeev Ranjan Dwivedi, Vinod K. Kurmi||||code|0|
|A Systematic Evaluation of Generated Time Series and Their Effects in Self-Supervised Pretraining|Audrey Der, ChinChia Michael Yeh, Xin Dai, Huiyuan Chen, Yan Zheng, Yujie Fan, Zhongfang Zhuang, Vivian Lai, Junpeng Wang, Liang Wang, Wei Zhang, Eamonn J. Keogh||||code|0|
|Quantum Inverse Contextual Vision Transformers (Q-ICVT): A New Frontier in 3D Object Detection for AVs|Sanjay Bhargav Dharavath, Tanmoy Dam, Supriyo Chakraborty, Prithwiraj Roy, Aniruddha Maiti||||code|0|
|Efficient Global Message Passing for Heterophilous Graphs|Yanfei Dong, Mohammed Haroon Dupty, Lambert Deng, Yong Liang Goh, Wee Sun Lee||||code|0|
|General Time Transformer: an Encoder-only Foundation Model for Zero-Shot Multivariate Time Series Forecasting|Cheng Feng, Long Huang, Denis Krompass||||code|0|
|Effective Clean-Label Backdoor Attacks on Graph Neural Networks|Xuanhao Fan, Enyan Dai||||code|0|
|Retrogressive Document Manipulation of US Federal Environmental Websites|Lesley Frew, Michael L. Nelson, Michele C. Weigle||||code|0|
|Application of Large Language Models in Chemistry Reaction Data Extraction and Cleaning|Xiaobao Huang, Mihir Surve, Yuhan Liu, Tengfei Luo, Olaf Wiest, Xiangliang Zhang, Nitesh V. Chawla||||code|0|
|Error Detection and Constraint Recovery in Hierarchical Multi-Label Classification without Prior Knowledge|Joshua Shay Kricheli, Khoa Vo, Aniruddha Datta, Spencer Ozgur, Paulo Shakarian||||code|0|
|Learning Prompt-Level Quality Variance for Cost-Effective Text-to-Image Generation|Dongkeun Lee, Wonjun Lee||||code|0|
|HypMix: Hyperbolic Representation Learning for Graphs with Mixed Hierarchical and Non-hierarchical Structures|Eric Wonhee Lee, Bo Xiong, Carl Yang, Joyce C. Ho||||code|0|
|Document-Level Relation Extraction Based on Heterogeneous Graph Reasoning|Dong Li, Miao Li, ZhiLei Lei, Baoyan Song, Xiaohuan Shan||||code|0|
|Beyond Aggregation: Efficient Federated Model Consolidation with Heterogeneity-Adaptive Weights Diffusion|Jiaqi Li, Xiaoyang Qu, Wenbo Ding, Zihao Zhao, Jianzong Wang||||code|0|
|ChefFusion: Multimodal Foundation Model Integrating Recipe and Food Image Generation|Peiyu Li, Xiaobao Huang, Yijun Tian, Nitesh V. Chawla||||code|0|
|Coresets for Deletion-Robust k-Center Clustering|Ruien Li, Yanhao Wang, Michael Mathioudakis||||code|0|
|Effective Job-market Mobility Prediction with Attentive Heterogeneous Knowledge Learning and Synergy|Sida Lin, Zhouyi Zhang, Yankai Chen, Chenhao Ma, Yixiang Fang, Shan Dai, Guangli Lu||||code|0|
|An Explainable Multi-atlas Fusion Model based on Spatial Overlap for ASD Diagnosis|Yuefeng Ma, Xiaochen Mu, Tengfei Zhang||||code|0|
|ToxVI: a Multimodal LLM-based Framework for Generating Intervention in Toxic Code-Mixed Videos|Krishanu Maity, A. S. Poornash, Sriparna Saha, Kitsuchart Pasupa||||code|0|
|Extended Japanese Commonsense Morality Dataset with Masked Token and Label Enhancement|Takumi Ohashi, Tsubasa Nakagawa, Hitoshi Iyatomi||||code|0|
|Progressive Label Disambiguation for Partial Label Learning in Homogeneous Graphs|Rajat Patel, Aakarsh Malhotra, Sudipta Modak, Siddharth Yerramsetty||||code|0|
|MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models|Phan Nguyen Minh Thao, CongTinh Dao, Chenwei Wu, JianZhe Wang, Shun Liu, JunEn Ding, David S. Restrepo, Feng Liu, FangMing Hung, WenChih Peng||||code|0|
|Improving German News Clustering with Contrastive Learning|Piriyakorn Piriyatamwong, Saikishore Kalloori, Fabio Zünd||||code|0|
|Hol-Light: A Holistic framework for Efficient and Dynamic Traffic Signal Management|Siyao Qiao, Jia Wu||||code|0|
|ExPrompt: Augmenting Prompts Using Examples as Modern Baseline for Stance Classification|Umair Qudus, Michael Röder, Daniel Vollmers, AxelCyrille Ngonga Ngomo||||code|0|
|A Mixture of Experts in Forecasting Student Performance in Classroom Programming Activities|Moqsadur Rahman, Monika Akbar, Justice T. Walker, Mahmud Shahriar Hossain||||code|0|
|Compressed Models are NOT Miniature Versions of Large Models|Rohit Raj Rai, Rishant Pal, Amit Awekar||||code|0|
|Generative AI for Energy: Multi-Horizon Power Consumption Forecasting using Large Language Models|Kevin Roitero, Gianluca D'Abrosca, Andrea Zancola, Vincenzo Della Mea, Stefano Mizzaro||||code|0|
|Scalable Expressiveness through Preprocessed Graph Perturbations|Danial Saber, Amirali SalehiAbari||||code|0|
|EDGE: Evaluation Framework for Logical vs. Subgraph Explanations for Node Classifiers on Knowledge Graphs|Rupesh Sapkota, Dominik Köhler, Stefan Heindorf||||code|0|
|Empowering Traffic Speed Prediction with Auxiliary Feature-Aided Dependency Learning|DongHyuk Seo, Jiwon Son, Namhyuk Kim, WonYong Shin, SangWook Kim||||code|0|
|QuestGen: Effectiveness of Question Generation Methods for Fact-Checking Applications|Ritvik Setty, Vinay Setty||||code|0|
|M2IoU: A Min-Max Distance-based Loss Function for Bounding Box Regression in Medical Imaging|Anurag Shandilya, Kalash Shah, Bhavik Kanekar, Akshat Gautam, Pavni Tandon, Ganesh Ramakrishnan, Kshitij S. Jadhav||||code|0|
|Enhancing SPARQL Generation by Triplet-order-sensitive Pre-training|Chang Su, Jiexing Qi, He Yan, Kai Zou, Zhouhan Lin||||code|0|
|Revealing the Power of Masked Autoencoders in Traffic Forecasting|Jiarui Sun, Yujie Fan, ChinChia Michael Yeh, Wei Zhang, Girish Chowdhary||||code|0|
|Spatio-Temporal Sequence Modeling for Traffic Signal Control|Qian Sun, Le Zhang, Jingbo Zhou, Rui Zha, Yu Mei, Chujie Tian, Hui Xiong||||code|0|
|SurvReLU: Inherently Interpretable Survival Analysis via Deep ReLU Networks|Xiaotong Sun, Peijie Qiu, Shengfan Zhang||||code|0|
|Over-penalization for Extra Information in Neural IR Models|Kota Usuha, Makoto P. Kato, Sumio Fujita||||code|0|
|Enhancing Temporal and Geographical Named Entity Recognition in Chinese Ancient Texts with External Time-series Knowledge Bases|Xiaotong Wang, Xuanning Liu, Shuai Zhong, Xinming Chen, Bin Wu||||code|0|
|Does Knowledge Localization Hold True? Surprising Differences Between Entity and Relation Perspectives in Language Models|Yifan Wei, Xiaoyan Yu, Yixuan Weng, Huanhuan Ma, Yuanzhe Zhang, Jun Zhao, Kang Liu||||code|0|
|DP-FedFace: Privacy-Preserving Facial Recognition in Real Federated Scenarios|Wenjing Wang, Si Li||||code|0|
|Attentional Neural Integral Equation for Temporal Knowledge Graph Forecasting|Likang Xiao, Zijie Chen, Richong Zhang, Junfan Chen||||code|0|
|MPHDetect: Multi-View Prompting and Hypergraph Fusion for Malevolence Detection in Dialogues|Bo Xu, Xuening Qiao, Hongfei Lin, Linlin Zong||||code|0|
|SparseBF: Enhancing Scalability and Efficiency for Sparsely Filled Privacy-Preserving Record Linkage|Han Xu, Yuhong Shao, Kareem Benaissa, Yutong Li||||code|0|
|Learning Counterfactual Explanations with Intervals for Time-series Classification|Akihiro Yamaguchi, Ken Ueno, Ryusei Shingaki, Hisashi Kashima||||code|0|
|GeoReasoner: Reasoning On Geospatially Grounded Context For Natural Language Understanding|Yibo Yan, Joey Lee||||code|0|
|Multi-Scale Contrastive Attention Representation Learning for Encrypted Traffic Classification|Shuo Yang, Xinran Zheng, Jinze Li, Jinfeng Xu, Edith C. H. Ngai||||code|0|
|You Can't Ignore Either: Unifying Structure and Feature Denoising for Robust Graph Learning|Tianmeng Yang, Jiahao Meng, Min Zhou, Yaming Yang, Yujing Wang, Xiangtai Li, Yunhai Tong||||code|0|
|CAG: A Consistency-Adaptive Text-Image Alignment Generation for Joint Multimodal Entity-Relation Extraction|Xinjie Yang, Xiaocheng Gong, Binghao Tang, Yang Lei, Yayue Deng, Huan Ouyang, Gang Zhao, Lei Luo, Yunling Feng, Bin Duan, Si Li, Yajing Xu||||code|0|
|Robust Heterophily Graph Learning via Uniformity Augmentation|Xusheng Yang, Zhengyu Chen, Yuexian Zou||||code|0|
|Multi-Stage Refined Visual Captioning for Baidu Ad Creatives Generation|Yi Yang, Xinyu Zhao, Kang Zhao, Zhipeng Jin, Wen Tao, Lin Liu, Shuanglong Li||||code|0|
|BART-based Hierarchical Attentional Network for Sentence Ordering|Yiping Yang, Baiyun Cui, Yingming Li||||code|0|
|Span Confusion is All You Need for Chinese Spelling Correction|Dezhi Ye, Haomei Jia, Bowen Tian, Jie Liu, Haijin Liang, Jin Ma, Wenmin Wang||||code|0|
|GaQR: An Efficient Generation-augmented Question Rewriter|Oliver Young, Yixing Fan, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Xueqi Cheng||||code|0|
|Meta-Prompt Tuning Vision-Language Model for Multi-Label Few-Shot Image Recognition|Feng Zhang, Wei Chen, Fei Ding, Tengjiao Wang, Dawei Lu, Jiabin Zheng||||code|0|
|Multi-view Temporal Knowledge Graph Reasoning|Fuwei Zhang, Zhao Zhang, Fuzhen Zhuang, Zhiqiang Zhang, Jun Zhou, Deqing Wang||||code|0|
|Evolving to the Future: Unseen Event Adaptive Fake News Detection on Social Media|Jiajun Zhang, Zhixun Li, Qiang Liu, Shu Wu, Zilei Wang, Liang Wang||||code|0|
|H2D: Hierarchical Heterogeneous Graph Learning Framework for Drug-Drug Interaction Prediction|Ran Zhang, Xuezhi Wang, Sheng Wang, Kunpeng Liu, Yuanchun Zhou, Pengfei Wang||||code|0|
|In Situ Answer Sentence Selection at Web-scale|Zeyu Zhang, Thuy Vu, Alessandro Moschitti||||code|0|
|CNN to GNN: Unsupervised Multi-level Knowledge Learning|Ziheng Jiao, Hongyuan Zhang, Xuelong Li||||code|0|
|A Structural Information Guided Hierarchical Reconstruction for Graph Anomaly Detection|Dongcheng Zou, Hao Peng, Chunyang Liu||||code|0|
|UGAD: Universal Generative AI Detector utilizing Frequency Fingerprints|Inzamamul Alam, Muhammad Shahid Muneer, Simon S. Woo||||code|0|
|iRAG: Advancing RAG for Videos with an Incremental Approach|Md. Adnan Arefeen, Biplob Debnath, Md. Yusuf Sarwar Uddin, Srimat Chakradhar||||code|0|
|Leveraging Large Language Models for Improving Keyphrase Generation for Contextual Targeting|Xiao Bai, Xue Wu, Ivan Stojkovic, Kostas Tsioutsiouliklis||||code|0|
|LLP-Bench: A Large Scale Tabular Benchmark for Learning from Label Proportions|Anand Brahmbhatt, Mohith Pokala, Rishi Saket, Aravindan Raghuveer||||code|0|
|DeepClair: Utilizing Market Forecasts for Effective Portfolio Selection|Donghee Choi, Jinkyu Kim, Mogan Gim, Jinho Lee, Jaewoo Kang||||code|0|
|Causal Interventional Prediction System for Robust and Explainable Effect Forecasting|Zhixuan Chu, Hui Ding, Guang Zeng, Shiyu Wang, Yiming Li||||code|0|
|Automated Nanoparticle Image Processing Pipeline for AI-Driven Materials Characterization|Alexandra L. Day, Carolin B. Wahl, Roberto dos Reis, Weikeng Liao, Vinayak P. Dravid, Alok N. Choudhary, Ankit Agrawal||||code|0|
|Parallel-friendly Spatio-Temporal Graph Learning for Photovoltaic Degradation Analysis at Scale|Yangxin Fan, Raymond Wieser, Laura S. Bruckman, Roger H. French, Yinghui Wu||||code|0|
|GraphWeaver: Billion-Scale Cybersecurity Incident Correlation|Scott Freitas, Amir Gharib||||code|0|
|PODTILE: Facilitating Podcast Episode Browsing with Auto-generated Chapters|Azin Ghazimatin, Ekaterina Garmash, Gustavo Penha, Kristen Sheets, Martin Achenbach, Oguz Semerci, Remi Galvez, Marcus Tannenberg, Sahitya Mantravadi, Divya Narayanan, Ofeliya Kalaydzhyan, Douglas Cole, Ben Carterette, Ann Clifton, Paul N. Bennett, Claudia Hauff, Mounia Lalmas||||code|0|
|CancerKG.ORG - A Web-scale, Interactive, Verifiable Knowledge Graph-LLM Hybrid for Assisting with Optimal Cancer Treatment and Care|Michael N. Gubanov, Anna Pyayt, Aleksandra Karolak||||code|0|
|Quality Prediction in Arc Welding: Leveraging Transformer Models and Discrete Representations from Vector Quantised-VAE|Yannik Hahn, Robert F. Maack, Hasan Tercan, Tobias Meisen, Marion Purrio, Guido Buchholz, Matthias Angerhausen||||code|0|
|Reinforcement Feature Transformation for Polymer Property Performance Prediction|Xuanming Hu, Dongjie Wang, Wangyang Ying, Yanjie Fu||||code|0|
|Robust Sequence-Based Self-Supervised Representation Learning for Anti-Money Laundering|Shuaibin Huang, Yun Xiong, Yi Xie, Tianyu Qiu, Guangzhong Wang||||code|0|
|LAPIS: Language Model-Augmented Police Investigation System|Heedou Kim, Dain Kim, Jiwoo Lee, Chanwoong Yoon, Donghee Choi, Mogan Gim, Jaewoo Kang||||code|0|
|XploitSQL: Advancing Adversarial SQL Injection Attack Generation with Language Models and Reinforcement Learning|Daniel Leung, Omar Tsai, Kourosh Hashemi, Bardia Tayebi, Mohammad A. Tayebi||||code|0|
|RealTCD: Temporal Causal Discovery from Interventional Data with Large Language Model|Peiwen Li, Xin Wang, Zeyang Zhang, Yuan Meng, Fang Shen, Yue Li, Jialong Wang, Yang Li, Wenwu Zhu||||code|0|
|Bridging Dynamic Factor Models and Neural Controlled Differential Equations for Nowcasting GDP|Seonkyu Lim, Jeongwhan Choi, Noseong Park, SangHa Yoon, ShinHyuck Kang, YoungMin Kim, Hyunjoong Kang||||code|0|
|Hierarchical Information Propagation and Aggregation in Disentangled Graph Networks for Audience Expansion|Li Lin, Xinyao Chen, Kaiwen Xia, Shuai Wang, Desheng Zhang, Tian He||||code|0|
|DECO: Cooperative Order Dispatching for On-Demand Delivery with Real-Time Encounter Detection|Yao Lu, Shuai Wang, Yu Yang, Hai Wang, Baoshen Guo, Desheng Zhang, Shuai Wang, Tian He||||code|0|
|Combat Greenwashing with GoalSpotter: Automatic Sustainability Objective Detection in Heterogeneous Reports|Mohammad Mahdavi, Ramin Baghaei Mehr, Tom Debus||||code|0|
|Multi-view Causal Graph Fusion Based Anomaly Detection in Cyber-Physical Infrastructures|Arun Vignesh Malarkkan, Dongjie Wang, Yanjie Fu||||code|0|
|Ericsogate: Advancing Analytics and Management of Data from Diverse Sources within Ericsson Using Knowledge Graphs|Abdelghny Orogat, Sri Lakshmi Vadlamani, Dimple Thomas, Ahmed ElRoby||||code|0|
|COKE: Causal Discovery with Chronological Order and Expert Knowledge in High Proportion of Missing Manufacturing Data|TingYun Ou, Ching Chang, WenChih Peng||||code|0|
|LawLLM: Law Large Language Model for the US Legal System|Dong Shu, Haoran Zhao, Xukun Liu, David Demeter, Mengnan Du, Yongfeng Zhang||||code|0|
|"Reasoning before Responding": Towards Legal Long-form Question Answering with Interpretability|Utkarsh Ujwal, Sai Sri Harsha Surampudi, Sayantan Mitra, Tulika Saha||||code|0|
|COIN: Chance-Constrained Imitation Learning for Safe and Adaptive Resource Oversubscription under Uncertainty|Lu Wang, Mayukh Das, Fangkai Yang, Chao Du, Bo Qiao, Hang Dong, Chetan Bansal, Si Qin, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang, Qi Zhang||||code|0|
|RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models|Zefan Wang, Zichuan Liu, Yingying Zhang, Aoxiao Zhong, Jihong Wang, Fengbin Yin, Lunting Fan, Lingfei Wu, Qingsong Wen||||code|0|
|Process-Informed Deep Learning for Enhanced Order Fulfillment Cycle Time Prediction in On-Demand Grocery Retailing|Jiawen Wei, Ziwen Ye, Chuan Yang, Chen Chen, Guangrui Ma||||code|0|
|G2PTL: A Geography-Graph Pre-trained Model|Lixia Wu, Jianlin Liu, Junhong Lou, Minhui Deng, Jianbin Zheng, Haomin Wen, Chao Song, Shu He||||code|0|
|Deep Learning-Based Compressed Sensing for Mobile Device-Derived Sensor Data|Liqiang Xu, Yuuki Nishiyama, Kota Tsubouchi, Kaoru Sezaki||||code|0|
|Towards a Zero-Day Anomaly Detector in Cyber Physical Systems Using a Hybrid VAE-LSTM-OCSVM Model|Romarick Yatagha, Betelhem Nebebe, Karl Waedt, Christoph Ruland||||code|0|
|An End-to-End Reinforcement Learning Based Approach for Micro-View Order-Dispatching in Ride-Hailing|Xinlang Yue, Yiran Liu, Fangzhou Shi, Sihong Luo, Chen Zhong, Min Lu, Zhe Xu||||code|0|
|On the Fly Detection of Root Causes from Observed Data with Application to IT Systems|Lei Zan, Charles K. Assaad, Emilie Devijver, Éric Gaussier, Ali AïtBachir||||code|0|
|Scaling Vison-Language Foundation Model to 12 Billion Parameters in Baidu Dynamic Image Advertising|Xinyu Zhao, Kang Zhao, Zhipeng Jin, Yi Yang, Wen Tao, Xiaodong Chen, Cong Han, Shuanglong Li, Lin Liu||||code|0|
|Confidence-Aware Multi-Field Model Calibration|Yuang Zhao, Chuhan Wu, Qinglin Jia, Hong Zhu, Jia Yan, Libin Zong, Linxuan Zhang, Zhenhua Dong, Muyu Zhang||||code|0|
|Adaptive Cross-platform Transportation Time Prediction for Logistics|Shuxin Zhong, Wenjun Lyu, Zhiqing Hong, Guang Yang, Weijian Zuo, Haotian Wang, Guang Wang, Yu Yang, Desheng Zhang||||code|0|
|Understanding and Modeling Job Marketplace with Pretrained Language Models|Yaochen Zhu, Liang Wu, Binchi Zhang, Song Wang, Qi Guo, Liangjie Hong, Luke Simon, Jundong Li||||code|0|
|XplainScreen: Unveiling the Black Box of Graph Neural Network Drug Screening Models with a Unified XAI Framework|Geonhee Ahn, Md. Mahim Anjum Haque, Subhashis Hazarika, Soo Kyung Kim||||code|0|
|AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach|Maryam Amirizaniani, Elias Martin, Tanya Roosta, Aman Chadha, Chirag Shah||||code|0|
|Preserving Old Memories in Vivid Detail: Human-Interactive Photo Restoration Framework|SeungYeon Back, Geonho Son, Dahye Jeong, Eunil Park, Simon S. Woo||||code|0|
|FactCheckBureau: Build Your Own Fact-Check Analysis Pipeline|Oana Balalau, Pablo BertaudVelten, Younes El Fraihi, Garima Gaur, Oana Goga, Samuel Guimaraes, Ioana Manolescu, Brahim Saadi||||code|0|
|Music2P: A Multi-Modal AI-Driven Tool for Simplifying Album Cover Design|Joong Ho Choi, Geonyeong Choi, Ji Eun Han, Wonjin Yang, ZhiQi Cheng||||code|0|
|Shaded Route Planning Using Active Segmentation and Identification of Satellite Images|Longchao Da, Rohan Chhibba, Rushabh Jaiswal, Ariane Middel, Hua Wei||||code|0|
|A Skill Proficiency Framework for Workforce Learning and Development|Rebecca Dew, Mingzhao Li, Sandya Baratha Raj||||code|0|
|Human-in-the-Loop Feature Discovery for Tabular Data|Andra Ionescu, Zeger Mouw, Efthimia Aivaloglou, Rihan Hai, Asterios Katsifodimos||||code|0|
|DirDense: A Tool for Mining Dense Subgraphs from a Big Directed Graph|Jalal Khalil, Akhlaque Ahmad, Da Yan, Lyuheng Yuan, Saugat Adhikari, Yang Zhou, Zhe Jiang||||code|0|
|A One-Health Platform for Antimicrobial Resistance Data Analytics|Benoit Lange, Reza Akbarinia, Florent Masseglia||||code|0|
|DiaKoP: Dialogue-based Knowledge-oriented Programming for Neural-symbolic Knowledge Base Question Answering|Zhicheng Lee, Zhidian Huang, Zijun Yao, Jinxin Liu, Amy Xin, Lei Hou, Juanzi Li||||code|0|
|EHR-Based Mobile and Web Platform for Chronic Disease Risk Prediction Using Large Language Multimodal Models|ChunChieh Liao, WeiTing Kuo, IHsuan Hu, YenChen Shih, JunEn Ding, Feng Liu, FangMing Hung||||code|0|
|Demonstrating PARS: A Decision Support System for Developing Vertical Partitioning Plans|Pengju Liu, Kai Zhong, Cuiping Li, Hong Chen||||code|0|
|OpenTOS: Open-source System for Transfer Learning Bayesian Optimization|Peili Mao, Ke Li||||code|0|
|GARF: A Self-supervised Data Cleaning System with SeqGAN|Jinfeng Peng, Hanghai Cui, Derong Shen, Yue Kou, Tiezheng Nie, Tianlong Guo||||code|0|
|CourtsightTV: An Interactive Visualization Software for Labeling Key Basketball Moments|Alexander Russakoff, Kenny Miller, Vahid Mahzoon, Parsa Esmaeilkhani, Christine Cho, Jaffar Alzeidi, Sandro Hauri, Slobodan Vucetic||||code|0|
|A Scalable Tool for Democratizing Variant Calling on Human Genomes Using Commodity Clusters|Khawar Shehzad, Ajay Kumar, Matthew Schutz, Chase Webb, Polycarp Nalela, Manas Jyoti Das, Praveen Rao||||code|0|
|Demonstration of a Multi-agent Framework for Text to SQL Applications with Large Language Models|Chen Shen, Jin Wang, Sajjadur Rahman, Eser Kandogan||||code|0|
|LINKin-PARK: Land Valuation Information and Knowledge in Predictive Analysis and Reporting Kit via Dual Attention-DCCNN|TengYuan Tsou, ShihYu Lai, HsuanChing Chen, JungTsang Yeh, PeiXuan Li, TzuChang Lee, HsunPing Hsieh||||code|0|
|DeliLaw: A Chinese Legal Counselling System Based on a Large Language Model|Nan Xie, Yuelin Bai, Hengyuan Gao, Ziqiang Xue, Feiteng Fang, Qixuan Zhao, Zhijian Li, Liang Zhu, Shiwen Ni, Min Yang||||code|0|
|myCADI: my Contextual Anomaly Detection using Isolation|Véronne Yepmo, Grégory Smits||||code|0|
|Mastodoner: A Command-line Tool and Python Library for Public Data Collection from Mastodon|Haris Bin Zia, Ignacio Castro, Gareth Tyson||||code|0|
|DetCat: Detecting Categorical Outliers in Relational Datasets|Arthur Zylinski, Abdulhakim Ali Qahtan||||code|0|
|3DLNews: A Three-decade Dataset of US Local News Articles|Gangani Ariyarathne, Alexander C. Nwala||||code|0|
|BioMAISx: A Corpus for Aspect-Based Sentiment Analysis of Media Representations of Agricultural Biotechnologies in Africa|Patricia Chiril, Trevor Spreadbury, Joeva Rock, Brian DowdUribe, David Uminsky||||code|0|
|Moving Region Representations on the Spread of a Forest Fire|Henrique Macías da Silva, Tiago F. R. Ribeiro, Rogério Luís C. Costa, José Manuel Moreira||||code|0|
|pyPANTERA: A Python PAckage for Natural language obfuscaTion Enforcing pRivacy & Anonymization|Francesco Luigi De Faveri, Guglielmo Faggioli, Nicola Ferro||||code|0|
|VHAKG: A Multi-modal Knowledge Graph Based on Synchronized Multi-view Videos of Daily Activities|Shusaku Egami, Takanori Ugai, Swe Nwe Nwe Htun, Ken Fukuda||||code|0|
|A Generative Benchmark Creation Framework for Detecting Common Data Table Versions|Daniel C. Fox, Aamod Khatiwada, Roee Shraga||||code|0|
|Dataset Generation for Korean Urban Parks Analysis with Large Language Models|Honggu Kim, Minwoo Kang, Hyeyoung Choi, YunGyung Cheong||||code|0|
|EUvsDisinfo: A Dataset for Multilingual Detection of Pro-Kremlin Disinformation in News Articles|João Augusto Leite, Olesya Razuvayevskaya, Kalina Bontcheva, Carolina Scarton||||code|0|
|LeDQA: A Chinese Legal Case Document-based Question Answering Dataset|Bulou Liu, Zhenhao Zhu, Qingyao Ai, Yiqun Liu, Yueyue Wu||||code|0|
|Refining Wikidata Taxonomy using Large Language Models|Yiwen Peng, Thomas Bonald, Mehwish Alam||||code|0|
|InfinityMath: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning|BoWen Zhang, Yan Yan, Lin Li, Guang Liu||||code|0|
|Advancing Multivariate Time Series Anomaly Detection: A Comprehensive Benchmark with Real-World Data from Alibaba Cloud|Chaoli Zhang, Yingying Zhang, Lanshu Peng, Qingsong Wen, Yiyuan Yang, ChongJiong Fan, Minqi Jiang, Lunting Fan, Liang Sun||||code|0|
|ELF-Gym: Evaluating Large Language Models Generated Features for Tabular Prediction|Yanlin Zhang, Ning Li, Quan Gan, Weinan Zhang, David Wipf, Minjie Wang||||code|0|
|M3: A Multi-Image Multi-Modal Entity Alignment Dataset|Shiqi Zhang, Weixin Zeng, Zhen Tan, Xiang Zhao, Weidong Xiao||||code|0|
|CheckGuard: Advancing Stolen Check Detection with a Cross-Modal Image-Text Benchmark Dataset|Fei Zhao, Jiawen Chen, Bin Huang, Chengcui Zhang, Gary Warner||||code|0|
|GeoAI for Natural Disaster Assessment|Saugat Adhikari||||code|0|
|Assessing Human Viewpoints in Theory of Mind for Large Language Models in Open-Ended Questioning|Maryam Amirizaniani||||code|0|
|Leveraging Knowledge Graphs and LLMs to Support and Monitor Legislative Systems|Andrea Colombo||||code|0|
|Realistic Synthetic Signed Network Generation and Analysis|Aikta Arya||||code|0|
|Demystifying Financial Texts Using Natural Language Processing|Sohom Ghosh||||code|0|
|Reliable Knowledge Graph Reasoning with Uncertainty Quantification|Bo Ni||||code|0|
|Graph-theoretical Approach to Enhance Accuracy of Financial Fraud Detection Using Synthetic Tabular Data Generation|DaeYoung Park||||code|0|
|Towards Effective Fusion and Forecasting of Multimodal Spatio-temporal Data for Smart Mobility|Chenxing Wang||||code|0|
|Causal Discovery from Heterogenous Multivariate Time Series|Lei Zan||||code|0|
|Submodular Optimization: Variants, Theory and Applications|Yanhui Zhu||||code|0|
|Fairness in Large Language Models in Three Hours|Thang Viet Doan, Zichong Wang, Nhat Nguyen Minh Hoang, Wenbin Zhang||||code|0|
|On the Use of Large Language Models for Table Tasks|Yuyang Dong, Masafumi Oyamada, Chuan Xiao, Haochen Zhang||||code|0|
|Tabular Data-centric AI: Challenges, Techniques and Future Perspectives|Yanjie Fu, Dongjie Wang, Hui Xiong, Kunpeng Liu||||code|0|
|Frontiers of Large Language Model-Based Agentic Systems - Construction, Efficacy and Safety|Jia He, Reshmi Ghosh, Kabir Walia, Jieqiu Chen, Tushar Dhadiwal, April Hazel, Chandra Inguva||||code|0|
|Towards Efficient Temporal Graph Learning: Algorithms, Frameworks, and Tools|Ruijie Wang, Wanyu Zhao, Dachun Sun, Charith Mendis, Tarek F. Abdelzaher||||code|0|
|Transforming Digital Forensics with Large Language Models: Unlocking Automation, Insights, and Justice|Eric Xu, Wenbin Zhang, Weifeng Xu||||code|0|
|Collecting and Analyzing Public Data from Mastodon|Haris Bin Zia, Ignacio Castro, Gareth Tyson||||code|0|
|Bridging Knowledge Gaps in LLMs via Function Calls|Kinjal Basu||||code|0|
|Planes, Trains and Automobiles: Leverage Multimodal In-Mission Signals for Shopping Journeys|Viet HaThuc, Shasha Li, Arnau Ramisa, Xinliang Zhu||||code|0|
|Towards Energy-Efficient Llama2 Architecture on Embedded FPGAs|Han Xu, Xingyuan Wang, Shihao Ji||||code|0|
|Trustworthy and Responsible AI for Information and Knowledge Management System|Huaming Chen, Jun Zhuang, Yu Yao, Wei Jin, Haohan Wang, Yong Xie, ChiHung Chi, KimKwang Raymond Choo||||code|0|
|Knowledge Graphs for Responsible AI|Edlira Vakaj, Nandana Mihindukulasooriya, Manas Gaur, Arijit Khan||||code|0|