You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: arxiv.json
+35
Original file line number
Diff line number
Diff line change
@@ -38470,5 +38470,40 @@
38470
38470
"pub_date": "2025-01-09",
38471
38471
"summary": "In this work, we explore the application of Large Language Models to zero-shot Lay Summarisation. We propose a novel two-stage framework for Lay Summarisation based on real-life processes, and find that summaries generated with this method are increasingly preferred by human judges for larger models. To help establish best practices for employing LLMs in zero-shot settings, we also assess the ability of LLMs as judges, finding that they are able to replicate the preferences of human judges. Finally, we take the initial steps towards Lay Summarisation for Natural Language Processing (NLP) articles, finding that LLMs are able to generalise to this new domain, and further highlighting the greater utility of summaries generated by our proposed approach via an in-depth human evaluation.",
"title": "kANNolo: Sweet and Smooth Approximate k-Nearest Neighbors Search",
38476
+
"url": "http://arxiv.org/abs/2501.06121v1",
38477
+
"pub_date": "2025-01-10",
38478
+
"summary": "Approximate Nearest Neighbors (ANN) search is a crucial task in several applications like recommender systems and information retrieval. Current state-of-the-art ANN libraries, although being performance-oriented, often lack modularity and ease of use. This translates into them not being fully suitable for easy prototyping and testing of research ideas, an important feature to enable. We address these limitations by introducing kANNolo, a novel research-oriented ANN library written in Rust and explicitly designed to combine usability with performance effectively. kANNolo is the first ANN library that supports dense and sparse vector representations made available on top of different similarity measures, e.g., euclidean distance and inner product. Moreover, it also supports vector quantization techniques, e.g., Product Quantization, on top of the indexing strategies implemented. These functionalities are managed through Rust traits, allowing shared behaviors to be handled abstractly. This abstraction ensures flexibility and facilitates an easy integration of new components. In this work, we detail the architecture of kANNolo and demonstrate that its flexibility does not compromise performance. The experimental analysis shows that kANNolo achieves state-of-the-art performance in terms of speed-accuracy trade-off while allowing fast and easy prototyping, thus making kANNolo a valuable tool for advancing ANN research. Source code available on GitHub: https://github.com/TusKANNy/kannolo.",
"title": "Recommender Systems for Social Good: The Role of Accountability and\n Sustainability",
38483
+
"url": "http://arxiv.org/abs/2501.05964v1",
38484
+
"pub_date": "2025-01-10",
38485
+
"summary": "This work examines the role of recommender systems in promoting sustainability, social responsibility, and accountability, with a focus on alignment with the United Nations Sustainable Development Goals (SDGs). As recommender systems become increasingly integrated into daily interactions, they must go beyond personalization to support responsible consumption, reduce environmental impact, and foster social good. We explore strategies to mitigate the carbon footprint of recommendation models, ensure fairness, and implement accountability mechanisms. By adopting these approaches, recommender systems can contribute to sustainable and socially beneficial outcomes, aligning technological advancements with the SDGs focused on environmental sustainability and social well-being.",
"title": "Navigating Tomorrow: Reliably Assessing Large Language Models\n Performance on Future Event Prediction",
38490
+
"url": "http://arxiv.org/abs/2501.05925v1",
38491
+
"pub_date": "2025-01-10",
38492
+
"summary": "Predicting future events is an important activity with applications across multiple fields and domains. For example, the capacity to foresee stock market trends, natural disasters, business developments, or political events can facilitate early preventive measures and uncover new opportunities. Multiple diverse computational methods for attempting future predictions, including predictive analysis, time series forecasting, and simulations have been proposed. This study evaluates the performance of several large language models (LLMs) in supporting future prediction tasks, an under-explored domain. We assess the models across three scenarios: Affirmative vs. Likelihood questioning, Reasoning, and Counterfactual analysis. For this, we create a dataset1 by finding and categorizing news articles based on entity type and its popularity. We gather news articles before and after the LLMs training cutoff date in order to thoroughly test and compare model performance. Our research highlights LLMs potential and limitations in predictive modeling, providing a foundation for future improvements.",
"title": "Text2Playlist: Generating Personalized Playlists from Text on Deezer",
38497
+
"url": "http://arxiv.org/abs/2501.05894v1",
38498
+
"pub_date": "2025-01-10",
38499
+
"summary": "The streaming service Deezer heavily relies on the search to help users navigate through its extensive music catalog. Nonetheless, it is primarily designed to find specific items and does not lead directly to a smooth listening experience. We present Text2Playlist, a stand-alone tool that addresses these limitations. Text2Playlist leverages generative AI, music information retrieval and recommendation systems to generate query-specific and personalized playlists, successfully deployed at scale.",
"title": "VideoRAG: Retrieval-Augmented Generation over Video Corpus",
38504
+
"url": "http://arxiv.org/abs/2501.05874v1",
38505
+
"pub_date": "2025-01-10",
38506
+
"summary": "Retrieval-Augmented Generation (RAG) is a powerful strategy to address the issue of generating factually incorrect outputs in foundation models by retrieving external knowledge relevant to queries and incorporating it into their generation process. However, existing RAG approaches have primarily focused on textual information, with some recent advancements beginning to consider images, and they largely overlook videos, a rich source of multimodal knowledge capable of representing events, processes, and contextual details more effectively than any other modality. While a few recent studies explore the integration of videos in the response generation process, they either predefine query-associated videos without retrieving them according to queries, or convert videos into the textual descriptions without harnessing their multimodal richness. To tackle these, we introduce VideoRAG, a novel framework that not only dynamically retrieves relevant videos based on their relevance with queries but also utilizes both visual and textual information of videos in the output generation. Further, to operationalize this, our method revolves around the recent advance of Large Video Language Models (LVLMs), which enable the direct processing of video content to represent it for retrieval and seamless integration of the retrieved videos jointly with queries. We experimentally validate the effectiveness of VideoRAG, showcasing that it is superior to relevant baselines.",
38507
+
"translated": "检索增强生成(Retrieval-Augmented Generation, RAG)是一种强大的策略,旨在通过检索与查询相关的外部知识并将其整合到生成过程中,来解决基础模型生成事实错误输出的问题。然而,现有的RAG方法主要集中在文本信息上,尽管最近的一些进展开始考虑图像,但它们大多忽视了视频这一多模态知识的丰富来源。视频能够比任何其他模态更有效地表示事件、过程和上下文细节。尽管最近有几项研究探索了在响应生成过程中整合视频的方法,但它们要么预定义了与查询相关的视频而不根据查询进行检索,要么将视频转换为文本描述,而没有充分利用其多模态的丰富性。\n\n为了解决这些问题,我们提出了VideoRAG,这是一个新颖的框架,不仅能够根据视频与查询的相关性动态检索相关视频,还能在输出生成过程中利用视频的视觉和文本信息。此外,为了实现这一目标,我们的方法围绕最近的大规模视频语言模型(Large Video Language Models, LVLMs)的进展展开,这些模型能够直接处理视频内容以进行检索,并将检索到的视频与查询无缝整合。我们通过实验验证了VideoRAG的有效性,展示了其优于相关基线方法的性能。"
0 commit comments