You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Abstract: Retrieval Augmented Generation (RAG) has been a powerful tool for LargeLanguage Models (LLMs) to efficiently process overly lengthy contexts. However,recent LLMs like Gemini-1.5 and GPT-4 show exceptional capabilities tounderstand long contexts directly. We conduct a comprehensive comparisonbetween RAG and long-context (LC) LLMs, aiming to leverage the strengths ofboth. We benchmark RAG and LC across various public datasets using three latestLLMs. Results reveal that when resourced sufficiently, LC consistentlyoutperforms RAG in terms of average performance. However, RAG's significantlylower cost remains a distinct advantage. Based on this observation, we proposeSelf-Route, a simple yet effective method that routes queries to RAG or LCbased on model self-reflection. Self-Route significantly reduces thecomputation cost while maintaining a comparable performance to LC. Our findingsprovide a guideline for long-context applications of LLMs using RAG and LC.
Reasoning: produce the is_lm_paper. We start by examining the title and abstract for key terms and concepts related to language models. The title mentions "Retrieval Augmented Generation" and "Long-Context LLMs," both of which are directly related to language models. The abstract further discusses the comparison between RAG and long-context LLMs, specifically mentioning models like Gemini-1.5 and GPT-4, which are well-known language models. The focus on benchmarking these models and proposing a method to optimize their use also indicates a strong emphasis on language models.
The text was updated successfully, but these errors were encountered:
Paper: Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive
Authors: Zhuowan Li, Cheng Li, Mingyang Zhang, Qiaozhu Mei, Michael Bendersky
Abstract: Retrieval Augmented Generation (RAG) has been a powerful tool for LargeLanguage Models (LLMs) to efficiently process overly lengthy contexts. However,recent LLMs like Gemini-1.5 and GPT-4 show exceptional capabilities tounderstand long contexts directly. We conduct a comprehensive comparisonbetween RAG and long-context (LC) LLMs, aiming to leverage the strengths ofboth. We benchmark RAG and LC across various public datasets using three latestLLMs. Results reveal that when resourced sufficiently, LC consistentlyoutperforms RAG in terms of average performance. However, RAG's significantlylower cost remains a distinct advantage. Based on this observation, we proposeSelf-Route, a simple yet effective method that routes queries to RAG or LCbased on model self-reflection. Self-Route significantly reduces thecomputation cost while maintaining a comparable performance to LC. Our findingsprovide a guideline for long-context applications of LLMs using RAG and LC.
Link: https://arxiv.org/abs/2407.16833
Reasoning: produce the is_lm_paper. We start by examining the title and abstract for key terms and concepts related to language models. The title mentions "Retrieval Augmented Generation" and "Long-Context LLMs," both of which are directly related to language models. The abstract further discusses the comparison between RAG and long-context LLMs, specifically mentioning models like Gemini-1.5 and GPT-4, which are well-known language models. The focus on benchmarking these models and proposing a method to optimize their use also indicates a strong emphasis on language models.
The text was updated successfully, but these errors were encountered: