Rag Advanced example #7

ogbozoyan · 2024-11-16T21:51:27Z

You're implementation of RAG advisor look much better and more professional. Would you like to make a pull request and add it to the Spring AI project ?

And also i didn't quit understand, what's the purpose of those interfaces?

package com.thomasvitale.ai.spring.rag.orchestration.routing.QueryRouter;

package com.thomasvitale.ai.spring.rag.postretrieval.document.reranking.DocumentReranker;

package com.thomasvitale.ai.spring.rag.postretrieval.document.selection.DocumentSelector;

and an empty package

com/thomasvitale/ai/spring/rag/postretrieval/document/compression/

The text was updated successfully, but these errors were encountered:

ThomasVitale · 2024-11-16T22:35:25Z

@ogbozoyan thanks for your message! I'm currently working on a Modular RAG implementation in Spring AI. You can follow the status of the first stage of the implementation in this issue: spring-projects/spring-ai#1603. Soon, documentation will be available for all the new functionality introduced in that issue. It will be part of the upcoming M4 release of Spring AI.

What you find in this repo in the advanced-rag example is actually part of what I was validating before delivering the changes to Spring AI. I started that for convenience, but soon I will move that research-based code elsewhere to avoid confusion, while still leaving the RAG-related examples here.

ogbozoyan · 2024-11-17T14:28:40Z

@ThomasVitale thanks, Great news, I'll be looking forward

ogbozoyan · 2024-11-23T19:16:22Z

@ThomasVitale, while using your implementation Advanced RAG facing with next problem:
After saving to the vector database my files, which is have information about "Finite field or Galois field", i'm starting to chat with model about cyber security:

Input:
query: What is Galois field?
Answer:
*some answer from model, containing answer from file"
Another input:
What is man in the middle?
Answer:
I can't find information about Man in the Middle in this context. If you would like information about this term, I can suggest you look it up on the Internet or in specialized cybersecurity and computer security sources

I think this is because of org.springframework.ai.rag.generation.augmentation.ContextualQueryAugmenter have certain instructions to restrict continue "thinking" model if it's lack of context. How do you manage that situation ?

private static final PromptTemplate DEFAULT_PROMPT_TEMPLATE = new PromptTemplate("""
			Context information is below.

			---------------------
			{context}
			---------------------

			Given the context information and no prior knowledge, answer the query.

			Follow these rules:

			1. If the answer is not in the context, just say that you don't know.
			2. Avoid statements like "Based on the context..." or "The provided information...".

			Query: {query}

			Answer:
			""");

MY CONFIGS:
System prompt:

INSTRUCTIONS:
You are a cyber security assistant. You know everything about computer security and information security and are always sharing your knowledge. You need to help and answer questions related to computer, information security. Be ready to explain complex technical terms in simple words. Help users understand the importance of security measures in internet and digital life. Recommend secure practices and tools to protect personal data and devices.
You are friendly and always try to help the person you're talking to, if YOU don't know the answer to a question, gently just answer the user that you're in doubt, but never make up anything
Your role is not only to provide information, but also to motivate people to take care of their digital security. Be helpful and supportive!

ChatClient configuration

@Bean
    fun ollamaClient(): ChatClient {
        val documentRetriever: VectorStoreDocumentRetriever = VectorStoreDocumentRetriever.builder()
            .vectorStore(vectorStore)
            .similarityThreshold(0.50)
            .topK(3)
            .build()

        return chatClientBuilder
            .defaultAdvisors(
//                MessageChatMemoryAdvisor(inMemoryChatMemory(), DEFAULT_CHAT_MEMORY_CONVERSATION_ID, 50),
                VectorStoreChatMemoryAdvisor(vectorStore, DEFAULT_CHAT_MEMORY_CONVERSATION_ID, 5),
                SimpleLoggerAdvisor(),
                RetrievalAugmentationAdvisor.builder()
                    .documentRetriever(documentRetriever)
                    .order(Ordered.HIGHEST_PRECEDENCE)
                    .build()
            )
            .build()
    }

  ai:
    ollama:
      base-url: ${OLLAMA_BASE_URL:http://localhost:11434/}
      embedding:
        options:
          model: ${AI_OLLAMA_EMBEDDING_OPTIONS_MODEL:mxbai-embed-large:latest}
      chat:
        options:
          model: ${OLLAMA_CHAT_MODEL:llama3.2:3b}
          temperature: ${OLLAMA_CHAT_TEMPERATURE:0.6}
          num-ctx: 16384
      vectorstore:
        pgvector:
          initialize-schema: true
          index-type: HNSW
          distance-type: COSINE_DISTANCE
          dimensions: 1024

ThomasVitale · 2024-11-23T19:31:00Z

If I understood correctly, you would like the model to answer based on its own knowledge in case the retrieval step returns no document?

By default, if no documents are found in the retrieval step, the model is instructed not to answer in order to mitigate the risk of hallucinations. You can customize the ContextQueryAugmenter to allow an empty context and still ask the model to answer the question.

You can find an example here: https://github.com/ThomasVitale/llm-apps-java-spring-ai/blob/main/rag/rag-sequential/rag-naive/src/main/java/com/thomasvitale/ai/spring/RagControllerEmptyContext.java#L27

ogbozoyan · 2024-11-24T13:58:52Z

You understood it right, but it seems to me problem in another advisor

                VectorStoreChatMemoryAdvisor(vectorStore, DEFAULT_CHAT_MEMORY_CONVERSATION_ID, 5),

I've take a look what prompt goes to model and see a lot of trash cashed from VectorStoreChatMemory

Did you test the RAG with memorization?

ogbozoyan · 2024-11-24T14:03:40Z

Also i has some conversation in pull request spring-projects/spring-ai#1528 (comment) where saw example with usage self implemented PgVectorChatMemory with your RAG
Maybe that implementation will helps to avoid or correctly use memory

ogbozoyan closed this as completed Nov 17, 2024

ogbozoyan reopened this Nov 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rag Advanced example #7

Rag Advanced example #7

ogbozoyan commented Nov 16, 2024

ThomasVitale commented Nov 16, 2024

ogbozoyan commented Nov 17, 2024

ogbozoyan commented Nov 23, 2024 •

edited

Loading

ThomasVitale commented Nov 23, 2024 •

edited

Loading

ogbozoyan commented Nov 24, 2024

ogbozoyan commented Nov 24, 2024

Rag Advanced example #7

Rag Advanced example #7

Comments

ogbozoyan commented Nov 16, 2024

ThomasVitale commented Nov 16, 2024

ogbozoyan commented Nov 17, 2024

ogbozoyan commented Nov 23, 2024 • edited Loading

ThomasVitale commented Nov 23, 2024 • edited Loading

ogbozoyan commented Nov 24, 2024

ogbozoyan commented Nov 24, 2024

ogbozoyan commented Nov 23, 2024 •

edited

Loading

ThomasVitale commented Nov 23, 2024 •

edited

Loading