-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add support for prompt augmentation via external API #909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So the main difference with a RAG API (if I understand correctly) is that you want your system to return different things based on the content of the user prompt. (While RAG will just fetch a static asset, be it web page or PDF, even if it does sentence similarity afterwards). Seems to me that this could be accomplished using some kind of function calling API? I think this feature would be a nice next-step for chat-ui, but we should discuss what it would look like. In the meantime if you want to fork and add something a bit more custom built for your own use case, maybe have a look in:
Let me know if you need any other help! |
Yes, the other system would take the prompt and do a similarity search in a bunch of pre-indexed material (a library of sorts) and augment the prompt with the results of that search. That's a little different from some RAG flows (like chat-ui's web search or open-webui's PDF upload) where material is fetched and indexed as part of a conversation. But we're still augmenting the prompt with relevant results from a data store. From chat-ui's perspective, though, it doesn't matter what I do in the library, which is good separation of concerns IMO. Someone else could implement what makes sense for them, as long as they follow the API spec. I'll have a look at the web search code you mentioned and see what I can do. I'm sure I'll have questions. :) |
I've managed to get the basic flow working, but I haven't surfaced anything in the UI yet. I was picturing a "Library search" toggle at the bottom of the conversation next to the "Web search" toggle. If enabled by ENV vars, the toggle would be present and off by default. If you turn it on, you should see the prompt you entered as you typed it, but maybe there were Updates in the UpdatePad about how the prompt was being augmented. Alternatively, maybe there was something at the bottom of the message showing the augmented prompt, similar to the WebSearch sources. Thoughts, @nsarrazin ? |
Would a custom search engine work for you as part of the web search instead of having a different feature? The way it works is currently as follows: (entry point here)
Seems to me like for your use case, you could hook your feature in step 2, and replace the google search by a custom search engine which takes a search query and returns links to the relevant chunks (in plain text) hosted on your server. If you return 8 chunks or less they will all get added to the prompt. Currently all search engines are all hardcoded, but it should be very easy to add your own there |
Related:
|
Integrating it as another search engine would be easier, and have minimal impact on the UI. I'll try that route. The only drawback I can think of is that it's not technically "web" search anymore... :D |
I've got the basic flow working. The only thing that felt like Web Search wasn't a good fit was that I needed to skip I will probably also look for a way to skip the chunking, since the remote server will be returning chunks rather than whole documents. Do you think some sort of size check makes sense here? Like, if the document is already under 200 characters don't bother chunking? |
Another thing that such a thing should support is forwarding the identity of the logged in chat user (if configured with OpenID). |
Problem
I love HF chat-ui, and I'd like to deploy it for more teams. However, missing RAG features prevents me from deploying it as widely as I'd like.
Some teams need RAG with PDFs as a data source. Many of these PDFs are oddly formatted, and different types of PDFs require different kinds of parsing/chunking/embedding/whatever. This is not chat-ui's concern - it's mine - and no feature of any LLM chat UI will ever solve this problem in the way I need it solved.
Possible solution
What I'd really like is to have a feature where I can tell chat-ui to get a prompt from the user, call a ReST API to have the prompt translated/augmented/whatever, and then send the resulting prompt to the LLM. (Some way to hover and see what the actual, augmented prompt looked like would also be nice, in case something weird happens and the user wants to know why.)
I'll build the indexing system and present the ReST API to augment prompts, I just need a UI that will use it. I would even be happy to use an existing API as reference or adopt a standard if one exists, but I haven't seen one. Maybe we'll set the standard here.
My plan
I was thinking of forking chat-ui to add these hooks to do "prompt translation" or "prompt augmentation" or whatever the best name is.
Questions
Any advice on where I start? Where in the code would this logic go if I added it?
Are maintainers open to the idea of merging a feature like this if it works?
(Because you never know) Is this already present in chat-ui and I just haven't noticed?
The text was updated successfully, but these errors were encountered: