-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow us to use MCP servers to extend OpenHand's functionality #5781
Comments
If it helps anyone, I could offer a small "bounty" payment for implementing this? |
I must first deal with my GUI & CLI issue, however next thing im planning is this one if no one else is interested. |
Agree that MCP servers in Openhands seems like a necessary table stake in the near future :) |
I figured out GUI & CLI thing. I am working on this right now. @orangejon do you think users should need to add / remove tool by themselves or should OpenHands figure out what kind of tools it might utilize and install them. |
I think it could be either, or even both. The way Cline does it with describing a tool by its capability seems ideal, because then I don't have to search online for a suitable tool first. I agree that in this case a confirmation step is probably worthwhile, especially if there are multiple tools that match. I suppose if I have a particular tool in mind then it would be good to be able to just give the name or URL - though I guess that could also be via a prompt. Removing might be easier to just click a button on a list of installed servers though? But any way is fine for me really, so long as there's a reasonably easy and clearly documented way to use MCP servers then I'll figure it out :) I've not use Cline much yet, but I've got it installed and will be experimenting with it over the next few days. I'll report back! |
@motin I've had a chance to use Cline for a few days now, so I can report back my initial experience. So far I used it to create a simple Ruby on Rails web app. Because MacOS really creates headaches with Ruby versions (which Cline tried to find solutions to for over an hour, but only succeeded temporarily), I decided to use a Github codespace (basically a VPS running Ubuntu) and connect Vscode to that, which works really well - effortless setup, fast, reliable, and automatically configures port forwarding so you can see your web app as if it was running on your local machine. This had the nice side effects that it runs faster as the VPS has more resources than my laptop and I don't have to worry about the terminal commands that Cline runs, as the worst case scenario would be wasting a few minutes rebuilding the virtual machine if it really screwed it up (which, so far, it didn't). The code generated is pretty decent when using Claude Sonnet 3.5 (via OpenRouter) but my attempt to use Gemini was pretty unsuccessful, hitting various errors regardless of which model I selected. Claude can use MCP tools but it doesn't seem to do so unless you directly tell it to in the user prompt; e.g. I added to the "system" prompt that it when encountering an error it should use the search1api MCP client to read the documentation, but it never did. At least it (usually) listened to my instruction to run all unit tests and a Playwright browser-based integration test, so it does usually catch its own errors and fix them before asking for user input. I just swear it would be faster and burn less tokens if it Googled for a solution or documentation rather than just randomly changing the code, sometimes even call functions that don't exist. Also, although the documentation sounds like you can just prompt Cline with "Add a tool that..." and it will install the correct tool, that's not what it does. Typing that prompt seems to create a new MCP client from scratch which, seeing as it doesn't read the API documentation, is very unlikely to actually work! Instead you have to search for the "configure MCP servers" dialog, which then makes you manually edit the JSON configuration file to insert the MCP client definition. Then it works fine (and displays a nice "status" thing on the dialog with the various functions you can call, like in Claude Desktop) but it's a bit of a faff. I'd rather just paste in the URL of an MCP client definition and it adds it for me. Still, I have to say, my initial impression of Cline is generally pretty positive. The areas where it falls short currently are:
If you've got an questions, let me know. I'm still a fan of the OpenHands approach, and FOSS in general, so I'm happy to help if I can. It's just that Cline is working well for me so I will stick to using it for the time being. |
It would be great to have MCP/RAG built-in options for popular knowledge bases like wikis, Confluence, PDFs, and web links. Key benefits of this feature:
Potential implementation ideas: Develop connectors for popular wiki platforms and Confluence
Would love to hear your thoughts on this suggestion! |
I think the point of MCP is not that these types of functionality are
"built in", the idea is that you can you add MCP clients for whatever you
need, then OpenHands will access whatever it needs.
(At least in theory; I've been using Cline + Claude Sonnet 3.5 recently,
which supports MCP, and it rarely ever uses any MCP clients, no matter how
much I prompt it to!)
…On Thu, 9 Jan 2025 at 23:46, Alexander Dorofeev ***@***.***> wrote:
It would be great to have MCP/RAG built-in options for popular knowledge
bases like wikis, Confluence, PDFs, and web links.
Key benefits of this feature:
1. Enhanced knowledge retrieval capabilities
2. Improved integration with common information sources
3. Increased efficiency in accessing relevant data
Potential implementation ideas:
Develop connectors for popular wiki platforms and Confluence
1. Implement PDF parsing and indexing functionality
2. Create a system for crawling and updating web link content
3. This enhancement would significantly expand OpenHands' ability to
leverage existing knowledge repositories, making it more versatile and
powerful for users working with various information sources.
Would love to hear your thoughts on this suggestion!
—
Reply to this email directly, view it on GitHub
<#5781 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AWZJGWJPW3HQ3WIIT7EPR332J3U2FAVCNFSM6AAAAABUETS6RWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBRGMYDINBYGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Agreed, integrating MCP is essential. However, how can we design MCP usage to be model-agnostic? It doesn’t seem like a good approach to develop a feature that only works with Claude 3.5/Sonnet. |
MCP is (at least theoretically) an open standard that other LLMs can implement. As far as I know it's only Anthropic's models than implemented it so far, though. |
I think there is a way to implement a middleware like MCP-Bridge by https://github.com/SecretiveShell/MCP-Bridge, which main idea is to provide an openAI compatible endpoint that can call MCP tools @orangejon. However, whether it is appropriate still requires in-depth evaluation. |
I've been looking into this and I think we can implement this pretty easily. We already use LiteLLM which lets us do tool/function calling with any model - just like how librechat handles tools globally (they do it with langchain) regardless of which provider or model you're using. How do you prefer to add tools - should users manually configure them, or should OpenHands try to discover and suggest relevant tools? I'm leaning towards automated discovery with approval prompt since it would make things easier for users while keeping them in control. But let me know what you think would work best for your use cases. |
Automated discovery sounds great if it works well, because then if I'm
coding something and realise I need a tool (or the LLM realises?) then I
don't have to go off to search the web for a solution. However, if there
are multiple MCP tools then it might be preferable to select one manually..
not necessarily for fear of skynet situations but more because some of
the MCP tools are pretty flakey!
Also there's the case that's been more common for me so far: I'm browsing
the web looking for tools that can improve my workflow, and want to add one
that I've found. So it's not necessarily something that is essential in
that moment or that OpenHands (or Cline) can't operate without, but it's
something that seems generally useful (e.g. web search and scraping). Also
most tools need me to create an account, add payment details and get an API
key, so unless OpenHands will do that automatically, there's not a
significant benefit in the discovery step happening automatically.
In short, being able to install (and uninstall) tools manually would
certainly be useful, and presumably it's easier to implement, so it might
make sense to add that first.
…On Thu, 16 Jan 2025 at 09:53, Goku ***@***.***> wrote:
I've been looking into this and I think we can implement this pretty
easily. We already use LiteLLM which lets us do tool/function calling with
any model - just like how librechat handles tools globally (they do it with
langchain) regardless of which provider or model you're using.
@RPirruccio <https://github.com/RPirruccio> - good point about model
agnostic design. That's exactly why we don't need MCP-Bridge here - LiteLLM
already handles the compatibility layer for us.
Before I start implementing this, I'd like to hear from everyone:
How do you prefer to add tools - should users manually configure them, or
should OpenHands try to discover and suggest relevant tools?
Should we add an approval step for tool installation like Claude Desktop
does?
I'm leaning towards automated discovery with approval prompt since it
would make things easier for users while keeping them in control. But let
me know what you think would work best for your use cases.
We may also need to configure headless version to be able to configure its
own tools but IMO those tools available to it should be limited for
preventing any *skynet becomes self aware* moment.
—
Reply to this email directly, view it on GitHub
<#5781 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AWZJGWKYG25RSVGZ4QHNRR32K5QRLAVCNFSM6AAAAABUETS6RWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJUG42TIMZSGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Ok I'm working on it. |
Interesting conversation here! I just want to add a few ideas based on my experience crafting coding agents using MCP with Claude Desktop. There’s a belief that adding more tools makes LLMs smarter, but more often, it just creates confusion and fills the context with noise. In my opinion, additional tools should be part of a message and include extra in-context learning materials. Since I can’t do this with Claude Desktop, I started looking for alternatives. A few ideas worth exploring:
This also makes me think: the shell is an underutilized platform for LLM tools. It’s straightforward to provide RAG, web search, and many other functions via CLIs. If an LLM can call native tools, why wouldn't it use shell-based tools with the same efficiency? I’d love to hear critical opinions on this approach. What am I missing? Are there hidden downsides? |
I think MCP is cool and we can benefit from adding it to Just to note quickly, @anzax I do agree.
|
I switched to using Cline (with Claude Sonnet 3.5) mostly because it
supports MCP tools, but I have been disappointed how infrequently it uses
them. I suspect it's because the LLM was mainly trained on content like
StackOverflow, where people offer solutions to the problem the developer is
currently facing. These solutions don't often say "Now Google for the
latest API documentation and check your API calls are correct" because the
solutions offered were correct at the time of writing. I guess having a
separate reasoning thread might help, because, if prompted appropriately,
it could encourage the LLM to first plan out how to approach a problem
methodically, as a good developer would, instead of just randomly making
changes that are just as likely to cause new problems as to fix the bug!
…On Tue, 4 Feb 2025 at 22:27, Engel Nyst ***@***.***> wrote:
I think MCP is cool and we can benefit from adding it to openhands.
Just to note quickly, @anzax <https://github.com/anzax> I do agree.
- Re: Augmenting microagents with tools (MCP) - we need MCP first IMO,
once MCP is integrated, we can already use this I think
- Re: Using a separate reasoning thread - underlying support for
reasoning llm and workflow is coming
<#6189>
- Re: Leveraging CLI-based tools - we had what we call agent skills
<https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/runtime/plugins/agent_skills>,
implemented in python, ran via Jupyter server in the runtime. We consider
some of them deprecated right now. I don't know how new tools would look
like, but if you want to try it, please feel free to!
—
Reply to this email directly, view it on GitHub
<#5781 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AWZJGWLYPHOVFMGZRAANA6T2OEPC3AVCNFSM6AAAAABUETS6RWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMZUHE4DSMBUHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
What problem or use case are you trying to solve?
OpenHands' functionality is currently fairly limited, but Anthropic's MCP standard provides a way for LLMs to interact with many additional services and use them as "tools". This could allow for much more complex workflows, e.g. to use Puppeteer or Playwright to test the code in the browser, then if it fails use OpenAI o1 (via MCP) to debug/rewrite it, etc.
Describe the UX of the solution you'd like
I guess the ideal would be to be able to install MCP servers in one click or a prompt. The implementation in Cline is neat:
... but the main thing is to be able to access them. Perhaps a list of the installed servers could be good to verify they have been recognised, like in Claude Desktop:
Do you have thoughts on the technical implementation?
I don't know OpenHands' architecture, but please be sure to add clear documentation with step-by-step instructions so I know any setup that's required to use this functionality.
Describe alternatives you've considered
Using Claude Desktop instead of OpenHands, because I can probably replicate a lot of the same functionality by just combining MCP servers. But the UI probably wouldn't be as good and I'm not sure if it would work as effectively.
Additional context
The text was updated successfully, but these errors were encountered: