Replies: 11 comments 6 replies
-
|
Beta Was this translation helpful? Give feedback.
-
Hey! This is a great idea -- thanks for laying it all out so neatly. I think you should go for it, but I would ask that this provider-specific configuration live in a file separate from |
Beta Was this translation helpful? Give feedback.
-
If you want to program, it sometimes needs to execute the end point command by itself, such as npm run. At this time, you should first kill all node.exe processes and then execute the startup command, so that it will not keep running repeatedly and occupy port 3000. |
Beta Was this translation helpful? Give feedback.
-
I was thinking about adding imports; It makes things complicated at the moment; I will include that but that would be pretty much lower in priority; my primary focus is to deliver small, incremental wins that can make tangible impact on my personal workflow and others as well ; ergonomic features that help but are not mandatory in my opinion ( like having splitted config files ) would be good to have but not now. |
Beta Was this translation helpful? Give feedback.
-
@lily-de I will outline my main challenges in the following comments , each in one message so that we can discuss them one by one |
Beta Was this translation helpful? Give feedback.
-
Challenge 1Goose sends a LOT of tokens. I keep getting rate limited . I am on tier 5 on OpenAI and tier 4 on Anthropic yet this still plagues me . I am sure others are facing this. I have a few ideas about smart truncation and compression of chat message thread ; I have been looking into some approaches but for now, the quickest win here is to set up rate limiter with a simple strategy. |
Beta Was this translation helpful? Give feedback.
-
Challenge 2I have found that when coding , I needed to
After experimenting with pretty much all models ( O3 included ) Claude Sonnet is still the best one when it comes to Agentic workflows. O1-Pro is the only model that surpasses it but you have to use it through The issue/advantage of Claude Sonnet he is very opinionated ; He will do what he "thinks" is best to achieve what he was able to "understand" from your prompt; sometimes even blatantly ignoring direct instructions in the system prompt ( this behavior becomes more emergent when tokens in the context window increases ). In my experiments, the best way to make models follow specific instructions more closely and prevent them from spazzing out is through a. setting temperature to |
Beta Was this translation helpful? Give feedback.
-
Challenge 3Only supporting well-known , larger LLM inference providers is very limiting in option while virtually every provider supports OpenAI compliant inference endpoints. I have found myself relying on more niche LLMs such as Perplexity's Adding option to fine tune based on provider type can have a few advantages
this would enable more fine-tuned workflows; in my example
|
Beta Was this translation helpful? Give feedback.
-
@lily-de @alexhancock if it is OK with you, I will lay out a more detailed incremental implementation plan:
|
Beta Was this translation helpful? Give feedback.
-
@da-moon -- all those ideas sound great to me! I think the best implementation, given that this is somewhat experimental, is one where people can opt-in, and the default goose flow is left as is. As we experiment with these parameters and rate limiting, if we find one that performs really well, we can make that the default experience for users |
Beta Was this translation helpful? Give feedback.
-
Hey @da-moon -- I wanted to make sure we all get aligned before you spend time implementing anything. Right now the team is working to overhaul To that end, I've started this discussion topic here that I'd love if you could look at and provide any feedback. This is the main reason I asked that the model-specific config lived in a different file, but if you see a way forward with a single config file with a strong reason to do so let us know! |
Beta Was this translation helpful? Give feedback.
-
Context
To enable finer control over how we connect to LLM providers—and to introduce
the capability to throttle API calls—we propose a new, unified
providers
configuration block. This block will group all provider-specific parameters
(such as
name
,type
,api_base
,temperature
,top_p
,cache
,max_tokens
, andadditional_headers
) and include a new configurable ratelimiting mechanism.
For this epic, our focus is on implementing a "simple" rate
limiting strategy; while we will document a token bucket approach for future
work, that remains out of scope.
I wanted to start a discussion around this and with your blessings , I can get started on this change
Value
configuration block.
and token usage thresholds.
according to their quota limits.
can be delivered and merged quickly.
limiting (e.g. token bucket) in future iterations.
Acceptance Criteria
providers
section containing:name
,type
,api
,temperature
,top_p
,cache
,max_tokens
, andadditional_headers
.ratelimit
sub-section (initially supporting the "simple"strategy).
providers
block.
and tokens per period.
handling rate limit errors.
behavior.
Additional Context below).
Measurement
Persona(s)
usage.
flexibility and system robustness.
In Scope
providers
configuration element that groups allprovider-specific parameters into a single block.
name
,type
,api
,temperature
,top_p
,cache
,max_tokens
, andadditional_headers
.providers
block.GOOSE_PROVIDER
) are either mapped orproperly deprecated.
providers
to support the simple ratelimiting mechanism.
ratelimit
:type: simple
config.requests
(withlimit
andperiod_seconds
)config.tokens
(withlimit
andperiod_seconds
)config.retry
(withdelay_seconds
andmax_retries
)API request workflow.
rate limiting mechanism.
for this epic).
Out of Scope
sliding window).
providers
block.Complexities
and rate limiting logic.
introducing significant performance penalties.
Additional Context
Below is an example YAML configuration that combines existing configuration
elements with the new
providers
block and a simple rate limiting mechanism:Beta Was this translation helpful? Give feedback.
All reactions