You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What's system load on a server? What's the memory consumption?
The top doesn't seem to indicate any kind of throttling. I believe the userspace CPU load is somewhat higher during the query but still very low.
Mem - free -h output:
me@graphite-dev:~$ free -h
total used free shared buff/cache available
Mem: 30Gi 3.1Gi 628Mi 1.0Mi 27Gi 27Gi
Swap: 0B 0B 0B
This is our dev host.
If possible, please provide profiler's output (https://blog.golang.org/pprof for instruction how to do that, debug handlers can be enabled in carbonapi's config on a separate port). Golang allows different types of profiles. It's always better to give CPU Profile, Memory profile and Lock profile, but that depends on what's the symptoms.
2 CPU profiles which are captured during the request are attached:
Debug logs could be helpful as well
Setting log-level to debug has immense implication on performance so for now I've set it to "info" level. The only 2 relevant log entries for this request (grepped a\.\*b\*\.c\.d\.\*) are:
"level":"ERROR","timestamp":"2024-02-18T13:51:21.454Z","logger":"access","message":"request failed","data":{"handler":"render","carbonapi_uuid":"d1118526-3647-4c52-99ea-49d19b549f74","url":"/render","peer_ip":"10.42.136.247","host":"{our graphite dev host}:4040","format":"json","use_cache":true,"targets":["a.*b*.c.d.*"],"cache_timeout":180,"runtime":59.89556336,"http_code":503,"reason":"timeout while fetching Response","from":1708263920,"until":1708264222,"from_raw":"1708263920","until_raw":"1708264222","uri":"/render","from_cache":false,"used_backend_cache":false,"request_headers":{"X-Grafana-Org-Id":"1","X-Panel-Id":"1"}}}
and some 10's MB's long entry which starts like this:
{"level":"WARN","timestamp":"2024-02-18T13:52:40.013Z","logger":"zipper","message":"errors occurred while getting results","type":"protoV2Group","name":"http://127.0.0.1:8080","type":"fetch","request":"&MultiFetchRequest{Metrics:[]FetchRequest{FetchRequest{Name:
{{ removed for brevity }}
,"carbonapi_uuid":"d1118526-3647-4c52-99ea-49d19b549f74","errors":"max tries exceeded","errorsVerbose":"max tries exceeded\nHTTP Code: 500\n\ngithub.com/go-graphite/carbonapi/zipper/types.init\n\t/root/go/src/github.com/go-graphite/carbonapi/zipper/types/errors.go:25\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6321\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:233\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594\n\nCaused By: failed to fetch data from server/group\nHTTP Code: 500\n\ngithub.com/go-graphite/carbonapi/zipper/types.init\n\t/root/go/src/github.com/go-graphite/carbonapi/zipper/types/errors.go:27\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6321\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:233\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594\n\nCaused By: error while fetching Response\n\ngithub.com/go-graphite/carbonapi/zipper/types.init\n\t/root/go/src/github.com/go-graphite/carbonapi/zipper/types/errors.go:34\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6321\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6298\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:233\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594"
Query that causes problems
GF queries metrics. It's render query. Query target is expressed by a.*b*.c.d.*.
According to go-carbon logs, there are ~65.5K files (i.e. metrics) returned.
resolution - I think it's irrelevant.
datapoints amount - Also irrelevant to my mind?
Additional context
The carbonapi log reports status code of 503. The nginx reverse proxy indicates code 499 which is according to the internet means a client has halted the connection. Anyway, 1 minute feels too long anyway.
The text was updated successfully, but these errors were encountered:
Problem description
Query
a.*b*.c.d.*
times out after a minute.carbonapi's version
v0.16.0
Does this happened before
irrelevant (I didn't have an earlier version)
carbonapi's config
Will provide if relevant.
backend software and config
Go-Carbon 0.17.3.
Config will be provided if needed.
carbonapi performance metrics
expvars
in the config and provide it's outputI believe this is the required expvars config:
IDK what output this question refers to.
The
top
doesn't seem to indicate any kind of throttling. I believe the userspace CPU load is somewhat higher during the query but still very low.Mem -
free -h
output:This is our dev host.
If possible, please provide profiler's output (https://blog.golang.org/pprof for instruction how to do that,
debug
handlers can be enabled in carbonapi's config on a separate port). Golang allows different types of profiles. It's always better to give CPU Profile, Memory profile and Lock profile, but that depends on what's the symptoms.2 CPU profiles which are captured during the request are attached:
Debug logs could be helpful as well
Setting log-level to debug has immense implication on performance so for now I've set it to "info" level. The only 2 relevant log entries for this request (grepped
a\.\*b\*\.c\.d\.\*
) are:and some 10's MB's long entry which starts like this:
Query that causes problems
a.*b*.c.d.*
.go-carbon
logs, there are ~65.5K files (i.e. metrics) returned.Additional context
The
carbonapi
log reports status code of 503. Thenginx
reverse proxy indicates code 499 which is according to the internet means a client has halted the connection. Anyway, 1 minute feels too long anyway.The text was updated successfully, but these errors were encountered: