Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: VTGate fatal error: concurrent map iteration and map write #17410

Closed
quinox opened this issue Dec 19, 2024 · 0 comments · Fixed by #17417
Closed

Bug Report: VTGate fatal error: concurrent map iteration and map write #17410

quinox opened this issue Dec 19, 2024 · 0 comments · Fixed by #17417

Comments

@quinox
Copy link
Contributor

quinox commented Dec 19, 2024

Overview of the Issue

The VTGate crashes when a specific query is sent to it with the following message:

fatal error: concurrent map iteration and map write

We observed it in production (see the 6 attached stacktraces) every time a specific set of queries was performed and we can reproduce it consistently in our dev environment. We found a workaround: these queries crash VTGate when they are done in OLAP mode, it does not crash when the queries are done in OLTP mode (we don't know why the workload setting matters).

Potentially related to: #17411

Reproduction Steps

We moved our platform to Vitess last night and were happy with the workaround to get everything online and stable, we did not have the time to track down which specific query causes it. Hopefully the stacktrace is enough for you to work with. If not we'll try to narrow it down further next week.

Binary Version

# vtgate --version
vtgate version Version: 21.0.1 (Git revision 3d4f41db2fbc32611c7d2ea2af3dc68b9d962415 branch 'HEAD') built on Tue Dec  3 05:39:35 UTC 2024 by runner@fv-az2029-313 using go1.23.3 linux/amd64

# Installed via the deb package

Operating System and Environment details

# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 24.04.1 LTS
Release:	24.04
Codename:	noble

# uname -sr
Linux 6.8.0-1019-aws

# uname -m
x86_64

Log Fragments

https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_b1_crash_01_map_iteration.txt
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_b1_crash_02_map_iteration.txt
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_b1_crash_03_map_iteration.txt
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_b2_crash_01_map_iteration.txt
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_c1_crash_01_map_iteration.txt
https://ytec.nl/media/4a0c2556-81f7-4203-aecc-abd6bb82808e/2024-12-18_server_c1_crash_02_map_iteration.txt

@quinox quinox added Needs Triage This issue needs to be correctly labelled and triaged Type: Bug labels Dec 19, 2024
@GuptaManan100 GuptaManan100 added Component: Query Serving and removed Needs Triage This issue needs to be correctly labelled and triaged labels Dec 19, 2024
@GuptaManan100 GuptaManan100 self-assigned this Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants