Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cygwin: Build with SRT is ok, but crash when running. #3251

Open
Tracked by #2532
winlinvip opened this issue Nov 20, 2022 · 2 comments
Open
Tracked by #2532

Cygwin: Build with SRT is ok, but crash when running. #3251

winlinvip opened this issue Nov 20, 2022 · 2 comments
Assignees
Labels
EnglishNative This issue is conveyed exclusively in English. SRT It's about SRT protocol.

Comments

@winlinvip
Copy link
Member

winlinvip commented Nov 20, 2022

Workaround: Now we disable SRT for cygwin by default as a workaround.

The stack is bellow

gdb: unknown target exception 0x20474343 at 0x7fff993c039c

Thread 1 "srs" received signal ?, Unknown signal.
0x00007fff993c039c in RaiseException () from /cygdrive/c/Windows/System32/KERNELBASE.dll
(gdb) bt
#0  0x00007fff993c039c in RaiseException () from /cygdrive/c/Windows/System32/KERNELBASE.dll
#1  0x00000003ffc3cca1 in cyggcc_s-seh-1!_Unwind_RaiseException () from /usr/bin/cyggcc_s-seh-1.dll
#2  0x00000003ff20819b in cygstdc++-6!.cxa_throw () from /usr/bin/cygstdc++-6.dll
#3  0x00000001008c86c7 in CUDTUnited::accept(int, sockaddr*, int*) ()
#4  0x00000001008d17c2 in CUDT::accept(int, sockaddr*, int*) ()
#5  0x0000000100491d4c in SrsSrtSocket::accept (this=0x8000ed540, client_srt_fd=0x6fffff0afcb4) at ./src/protocol/srs_protocol_srt.cpp:757
#6  0x0000000100580360 in SrsSrtListener::cycle (this=0x8000e77b0) at ./src/app/srs_app_srt_listener.cpp:87
#7  0x00000001004ca5a9 in SrsFastCoroutine::cycle (this=0x8000fd360) at ./src/app/srs_app_st.cpp:285
#8  0x00000001004ca638 in SrsFastCoroutine::pfn (arg=0x8000fd360) at ./src/app/srs_app_st.cpp:300
#9  0x00000001005c6573 in _st_thread_main () at sched.c:371
#10 0x00000001005c6ee8 in st_thread_create (start=0x1, arg=0x18017c85d <dlcalloc+109>, joinable=16, stk_size=-14592) at sched.c:657
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
@winlinvip winlinvip added the SRT It's about SRT protocol. label Nov 20, 2022
@winlinvip winlinvip added the EnglishNative This issue is conveyed exclusively in English. label Jul 29, 2023
@winlinvip
Copy link
Member Author

winlinvip commented Mar 16, 2024

Based on the work of @xiaozhihong, I took the time to examine this issue in detail:

  1. Multithreading itself is not a problem, as can be seen in the verification program ST: Research adds examples that demos pthread and helloworld. v6.0.118 #3989.
  2. C++ exceptions work on other platforms without issues, but there are problems with Cygwin, as referenced in the verification program ST: Research adds examples that demos pthread and helloworld. v6.0.118 #3989.
  3. Attempts to fix the ST stack by copying the entire stack were unsuccessful, as noted in PR ST: Add support to save RBP register that allows stack backtrace #3987.
  4. For research on exception handling mechanisms such as SEH, DWARF, and SJLJ, refer to the branch win-st-seh.

To summarize, here are several conclusions:

  1. The fundamental issue is that ST does not support Windows' SEH exception handling mechanism and will not support it in the future due to the high complexity involved. It would require hacking the entire SEH exception mechanism of Windows, which would significantly reduce maintainability and stability.
  2. The only possible solution is to rewrite the SRT protocol without using C++ exception handling to achieve better portability. This custom-implemented protocol stack could be enabled specifically for the Cygwin platform.
  3. Since SRS has its own implementation of protocols, adding an implementation for the SRT protocol would be in line with its conventions. Supporting general Windows platform users is considered worthwhile.

See also:

  1. Mixing Win32 SEH with heap-allocated stack frames
  2. SEH setup for fibers with exception chain validation (SEHOP) active

TRANS_BY_GPT4

@winlinvip winlinvip assigned xiaozhihong and unassigned wenjiegit Mar 16, 2024
@winlinvip
Copy link
Member Author

winlinvip commented Apr 9, 2024

Coroutine with stack is not compatible with Windows SEH exception, which is introduced by libsrt. Therefore, I beieve if call libsrt APIs in main coroutine, we can bypass this issue. This is achived when SEH exceptions are created in the main coroutine, or threads forked by main coroutine. Actually, the main coroutine is the primordial main thread, whose stack is not created by us but by the OS. This stack should be compatible with the SEH exceptions.

A possible solution is to run libsrt on the primordial coroutine. Since the primordial coroutine is a normal stack without any modifications, it should support SEH, but this requires further research.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EnglishNative This issue is conveyed exclusively in English. SRT It's about SRT protocol.
Projects
None yet
Development

No branches or pull requests

3 participants