Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Lock order violation: m_ConnectionLock and m_LSLock. #2970

Closed
1 task done
maxsharabayko opened this issue Jul 1, 2024 · 1 comment · Fixed by #2998
Closed
1 task done

[BUG] Lock order violation: m_ConnectionLock and m_LSLock. #2970

maxsharabayko opened this issue Jul 1, 2024 · 1 comment · Fixed by #2998
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Milestone

Comments

@maxsharabayko
Copy link
Collaborator

maxsharabayko commented Jul 1, 2024

This issue was initially listed in this comment.
Creating a separate ticket on it to track progress and investigation results.

Lock order "m_ConnectionLock before m_LSLock" is violated.

Case 1. Retrieve Socket Options from a Listen Callback

CRcvQueue::worker_ProcessConnectionRequest(..) locks m_LSLock to protect CRcvQueue::m_pListener from modifications while processing a connection request.

Within this function a listen callback may be called where an application can retrieve socket options srt_getsockopt for the newly created socket (being considered for accepting). The srt_getsockopt locks m_ConnectionLock while a lock on m_LSLock is still being held.

Thread checker report (click to expand)
0x4FCEB70 - m_ConnectionLock
0x4FD0A48 - m_LSLock
Thread #9: lock order "0x4FCEB70 before 0x4FD0A48" violated

Observed (incorrect) order is: acquisition of lock at 0x4FD0A48
   at 0x48510B8: mutex_lock_WRK (hg_intercepts.c:944)
   by 0x4E7487: srt::sync::Mutex::lock() (sync_posix.cpp:220)
   by 0x4E74FB: srt::sync::ScopedLock::ScopedLock(srt::sync::Mutex&) (sync_posix.cpp:236)
   by 0x56C6CF: srt::CRcvQueue::worker_ProcessConnectionRequest(srt::CUnit*, srt::sockaddr_any const&) (queue.cpp:1407)
   by 0x56BFA7: srt::CRcvQueue::worker(void*) (queue.cpp:1247)
   by 0x4D11FD7: ??? (in /lib/libpthread-2.23.so)

 followed by a later acquisition of lock at 0x4FCEB70
   at 0x48510B8: mutex_lock_WRK (hg_intercepts.c:944)
   by 0x4E7487: srt::sync::Mutex::lock() (sync_posix.cpp:220)
   by 0x4E74FB: srt::sync::ScopedLock::ScopedLock(srt::sync::Mutex&) (sync_posix.cpp:236)
   by 0x51A513: srt::CUDT::getOpt(SRT_SOCKOPT, void*, int&) (core.cpp:459)  ---> m_ConnectionLock
   by 0x4F59EF: srt::CUDT::getsockopt(int, int, SRT_SOCKOPT, void*, int*) (api.cpp:3783)
   by 0x4E5BEF: srt_getsockopt (srt_c_api.cpp:163)
   by 0x4BBE87: SrtConn_ListenCB (srtc_mod.c:9753)
   by 0x53C993: srt::CUDT::runAcceptHook(srt::CUDT*, sockaddr const*, srt::CHandShake const&, srt::CPacket const&) (core.cpp:11814)
   by 0x4E9E0B: srt::CUDTUnited::newConnection(int, srt::sockaddr_any const&, srt::CPacket const&, srt::CHandShake&, int&, srt::CUDT*&) (api.cpp:611)
   by 0x53AA93: srt::CUDT::processConnectRequest(srt::sockaddr_any const&, srt::CPacket&) (core.cpp:11101)
   by 0x56C7F3: srt::CRcvQueue::worker_ProcessConnectionRequest(srt::CUnit*, srt::sockaddr_any const&) (queue.cpp:1411)
   by 0x56BFA7: srt::CRcvQueue::worker(void*) (queue.cpp:1247)

Required order was established by acquisition of lock at 0x4FCEB70
   at 0x48510B8: mutex_lock_WRK (hg_intercepts.c:944)
   by 0x4E7487: srt::sync::Mutex::lock() (sync_posix.cpp:220)
   by 0x4E74FB: srt::sync::ScopedLock::ScopedLock(srt::sync::Mutex&) (sync_posix.cpp:236)
   by 0x51BDBF: srt::CUDT::setListenState() (core.cpp:1005)
   by 0x4EB46B: srt::CUDTUnited::listen(int, int) (api.cpp:1012)
   by 0x4F433F: srt::CUDT::listen(int, int) (api.cpp:3571)
   by 0x4E593F: srt_listen (srt_c_api.cpp:114)
   by 0x47B2A7: srtconn_SetupListener (srtc_mod.c:4291)
   by 0x48155F: srtgroup_Listen (srtc_mod.c:4803)
   by 0x48324B: srtgroup_Open (srtc_mod.c:4910)
   by 0x4848A3: srtconn_Open (srtc_mod.c:5044)
   by 0x489EA3: SrtConn_Open (srtc_mod.c:5308)

 followed by a later acquisition of lock at 0x4FD0A48
   at 0x48510B8: mutex_lock_WRK (hg_intercepts.c:944)
   by 0x4E7487: srt::sync::Mutex::lock() (sync_posix.cpp:220)
   by 0x4E74FB: srt::sync::ScopedLock::ScopedLock(srt::sync::Mutex&) (sync_posix.cpp:236)
   by 0x56D31F: srt::CRcvQueue::setListener(srt::CUDT*) (queue.cpp:1693)
   by 0x51BECB: srt::CUDT::setListenState() (core.cpp:1018)
   by 0x4EB46B: srt::CUDTUnited::listen(int, int) (api.cpp:1012)
   by 0x4F433F: srt::CUDT::listen(int, int) (api.cpp:3571)
   by 0x4E593F: srt_listen (srt_c_api.cpp:114)
   by 0x47B2A7: srtconn_SetupListener (srtc_mod.c:4291)
   by 0x48155F: srtgroup_Listen (srtc_mod.c:4803)
   by 0x48324B: srtgroup_Open (srtc_mod.c:4910)
   by 0x4848A3: srtconn_Open (srtc_mod.c:5044)

Case 2. Data Race if App Sets POST Socket Options Right After a Call to srt_listen()

Simultaneous access to listener's config (CUDT::m_config) from SRT internal newConnection(..) and the app thread called for srt_listen(). Although it does not seem the right thing to do, still the data race has to be avoided.
Described in this comment.
This way srt-xtransmit sets POST socket options.

Thread checker report (click to expand)
WARNING: ThreadSanitizer: data race (pid=2590)
  Read of size 8 at 0x7ba8000000a8 by thread T3 (mutexes: write M227):
    #0 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 (libtsan.so.0+0x6243e)
    #1 memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:819 (libtsan.so.0+0x6243e)
    #2 srt::CSrtConfig::operator=(srt::CSrtConfig const&) /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/socketconfig.h:167 (srt-xtransmit+0x2b6d98)
    #3 srt::CUDT::CUDT(srt::CUDTSocket*, srt::CUDT const&) /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/core.cpp:335 (srt-xtransmit+0x2796e6)
    #4 srt::CUDTSocket::CUDTSocket(srt::CUDTSocket const&) /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/api.h:113 (srt-xtransmit+0x24de9e)
    #5 srt::CUDTUnited::newConnection(int, srt::sockaddr_any const&, srt::CPacket const&, srt::CHandShake&, int&, srt::CUDT*&) /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/api.cpp:544 (srt-xtransmit+0x233312)
    #6 srt::CUDT::processConnectRequest(srt::sockaddr_any const&, srt::CPacket&) /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/core.cpp:11024 (srt-xtransmit+0x2b2175)
    #7 srt::CRcvQueue::worker_ProcessConnectionRequest(srt::CUnit*, srt::sockaddr_any const&) /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/queue.cpp:1406 (srt-xtransmit+0x309743)
    #8 srt::CRcvQueue::worker(void*) /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/queue.cpp:1238 (srt-xtransmit+0x30828c)

  Previous write of size 1 at 0x7ba8000000a9 by main thread (mutexes: write M188, write M193, write M192):
    #0 (anonymous namespace)::CSrtConfigSetter<(SRT_SOCKOPT)2>::set(srt::CSrtConfig&, void const*, int) /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/socketconfig.cpp:213 (srt-xtransmit+0x31c8c3)
    #1 (anonymous namespace)::dispatchSet(SRT_SOCKOPT, srt::CSrtConfig&, void const*, int) /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/socketconfig.cpp:903 (srt-xtransmit+0x31acf0)
    #2 srt::CSrtConfig::set(SRT_SOCKOPT, void const*, int) /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/socketconfig.cpp:954 (srt-xtransmit+0x31a7fd)
    #3 srt::CUDT::setOpt(SRT_SOCKOPT, void const*, int) /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/core.cpp:404 (srt-xtransmit+0x27bbe4)
    #4 srt::CUDT::setsockopt(int, int, SRT_SOCKOPT, void const*, int) /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/api.cpp:3641 (srt-xtransmit+0x243f71)
    #5 srt_setsockopt /mnt/d/Projects/srt/srt-xtransmit-multi/submodule/srt/srtcore/srt_c_api.cpp:165 (srt-xtransmit+0x3227c9)
    #6 xtransmit::socket::srt::configure_post(int) /mnt/d/Projects/srt/srt-xtransmit-multi/xtransmit/srt_socket.cpp:324 (srt-xtransmit+0x10b67a)
    #7 xtransmit::socket::srt::listen() /mnt/d/Projects/srt/srt-xtransmit-multi/xtransmit/srt_socket.cpp:104 (srt-xtransmit+0x10968a)
    #8 xtransmit::create_connection(std::vector<UriParser, std::allocator<UriParser> > const&, std::shared_ptr<xtransmit::socket::isocket>&) /mnt/d/Projects/srt/srt-xtransmit-multi/xtransmit/misc.cpp:70 (srt-xtransmit+0xd330f)
    #9 xtransmit::common_run(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, xtransmit::stats_config const&, xtransmit::conn_config const&, std::atomic<bool> const&, std::function<void (std::shared_ptr<xtransmit::socket::isocket>, std::atomic<bool> const&)>&) /mnt/d/Projects/srt/srt-xtransmit-multi/xtransmit/misc.cpp:145 (srt-xtransmit+0xd37f6)
    #10 xtransmit::receive::run(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, xtransmit::receive::config const&, std::atomic<bool> const&) /mnt/d/Projects/srt/srt-xtransmit-multi/xtransmit/receive.cpp:141 (srt-xtransmit+0xe190f)
    #11 main /mnt/d/Projects/srt/srt-xtransmit-multi/xtransmit/xtransmit-app.cpp:231 (srt-xtransmit+0x127092)

Status

Some ongoing work is in PR #2961.

@maxsharabayko maxsharabayko added Type: Bug Indicates an unexpected problem or unintended behavior [core] Area: Changes in SRT library core labels Jul 1, 2024
@maxsharabayko maxsharabayko added this to the v1.5.4 milestone Jul 1, 2024
@ethouris
Copy link
Collaborator

ethouris commented Jul 1, 2024

This is a known problem, but not much of a solution has been proposed so far. Fortunately the situation executed with different lock ordering can't happen simultaneously, except for a situation when trying to close the socket in the listener callback handler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
2 participants