Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly checking the return value of ipc_sendrecv_with_fds #624

Closed
wants to merge 2 commits into from

Conversation

deanlee
Copy link
Contributor

@deanlee deanlee commented Jun 26, 2024

This PR addresses an issue where the VIPC server closes the socket if the required stream type is unavailable:

if (buffers.count(type) <= 0) {
std::cout << "got request for invalid buffer type: " << type << std::endl;
close(fd);
continue;
}

This causes the ipc_sendrecv_with_fds function in the VIPC client to return 0. However, the client does not properly handle this error and instead relies on two assertions that are always true:
r = ipc_sendrecv_with_fds(false, socket_fd, &bufs, sizeof(bufs), fds, VISIONIPC_MAX_FDS, &num_buffers);
assert(num_buffers >= 0);
assert(r == sizeof(VisionBuf) * num_buffers);

This PR introduces a proper check to ensure that the return value of ipc_sendrecv_with_fds is correctly handled if it is less than or equal to 0. These changes resolve the following bugs:

1. Assertion in getAvailableStreams

the vipc client still can connect to the ipc path even if the camerad is not running. This could be due to the camerad not shutting down properly and the ipc path not being removed.

ipc_sendrecv_with_fds will returns -1 In this case. this causes the following assertion to fail.

ui: cereal/visionipc/visionipc_client.cc:133: static std::set VisionIpcClient::getAvailableStreams(const std::string &, bool): Assertion `(r >= 0) && (r % sizeof(VisionStreamType) == 0)' failed.

This bug can be reproduced by clicking “preview driver camera" button quickly and continuously in the UI interface.

2. vipcClient:connect will returns true even if an attempt is made to connect to a non-existent stream type
the ipc_sendrecv_with_fds will return 0 if required stream type is not available. return false in this case.

This bug can be reproduced by run watch3, and run replay --demo . quit replay after the road camera displayed in watch3. and run replay --demo --ecam --dcam. the watch3 will assert and quit:

watch3: msgq_repo/msgq/visionipc/visionipc_client.cc:96: VisionBuf *VisionIpcClient::recv(VisionIpcBufExtra *, const int): Assertion `packet->idx < num_buffers' failed.

3. crash in VideoWidget::vipcAvailableStreamsUpdated:

#2 0x00007f882bce5729 in __assert_fail_base (fmt=0x7f882be7b588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5618e8 "", file=0x5617ff "repeat-1", line=263, function=) at assert.c:92
#3 0x00007f882bcf6fd6 in __GI___assert_fail (assertion=0x5618e8 "", file=0x5617ff "repeat-1", line=263, function=0x5618be "\311?") at assert.c:101
#4 0x000000000047ebce in QString::QString(char const*) (this=0x7fffde080230, ch=0x2 <error: Cannot access memory at address 0x2>) at /usr/include/x86_64-linux-gnu/qt5/QtCore/qstring.h:700
#5 VideoWidget::vipcAvailableStreamsUpdated(std::set<VisionStreamType, std::less, std::allocator >) (this=Python Exception <class 'gdb.MemoryError'> Cannot access memory at address 0x0:
0x6546a8 StreamNotifier::instance()::notifier+8, streams=#6 0x000000000047e9a2 in VideoWidget::loopPlaybackClicked() (this=) at tools/cabana/videowidget.cc:198
#7 0x0000000001bb68e0 in ()
#8 0x0000000001bb68b0 in ()
#9 0x0000000001eb9680 in ()
#10 0x0000000001c4ef10 in ()
#11 0x00007fffde080240 in ()
#12 0x00007f882cca5b40 in () at /lib/x86_64-linux-gnu/libQt5Core.so.5
#13 0x00007f882ca10328 in QMetaObject::activate(QObject*, int, int, void**) () at /lib/x86_64-linux-gnu/libQt5Core.so.5

@deanlee deanlee force-pushed the fix_available_streams branch 2 times, most recently from db715bd to 2bada53 Compare June 26, 2024 10:21
@deanlee deanlee force-pushed the fix_available_streams branch from 2bada53 to c7d540a Compare August 8, 2024 13:55
@sshane
Copy link
Contributor

sshane commented Dec 17, 2024

@deanlee should we check for the specific errno in this case so we can still catch unexpected errors? (ECONNRESET)

@sshane
Copy link
Contributor

sshane commented Dec 17, 2024

Looks like there's another crash spamming driver view even with this PR:

(gdb) bt
#0  0x0000007fb4d40de0 in main_arena () from /lib/aarch64-linux-gnu/libc.so.6
#1  0x0000007fb3c16624 in ?? () from /lib/aarch64-linux-gnu/libffi.so.8
#2  0x0000007fb3c1383c in ?? () from /lib/aarch64-linux-gnu/libffi.so.8
#3  0x0000007fb50567c4 in ?? () from /lib/aarch64-linux-gnu/libwayland-client.so.0
#4  0x0000007fb505711c in ?? () from /lib/aarch64-linux-gnu/libwayland-client.so.0
#5  0x0000007fb5057444 in wl_display_dispatch_queue_pending () from /lib/aarch64-linux-gnu/libwayland-client.so.0
#6  0x0000007fafe36078 in EglWaylandWlWindowSurface::CommitBuffer(void*, unsigned int) () from /lib/aarch64-linux-gnu/libeglSubDriverWayland.so
#7  0x0000007fafe37e78 in EglWaylandUpdater::UpdaterThread() () from /lib/aarch64-linux-gnu/libeglSubDriverWayland.so
#8  0x0000007fb4c1597c in start_thread (arg=0x0) at ./nptl/pthread_create.c:447
#9  0x0000007fb4c7b7dc in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants