Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in MultiThreadedExecutor #1393

Open
felixdivo opened this issue Jan 4, 2025 · 4 comments
Open

Race condition in MultiThreadedExecutor #1393

felixdivo opened this issue Jan 4, 2025 · 4 comments

Comments

@felixdivo
Copy link
Contributor

felixdivo commented Jan 4, 2025

Bug report

Required Info:

  • Operating System:
    • Ubuntu 24.04.1
  • Installation type:
    • Binaries
  • Version or commit hash:
  • DDS implementation:
    • unknown
  • Client library (if applicable):
    • rclpy

Steps to reproduce the issue

Documented here: felixdivo/ros2-easy-test#45. It is not 100% reproducible, yet it occurs frequently in my repo.

Expected behavior

No exception is raised.

Actual behavior

self = <rclpy.executors.MultiThreadedExecutor object at 0x7fb80c2f32f0>
timeout_sec = <rclpy.executors.TimeoutObject object at 0x7fb80c41cb90>
wait_condition = <bound method Future.done of <rclpy.task.Task object at 0x7fb80c2f30b0>>

    def _spin_once_impl(
        self,
        timeout_sec: Optional[Union[float, TimeoutObject]] = None,
        wait_condition: Callable[[], bool] = lambda: False
    ) -> None:
        try:
            handler, entity, node = self.wait_for_ready_callbacks(
                timeout_sec, None, wait_condition)
        except ExternalShutdownException:
            pass
        except ShutdownException:
            pass
        except TimeoutException:
            pass
        except ConditionReachedException:
            pass
        else:
            self._executor.submit(handler)
            self._futures.append(handler)
            # make a copy of the list that we iterate over while modifying it
            # (https://stackoverflow.com/q/1207406/3753684)
            for future in self._futures[:]:
                if future.done():
>                   self._futures.remove(future)
E                   ValueError: list.remove(x): x not in list

/opt/ros/rolling/lib/python3.12/site-packages/rclpy/executors.py:895: ValueError

Additional information

Previously, I had similar issues caused by race conditions reported in #1129. This again appears to be caused by multiple threads simultaneously acting on the executor. Is there some unintended usage by me?

Possible solution

If not, I suggest going with the it is "Easier to ask for forgiveness than permission" and just wrap the self._futures.remove(future) in a try-except block catching and ValueErrors. I can also do that if you confirm it is a reasonable path forward.

@Barry-Xu-2018
Copy link
Contributor

This again appears to be caused by multiple threads simultaneously acting on the executor. Is there some unintended usage by me?

In your test, is it possible to call _spin_once_impl() simultaneously from different threads for the same executor ?

@felixdivo
Copy link
Contributor Author

felixdivo commented Jan 6, 2025

I thought not; that's why I was skeptical. But upon further inspection, there is a case!

  1. One place is here, where the executor spins until the test case is complete.
  2. The other one is in the test case, where waiting for the action to conclude eventually calls spin_until_future_complete() too.

They necessarily run in parallel. Is that intended usage of a rclpy.executors.MultiThreadedExecutor? The documentation around its thread safety is a bit ... thin.

@Barry-Xu-2018
Copy link
Contributor

They necessarily run in parallel. Is that intended usage of a rclpy.executors.MultiThreadedExecutor?

For the same executor, spin_until_future_complete() should not be run in parallel. So I think this isn't the correct way to use it.

@felixdivo
Copy link
Contributor Author

Hmm, is there some documentation on what is thread-safe and what isn't? I must've missed it if it exists.

I'm honestly a bit lost on properly doing this kind of thing if not that way. Is there some other way in which one can wait for a Future to be available that is not busy-waiting?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants