-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
really long merge time on big networks #5
Comments
Could you make a list of all the definitions / requirements that are present before and after the transition ? It would help creating a testcase to profile what goes wrong. |
Running thought a controller:
Startet in the syskit-shell:
the statemachine minimal_demo was in state
then the event "align_auv" happen, which causes a success_event in find_pipe_with_localization, which should start in minimal_demo this state:
|
I need to know which definitions were running before AND after the |
You mean also all resolved definition by syskit? -> this could be a LOT (including instance requirements?) the state machine find_pipe_with_localization finished and was in state pipe_detector, (which is the only state of this machine btw) Or how do i figure out what you mean with definitions? |
I mean instance requirements that are active at the time of the switch. Basically, I want to know what syskit had deployed before the transition The problem is with before the transition (since it uses a 2014-07-04 16:13 GMT+02:00 Matthias Goldhoorn [email protected]:
|
How i can gather these information? do i have to extract them out of the roby-logs? Best, On 04.07.2014 16:27, Sylvain Joyeux wrote:
|
Well, you can get them by looking at your state machine definitions. They are basically the last state of find_pipe_with_localization plus any depends_on you have declared plus the ones you start manually. |
Öhm, you are confusing me, this is what i wrote in the third comment.
and after
|
Okay we got a new negative record. This caused a reconfiguration of 2 Minutes, and make after that the SV become completly insane. The Text output was:
|
To correct myself, we used another state-machine, but this part of the other statemachine is the same then in the old explained one... And i have the patch as poroposed in #3 active. |
Well .. correct yourself one more time. Yes, syskit takes really too long to compute these changes. But syskit's algorithm is deterministic. You ask him the same transition, it is going to take the same time +/- 10% .... So I went looking somewhere else and found out that I am logging CPU time spent in Roby/Syskit. From the Roby logs, it seems that you are starving the syskit/roby process. In the 75s cycle in which the switch happened, syskit only took 21s of CPU time (which matches your current - bad - experience with long switches). You can generate this data with
You'll have to remove the first lines that are not CSV-compliant before you can feed that into localc or something equivalent. The orogen_loaders branch take 1.8s to switch on a system that is roughly equivalent to yours (Core i7 2.0 GHz). I don't think it is worth our collective time that we continue discussing on this thread until we get orogen_loaders to work for you. The patch in #3 is wrong (broken, as you can see in the output you just posted). I'll answer directly there. |
Well for the same transision syskit indeed the same time. Nevertheless, i fully agree that we spend out concentration to orogen-loader and fix them soon instead triing to debug this old. We can posprune this issue after the orogen_loader merge if the merge will not fix this issue... Best, |
Yeah ... and did you consider that maybe adding the loggers could have some effect on the I/O ? Given that Roby only used 21s of CPU, that would definitely be my guess. If you have not turned off debug logging during the merge, that would definitely hit you. It could also make the main Roby thread wait for the logger. Try doing a x10 to x20 on LOGGED_EVENTS_QUEUE_SIZE and see if it improves the situation. Bottom line: the 2 minutes switch is most probably not a syskit problem, and I would even argue that it is not a Roby one. This is a system problem, you're getting low on some ressource and starving the roby process one way or the other. |
I'm closing this. This particular issue got a lot better, even though the merge algorithm has still some complexity problems for graphs with a lot of redundant loops. |
On some cases the network generation tooks really edges.
I uploaded a logfile http://auv.informatik.uni-bremen.de/framework/logs/long-reconfiguration-time/ which shows a 15 seconds time for generating the new network.
The CPU load on this single CPU quadcore I7 system is roughly 1.00 the normal vehile processes took not more than 40% each. Only ruby jumps to ~100% if the event occurs.
In the attached logfile the iteration is 2084 at 16:09:30 where the pipeline detector detecs a pipe and therefore the statemachines switches to follow pipe.
This is part of the avalon-profile "find_pipe_with_localiation" in this file: https://github.com/auv-avalon/bundle/blob/interrim_working/models/actions/core.rb
The text was updated successfully, but these errors were encountered: