-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Balance ionice idle priority #66
base: master
Are you sure you want to change the base?
Conversation
use ionice to make balance less disruptive.
Unfortunatelly this won't work as expected. The ionice priority does not apply to the the threads that do the IO or CPU load as this is asynchronous and done by the kernel threads. And priority for them can't be set like this. Same problem with scrub, the throttling would have to be done on the side where the IO is started, which means code changes. |
sounds logical - but why is the process itself consuming so much cpu if it is asynchronous? I can also see ios via iotop of the process. This was why i changed it and for my it worked since then not freezing my pcs anymore that much. btw. nice didn't help for me - it had to be ionice. Is there any link to the process and the kernel, that may be affected that way? |
I would also think that the niceness of a user process has to be passed to the kernel code doing the actual work since the user code can't access any resources - it always has to pass the work to some kernel code. @kdave Can you add some proof (maybe links to the kernel code which does the rebalance) to your argument? |
I would suggest using cgroups to control io weight rather than ionice. With cgroups the resulting kernel threads can, AFAIK, inherit the io.weight settings. If not, they could be added with the script. |
The relocation component of balance runs in process context, so ionice would have the intended effect there. What it won't touch is anything that happens in transaction context. With qgroups and lots of reflinked extents (or snapshots) that can also have a big impact on the system. |
FWIW, the proposed changes are effectively redundant to the service setting. |
So systemd has either a but that it doesn't pass those to the kernel or the kernel has a bug where it ignores those options for the code btrfsmaintenance calls or maybe both. |
Re transaction context: I'm not exactly sure what we're talking about here but would it be possible to throttle transactions? Or is the problem that some transactions just become too big (and the kernel hangs in a lock until it's done)? |
I read about cgroup2 via https://code.fb.com/open-source/linux/. Btrfs is the only supported filesystem :-) More docs here: https://facebookmicrosites.github.io/cgroup2/docs/io-controller.html I wonder if they use it internally to mitigate this (#66) issue? If not, maybe it would be possible to contact the author[s] of the cgroup2 patches and ask them to add support? In particular I'm very excited about future uses of the |
Wouldn't an I/O scheduler also be a way to handle it? BFQ uses cgroup2 as well iirc. I haven't adopted BTRFS yet, so I've not personally experienced this issue. I do find it confusing that users are claiming a fix is to do something that the systemd service file is already doing? Can you check what the niceness is prior and confirm that systemd didn't apply it? |
It's a bit hard to get direct confirmation (i.e. check this process status while it is running) for me because when it is running the system is more or less frozen until it has done its task. But I can reproduce the freezing by triggering the script again manually (which runs it without any nice and ionice settings), while if I run it with nice -n 19 ionice -c3 then it runs without freezing. |
Try Share what priority it had before, might show that idle was not applied for some reason? |
Not possible, this is a service run on a schedule, and when it is running the system is frozen. I can still get that though. I'll add this line in the script so it can log its own priority for me in a file in my home folder, ps -l $$ > /my/home/folder/btrfsmaintenance.log then trigger it by changing its schedule in systemd. |
You can just run the service on demand? Out of curiosity, what is the disk type? HDD? (capacity, RPM?) or SSD?(while advice generally says not to defrag these, the BTRFS wiki mentions fragmentation manifests as higher CPU usage). Do you know how many snapshots your system has? I've also heard those can contribute at higher numbers to performance issues. While setting to idle priority is working directly for you, I'm just curious if the balance alone is causing the high usage or if it's possibly a mix of the above built up over time. Looking forward to the priority logged when you test next :) |
Beware that the service changes scheduling policy (cf. priority), i.e. Also, from the reports it seems that lower IO scheduling has little effect (for various reasons) but CPU policing affects the "freezing" outcome (although it's not clear what the issue is, some describe it as excessive IO, others as hogging a CPU -- those might be two issues actually [1]). So if anyone gets down to testing, I'd suggest besides tinkering CPU scheduling policy above the following.
(Run Nice has only relative effect within one cpu cgroup (so snippet below would only work if no cpu hierarchy is created (can be seen e.g. as CPU granularity in
[1] And the solution for either of them may be different than enclosing the load under some constraints. |
Running balance makes a lot of IO- load, rendering most home office pcs useless for some minutes. This enhancement will make balance to use idle IO- Priority to lower the impact very much.