-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add functions to get cpu usage from /proc/stat #49
Conversation
Reviewer's Guide by SourceryThis pull request adds functions to measure and report CPU usage by reading from /proc/stat, enabling the recording and calculation of average CPU load over time. It also introduces dynamic worker coproc spawning based on CPU usage and read queue length, adds the ability to read input from a file descriptor other than stdin, and implements a custom bash loadable builtin for more efficient stdin reading on x86_64 platforms. File-Level Changes
Tips
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @jkool702 - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider adding error handling for file operations and arithmetic calculations to improve robustness.
- Minimize the use of global variables and consider passing state between functions to improve modularity and testability.
- Be cautious of potential integer overflow in calculations, especially when dealing with cumulative values over time.
Here's what I looked at during the review
- 🟡 General issues: 4 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.
forkrun.bash
Outdated
|
||
} | ||
|
||
load_toc() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Add error handling for potential failures when reading from /proc/stat
Consider adding checks to ensure the read operation succeeds and provide appropriate error messages if it fails.
load_toc() { | |
load_toc() { | |
if ! [[ -r /proc/stat ]]; then | |
echo "Error: Unable to read /proc/stat" >&2 | |
return 1 | |
fi |
forkrun.bash
Outdated
tLOAD1=$(( cpu_LOAD1 - cpu_LOAD0 )) | ||
tALL1=$(( cpu_ALL1 - cpu_ALL0 )) | ||
|
||
(( tALL00 > ( 10 * tALL1 ) )) && tALL00=$(( 10 * tALL1 )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Clarify the purpose and logic behind the tALL00 update
The reasoning behind multiplying by 10 and the overall purpose of this operation is not clear. Consider adding a comment explaining the logic or refining the approach if possible.
forkrun.bash
Outdated
# 0 --> no load 1000000 --> 100% load | ||
# optional: pass argument to use a different number for 100% load | ||
|
||
local -i loadMaxVal="${1:-1000000}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Clarify the usage of loadMaxVal parameter
The comment describes loadMaxVal as representing 100% load, but it's used as a scaling factor. Consider updating the comment or renaming the variable to better reflect its purpose.
local -i loadScaleFactor="${1:-1000000}"
forkrun.bash
Outdated
|
||
(( tALL00 > ( 10 * tALL1 ) )) && tALL00=$(( 10 * tALL1 )) | ||
|
||
pLOAD=$(( ( loadMaxVal * tLOAD1 ) / tALL1 )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (bug_risk): Consider adding checks for potential integer overflow
The arithmetic operations could potentially lead to integer overflow. Consider adding checks or using a different approach to handle large values safely.
pLOAD=$(( ( loadMaxVal * tLOAD1 ) / tALL1 )) | |
if (( tALL1 != 0 )); then | |
if (( loadMaxVal > (2**63-1) / tLOAD1 )); then | |
pLOAD=$((2**63-1)) | |
else | |
pLOAD=$(( ( loadMaxVal * tLOAD1 ) / tALL1 )) | |
fi | |
else | |
pLOAD=0 | |
fi |
…r of coprocs to spawn
…rom file descriptor other than stdin. existing '-u' flag changed to '-U'.
@sourcery-ai review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @jkool702 - I've reviewed your changes and they look great!
Here's what I looked at during the review
- 🟡 General issues: 4 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟡 Complexity: 1 issue found
- 🟡 Documentation: 5 issues found
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.
|
||
: "${pLOAD_max:=9500}" "${nProcsMax:=$((2*${nCPU}))}" "${nQueueLastCountGoal:=5}" | ||
|
||
mapfile -t pLOADA < <(_forkrun_get_load -i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (performance): Evaluate performance impact of _forkrun_get_load function
The new _forkrun_get_load function adds complexity to the worker spawning logic. It's important to ensure this doesn't introduce any significant performance overhead. Consider profiling the script with and without this change to quantify its impact.
mapfile -t pLOADA < <(_forkrun_get_load -i) | |
local pLOADA | |
if command -v _forkrun_get_load_cached >/dev/null 2>&1; then | |
pLOADA=($(_forkrun_get_load_cached -i)) | |
else | |
mapfile -t pLOADA < <(_forkrun_get_load -i) | |
fi |
|
||
export -fp _forkrun_get_load &>/dev/null && export -nf _forkrun_get_load | ||
|
||
_forkrun_get_load() ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (performance): Consider optimizing _forkrun_get_load function for efficiency
The _forkrun_get_load function is called frequently to determine when to spawn new workers. Given its importance, it's crucial to ensure it's as efficient as possible. Consider reviewing the function for any potential optimizations, particularly in how it reads and processes data from /proc/stat.
_forkrun_get_load() {
local -a cpu_stats
read -ra cpu_stats < /proc/stat
awk -v idle="${cpu_stats[4]}" -v total="${cpu_stats[@]:1}" '
BEGIN {printf "%.2f", (1 - idle / total) * 100}'
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The efficiency of _forkrun_get_load is less critical than you might think. This doesnt even come into play until some initial worker coprocs are forked, and (since this is being called from a coproc itself) the only think that is "slowed down" by waiting for this is how rapidly additional coprocs are forked. Additionally, dynamically determining coproc count is more about reducing CPU usage than it is about reducing the runtime, so slowing down how rapidly now worker coprocs are forked is actually good for efficiency (reducing CPU usage) and helps ensure not too many are forked (since once they are forked they run until the end)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the clarification, @jkool702. It's great to hear that the dynamic coprocess count adjustment is more focused on optimizing CPU usage rather than just speed. Your explanation makes sense, especially since the function is called from a coprocess and the goal is to manage resource usage efficiently. If the current implementation meets the performance and efficiency goals of the project, then it sounds like it's well-suited for your needs. However, if you ever notice any performance bottlenecks in the future, revisiting this function for potential optimizations could be beneficial. Let me know if there's anything else you'd like to discuss or if you need further assistance!
@@ -72,7 +92,7 @@ C1[5]=' >/dev/null' | |||
mkdir -p "${hfdir0}"/file_lists | |||
|
|||
nArgs=(1024 4096 16384 65536 262144 1048576) | |||
|
|||
cksumAlgsA=(sha1sum sha256sum sha512sum sha224sum sha384sum md5sum "sum -s" "sum -r" cksum b2sum "cksum -a sm3" xxhsum "xxhsum -H3") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Consider parameterizing the list of checksum algorithms
To make the benchmark script more flexible, consider allowing the list of checksum algorithms to be passed as a parameter or read from a configuration file. This would make it easier to add or remove algorithms without modifying the script.
# Read checksum algorithms from a config file
cksumAlgsFile="checksum_algorithms.txt"
mapfile -t cksumAlgsA < "$cksumAlgsFile"
# Fallback to default algorithms if file is empty or doesn't exist
if [ ${#cksumAlgsA[@]} -eq 0 ]; then
cksumAlgsA=(sha1sum sha256sum sha512sum sha224sum sha384sum md5sum "sum -s" "sum -r" cksum b2sum "cksum -a sm3" xxhsum "xxhsum -H3")
fi
@@ -3,18 +3,23 @@ | |||
`forkrun` is an *extremely* fast pure-bash function that leverages bash coprocs to efficiently run several commands simultaniously in parallel (i.e., it's a "loop parallelizer"). | |||
|
|||
`forkrun` is used in much the same way that `xargs` or `parallel` are, but is faster (see the `hyperfine_benchmark` subdirectory for benchmarks) while still being full-featured and only requires having a fairly recent `bash` version (4.0+) to run<sup>1</sup>. `forkrun`: | |||
* offers more features than `xargsd` and is mildly faster than it's fastest invocation (`forkrun` without any flags is functionally equivilant to `xargs -P $*(nproc) -d $'\n'`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (documentation): Fix typo: 'equivilant' should be 'equivalent'
* offers more features than `xargsd` and is mildly faster than it's fastest invocation (`forkrun` without any flags is functionally equivilant to `xargs -P $*(nproc) -d $'\n'`) | |
* offers more features than `xargsd` and is mildly faster than it's fastest invocation (`forkrun` without any flags is functionally equivalent to `xargs -P $*(nproc) -d $'\n'`), |
: >"${tmpDir}"/.spawned | ||
|
||
(( ${verboseLevel} > 1 )) && printf '\n\n%s WORKER COPROCS FORKED\n\n' "${nProcs}" >&${fd_stderr} | ||
|
||
# setup dynamically coproc to spawn new workers based on read queue length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider simplifying the dynamic worker spawning implementation to reduce code complexity and improve maintainability.
While the introduction of dynamic worker spawning could potentially improve resource utilization, the current implementation adds significant complexity to the code. Consider the following suggestions to simplify and improve maintainability:
- Extract the load calculation logic into a separate, well-documented function or file. This will keep the main forkrun function cleaner and easier to understand.
# In a separate file, e.g., load_calculator.sh
calculate_system_load() {
# Implementation of _forkrun_get_load goes here
}
# In the main forkrun file
source load_calculator.sh
# Use the function in pQueue coproc
calculate_system_load "${pLOADA[@]}"
- Consider using a simpler heuristic for worker spawning. The current implementation might be overly complex for most use cases. For example:
spawn_new_worker() {
local current_workers=$1
local queue_depth=$2
local max_workers=$3
if ((current_workers < max_workers && queue_depth > 1)); then
return 0 # Spawn new worker
else
return 1 # Don't spawn
fi
}
# Use in pQueue coproc
if spawn_new_worker "${kkProcs}" "${nQueue}" "${nProcsMax}"; then
source /proc/self/fd/0 <<<"${coprocSrcCode//'{<#>}'/"${kkProcs}"}"
((kkProcs++))
fi
- Simplify the pQueue coproc by breaking it into smaller, more focused functions:
handle_queue_input() {
# Handle input from fd_nQueue
}
manage_worker_count() {
# Logic for spawning/removing workers
}
# In pQueue coproc
while true; do
handle_queue_input
manage_worker_count
done
These changes would maintain the new dynamic scaling functionality while significantly reducing the complexity and improving the readability of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1st suggestion: Im trying to keep forkrun "self-contained" in a single file as much as possible. Also, sourcing the function to compute load when forkrun is sourced means it is already sourced when forkrun runs, helping to (slightly) reduce runtime.
2nd suggestion: using the simpler spawning protocol doesnt work well in practice. The issue is that once a coproc is spawned it is active until the end...there is no "un-spawning" or removing it. So, we need to be more selective about when to spawn them, otherwise every time forkrun is run you will just end up spawning "$max_workers" worker coprocs...which sort of defeats the point of dynamically adjusting how many workers to spawn, since "always spawning $max_workers" isnt dynamic.
3rd suggestion: for reasons related to "ensuring there arent any processes left running when the end condition is met and forkrun exits", it is (in my opinion) somewhat prefferable to spawn new coprocs directly from the forkrun function (or one of its coproc children) and not from another function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing your thoughts, @jkool702. I understand your perspective on keeping forkrun
self-contained and the challenges with dynamically managing coprocesses. Here are some additional thoughts:
-
Self-Contained File: It's perfectly reasonable to want to keep everything in a single file for ease of distribution and usage. If you decide to keep the load calculation within the same file, consider using a dedicated section or comments to clearly separate and document this logic. This can help maintain readability without splitting the code into multiple files.
-
Dynamic Worker Spawning: Given the constraints of not being able to "un-spawn" workers, your approach to being selective about when to spawn them makes sense. It might be helpful to document this behavior clearly in the code comments or documentation, so future maintainers understand the rationale behind the complexity.
-
Direct Coproc Management: Your reasoning for managing coprocesses directly within
forkrun
to ensure clean exits is valid. If breaking it into smaller functions isn't feasible, consider using inline comments to explain the flow and any critical logic. This can help others (or even yourself in the future) quickly grasp the purpose and flow of the code.
Overall, it sounds like you've thought through these challenges carefully. If you have any specific parts of the code that you're still unsure about or need further feedback on, feel free to share!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a future update i might even add a stochastic element to spawning the coprocs. e.g., when all the existing conditions are met, get a random number and if it is above some cutoff spawn the coproc. The cutoff would change such that initially you were almost 100% guaranteed to get a random number above the cutoff, but as you get closer and closer to having "$max_workers" coprocs the chances of spawning another decrease more and more.
Its surprisingly tricky to get the number of additional workers dynamically spawned to not always end up at either 0 or at $max_workers...it is really hard to find criteria that stabalize this process so you end up somewhere in the middle (after potentially 10's of thousands of iterations, if not more) in a way that makes sense for dynamically adjusting the number of coprocs...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like an interesting approach, @jkool702! Introducing a stochastic element could indeed help in achieving a more balanced distribution of coprocesses, avoiding the extremes of either 0 or max workers. It could add a layer of flexibility and adaptability to the system, especially in environments with fluctuating loads.
When implementing this, you might want to consider:
-
Tuning the Probability Function: Carefully design the probability function to ensure it adapts well to different load conditions. You might want to experiment with different functions or thresholds to find the right balance.
-
Logging and Monitoring: Implement logging to track how often new workers are spawned and under what conditions. This can provide valuable insights into how well the stochastic approach is working and help in fine-tuning the parameters.
-
Testing: Conduct thorough testing under various load scenarios to ensure that the stochastic element behaves as expected and doesn't introduce any unintended side effects.
-
Documentation: Clearly document this new behavior, including the rationale and how it affects the spawning logic. This will be helpful for anyone else working with the code in the future.
If you decide to pursue this, it could be a great enhancement to the dynamic worker management system. Feel free to reach out if you need any further feedback or assistance as you develop this feature!
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
@sourcery-ai can you combine the various "typo correction" edits you just suggested into a single commit? |
I'm sorry, I don't understand the command Please use |
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Summary by Sourcery
Add functions to compute CPU usage from /proc/stat and dynamically adjust coprocess count based on system load. Introduce support for reading input from a specified file descriptor. Enhance performance by using lseek for efficient data reading on x86_64 platforms. Update documentation to reflect these changes.
New Features:
Enhancements:
Documentation: