Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processes only need steps for large speedup #34

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

xyzzy42
Copy link
Contributor

@xyzzy42 xyzzy42 commented Apr 2, 2021

I measured this as reducing CPU usage from ~78% to ~45%. Almost doubling the speed, which is expected as described below.

Tg processes the audio in a series of steps that double in size: 2, 4, 8, and then 16 seconds long. This is the one to four dots shown in the display.

Tg starts at step 1 (2 seconds) and goes up, stopping if a step doesn't pass. Data from smaller steps isn't used if a larger step passes.

This means when running at a constant step 4 with good signal, CPU usage is almost double what it needs to be, since steps 1-3 are processed and unused and add up to 14 seconds, almost as much as step 4's 16 seconds.

Change this to start at the previous iteration's step. With good signal, only step 4 will be processed and CPU usage is cut almost in half. If the previous step fails, it will try smaller steps. If it passes and is not yet the top step, it will try larger steps.

End result should end up on the same step as before, but get there sooner. It could be slower if the step drops a lot, e.g. from 4 to 0, but this happens far less often than the step saying the same or nearly the same.

I also change the dot display to consistently indicate averaging interval and amplitude measurement. It did both before in an inconsistent way.

See commit messages for more details.

xyzzy42 added 2 commits April 1, 2021 17:44
I measured this as reducing CPU usage from ~78% to ~45%.  Almost
doubling the speed, which is expected as described below.

Tg processes the audio in a series of steps that double in size:  2, 4,
8, and then 16 seconds long.  This is the one to four dots shown in the
display.

Tg starts at step 1 (2 seconds) and goes up, stopping if a step doesn't
pass.  Data from smaller steps isn't used if a larger step passes.

This means when running at a constant step 4 with good signal, CPU usage
is almost double what it needs to be, since steps 1-3 are processed and
unused and add up to 14 seconds, almost as much as step 4's 16 seconds.

Change this to start at the previous iteration's step.  With good
signal, only step 4 will be processed and CPU usage is cut almost in
half.  If the previous step fails, it will try smaller steps.  If it
passes and is not yet the top step, it will try larger steps.

End result should end up on the same step as before, but get there
sooner.  It could be slower if the step drops a lot, e.g. from 4 to 0,
but this happens far less often than the step saying the same or nearly
the same.

To do this, I moved the step logic out of analyze_pa_data() and entirely
into compute_update().  analyze_pa_data() will now just processes one
step and compute_update() decides which step(s).  Previously,
analyze_pa_data() did all steps and compute_update() decided which step
to actually use.

There is a small change to the algorithm.  To be good, a step needs to
pass a number of changes done in process().  Then there is one more
check, of sigma vs period, done in compute_update().

Previously a step didn't need to pass the sigma check and it still
counted enough to increase last_tic and show up as a signal dot in the
display.  But the data wasn't actually used unless it passed to sigma
check too.

I no longer keep track of "partially" passing steps like this.  Either
it passes all checks, including the sigma check, or not.  last_tic and
signal level only count fully passing steps.

I see no practical difference in my tests, but I think it could show up
with some kind of marginal signal that has a high error in period
estimation with the longer steps.
Previously max signal level (4 dots) but with no amplitude measured was
counted as signal level 3.  But level 3 or lower with no amplitude sill
counts as the same level.

Have compute_update() no longer do this signal level adjustment and just
report the level used, which indicates the averaging interval.

The dot graphic will now indicate "no amplitude" by using a hollow dot
for the final signal level's dot.

This way the number of dots always shows the averaging interval and a
hollow dot always shows that the signal is too poor to measure
amplitude.
@xyzzy42 xyzzy42 force-pushed the trentpi/faster-processing branch from e199e2d to 87e0622 Compare April 2, 2021 00:44
When there is just one period in the processing buffer it is not
possible to calculate the sample standard deviation (sigma).  In this
case, the period value was being used as the sigma estimate, which
effectively gives a huge sigma when there is 1 period in the buffer.

This results in the processing buffer being rejected as bad and a larger
buffer, which would have multiple periods, is never tried.

One period per buffer would happen with a combination of long period and
short buffer, e.g. a small BPH in light mode, since light mode uses
buffers half as long as normal mode.

Use 0 as the sigma value for 1 sample.  This means the buffer will pass
the sigma check and a larger buffer will be tried.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant