Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC, WIP: public function to find and return bad channels #214

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

scott-huberty
Copy link
Member

@scott-huberty scott-huberty commented Dec 24, 2024

RFC (Request For comments)

I am opening this as a PR instead of an issue just so that there is a workable example of what I am proposing, to facilitate discussion.

Background

When using the flag_channels_by_fixed_threshold method in the pipeline class, users are encouraged to specify a threshold that is appropriate for their data. However, finding the right threshold often involves a process of trial and error, and I am increasingly finding this to be tedious with the current API.

To streamline this process, I propose adding a standalone function that takes an epochs object and returns a list of "bad" channels based on a given threshold. This would allow users to iterate more quickly in determining an optimal threshold. Then, the method flag_channels_fixed_threshold just wraps this function, so that we aren't duplicating code.

Example

I.e. instead of doing this:

import mne
import pylossless as ll

fname = mne.datasets.sample.data_path() + '/MEG/sample/sample_audvis-ave.fif'
raw = mne.io.read_raw_fif(fname)

config = ll.Config().load_default()
config["flag_channels_fixed_threshold"] = {"threshold": 5e-5}
pipeline = ll.Pipeline(config)
pipeline.raw = raw
pipeline.flag_channels_by_fixed_threshold(epochs)
pipeline.flags["ch"].items()

You would like to be able to do

import pylossless as ll
import mne

fname = mne.datasets.sample.data_path() + '/MEG/sample/sample_audvis-ave.fif'
raw = mne.io.read_raw_fif(fname)
epochs = mne.make_fixed_length_epochs(raw)
for threshold in [1e-5, 5e-5, 1e-4]:
    print(ll.find_bad_channels_by_threshold(epochs, threshold))

If we find this design to be helpful, it could eventually implement it for the other pipeline steps. But for now, flag_channels_fixed_threshold isn't even a part of the default pipeline (we only implemented it in order to replicate the MATLAB pipeline for the paper), and so I think that it makes for a good guinea pig for this design (since few if any users probably use it).

@christian-oreilly WDYT?

Copy link

codecov bot commented Dec 24, 2024

Codecov Report

Attention: Patch coverage is 14.28571% with 6 lines in your changes missing coverage. Please review.

Project coverage is 83.19%. Comparing base (c985968) to head (35db8bd).

Files with missing lines Patch % Lines
pylossless/pipeline.py 14.28% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #214      +/-   ##
==========================================
- Coverage   83.43%   83.19%   -0.24%     
==========================================
  Files          26       26              
  Lines        1696     1702       +6     
==========================================
+ Hits         1415     1416       +1     
- Misses        281      286       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@christian-oreilly
Copy link
Collaborator

Please note that it is not flag_channels_by_fixed_threshold but flag_channels_fixed_threshold in the code.

Seems reasonable to me.

As is often necessary with processing steps using thresholds, we will probably want to implement at some point a function that estimate the optimal threshold based on the properties of the data, rather than using a fixed and arbitrary value. That is more or less what you are saying you are doing with a trial-and-error process for now.

@scott-huberty
Copy link
Member Author

As is often necessary with processing steps using thresholds, we will probably want to implement at some point a function that estimate the optimal threshold based on the properties of the data, rather than using a fixed and arbitrary value.

That would be awesome.

OK, I'll clean this up and get the CI's green, and then request a review from you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants