-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ignore case in original extension #183
Conversation
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here. PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here. PR Code Suggestions ✨Explore these optional code suggestions:
|
@@ -59,7 +60,7 @@ def _removesuffix(x: str, suffix: str, /) -> str: | |||
|
|||
raw_bkp = raw | |||
for current_extension, target_extension in extension_convert_dict.items(): | |||
if raw.endswith(current_extension): | |||
if raw.lower().endswith(current_extension.lower()): | |||
raw = _removesuffix(raw, current_extension) | |||
raw += target_extension | |||
if not any(raw.endswith(x) for x in possible_extension): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you will need to change this,too in case the user uses RAW and does no conversion of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I assume that the final openms experimental design file has the same suffix. Is that right? For example
Example 1: raw:mzML
:
1.raw -> 1.mzML
2.Raw -> 2.mzML
3.RAW -> 3.mzML
Example 2: raw:RAW
:
1.raw -> 1.RAW
2.Raw -> 2.RAW
3. RAW -> 3.RAW
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also we assumed only suppoted four format in the ending file: "raw", "mzML", "mzml", "d"
sdrf-pipelines/sdrf_pipelines/openms/openms.py
Lines 66 to 67 in d87d22c
if not any(raw.endswith(x) for x in possible_extension): | |
raise RuntimeError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am not mistaken your example raw:RAW will fail.
If you only allow ".raw" as possible extension you have to add RAW:raw or raw:raw as a default conversion to the pipeline which is a bit weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not allow the possible extensions case-insensitive then? (i.e. "raw", "mzml", "d" in any possible casing)
Would solve a lot of problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends a bit on if downstream tools allow different cases. (e.g. does DIANN allow .D?)
Is this conversion to openms experimental design even used in the DIA branch of the pipeline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good ideas. I need to test it. Wait a while. Also is ProteomicsLFQ allowed different cases? For example input .mzML
file, but in the experimental design file it's mzml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ProteomicLFQ expects exact matches of the full filename of the files that you pass as -in and what is written in the experimental design (including extension, excluding paths).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DIA-NN don't support .D
format. So I suggest the possible extensions is case-sensitive. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this conversion to openms experimental design even used in the DIA branch of the pipeline?
Yes. It also used in DIA branch https://github.com/bigbio/quantms/blob/b7c9b6a416723c4210effd3889dadce3d9032938/workflows/quantms.nf#L55
@ypriverol I think this was too quick |
Another related question: Do we need to support this format ? bigbio/quantms#430 |
@daichengxin I have asked how the |
PR Type
enhancement, documentation
Description
_removesuffix
function to handle suffixes in a case-insensitive manner.parse_sdrf.py
to reflect the case-insensitive nature of extension conversion.Changes walkthrough 📝
__init__.py
Bump version to 0.0.32
sdrf_pipelines/init.py
openms.py
Implement case-insensitive extension handling
sdrf_pipelines/openms/openms.py
_removesuffix
function to ignore case when checking suffix.parse_sdrf.py
Update documentation for case-insensitive extension conversion
sdrf_pipelines/parse_sdrf.py