Skip to content

Commit

Permalink
Merge branch 'main' into ci_test_validate
Browse files Browse the repository at this point in the history
  • Loading branch information
mwalzer authored Jul 22, 2024
2 parents 119eeef + f0b232d commit 81ff832
Show file tree
Hide file tree
Showing 15 changed files with 224 additions and 28 deletions.
18 changes: 8 additions & 10 deletions .github/workflows/linkcheck.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,13 @@ jobs:
runs-on: ubuntu-latest
name: Check for broken links
steps:
- name: Check for broken links
id: link-report
uses: celinekurpershoek/[email protected]
- name: Setup Python
uses: actions/setup-python@v5
with:
# Required:
url: 'https://hupo-psi.github.io/mzQC/'
# optional:
honorRobotExclusions: false
ignorePatterns: 'github,google'
recursiveLinks: false # Check all URLs on all reachable pages (could take a while)
python-version: '3.10'
- run: pip3 install linkchecker
- name: Link Checker
id: link-report
run: linkchecker -ofailures --check-extern --ignore-url="\s*\.md" https://hupo-psi.github.io/mzQC/
- name: Get the result
run: echo "${{steps.link-report.outputs.result}}"
run: echo linkchecker/failures
13 changes: 7 additions & 6 deletions docs/pages/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,10 @@ permalink: /examples/
---

Here are a number of worked examples, that, each for its own use-case, go step-by-step through the different parts of a mzQC.
* [Individual run QC](individual-runs/)
* [QC sample mzQC](QC2-sample-example/)
* [in mzML](mzml-mzqc-example/)
* [Using USI with mzQC](USI-example/)
* [Sets of runs](set-of-runs/)
* [Batch correction](metabo-batches/)

- [Single mass spectrometry run](intro_run/)
- [Sets of runs](set-of-runs/)
- [QC sample mzQC](QC2-sample-example/)
- [in mzML](mzml-mzqc-example/)
- [Using USI with mzQC](USI-example/)
- [Batch correction](metabo-batches/)
44 changes: 44 additions & 0 deletions docs/pages/resource-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,50 @@ Metrics generating software with mzQC support:
- [OpenMS](https://github.com/OpenMS/OpenMS): Open-source software C++ library for LC/MS data management and analyses.
- [QCCalculator](https://github.com/bigbio/qccalculator): Python tool for base QC metric calculation from mzML, mzIdentML, and MaxQuant input files.
- [Yamato / SwaMe / Prognosticator](https://github.com/PaulBrack/Yamato): SWATH-MS QC metrics generation tools.
- [MsQuality](https://bioconductor.org/packages/release/bioc/html/MsQuality.html): An R Bioconductor package, which provides functionality to calculate quality metrics for mass spectrometry-derived, spectral data at the per-sample level. MsQuality relies on the mzQC framework of quality metrics. It supports the calculation of the following metrics.
- chromatography duration (MS:4000053)
- TIC quarters RT fraction (MS:4000054)
- MS1 quarter RT fraction (MS:4000055)
- MS2 quarter RT fraction (MS:4000056)
- MS1 TIC-change quartile ratios (MS:4000057)
- MS1 TIC quartile ratios (MS:4000058)
- number of MS1 spectra MS:4000059)
- number of MS2 spectra (MS:4000060)
- m/z acquisition range (MS:4000069)
- retention time acquisition range (MS:4000070)
- MS1 signal jump (10x) count (MS:4000097)
- MS1 signal fall (10x) count (MS:4000098)
- number of empty MS1 scans (MS:4000099)
- number of empty MS2 scans (MS:4000100)
- number of empty MS3 scans (MS:4000101)
- MS2 precursor intensity distribution Q1, Q2, Q3 (MS:4000116)
- MS2 precursor intensity distribution mean (MS:4000117)
- MS2 precursor intensity distribution sigma (MS:4000118)
- MS2 precursor median m/z of identified quantification data points (MS:4000152)
- interquartile RT period for identified quantification data points (MS:4000153)
- rate of the interquartile RT period for identified quantification data points (MS:4000154)
- area under TIC (MS:4000155)
- area under TIC RT quantiles (MS:4000156)
- extent of identified MS2 precursor intensity (MS:4000157)
- median of TIC values in the RT range in which the middle half of quantification data points are identified (MS:4000158)
- median of TIC values in the shortest RT range in which half of the quantification data points are identified (MS:4000159)
- MS2 precursor intensity range (MS:4000160)
- identified MS2 precursor intensity distribution Q1, Q2, Q3 (MS:4000161)
- unidentified MS2 precursor intensity distribution Q1, Q2, Q3 (MS:4000162)
- identified MS2 precursor intensity distribution mean (MS:4000163)
- unidentified MS2 precursor intensity distribution mean (MS:4000164)
- identified MS2 precursor intensity distribution sigma (MS:4000165)
- unidentified MS2 precursor intensity distribution sigma (MS:4000166)
- ratio of 1+ over 2+ of all MS2 known precursor charges (MS:4000167)
- ratio of 1+ over 2+ of identified MS2 known precursor charges (MS:4000168)
- ratio of 3+ over 2+ of all MS2 known precursor charges (MS:4000169)
- ratio of 3+ over 2+ of identified MS2 known precursor charges (MS:4000170)
- ratio of 4+ over 2+ of all MS2 known precursor charges (MS:4000171)
- ratio of 4+ over 2+ of identified MS2 known precursor charges (MS:4000172)
- mean MS2 precursor charge in all spectra (MS:4000173)
- mean MS2 precursor charge in identified spectra (MS:4000174)
- median MS2 precursor charge in all spectra (MS:4000175)
- median MS2 precursor charge in identified spectra (MS:4000176)

### A Hub for QC
If you are looking for a home to your QC software or library, check out [MS-Quality-Hub](https://github.com/MS-Quality-Hub). All the above libraries have their development home there and some other very useful repositories.
6 changes: 3 additions & 3 deletions docs/pages/tutorials.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ permalink: /tutorials/
We have a couple of tutorials and guides to offer, this page will lead you to them:

### with python
* read mzQC: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/MS-Quality-hub/pymzqc/blob/v1.0.0rc1/jupyter/colab/read_in_5_minutes.ipynb)
* write mzQC: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/MS-Quality-hub/pymzqc/blob/v1.0.0rc1/jupyter/colab/write_in_5_minutes.ipynb)
* read mzQC: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/MS-Quality-hub/pymzqc/blob/v1.0.0rc2/jupyter/colab/read_in_5_minutes.ipynb)
* write mzQC: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/MS-Quality-hub/pymzqc/blob/v1.0.0rc2/jupyter/colab/write_in_5_minutes.ipynb)
* demo video
<iframe width="560" height="315" src="https://www.youtube.com/embed/vZXJuPl2yGw" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

### with R
* [Rlib for mzQC(TBA)](TBA)
* [rmzQC](https://cran.r-project.org/web/packages/rmzqc/index.html)
* PTXQC demo
<iframe width="560" height="315" src="https://www.youtube.com/embed/sb-mydbNRS4" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

Expand Down
4 changes: 2 additions & 2 deletions docs/pages/use-case-stories/mzQC_for_analytical_chemists.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ While many of the papers describing QC metrics for mass spectrometry have been b
4. What fraction of MS/MS scans came from +2 precursor ions?
_63.67%_

Naturally, it’s much easier if you use software to produce these values. In this case, I am using values reported by the QuaMeter software[1] in “IDFree” mode.
Naturally, it’s much easier if you use software to produce these values. In this case, I am using values reported by the QuaMeter software [1](#bumbershoot) in “IDFree” mode.


## Format metrics in JSON
Expand Down Expand Up @@ -124,5 +124,5 @@ Choosing JSON rather than XML is intended to reduce the effort required to write


---
[1]: Bumbershoot from proteowizard.sourceforge.net. X. Wang et al. Anal. Chem.. (2014) 86: 2497
<a id="bumbershoot">[1]</a>: Bumbershoot from proteowizard.sourceforge.net. X. Wang et al. Anal. Chem.. (2014) 86: 2497
Original draft: https://docs.google.com/document/d/16b1n_LXYWsxLK2PQfXWj_WvtiwJOXu2EZx1Q8WzZNlc/edit#
4 changes: 2 additions & 2 deletions docs/pages/use-cases.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ permalink: /use-cases/
This page should give you a (non-exclusive) overview over the use cases covered by mzQC:

## Handover Format
It is easy with **mzQC** to get relevant QC info, easy to put your data into context (of measurement realities). That makes it a preferred medium to handover quality information. Read more about it in [mzQC at a glance](at-a-glance/) and explore [a small mzQC example](../examples/individual-runs/).
It is easy with **mzQC** to get relevant QC info, easy to put your data into context (of measurement realities). That makes it a preferred medium to handover quality information. Read more about it in [mzQC at a glance](at-a-glance/) and explore [a small mzQC example](../examples/intro_run/).

## Quality Reports
With JSON at its core, mzQC follows a '_works online, works everywhere_' approach. Even for single spectra, as we show with the [universal spectrum identifier example](../examples/USI-example/).

## Archival
The format is an optimal QC tool for the analytical chemist and instrument operators keeping track (and archive) instrument performance. Read on with an [introduction to mzQC for anlytical chemists](analytical-chemists/) or explore our [QC sample example](../examples/qc-sample-run/). You can even embed mzQC in mzML, should you choose to. [View an example here](../examples/mzML-mzQC/).
The format is an optimal QC tool for the analytical chemist and instrument operators keeping track (and archive) instrument performance. Read on with an [introduction to mzQC for anlytical chemists](analytical-chemists/) or explore our [QC sample example](../examples/QC2-sample-example/). You can even embed mzQC in mzML, should you choose to. [View an example here](../examples/mzml-mzqc-example/).

## Common currency
With mzQC for archival, quality reports, and as handover format, [mzQC can serve as a common currency](mzQC-common-currency/) for data repositories, journals, and collaborators.
Expand Down
2 changes: 1 addition & 1 deletion docs/pages/worked-examples/QC2-sample-example.mzQC.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,4 @@ Since each column is in turn defined by a cv term, the column can also be assign


### This is the mzQC file once again, in full:
**[QC2-sample-example.mzQC](https://github.com/HUPO-PSI/mzQC/tree/main/specification_documents/draft_v1/examples/QC2-sample-example.mzQC)**
**[QC2-sample-example.mzQC](https://github.com/HUPO-PSI/mzQC/tree/main/specification_documents/examples/QC2-sample-example.mzQC)**
2 changes: 1 addition & 1 deletion docs/pages/worked-examples/USI-example.mzQC.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,4 @@ Each row represents one spectrum that can be directly looked up, in the case of
[even directly from the web](https://www.proteomicsdb.org/use/?usi=mzspec:PXD000966:CPTAC_CompRef_00_iTRAQ_01_2Feb12_Cougar_11-10-09:scan:2).

### This is the mzQC file once again, in full:
**[USI-example.mzQC](https://github.com/HUPO-PSI/mzQC/tree/main/specification_documents/draft_v1/examples/USI-example.mzQC)**
**[USI-example.mzQC](https://github.com/HUPO-PSI/mzQC/tree/main/specification_documents/examples/USI-example.mzQC)**
2 changes: 1 addition & 1 deletion docs/pages/worked-examples/metabo-batches.mzQC.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,4 +87,4 @@ before | after


### This is the mzQC file once again, in full:
**[metabo-batches.mzQC](https://github.com/HUPO-PSI/mzQC/tree/main/specification_documents/draft_v1/examples/metabo-batches.mzQC)**
**[metabo-batches.mzQC](https://github.com/HUPO-PSI/mzQC/tree/main/specification_documents/examples/metabo-batches.mzQC)**
2 changes: 1 addition & 1 deletion docs/pages/worked-examples/mzml-mzqc-example.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,4 +84,4 @@ The `<cv>` elements are again very similar to the respective entries in _mzQC_,
Note that you should also add your software to `<software>` and `<dataProcessing>`.

### This is the mzQC file once again, in full:
**[mzml-mzqc-example.mzML](https://github.com/HUPO-PSI/mzQC/tree/main/specification_documents/draft_v1/examples/mzml-mzqc-example.mzML)**
**[mzml-mzqc-example.mzML](https://github.com/HUPO-PSI/mzQC/tree/main/specification_documents/examples/mzml-mzqc-example.mzML)**
2 changes: 1 addition & 1 deletion docs/pages/worked-examples/set-of-runs.mzQC.md
Original file line number Diff line number Diff line change
Expand Up @@ -456,4 +456,4 @@ On the other hand, ommitting the `healthy`/`diseased` setQualities is not sensib
}
```
### This is the mzQC file once again, in full:
**[sets-of-runs.mzQC](https://github.com/HUPO-PSI/mzQC/tree/main/specification_documents/draft_v1/examples/set-of-runs.mzQC)**
**[sets-of-runs.mzQC](https://github.com/HUPO-PSI/mzQC/tree/main/specification_documents/examples/set-of-runs.mzQC)**
41 changes: 41 additions & 0 deletions meeting_notes/2024/20240319_psi2024.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# HUPO-PSI Spring Meeting 2024

## March 19, 2024

- Wout Bittremieux
- Kozo Nishida
- Tim Van Den Bossche
- Mathias Walzer

## Tutorials and examples

Going through the [tutorials and examples](https://hupo-psi.github.io/mzQC/examples/) to make sure they are compliant with the 1.0.0 format. Some tutorials were fully updated, some need new CV terms, others still need to be revised.

- Common issues with the examples:
- CV version needs to be updated.
- Some outdated CV terms used.
- Units were not always specified for all relevant metrics.
- Make sure that the file name and location match, and that the latter is a proper URI.
- Warnings by the validator if optional information is missing. For completeness of these examples, optional information has now been included.
- Reorder examples going from introduction to mid-level to advanced topics.
- Issues in the Python notebooks could be resolved by upgrading the Python version to 3.10+. Some links fixes needed as well.

This also revealed some bugs in pymzqc:

- Installation issues due to outdated version of Pronto and fastobo.
- Activate GitHub Actions for Python 3.10+.
- Incorrect specification of `unit` in `qualityMetric`, make it of ctype `cvParameter`.
- Allow `proteinGroups.txt` from MaxQuant as an identification file.
- Fix installation of offline validator script.

Validation issues:

- Faulty URIs flagged as inconsistent inputs.
- The file extension is inconsistently split from the input file name and location. Instead don't remove the extension prior to checking whether these two correspond to each other.
- Display the active version number on the validator web page.

## Promoting mzQC

Most current material is geared towards bioinformaticians. Instead, our main audience should be the broader scientific community, to educate them on how QC can improve their work.
Therefore, besides technical tutorials and examples, we should clearly describe which end-to-end tools are available and how to harness mzQC. This already needs to be obvious from the website's main page!
Some more details were added to the [tools page](https://hupo-psi.github.io/mzQC/resource-guide/#qc-software), but this still needs to be significantly extended with information on the various tools implementing mzQC and which QC metrics they can calculate.
22 changes: 22 additions & 0 deletions meeting_notes/2024/20240403_telco.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# QC working group teleconference 3 April 2024

- Chris Bielow
- Wout Bittremieux
- Tim Van Den Bossche
- Mathias Walzer

---

## mzQC software libraries manuscript

Rather than synchronizing publication of the manuscript with the main manuscript describing the mzQC format itself, we decided to submit it to the [JASMS special issue on computational mass spectrometry](https://axial.acs.org/analytical-chemistry/jasms-call-for-papers-special-issue-on-computational-mass-spectrometry) (submission deadline April 30, 2024).

Technical aspects related to finalizing the manuscript:
- Move away from the Nextflow analysis workflow because it just added complexity and compatibility issue. It is also overkill for this limited example.
- Discussion on the necessity of proper FDR control, including lowering the FDR threshold to 1% to account for E. coli's small proteome, and the need for protein-level FDR control, as we report protein results.
- All software libraries and the analysis repository on GitHub need to be brought up to date matching the results in the manuscript. Repository cleanup can still happen after the manuscript submission as well.
- Discussion on how to represent the results of the QC analysis in the manuscript. Tim had some suggestions and will try to clarify the figure, after Mathias shares the latest updated version.
- Requests for missing CV terms were submitted to the PSI-MS-CV repository.

Related discussions:
- mzQC file merging is a relevant task that should be easily supported by the software libraries. For the current manuscript a somewhat hacked-together solution initially suffices.
30 changes: 30 additions & 0 deletions meeting_notes/2024/20240417_telco.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# QC working group teleconference 17 April 2024

- Chris Bielow
- Wout Bittremieux
- Nils Hoffmann
- Dave Tabb
- Tim Van Den Bossche
- Mathias Walzer

---

## Working group charter

We went over the [working group charter](https://www.psidev.info/quality-control-working-group-charter) with an eye on its annual update. Some minor updates (affiliation update), but other than that, no major changes are needed.

Our key effort during the next year will be to demonstrate the benefits and applications of mzQC in practice, through the development of use cases and support of QC software.

Wout will update the charter and send it for consultation to Slack.

## CV

- Precursor mass deviation ([#254](https://github.com/HUPO-PSI/psi-ms-CV/pull/254)): Good to be merged. Note that FragPipe reports the absolute mass deviation (although not explicitly as a QC metric).
- Number of missed cleavages ([#255](https://github.com/HUPO-PSI/psi-ms-CV/pull/255)): Discussion on how to represent this metric and its definition. Mathias will apply some refinements.
- identified MS2 vs RT ([#255](https://github.com/HUPO-PSI/psi-ms-CV/pull/255)): Some confusion around what this metric precisely represents, indicating that its definition should be clarified. Mathias will update it in analogy to some existing related terms.

## mzQC software libraries manuscript

- Mathias reported on the updated data analysis workflow (including protein-level FDR control, which necessitated switching search engines).
- Mathias will share the updated data analysis figure with Tim for some visual edits.
- The [JASMS special issue on computational mass spectrometry](https://axial.acs.org/analytical-chemistry/jasms-call-for-papers-special-issue-on-computational-mass-spectrometry) submission deadline is April 30, 2024. We will have an additional meeting next week to finalize the manuscript by this deadline.
Loading

0 comments on commit 81ff832

Please sign in to comment.