proteomicsLFQ with new SVM results in UPS1 dataset #301

ypriverol · 2023-10-06T15:46:38Z

Description of feature

@timosachsenberg @jpfeuffer @daichengxin we also have run it with parameters:

feature_with_id_min_score = 0.25
feature_without_id_min_score = 0.75

Dataset PXD001819, Files: http://ftp.pride.ebi.ac.uk/pub/databases/pride/resources/proteomes/quantms-benchmark/PXD001819NEWSAGE/

Results Table:

Previous UPS detected:

Current UPS detected:

This issue needs to be discussion about the default parameters for MBRs. This issue is related to #287

The text was updated successfully, but these errors were encountered:

ypriverol · 2023-10-06T16:04:59Z

@daichengxin can you provide a similar plot than the last one for Maxquant.

timosachsenberg · 2023-10-06T16:19:13Z

I think the numbers don't tell you that much here. Can we somehow see how well the quantities match? e.g., if we only picked up noise in the old version the new one would be better. (and the other way around)

ypriverol · 2023-10-06T17:12:25Z

You are right, this is why I ask for the same plot from MQ for the last plot. Our first version of the plot has a lot of noise quantities, and we have solved that, but we have to find the right parameters for both thresholds now. It looks like the current 0.25 and 0.75 is too stringent by the results in here #287

I'm running now both datasets with 0.10 and 0.90.

jpfeuffer · 2023-10-06T17:58:33Z

I agree. Can we do an automated evaluation maybe? We could even do a little step in nextflow and then launch the process multiple times with different parameters.
How long is the runtime with -resume in the last step?

jpfeuffer · 2023-10-06T18:49:40Z

But from my memory this actually looks a bit like the MQ results. 2500 amol was the turning point.

Except for the lowest concentration of course. In theory it does not make sense that you find more features in lower concentrations, if you didn't even find them in higher ones.
Unless, the higher amount of MS2 IDs in the higher concentrations and consecutively extracted features leads to an increased pressure to link untargeted features in the lowest concentration. And the more higher concentrations (relative to the concentration you are currently looking at) you have, the more features you try to link. Therefore the lowest concentration gets more "chances". Does this make sense?
I think in this case an FDR approach vs the same probability cutoff for every run would bring additional value.

I also agree with @timosachsenberg that we should do plots that look at the relative differences between each concentration, instead of just found features/proteins.
E.g. check the number of significantly deregulated proteins between samples.
(Although the higher number of found features in the lowest concentration remains a bit weird to me).

daichengxin · 2023-10-07T13:18:24Z

MaxQuant results provided by original paper :
PalombaA_pubmed_34038140_2.xlsx

edit: replaced P1-P9 with concentrations. @jpfeuffer

daichengxin · 2023-10-07T14:32:46Z

The same plot given by authors. https://europepmc.org/articles/PMC8280745/figure/fig2/. 2500 amol was the turning point. And ours results looks like better than MQ at low concentrations?

daichengxin · 2023-10-07T16:00:10Z

Test results. Results folder: http://ftp.pride.ebi.ac.uk/pub/databases/pride/resources/proteomes/quantms-benchmark/PXD001819NEWSAGE/proteomicslfq/

jpfeuffer · 2023-10-07T16:32:54Z

Can you replace P1 and P9 with something reasonable that includes the amol concentration in the header of the MQ xlsx?

ypriverol · 2023-10-11T12:43:08Z

Test results. Results folder: http://ftp.pride.ebi.ac.uk/pub/databases/pride/resources/proteomes/quantms-benchmark/PXD001819NEWSAGE/proteomicslfq/

@timosachsenberg @jpfeuffer for the 0.10 and 0.75 I used the threshold of 1000 for the intensity. It actually looks much better than 0.1 0.9 with intensity 10'000. What do you think?

jpfeuffer · 2023-10-11T12:59:00Z

Yes looks better imo. But we should really check expected fold changes.

ypriverol · 2023-10-11T13:00:42Z

Do you need the Msstats, I think in the project folder: http://ftp.pride.ebi.ac.uk/pub/databases/pride/resources/proteomes/quantms-benchmark/PXD001819NEWSAGE-0.10-0.75/msstats/

jpfeuffer · 2023-10-11T13:04:10Z

upsReval.zip

Here is my old script. In the beginning it includes some fixes for MSstats loading and plotting which are probably not necessary anymore.
I am not sure when I will get to it. Maybe Friday.

ypriverol · 2023-10-11T13:05:44Z

I will try to make it run today and let you know. Do you have the original figures from MQ.

jpfeuffer · 2023-10-11T13:10:14Z

It can read MQ files and PD files, too. I use the results from https://github.com/wombat-p

ypriverol · 2023-10-11T13:11:51Z

Can you provide me a direct link to those outputs PD and MQ from wombat-p. Im now configuring your script.

ypriverol · 2023-10-11T14:33:00Z

Here the results @jpfeuffer of your script with this data:

jpfeuffer · 2023-10-11T14:56:34Z

Looks good but I think not a large improvement to fold changes for comparisons with <=500 amol.

ypriverol · 2023-10-11T14:59:03Z

We are fine with that, I have the feeling we have reduced a lot the false positive signals in MBR and that is the major advantage. But as you said, nothing has happens in terms of the feature detection by itself in the low concentrations.

jpfeuffer · 2023-10-11T19:08:27Z

I agree that it might be enough since the fold changes did not get worse but the number of found features look a bit better. I think MQ identified way less in the lower concentrations (which doesn't mean much since they might actually be close to unquantifiability and looking at the quants maybe should stay unreported). We might be able to set a higher min threshold for unidentified features actually.
I am not sure if my MQ files are the best to use (old version and I don't know the settings used) but I can send them tomorrow.

It would be great if we could have some debug output on all traces of all features across a consensusFeature. Or the extracted areas.
@timosachsenberg we could also check the interpolation of quantities for traces that could not be fit (in the OpenSwath algos). I think you fixed something there but maybe this approach in general is not so great for our purpose?

ypriverol · 2023-10-13T08:54:45Z

I will close the following issue in favor of #303 Please move the discussions about future improvements about MBR LFQ to that issue.

jpfeuffer · 2023-10-15T12:11:07Z

@ypriverol @timosachsenberg This was my old plot for MQ. I don't remember the version. But you can clearly see that if MQ finds a feature it is usually correct in rel. quants. We have more proteins basically everywhere, but out quants are very off in the low concentrations. My interpretation is a significant overestimation of quants in features in lower concentrations. Basically starting from 2500 and below there is no difference in quants for a linked feature anymore.
FDR might help but maybe also a more sensitive quantification is necessary in those low concentration features (since as you can see MQ is able to recover 3-4 proteins with sometimes close to correct rel. quants.

timosachsenberg · 2023-10-15T13:51:27Z

@jpfeuffer do you think the interpolation you mentioned could be an issue? could be easy to check...

ypriverol added the enhancement New feature or request label Oct 6, 2023

ypriverol assigned ypriverol, timosachsenberg, jpfeuffer and daichengxin Oct 6, 2023

ypriverol added high-priority release 1.3 documentation Improvements or additions to documentation labels Oct 6, 2023

ypriverol mentioned this issue Oct 13, 2023

LFQ MBR FDR algorithm needed. #303

Open

3 tasks

ypriverol closed this as completed Oct 13, 2023

timosachsenberg mentioned this issue Oct 16, 2023

[FFID][tweak] thoughts on settings/scores OpenMS/OpenMS#7130

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proteomicsLFQ with new SVM results in UPS1 dataset #301

proteomicsLFQ with new SVM results in UPS1 dataset #301

ypriverol commented Oct 6, 2023 •

edited

Loading

ypriverol commented Oct 6, 2023

timosachsenberg commented Oct 6, 2023

ypriverol commented Oct 6, 2023 •

edited

Loading

jpfeuffer commented Oct 6, 2023 •

edited

Loading

jpfeuffer commented Oct 6, 2023 •

edited

Loading

daichengxin commented Oct 7, 2023 •

edited

Loading

daichengxin commented Oct 7, 2023 •

edited

Loading

daichengxin commented Oct 7, 2023 •

edited

Loading

jpfeuffer commented Oct 7, 2023

ypriverol commented Oct 11, 2023

jpfeuffer commented Oct 11, 2023

ypriverol commented Oct 11, 2023

jpfeuffer commented Oct 11, 2023

ypriverol commented Oct 11, 2023

jpfeuffer commented Oct 11, 2023

ypriverol commented Oct 11, 2023

ypriverol commented Oct 11, 2023

jpfeuffer commented Oct 11, 2023

ypriverol commented Oct 11, 2023 •

edited

Loading

jpfeuffer commented Oct 11, 2023 •

edited

Loading

ypriverol commented Oct 13, 2023

jpfeuffer commented Oct 15, 2023 •

edited

Loading

timosachsenberg commented Oct 15, 2023

proteomicsLFQ with new SVM results in UPS1 dataset #301

proteomicsLFQ with new SVM results in UPS1 dataset #301

Comments

ypriverol commented Oct 6, 2023 • edited Loading

Description of feature

ypriverol commented Oct 6, 2023

timosachsenberg commented Oct 6, 2023

ypriverol commented Oct 6, 2023 • edited Loading

jpfeuffer commented Oct 6, 2023 • edited Loading

jpfeuffer commented Oct 6, 2023 • edited Loading

daichengxin commented Oct 7, 2023 • edited Loading

daichengxin commented Oct 7, 2023 • edited Loading

daichengxin commented Oct 7, 2023 • edited Loading

jpfeuffer commented Oct 7, 2023

ypriverol commented Oct 11, 2023

jpfeuffer commented Oct 11, 2023

ypriverol commented Oct 11, 2023

jpfeuffer commented Oct 11, 2023

ypriverol commented Oct 11, 2023

jpfeuffer commented Oct 11, 2023

ypriverol commented Oct 11, 2023

ypriverol commented Oct 11, 2023

jpfeuffer commented Oct 11, 2023

ypriverol commented Oct 11, 2023 • edited Loading

jpfeuffer commented Oct 11, 2023 • edited Loading

ypriverol commented Oct 13, 2023

jpfeuffer commented Oct 15, 2023 • edited Loading

timosachsenberg commented Oct 15, 2023

ypriverol commented Oct 6, 2023 •

edited

Loading

ypriverol commented Oct 6, 2023 •

edited

Loading

jpfeuffer commented Oct 6, 2023 •

edited

Loading

jpfeuffer commented Oct 6, 2023 •

edited

Loading

daichengxin commented Oct 7, 2023 •

edited

Loading

daichengxin commented Oct 7, 2023 •

edited

Loading

daichengxin commented Oct 7, 2023 •

edited

Loading

ypriverol commented Oct 11, 2023 •

edited

Loading

jpfeuffer commented Oct 11, 2023 •

edited

Loading

jpfeuffer commented Oct 15, 2023 •

edited

Loading