-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proteomicsLFQ with new SVM results in UPS1 dataset #301
Comments
@daichengxin can you provide a similar plot than the last one for Maxquant. |
I think the numbers don't tell you that much here. Can we somehow see how well the quantities match? e.g., if we only picked up noise in the old version the new one would be better. (and the other way around) |
You are right, this is why I ask for the same plot from MQ for the last plot. Our first version of the plot has a lot of noise quantities, and we have solved that, but we have to find the right parameters for both thresholds now. It looks like the current 0.25 and 0.75 is too stringent by the results in here #287 I'm running now both datasets with |
I agree. Can we do an automated evaluation maybe? We could even do a little step in nextflow and then launch the process multiple times with different parameters. |
But from my memory this actually looks a bit like the MQ results. 2500 amol was the turning point. Except for the lowest concentration of course. In theory it does not make sense that you find more features in lower concentrations, if you didn't even find them in higher ones. I also agree with @timosachsenberg that we should do plots that look at the relative differences between each concentration, instead of just found features/proteins. |
MaxQuant results provided by original paper : edit: replaced P1-P9 with concentrations. @jpfeuffer |
The same plot given by authors. https://europepmc.org/articles/PMC8280745/figure/fig2/. 2500 amol was the turning point. And ours results looks like better than MQ at low concentrations? |
Test results. Results folder: http://ftp.pride.ebi.ac.uk/pub/databases/pride/resources/proteomes/quantms-benchmark/PXD001819NEWSAGE/proteomicslfq/ |
Can you replace P1 and P9 with something reasonable that includes the amol concentration in the header of the MQ xlsx? |
@timosachsenberg @jpfeuffer for the 0.10 and 0.75 I used the threshold of 1000 for the intensity. It actually looks much better than 0.1 0.9 with intensity 10'000. What do you think? |
Yes looks better imo. But we should really check expected fold changes. |
Do you need the Msstats, I think in the project folder: http://ftp.pride.ebi.ac.uk/pub/databases/pride/resources/proteomes/quantms-benchmark/PXD001819NEWSAGE-0.10-0.75/msstats/ |
Here is my old script. In the beginning it includes some fixes for MSstats loading and plotting which are probably not necessary anymore. |
I will try to make it run today and let you know. Do you have the original figures from MQ. |
It can read MQ files and PD files, too. I use the results from https://github.com/wombat-p |
Can you provide me a direct link to those outputs PD and MQ from wombat-p. Im now configuring your script. |
Here the results @jpfeuffer of your script with this data: |
Looks good but I think not a large improvement to fold changes for comparisons with <=500 amol. |
We are fine with that, I have the feeling we have reduced a lot the false positive signals in MBR and that is the major advantage. But as you said, nothing has happens in terms of the feature detection by itself in the low concentrations. |
I agree that it might be enough since the fold changes did not get worse but the number of found features look a bit better. I think MQ identified way less in the lower concentrations (which doesn't mean much since they might actually be close to unquantifiability and looking at the quants maybe should stay unreported). We might be able to set a higher min threshold for unidentified features actually. It would be great if we could have some debug output on all traces of all features across a consensusFeature. Or the extracted areas. |
I will close the following issue in favor of #303 Please move the discussions about future improvements about MBR LFQ to that issue. |
@ypriverol @timosachsenberg This was my old plot for MQ. I don't remember the version. But you can clearly see that if MQ finds a feature it is usually correct in rel. quants. We have more proteins basically everywhere, but out quants are very off in the low concentrations. My interpretation is a significant overestimation of quants in features in lower concentrations. Basically starting from 2500 and below there is no difference in quants for a linked feature anymore. |
@jpfeuffer do you think the interpolation you mentioned could be an issue? could be easy to check... |
Description of feature
@timosachsenberg @jpfeuffer @daichengxin we also have run it with parameters:
Dataset PXD001819, Files: http://ftp.pride.ebi.ac.uk/pub/databases/pride/resources/proteomes/quantms-benchmark/PXD001819NEWSAGE/
Results Table:
Previous UPS detected:
Current UPS detected:
This issue needs to be discussion about the default parameters for MBRs. This issue is related to #287
The text was updated successfully, but these errors were encountered: