Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PXD000279 #5

Open
ypriverol opened this issue Nov 29, 2021 · 3 comments
Open

PXD000279 #5

ypriverol opened this issue Nov 29, 2021 · 3 comments
Assignees
Labels
documentation Improvements or additions to documentation help wanted Extra attention is needed urgent

Comments

@ypriverol
Copy link
Owner

ypriverol commented Nov 29, 2021

Hi @daichengxin:

This dataset (issues #1 ) has one fraction missing RAW file breaking the quantms pipeline. In the current version of quantms this can't be run.

@ypriverol ypriverol self-assigned this Jan 8, 2022
@ypriverol ypriverol added documentation Improvements or additions to documentation help wanted Extra attention is needed urgent labels Jan 8, 2022
@ypriverol
Copy link
Owner Author

ypriverol commented Jan 8, 2022

Results of the dataset PXD000279 can be seen here: http://ftp.pride.ebi.ac.uk/pride/data/proteomes/proteogenomics/benchmakrs/

The current results are worse than the original quant results from MQ. Some issues found it:

  • L peptides in the database are not identified. Instead, the MQ results reports a lot of hits with I as a replacement.
  • Lost of hits in the MQ results are not even identified in the non-filtered list from MS-GF+ and Comet. Example: Peptide: TSEFQAEVQK. Interestingly the scan number for this peptide in the MQ output is: 864524 However in the mzML converted with ThermoRawFileParser the highest number for one scan is 16168.

I have reviewed in details the parameters, the database used and all seems to be the same that the original MQ searches.

Here the peptides from MQ.

modificationSpecificPeptides-results.txt.gz

@jpfeuffer
Copy link

As long as the L peptides map to proteins that have both I and L, everything is fine. Of course, this needs to be considered during comparison, but only when comparing peptides.

864524 seems to be unlikely. Probably MQ is counting all spectra in all files. That is a pity for comparing.
The only thing we could do is compare it to an MSconvert conversion. But such a big error would have been found by now I guess.

Can you describe "worse"?

@ypriverol
Copy link
Owner Author

ypriverol commented Jan 9, 2022

We have decided to re-run all the searches again with MQ and see if the results are different. We have seen two completely difderent peptide spaces between the orignal searches in the paper and our results and this is strange. We will use same database and perform the searches again.

ypriverol pushed a commit that referenced this issue Jan 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation help wanted Extra attention is needed urgent
Projects
None yet
Development

No branches or pull requests

2 participants