Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross link input #3

Open
roivant-matts opened this issue Jun 27, 2023 · 7 comments
Open

Cross link input #3

roivant-matts opened this issue Jun 27, 2023 · 7 comments

Comments

@roivant-matts
Copy link

Hello, Thanks for this excellent project. For technical reproducibility, are you able to share the RpoA-RpoC cross-links csv or pkl dictionary? I wasn't able to find in the paper/supplements etc

@lhatsk
Copy link
Collaborator

lhatsk commented Jun 27, 2023

The crosslinks are from this paper (Data Availability): https://www.embopress.org/doi/full/10.15252/msb.202311544

RPOA-RPOC.pkl.gz

@roivant-matts
Copy link
Author

Thanks and I gather the CSV format is 1-based, but the dictionary is 0 based? I guess expected, but I needed to add FDR values to above to get the prediction.

@lhatsk
Copy link
Collaborator

lhatsk commented Jun 28, 2023

Yes, CSV format is 1-based, and dictionary 0-based. Sorry, about the missing FDR, my internal set-up is a little different and has a fixed FDR.

@roivant-matts
Copy link
Author

roivant-matts commented Jun 29, 2023

Thanks - I am rerunning to be sure I didn't have a mixup, but I found using the v2 params with and without cross-links (e.g. an empty {} pkl.gz) I get the same structure. (both match the reference PDB closely - e.g. superimposing al2 chain B (rpoc) on chain D of the reference). Is my approach to use an empty dictionary as a baseline appropriate in your view? edit: I am using FDR 0.20 on all the links you shared.

@lhatsk
Copy link
Collaborator

lhatsk commented Jun 30, 2023

Yes, that works as a baseline. We used an FDR of 0.05 but it doesn't matter here.

We noticed the same thing last week. Increasing the crop size during fine-tuning seems to already improve the RpoA-RpoC prediction sometimes. In our runs, 5/10 failed without crosslinks, whereas 10/10 succeeded with crosslinks so similar to the other experiments they allow us to focus sampling on the interesting regions.

On the CASP data, it didn't seem to have a big effect (see extended data figure 3 in the v2 paper supplement).

@roivant-matts
Copy link
Author

For the figure 3 data do you recall if the v2.2.4 or v2.3.0 weights are used for the alphafold predictions? I noticed in their release notes for v2.3.0 they also increased crop size to 640AA. Apologies for lag in coming back to this thread - some other testing brought it back to my mind.

@lhatsk
Copy link
Collaborator

lhatsk commented Oct 16, 2023

Sorry for the late response!

For Figure 3 (the Cullin4 data) we switched to v2.3.0 for AlphaFold and AlphaLink because the other networks were not able to produce meaningful predictions. Essentially the structures were just floating in space (disconnected). v2.3.0 performs much better for larger complexes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants