-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
foldseek cluster
crashes with --cluster-reassign 1
#399
Comments
Yes, documentation is indeed a weak spot. What do you need better documented? |
Amazing! Thank you, Martin. I missed documentation on the following areas (though maybe I wasn't looking in the right places):
Your team has done an excellent job on writing a piece of software that is versatile, computationally efficient, and revolutionising not least. The field craves such tools. Currently, however, lacking documentation is the bottleneck for widespread adoption. Everybody wants to use this! Thank you again for this. |
Thank you for the feedback. We will add this documentation. |
@shiraz-shah we have no a static GPU binary that works with Prostt5. We also reworked the documentation. Could you please give it a try please? You need to redownload the weights though.
|
Martin, GPU inference worked great for generating a 3di database directly
from aa sequences! Amazing!!
What about `search` though?
When I did:
`foldseek search DB DB aln tmp --gpu 1 --prefilter-mode 1`
I got:
`Database vOTUs_ss is not a valid GPU database`
|
Great that the ProstT5 works smoothly. Also, thank you for pointing out the problem. We’ve updated the documentation to clarify that padding databases are needed ( Our |
OK, Martin, I just tested this with my data set, and here's what it says:
It writes a file Any ideas? |
Please update the binary again. We fixed this bug here in b2e41c1. But use the latest commit anyway. That one should be close to release candidate status for the next release. |
OK, that looks better. But now it says:
FYI, I can see that the input vOTUs and vOTUs_ss are the exact same length (number of lines). Also, it seems the above command succeeds in generating vOTUs_pad and vOTU_ss_pad. They also appear to have the correct size (same number of MBs as vOTUs and vOTUs_ss). However when I run |
It stops at the step where it attempts to rename the Calpha database. Does your vOTUs dataset include Calphas, or were they predicted using ProstT5? Could you please share the step before to generate the vOTU? |
It's ProstT5 only, no Calpha. I generated it like this as per your new instructions:
|
I just tested it with the latest version with my db and it worked, see log below,. I noticed that your database includes a
|
Amazing software, guys! More documentation would be helpful, though!!
Expected Behavior
That clustering works with cluster reassignment enabled. Clustering works fine with it disabled.
Current Behavior
foldseek cluster DB C tmp --cluster-reassign 1
Crashes with error:
awk: fatal: cannot open file
tmp/9215817526405491371/seq_seeds_ca.index' for reading: No such file or directory`Steps to Reproduce (for bugs)
Make foldseek database composed of only amino acid sequence and 3di sequences, i.e.:
Foldssek Output (for bugs)
awk: fatal: cannot open file 'tmp/9215817526405491371/seq_seeds_ca.index' for reading: No such file or directory
Context
generate_foldseek_db.py
Your Environment
Include as many relevant details about the environment you experienced the bug in.
The text was updated successfully, but these errors were encountered: