Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory error when creating a k-mer library #20

Open
senaj opened this issue Oct 20, 2015 · 3 comments
Open

Memory error when creating a k-mer library #20

senaj opened this issue Oct 20, 2015 · 3 comments

Comments

@senaj
Copy link

senaj commented Oct 20, 2015

I receive a Memory error when attempting to create a 25mer k-mer library from a very large fasta file (9Gb). Please see the error message below.

Traceback (most recent call last):
File "/sw/Cookiecutter/1.0/cookiecutter", line 728, in
cookiecutter()
File "/sw/Cookiecutter/1.0/cookiecutter", line 677, in cookiecutter
create_kmer_file(args.input, args.output, args.length)
File "/sw/Cookiecutter/1.0/cookiecutter", line 406, in create_kmer_file
kmers[rkmer] += 1
MemoryError

Does this error mean that I am running out of memory on my server?

However, I am able to create 25mer k-mer libraries using smaller fasta files (174Mb and 56 Kb).

Should I split my large fasta file (9Gb) into several smaller fasta files when creating 25mer libraries?

10/20/15
I am re-making the libraries for my large fasta file (9Gb) on a server with more memory and I am currently using about 220Gb of memory. Is this normal? Is there a way to reduce memory usage?

@ad3002
Copy link
Owner

ad3002 commented Nov 2, 2015

Dear @senaj, thank you for your report, we added memory error handling for such cases, and you should use Jellyfish for large dataset, see update documentation here https://github.com/ad3002/Cookiecutter#creating-a-library-of-k-mers, please let us know if it will work for your dataset.

@senaj
Copy link
Author

senaj commented Nov 2, 2015

Hi ad3002,

I was able to create a k-mer library using Jellyfish and use this file to initiate cookiecutter. However, when I ran cookiecutter separate on my reads (single end fastq, 26 million reads) I maxed out a 528Gb (RAM) server and crashed the system. Cookiecutter seems to work well using small k-mer libraries.

@fxbabin
Copy link

fxbabin commented Jan 25, 2016

Hi,

In order to generate a kmer library without memory problems, i used KMC2 (it took few seconds for 450 thousands reads and i wasn't bothered by the memory usage). It generated a library with twice less reads than the one generated with "make_library" however cookiecutter recruited the same amount of reads (with a very little difference). Maybe this can be a solution for the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants