Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error for single reads during FASTQ linting preprocessing #1483

Open
cchapus opened this issue Jan 8, 2025 · 3 comments
Open

Error for single reads during FASTQ linting preprocessing #1483

cchapus opened this issue Jan 8, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@cchapus
Copy link

cchapus commented Jan 8, 2025

Description of the bug

Hi,

I've tried the new release 3.18.0 with my single read RNA-seq. Happy to test the fact that I can use pre-prepared sortmerna files.

But it failed each time at the step process > NFCORE_RNASEQ:RNASEQ:FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS:FQ_LINT.

I looked at the .command.sh corresponding file:

fq lint \ --disable-validator P001 \ XXXXX.fastq.gz > XXXXX.fq_lint.txt.

But fq lint is for validating a FASTQ file pair. I've single read files.

I've solved it by adding skip_linting: true to the params_file

Command used and terminal output

command:
nextflow run \
/path/Nextflow/nf-core-rnaseq/3_18_0/ \
-profile singularity \
-params-file params_rnaseq_final.yaml

output:
-[nf-core/rnaseq] Pipeline completed with errors-
WARN: Killing running tasks (6)
ERROR ~ Error executing process > 'NFCORE_RNASEQ:RNASEQ:FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS:FQ_LINT (XXXXX)'

Caused by:
  Process `NFCORE_RNASEQ:RNASEQ:FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS:FQ_LINT (006_J0)` terminated with an error exit status (1)


Command executed:

  fq lint \
      --disable-validator P001 \
      XXXXX.fastq.gz > XXXXX.fq_lint.txt
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNASEQ:RNASEQ:FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS:FQ_LINT":
      fq: $(echo $(fq lint --version | sed 's/fq-lint //g'))
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  XXXXX.fastq.gz:2037434:1: [S004] CompleteValidator: empty sequence

Work dir:
  /path/work/9c/d3a299bb8f82445c63c652011ceaed

Container:
  /path/Nextflow/cache/depot.galaxyproject.org-singularity-fq-0.12.0--h9ee0642_0.img

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

 -- Check '.nextflow.log' file for details

Relevant files

No response

System information

N E X T F L O W ~ version 24.10.3
Linux Ubuntu workstation
container: singularity
nf-core/rnaseq: 3.18.0

@cchapus cchapus added the bug Something isn't working label Jan 8, 2025
@pinin4fjords
Copy link
Member

The error doesn't suggest anything related to pairing to me- can you suggest why you feel that that is the case?

Did you check the file for missing records (which is what S004) flags? Are you able to share the FASTQ in question?

@cchapus
Copy link
Author

cchapus commented Jan 16, 2025

I'm going to ask for permissions concerning sharing. I hope to have the answer for next week.

The pipeline is working as intended if I add skip_linting: true. I had issues only with my single-read datasets (two examples). On my both-reads datasets, no issues at all.

It's why I thought it could be pairing related.

With the single-read dataset:
When I put myself into the correct folder, I've tried the .command.sh with my own fq conda environment. The output is always empty.

fq lint \
    --disable-validator P001 \
    006_J0.fastq.gz
2025-01-16T15:22:05.971320Z  INFO fq::commands::lint: fq-lint start
2025-01-16T15:22:06.049930Z  INFO fq::commands::lint: validating single end read
2025-01-16T15:22:06.049954Z  INFO fq::validators: disabled validators: ["P001"]
2025-01-16T15:22:06.049975Z  INFO fq::validators: enabled single read validators: ["[S003] NameValidator", "[S004] CompleteValidator", "[S002] AlphabetValidator", "[S001] PlusLineValidator", "[S005] ConsistentSeqQualValidator", "[S006] QualityStringValidator"]
2025-01-16T15:22:06.049994Z  INFO fq::validators: enabled paired read validators: []
2025-01-16T15:22:06.050004Z  INFO fq::commands::lint: starting validation
006_J0.fastq.gz:2037434:1: [S004] CompleteValidator: empty sequence

I've checked manually the file (at least the first 10 records). The FASTQ file seems fine. Same length for the QUAL and the SEQ. Names are ok. For example:

@NB501163:190:HK3FNBGXV:1:11101:20919:1040 1:N:0:TGTAACCACT+NGGAGCGATT
TCATANTCTCGTTTGTTTTCCTGATAAAGCTGTGCTGCCTGGCTATTGGCTGGACTGTTAGG
+
AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEE
@NB501163:190:HK3FNBGXV:1:11101:3784:1041 1:N:0:TGTAACCACT+NGGAGCGATT
CACGANTATTACATATAGTACAGTTCCCCAAAATGATGCACACTAGCCTTCCATATCTCCCT
+
AAAAA#EEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEAAE
@NB501163:190:HK3FNBGXV:1:11101:3059:1041 1:N:0:TGTAACCACT+NGGAGCGATT
AAAGTNGGACGCTCATCTGCTTTCTAAAACCAAAGAAAAAGTAAAGTGTTAGAGTGGCT
+
AAAA6#EEEEEEEEEEEEEEAAEEEEEEEEEEEEEEEEEEE<EEEE<EAEEEEEE/EEE

@pinin4fjords
Copy link
Member

OK, I don't think it's pairing related. I suspect there may be a genuine issue with the FASTQ file. You can try taking progressively larger chunks of the file (e.g. zcat foo.fastq.gz | head -n 4000 > sub.fastq.gz to get the first 1000 records), and running the command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants