Error for single reads during FASTQ linting preprocessing #1483

cchapus · 2025-01-08T15:32:44Z

Description of the bug

Hi,

I've tried the new release 3.18.0 with my single read RNA-seq. Happy to test the fact that I can use pre-prepared sortmerna files.

But it failed each time at the step process > NFCORE_RNASEQ:RNASEQ:FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS:FQ_LINT.

I looked at the .command.sh corresponding file:

fq lint \ --disable-validator P001 \ XXXXX.fastq.gz > XXXXX.fq_lint.txt.

But fq lint is for validating a FASTQ file pair. I've single read files.

I've solved it by adding skip_linting: true to the params_file

Command used and terminal output

command:
nextflow run \
/path/Nextflow/nf-core-rnaseq/3_18_0/ \
-profile singularity \
-params-file params_rnaseq_final.yaml

output:
-[nf-core/rnaseq] Pipeline completed with errors-
WARN: Killing running tasks (6)
ERROR ~ Error executing process > 'NFCORE_RNASEQ:RNASEQ:FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS:FQ_LINT (XXXXX)'

Caused by:
  Process `NFCORE_RNASEQ:RNASEQ:FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS:FQ_LINT (006_J0)` terminated with an error exit status (1)


Command executed:

  fq lint \
      --disable-validator P001 \
      XXXXX.fastq.gz > XXXXX.fq_lint.txt
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNASEQ:RNASEQ:FASTQ_QC_TRIM_FILTER_SETSTRANDEDNESS:FQ_LINT":
      fq: $(echo $(fq lint --version | sed 's/fq-lint //g'))
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  XXXXX.fastq.gz:2037434:1: [S004] CompleteValidator: empty sequence

Work dir:
  /path/work/9c/d3a299bb8f82445c63c652011ceaed

Container:
  /path/Nextflow/cache/depot.galaxyproject.org-singularity-fq-0.12.0--h9ee0642_0.img

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

 -- Check '.nextflow.log' file for details

Relevant files

No response

System information

N E X T F L O W ~ version 24.10.3
Linux Ubuntu workstation
container: singularity
nf-core/rnaseq: 3.18.0

The text was updated successfully, but these errors were encountered:

pinin4fjords · 2025-01-16T14:07:50Z

The error doesn't suggest anything related to pairing to me- can you suggest why you feel that that is the case?

Did you check the file for missing records (which is what S004) flags? Are you able to share the FASTQ in question?

cchapus · 2025-01-16T15:29:30Z

I'm going to ask for permissions concerning sharing. I hope to have the answer for next week.

The pipeline is working as intended if I add skip_linting: true. I had issues only with my single-read datasets (two examples). On my both-reads datasets, no issues at all.

It's why I thought it could be pairing related.

With the single-read dataset:
When I put myself into the correct folder, I've tried the .command.sh with my own fq conda environment. The output is always empty.

fq lint \
    --disable-validator P001 \
    006_J0.fastq.gz
2025-01-16T15:22:05.971320Z  INFO fq::commands::lint: fq-lint start
2025-01-16T15:22:06.049930Z  INFO fq::commands::lint: validating single end read
2025-01-16T15:22:06.049954Z  INFO fq::validators: disabled validators: ["P001"]
2025-01-16T15:22:06.049975Z  INFO fq::validators: enabled single read validators: ["[S003] NameValidator", "[S004] CompleteValidator", "[S002] AlphabetValidator", "[S001] PlusLineValidator", "[S005] ConsistentSeqQualValidator", "[S006] QualityStringValidator"]
2025-01-16T15:22:06.049994Z  INFO fq::validators: enabled paired read validators: []
2025-01-16T15:22:06.050004Z  INFO fq::commands::lint: starting validation
006_J0.fastq.gz:2037434:1: [S004] CompleteValidator: empty sequence

I've checked manually the file (at least the first 10 records). The FASTQ file seems fine. Same length for the QUAL and the SEQ. Names are ok. For example:

@NB501163:190:HK3FNBGXV:1:11101:20919:1040 1:N:0:TGTAACCACT+NGGAGCGATT
TCATANTCTCGTTTGTTTTCCTGATAAAGCTGTGCTGCCTGGCTATTGGCTGGACTGTTAGG
+
AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEE
@NB501163:190:HK3FNBGXV:1:11101:3784:1041 1:N:0:TGTAACCACT+NGGAGCGATT
CACGANTATTACATATAGTACAGTTCCCCAAAATGATGCACACTAGCCTTCCATATCTCCCT
+
AAAAA#EEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEAAE
@NB501163:190:HK3FNBGXV:1:11101:3059:1041 1:N:0:TGTAACCACT+NGGAGCGATT
AAAGTNGGACGCTCATCTGCTTTCTAAAACCAAAGAAAAAGTAAAGTGTTAGAGTGGCT
+
AAAA6#EEEEEEEEEEEEEEAAEEEEEEEEEEEEEEEEEEE<EEEE<EAEEEEEE/EEE

pinin4fjords · 2025-01-16T16:33:11Z

OK, I don't think it's pairing related. I suspect there may be a genuine issue with the FASTQ file. You can try taking progressively larger chunks of the file (e.g. zcat foo.fastq.gz | head -n 4000 > sub.fastq.gz to get the first 1000 records), and running the command.

cchapus added the bug Something isn't working label Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error for single reads during FASTQ linting preprocessing #1483

Error for single reads during FASTQ linting preprocessing #1483

cchapus commented Jan 8, 2025

pinin4fjords commented Jan 16, 2025

cchapus commented Jan 16, 2025

pinin4fjords commented Jan 16, 2025

Error for single reads during FASTQ linting preprocessing #1483

Error for single reads during FASTQ linting preprocessing #1483

Comments

cchapus commented Jan 8, 2025

Description of the bug

Command used and terminal output

Relevant files

System information

pinin4fjords commented Jan 16, 2025

cchapus commented Jan 16, 2025

pinin4fjords commented Jan 16, 2025