Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to access jarfile PE #290

Open
sunta3iouxos opened this issue Oct 7, 2024 · 7 comments
Open

Unable to access jarfile PE #290

sunta3iouxos opened this issue Oct 7, 2024 · 7 comments

Comments

@sunta3iouxos
Copy link

sunta3iouxos commented Oct 7, 2024

Hi, there,
Could you please help me with this. It seems to be stuck on the trimming level, while calling java?
activated the /scratch/tgeorgom/bulker/bulker_crates/databio/pepatac/1.0.12 image

This is the log file.

### Pipeline run code and environment:

*          Command: `pipelines/pepatac.py --single-or-paired paired --prealignment-index rCRSd=/scratch/tgeorgom/refgenie/alias/rCRSd/bowtie2_index/default/. --genome hg38 --genome-index /scratch/tgeorgom/refgenie/alias/hg38/bowtie2_index/default/. --chrom-sizes /scratch/tgeorgom/refgenie/alias/hg38/fasta/default/hg38.chrom.sizes --sample-name test1 --input examples/data/test1_r1.fastq.gz --input2 examples/data/test1_r2.fastq.gz --genome-size hs --trimmer trimmomatic -O /scratch/tgeorgom/pepatac_test`
*     Compute host: `cheops1`
*      Working dir: `/home/tgeorgom/pepatac`
*        Outfolder: `/scratch/tgeorgom/pepatac_test/test1/`
*         Log file: `/scratch/tgeorgom/pepatac_test/test1/PEPATAC_log.md`
*       Start time:  (10-07 11:29:19) elapsed: 1.0 _TIME_

### Version log:

*   Python version: `3.10.14`
*      Pypiper dir: `/home/tgeorgom/miniforge3/lib/python3.10/site-packages/pypiper`
*  Pypiper version: `0.14.2`
*     Pipeline dir: `/home/tgeorgom/pepatac/pipelines`
* Pipeline version: `0.11.3`
*    Pipeline hash: `82f0685e4d98d71d6d2fc5acfc9b995877c91648`
*  Pipeline branch: `* master`
*    Pipeline date: `2024-06-05 14:59:51 -0400`
*    Pipeline diff: `1 file changed, 21 insertions(+), 21 deletions(-)`

### Arguments passed to pipeline:

*           `TSS_name`:  `None`
*            `aligner`:  `bowtie2`
*          `anno_name`:  `None`
*          `blacklist`:  `None`
*        `chrom_sizes`:  `/scratch/tgeorgom/refgenie/alias/hg38/fasta/default/hg38.chrom.sizes`
*        `config_file`:  `pepatac.yaml`
*              `cores`:  `1`
*       `deduplicator`:  `samblaster`
*              `dirty`:  `False`
*             `extend`:  `250`
*              `fasta`:  `None`
*       `force_follow`:  `False`
*     `frip_ref_peaks`:  `None`
*    `genome_assembly`:  `hg38`
*       `genome_index`:  `/scratch/tgeorgom/refgenie/alias/hg38/bowtie2_index/default/.`
*        `genome_size`:  `hs`
*              `input`:  `['examples/data/test1_r1.fastq.gz']`
*             `input2`:  `['examples/data/test1_r2.fastq.gz']`
*               `keep`:  `False`
*               `lite`:  `False`
*             `logdev`:  `False`
*                `mem`:  `4000`
*              `motif`:  `False`
*          `new_start`:  `False`
*            `no_fifo`:  `False`
*           `no_scale`:  `False`
*      `output_parent`:  `/scratch/tgeorgom/pepatac_test`
*         `paired_end`:  `True`
*        `peak_caller`:  `macs3`
*          `peak_type`:  `fixed`
*      `pipeline_name`:  `None`
* `prealignment_index`:  `['rCRSd=/scratch/tgeorgom/refgenie/alias/rCRSd/bowtie2_index/default/.']`
* `prealignment_names`:  `[]`
*         `prioritize`:  `False`
*            `recover`:  `False`
*        `sample_name`:  `test1`
*        `search_file`:  `None`
*             `silent`:  `False`
*   `single_or_paired`:  `paired`
*             `skipqc`:  `False`
*                `sob`:  `False`
*           `testmode`:  `False`
*            `trimmer`:  `trimmomatic`
*          `verbosity`:  `None`

### Initialized Pipestat Object:

* PipestatManager (default_pipeline_name)
* Backend: File
*  - results: /scratch/tgeorgom/pepatac_test/test1/stats.yaml
*  - status: /scratch/tgeorgom/pepatac_test/test1
* Multiple Pipelines Allowed: False
* Pipeline name: default_pipeline_name
* Pipeline type: sample
* Status Schema key: None
* Results formatter: default_formatter
* Results schema source: None
* Status schema source: None
* Records count: 2
* Sample name: DEFAULT_SAMPLE_NAME


----------------------------------------

Local input file: examples/data/test1_r1.fastq.gz
Local input file: examples/data/test1_r2.fastq.gz

> `Read_type`   paired  _RES_

> `Genome`      hg38    _RES_
### Merge/link and fastq conversion:  (10-07 11:29:19) elapsed: 0.0 _TIME_

Number of input file sets: 2
Target to produce: `/scratch/tgeorgom/pepatac_test/test1/raw/test1_R1.fastq.gz`

> `ln -sf /home/tgeorgom/pepatac/examples/data/test1_r1.fastq.gz /scratch/tgeorgom/pepatac_test/test1/raw/test1_R1.fastq.gz` (12730)
<pre>
</pre>
Command completed. Elapsed time: 0:00:00. Running peak memory: 0.003GB.
  PID: 12730;   Command: ln;    Return code: 0; Memory used: 0.003GB

Local input file: '/scratch/tgeorgom/pepatac_test/test1/raw/test1_R1.fastq.gz'
Target to produce: `/scratch/tgeorgom/pepatac_test/test1/raw/test1_R2.fastq.gz`

> `ln -sf /home/tgeorgom/pepatac/examples/data/test1_r2.fastq.gz /scratch/tgeorgom/pepatac_test/test1/raw/test1_R2.fastq.gz` (12797)
<pre>
</pre>
Command completed. Elapsed time: 0:00:00. Running peak memory: 0.009GB.
  PID: 12797;   Command: ln;    Return code: 0; Memory used: 0.009GB

Local input file: '/scratch/tgeorgom/pepatac_test/test1/raw/test1_R2.fastq.gz'
Found .fastq.gz file
Found .fq.gz file; no conversion necessary
Found .fastq.gz file
Found .fq.gz file; no conversion necessary
Target to produce: `/scratch/tgeorgom/pepatac_test/test1/fastq/test1_R1.fastq.gz`,`/scratch/tgeorgom/pepatac_test/test1/fastq/test1_R2.fastq.gz`

> `ln -sf /scratch/tgeorgom/pepatac_test/test1/raw/test1_R1.fastq.gz /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R1.fastq.gz` (12853)
<pre>
</pre>
Command completed. Elapsed time: 0:00:00. Running peak memory: 0.009GB.
  PID: 12853;   Command: ln;    Return code: 0; Memory used: 0.009GB


> `ln -sf /scratch/tgeorgom/pepatac_test/test1/raw/test1_R2.fastq.gz /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R2.fastq.gz` (12902)
<pre>
</pre>
Command completed. Elapsed time: 0:00:00. Running peak memory: 0.029GB.
  PID: 12902;   Command: ln;    Return code: 0; Memory used: 0.029GB


### Adapter trimming:  (10-07 11:29:21) elapsed: 2.0 _TIME_

trimmomatic local_input_files: ['/scratch/tgeorgom/pepatac_test/test1/raw/test1_R1.fastq.gz', '/scratch/tgeorgom/pepatac_test/test1/raw/test1_R2.fastq.gz']
Target to produce: `/scratch/tgeorgom/pepatac_test/test1/fastq/test1_R1_trim.fastq`

> `java -Xmx4000M -jar ${TRIMMOMATIC} PE -threads 1 /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R1.fastq.gz /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R2.fastq.gz /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R1_trim.fastq /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R1_unpaired.fq /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R2_trim.fastq /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R2_unpaired.fq ILLUMINACLIP:/home/tgeorgom/pepatac/tools/NexteraPE-PE.fa:2:30:10` (13461)
<pre>
Error: Unable to access jarfile PE
</pre>
Command completed. Elapsed time: 0:00:01. Running peak memory: 0.031GB.
  PID: 13461;   Command: java;  Return code: 1; Memory used: 0.031GB

Starting cleanup: 0 files; 1 conditional files for cleanup

Conditional flag found: []

These conditional files were left in place:

- /scratch/tgeorgom/pepatac_test/test1/fastq/test1*.fastq

### Pipeline failed at:  (10-07 11:29:22) elapsed: 1.0 _TIME_

Total time: 0:00:04
Failure reason: Subprocess returned nonzero result. Check above output for details
Traceback (most recent call last):
  File "/home/tgeorgom/pepatac/pipelines/pepatac.py", line 2779, in <module>
    sys.exit(main())
  File "/home/tgeorgom/pepatac/pipelines/pepatac.py", line 914, in main
    pm.run(trim_cmd, trimmed_fastq, follow=check_trim)
  File "/home/tgeorgom/miniforge3/lib/python3.10/site-packages/pypiper/manager.py", line 1049, in run
    process_return_code, local_maxmem = self.callprint(
  File "/home/tgeorgom/miniforge3/lib/python3.10/site-packages/pypiper/manager.py", line 1316, in callprint
    self._triage_error(SubprocessError(msg), nofail)
  File "/home/tgeorgom/miniforge3/lib/python3.10/site-packages/pypiper/manager.py", line 2539, in _triage_error
    self.fail_pipeline(e)
  File "/home/tgeorgom/miniforge3/lib/python3.10/site-packages/pypiper/manager.py", line 2009, in fail_pipeline
    raise exc
pypiper.exceptions.SubprocessError: Subprocess returned nonzero result. Check above output for details

for more contex this is my /bulker_config.yaml file

bulker:
  volumes:
   - $HOME
   - /scratch/tgeorgom/
  envvars:
   - DISPLAY
  registry_url: http://hub.bulker.io/
  shell_path: ${SHELL}
  shell_rc: $HOME/.bashrc
  rcfile: templates/start.sh
  rcfile_strict: templates/start_strict.sh
  default_crate_folder: /scratch/tgeorgom/bulker/bulker_crates
  singularity_image_folder: /scratch/tgeorgom/bulker/simages
  container_engine: singularity
  default_namespace: bulker
  executable_template: templates/singularity_executable.jinja2
  shell_template: templates/singularity_shell.jinja2
  build_template: templates/singularity_build.jinja2
  crates:
    bulker:
      demo:
        default: /scratch/tgeorgom/bulker/bulker_crates/bulker/demo/default
      pi:
        default: /scratch/tgeorgom/bulker/bulker_crates/bulker/pi/default
      alpine:
        default: /scratch/tgeorgom/bulker/bulker_crates/bulker/alpine/default
      coreutils:
        default: /scratch/tgeorgom/bulker/bulker_crates/bulker/coreutils/default
    databio:
      pepatac:
        1.0.7: /scratch/tgeorgom/bulker/bulker_crates/databio/pepatac/1.0.7
        1.0.10: /scratch/tgeorgom/bulker/bulker_crates/databio/pepatac/1.0.10
        1.0.12: /scratch/tgeorgom/bulker/bulker_crates/databio/pepatac/1.0.12
@donaldcampbelljr
Copy link
Member

Hi,

java -Xmx4000M -jar ${TRIMMOMATIC} PE may offer some clues.

Does setting the env variable TRIMMOMATIC to the trimmomatic jar file solve the issue?

Could you also make sure the jar file is executable?

@sunta3iouxos
Copy link
Author

Thank you for the quick response.
After activating the pepatak environment and looking for echo ${TRIMMOMATIC} I get empty response. Maybe it is not properly set in the bulker?
Anyway this is what I did:
activated the environmet

TRIMMOMATIC=/scratch/tgeorgom/bulker/bulker_crates/databio/pepatac/1.0.12/trimmomatic

and got the same error:

> `java -Xmx4000M -jar ${TRIMMOMATIC} PE -threads 1 /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R1.fastq.gz /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R2.fastq.gz /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R1_trim.fastq /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R1_unpaired.fq /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R2_trim.fastq /scratch/tgeorgom/pepatac_test/test1/fastq/test1_R2_unpaired.fq ILLUMINACLIP:/home/tgeorgom/pepatac/tools/NexteraPE-PE.fa:2:30:10` (25453)
<pre>
Error: Unable to access jarfile PE
</pre>
Command completed. Elapsed time: 0:00:01. Running peak memory: 0.031GB.
  PID: 25453;   Command: java;  Return code: 1; Memory used: 0.031GB

@donaldcampbelljr
Copy link
Member

Ok, I see. For a short term solution, you'll need to download the trimmomatic jar file and set your environment variable to the location of the jar file. Not the directory within the bulker crate like you have above. Apologies that this is confusing.

If trimmomatic continues to give you issues, you could also attempt to use skewer. I believe it is faster than trimmomatic.

@donaldcampbelljr
Copy link
Member

I just realized that it's not populating the variable in your second attempt, so the solution above may not work.

@sunta3iouxos
Copy link
Author

skewer works.
Could you please state the options you are using for skewer?
I had some issues with skewer in the past and had to do with quality trimming and a couple of other things. Overall, was not that reliable, although very fast. I use fastp without issues, with those options "--trim_poly_g --trim_poly_x -Q -L --correction".

@donaldcampbelljr
Copy link
Member

Yes and I believe it should default to skewer without adjusting the pipeline if not other trimmer is set.

Checking the pipeline interface:

{% if sample.trimmer is defined %} --trimmer { sample.trimmer } {% else %} --trimmer "skewer" {% endif %}

From my recent tutorial run using the native install (which defaults to skewer): https://pepatac.databio.org/en/latest/detailed-install/

I see this skewer command from my PEPATAC_log.md

### Adapter trimming:  (10-07 11:55:35) elapsed: 0.0 _TIME_

Target to produce: `/home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial1/fastq/tutorial1_R1_trim.fastq`  

> `skewer -f sanger -t 4 -m pe -x /home/drc/PEPATAC_OCT_2024//tools/pepatac/tools/NexteraPE-PE.fa --quiet -o /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial1/fastq/tutorial1 /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial1/fastq/tutorial1_R1.fastq.gz /home/drc/PEPATAC_OCT_2024/processed/results_pipeline/tutorial1/fastq/tutorial1_R2.fastq.gz` (595646)

Is this helpful?

@nsheff
Copy link
Member

nsheff commented Oct 9, 2024

Isn't skewer the default trimmer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants