Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in 'interactions count' #281

Open
martino-ugolini opened this issue Mar 13, 2024 · 1 comment
Open

Error in 'interactions count' #281

martino-ugolini opened this issue Mar 13, 2024 · 1 comment

Comments

@martino-ugolini
Copy link

Describe the bug
Hi,
I have installed your software to analyze a CaptureC experiment. The pipeline runs without problems until almost the end, when this error occurs.
My bed file is as follows:
4 28693255 29709534 GRCz11Cluster

Note: I work with the zebrafish genome, which in ensembl has chromosomes that are just numbered, without 'chr' in front.

Have you ever encountered this problem or do you know what it could be causing it?

Thank you for your help,
Martino

[Wed Mar 13 15:00:58 2024]
rule regenerate_fastq:
input: capcruncher_output/interim/fastq/Oblong_A_1.fastq.gz, capcruncher_output/interim/fastq/Oblong_A_2.fastq.gz, capcruncher_output/results/Oblong_A/Oblong_A.parquet
output: capcruncher_output/results/Oblong_A/Oblong_A_1.fastq.gz, capcruncher_output/results/Oblong_A/Oblong_A_2.fastq.gz
jobid: 145
reason: Missing output files: capcruncher_output/results/Oblong_A/Oblong_A_1.fastq.gz, capcruncher_output/results/Oblong_A/Oblong_A_2.fastq.gz; Input files updated by another job: capcruncher_output/interim/fastq/Oblong_A_1.fastq.gz, capcruncher_output/interim/fastq/Oblong_A_2.fastq.gz, capcruncher_output/results/Oblong_A/Oblong_A.parquet
wildcards: sample=Oblong_A
resources: tmpdir=/tmp/40194850

2024-03-13 15:01:00.524 | INFO | capcruncher.cli.cli_utilities:regenerate_fastq:376 - Extracting reads info from capcruncher_output/results/Oblong_A/Oblong_A.parquet/*.parquet
[Wed Mar 13 15:01:01 2024]
Finished job 76.
1266 of 1341 steps (94%) done
Select jobs to execute...
2024-03-13 15:01:04.357 | INFO | capcruncher_tools.api:count_interactions:178 - Extracting viewpoint names and sizes
2024-03-13 15:01:04.639 | INFO | capcruncher_tools.api:count_interactions:191 - Number of viewpoints: 1
2024-03-13 15:01:04.639 | INFO | capcruncher_tools.api:count_interactions:192 - Number of slices per viewpoint:
+---------------+-------------+
| | n_slices |
|---------------+-------------|
| GRCz11Cluster | 1.36555e+06 |
+---------------+-------------+
2024-03-13 15:01:09,606 INFO worker.py:1724 -- Started a local Ray instance.
/users/mugolini/capcruncher_env/lib/python3.10/site-packages/pyranges/readers.py:87: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
df = pd.read_csv(

0%| | 0/1 [00:00<?, ?it/s]2024-03-13 15:01:11.953 | INFO | capcruncher.cli.cli_utilities:regenerate_fastq:386 - Writing reads to capcruncher_output/results/Oblong_A/Oblong_A
�[36m(make_cooler pid=3692677)�[0m /users/mugolini/capcruncher_env/lib/python3.10/site-packages/pyranges/methods/init.py:45: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
�[36m(make_cooler pid=3692677)�[0m return {k: v for k, v in df.groupby(grpby_key)}
�[36m(make_cooler pid=3692677)�[0m /users/mugolini/capcruncher_env/lib/python3.10/site-packages/pyranges/methods/init.py:45: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
�[36m(make_cooler pid=3692677)�[0m return {k: v for k, v in df.groupby(grpby_key)}
�[36m(make_cooler pid=3692677)�[0m /users/mugolini/capcruncher_env/lib/python3.10/site-packages/pyranges/methods/init.py:45: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
�[36m(make_cooler pid=3692677)�[0m return {k: v for k, v in df.groupby(grpby_key)}

0%| | 0/1 [00:06<?, ?it/s]
Traceback (most recent call last):
File "/users/mugolini/capcruncher_env/bin/capcruncher", line 8, in
sys.exit(cli())
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/capcruncher/cli/init.py", line 38, in invoke
return self._impl.invoke(ctx)
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/capcruncher/cli/cli_interactions.py", line 167, in count
count(*args, **kwargs)
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/capcruncher/cli/interactions_count.py", line 40, in count
clr = count_interactions(
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/capcruncher_tools/api.py", line 263, in count_interactions
clr = ray.get(clr)
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
return fn(*args, **kwargs)
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(*args, **kwargs)
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/ray/_private/worker.py", line 2624, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(TypeError): �[36mray::make_cooler()�[39m (pid=3692677, ip=10.203.101.126)
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/capcruncher_tools/count.py", line 115, in make_cooler
return cc.storage.create_cooler_cc(
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/capcruncher/api/storage.py", line 187, in create_cooler_cc
cooler.create_cooler(
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/cooler/create/_create.py", line 1020, in create_cooler
create(
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/cooler/create/_create.py", line 623, in create
write_bins(grp, bins, chroms["name"], h5opts)
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/cooler/create/_create.py", line 111, in write_bins
chrom_dset = grp.create_dataset(
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/h5py/_hl/group.py", line 183, in create_dataset
dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)
File "/users/mugolini/capcruncher_env/lib/python3.10/site-packages/h5py/_hl/dataset.py", line 86, in make_new_dset
tid = h5t.py_create(dtype, logical=1)
File "h5py/h5t.pyx", line 1658, in h5py.h5t.py_create
File "h5py/h5t.pyx", line 1682, in h5py.h5t.py_create
File "h5py/h5t.pyx", line 1698, in h5py.h5t.py_create
File "h5py/h5t.pyx", line 1468, in h5py.h5t._c_enum
TypeError: '<' not supported between instances of 'str' and 'int'
[Wed Mar 13 15:01:19 2024]
Error in rule count:
jobid: 116
input: capcruncher_output/results/Oblong_A/Oblong_A.parquet, capcruncher_output/resources/restriction_fragments/genome.digest.bed.gz, /scratch/mugolini/CAPTURE_C/viewpoint.bed
output: capcruncher_output/interim/pileups/counts_by_restriction_fragment/Oblong_A.hdf5
log: capcruncher_output/logs/counts/Oblong_A.log (check log file(s) for error details)
shell:

    mkdir -p capcruncher_output/interim/pileups/counts_by_restriction_fragment &&         capcruncher         interactions         count         capcruncher_output/results/Oblong_A/Oblong_A.parquet         -o capcruncher_output/interim/pileups/counts_by_restriction_fragment/Oblong_A.hdf5         -f capcruncher_output/resources/restriction_fragments/genome.digest.bed.gz         -v /scratch/mugolini/CAPTURE_C/viewpoint.bed         -p 8         --assay capture
    > capcruncher_output/logs/counts/Oblong_A.log 2>&1
    
    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Logfile capcruncher_output/logs/counts/Oblong_A.log not found.

[W::bgzf_read_block] EOF marker is absent. The input may be truncated
[W::bgzf_read_block] EOF marker is absent. The input may be truncated
2024-03-13 15:06:02.859 | INFO | capcruncher.cli.cli_utilities:regenerate_fastq:396 - Done
[Wed Mar 13 15:06:03 2024]
Finished job 145.
1267 of 1341 steps (94%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-03-13T125457.709739.snakemake.log
An error occurred. Please check the log file capcruncher_error.log for more information.

@alsmith151
Copy link
Collaborator

Hi,

I think this may be related to #234 and seems to be an issue with numbered chromosomes.

As a quick fix you could add "chr" to the start of your chromosome names. I will look at this in more detail although it appears to be an issue with a dependent package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants