Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

createArrowFiles has encountered an error, checking if any ArrowFiles completed #2268

Open
LoRner22 opened this issue Feb 20, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@LoRner22
Copy link

LoRner22 commented Feb 20, 2025

Hi,when I ran the code below,a mistake occurred:reateArrowFiles has encountered an error, checking if any ArrowFiles completed.I had tiried many solutions.
inputFiles <- getInputFiles(path="./rawdata") ArrowFiles <- createArrowFiles( inputFiles = inputFiles, sampleNames = names(inputFiles), minTSS = 4, minFrags = 1000, addTileMat = TRUE, addGeneScoreMat = TRUE )

I'm not sure if there's something wrong with my tsv.gz file. Because I converted them from csv files. The conversion code is as follows.Or is there other ways to analysis csv files by ArchR?
part of the csv files :
`> data1[1:4,1:4]

X AAACATGCAAGCGATG.1 AAACATGCAGGCGAGT.1 AAACCAACAATTGAAG.1
1 chr1:9821-10707 0 0 0
2 chr1:15796-16689 0 0 0
3 chr1:115263-116163 0 0 0
4 chr1:181102-181823 0 0 0`

I used pivot_longer() function to convert them ,filtered 0 values and splited the first column.Final tsv files :
`head S1_fragments.tsv

chr1 9821 10707 AACGACAAGGTGAGAC-1 2
chr1 9821 10707 AAGCGCTGTGGGAACA-1 2
chr1 9821 10707 AAGCTATGTAATCGTG-1 2
chr1 9821 10707 AATTTCCTCTTAGGAC-1 2
chr1 9821 10707 ACGAAGTCATGGAGGC-1 2
chr1 9821 10707 ACTAACCAGGCCTAAT-1 2
chr1 9821 10707 ACTCACCTCACTAGGT-1 4
chr1 9821 10707 AGCTAACTCCGTAAAC-1 2
chr1 9821 10707 AGGATATAGCAGGCCT-1 2
chr1 9821 10707 AGTGCGGAGGTCCACA-1 2
`

Looking forward to your reply.Best wishes.

My log files:

       ___      .______        ______  __    __  .______      
      /   \     |   _  \      /      ||  |  |  | |   _  \     
     /  ^  \    |  |_)  |    |  ,----'|  |__|  | |  |_)  |    
    /  /_\  \   |      /     |  |     |   __   | |      /     
   /  _____  \  |  |\  \\___ |  `----.|  |  |  | |  |\  \\___.
  /__/     \__\ | _| `._____| \______||__|  |__| | _| `._____|

Logging With ArchR!

Start Time : 2025-02-20 20:17:10.621667

------- ArchR Info

ArchRThreads = 40
ArchRGenome = Hg38

------- System Info

Computer OS = unix
Total Cores = 72

------- Session Info

R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.5 LTS

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0

Random number generation:
RNG: L'Ecuyer-CMRG
Normal: Inversion
Sample: Rejection

locale:
[1] LC_CTYPE=zh_CN.UTF-8 LC_NUMERIC=C
[3] LC_TIME=zh_CN.UTF-8 LC_COLLATE=zh_CN.UTF-8
[5] LC_MONETARY=zh_CN.UTF-8 LC_MESSAGES=zh_CN.UTF-8
[7] LC_PAPER=zh_CN.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C

time zone: Asia/Shanghai
tzcode source: system (glibc)

attached base packages:
[1] parallel stats4 grid stats graphics grDevices
[7] utils datasets methods base

other attached packages:
[1] magick_2.8.3
[2] BSgenome.Hsapiens.UCSC.hg38_1.4.5
[3] BSgenome_1.72.0
[4] rtracklayer_1.64.0
[5] BiocIO_1.14.0
[6] Biostrings_2.72.1
[7] XVector_0.44.0
[8] lubridate_1.9.3
[9] forcats_1.0.0
[10] purrr_1.0.2
[11] readr_2.1.5
[12] tidyr_1.3.1
[13] tibble_3.2.1
[14] tidyverse_2.0.0
[15] rhdf5_2.48.0
[16] SummarizedExperiment_1.34.0
[17] RcppArmadillo_0.12.8.4.0
[18] Rcpp_1.0.12
[19] Matrix_1.7-0
[20] GenomicRanges_1.56.1
[21] GenomeInfoDb_1.40.1
[22] IRanges_2.38.0
[23] S4Vectors_0.42.0
[24] sparseMatrixStats_1.16.0
[25] MatrixGenerics_1.16.0
[26] matrixStats_1.3.0
[27] plyr_1.8.9
[28] magrittr_2.0.3
[29] gtable_0.3.5
[30] gtools_3.9.5
[31] gridExtra_2.3
[32] devtools_2.4.5
[33] usethis_2.2.3
[34] ArchR_1.0.3
[35] dplyr_1.1.4
[36] data.table_1.15.4
[37] readxl_1.4.3
[38] factoextra_1.0.7
[39] ggplot2_3.5.1
[40] FactoMineR_2.11
[41] stringr_1.5.1
[42] GEOquery_2.72.0
[43] Biobase_2.64.0
[44] BiocGenerics_0.50.0

loaded via a namespace (and not attached):
[1] fs_1.6.4 spatstat.sparse_3.1-0
[3] bitops_1.0-7 httr_1.4.7
[5] RColorBrewer_1.1-3 profvis_0.3.8
[7] tools_4.4.1 sctransform_0.4.1
[9] utf8_1.2.4 R6_2.5.1
[11] DT_0.33 lazyeval_0.2.2
[13] uwot_0.2.2 rhdf5filters_1.16.0
[15] urlchecker_1.0.1 withr_3.0.0
[17] sp_2.1-4 progressr_0.14.0
[19] cli_3.6.3 Cairo_1.6-2
[21] spatstat.explore_3.2-7 fastDummies_1.7.3
[23] flashClust_1.01-2 sandwich_3.1-0
[25] Seurat_5.1.0 mvtnorm_1.2-5
[27] spatstat.data_3.1-2 ggridges_0.5.6
[29] pbapply_1.7-2 Rsamtools_2.20.0
[31] harmony_1.2.0 parallelly_1.37.1
[33] sessioninfo_1.2.2 limma_3.60.3
[35] rstudioapi_0.16.0 generics_0.1.3
[37] vroom_1.6.5 ica_1.0-3
[39] spatstat.random_3.2-3 leaps_3.2
[41] fansi_1.0.6 abind_1.4-5
[43] lifecycle_1.0.4 scatterplot3d_0.3-44
[45] multcomp_1.4-25 yaml_2.3.8
[47] SparseArray_1.4.8 Rtsne_0.17
[49] promises_1.3.0 crayon_1.5.3
[51] miniUI_0.1.1.1 lattice_0.22-6
[53] cowplot_1.1.3 pillar_1.9.0
[55] knitr_1.47 rjson_0.2.22
[57] estimability_1.5.1 future.apply_1.11.2
[59] codetools_0.2-20 leiden_0.4.3.1
[61] glue_1.7.0 remotes_2.5.0
[63] vctrs_0.6.5 png_0.1-8
[65] spam_2.10-0 cellranger_1.1.0
[67] cachem_1.1.0 xfun_0.45
[69] S4Arrays_1.4.1 mime_0.12
[71] coda_0.19-4.1 survival_3.6-4
[73] shinythemes_1.2.0 statmod_1.5.0
[75] ellipsis_0.3.2 fitdistrplus_1.1-11
[77] TH.data_1.1-2 ROCR_1.0-11
[79] nlme_3.1-164 bit64_4.0.5
[81] RcppAnnoy_0.0.22 irlba_2.3.5.1
[83] KernSmooth_2.23-24 colorspace_2.1-0
[85] tidyselect_1.2.1 emmeans_1.10.2
[87] bit_4.0.5 compiler_4.4.1
[89] curl_5.2.1 xml2_1.3.6
[91] rhandsontable_0.3.8 DelayedArray_0.30.1
[93] plotly_4.10.4 scales_1.3.0
[95] lmtest_0.9-40 multcompView_0.1-10
[97] digest_0.6.36 goftest_1.2-3
[99] spatstat.utils_3.0-5 presto_1.0.0
[101] rmarkdown_2.27 htmltools_0.5.8.1
[103] pkgconfig_2.0.3 fastmap_1.2.0
[105] rlang_1.1.4 htmlwidgets_1.6.4
[107] UCSC.utils_1.0.0 shiny_1.8.1.1
[109] zoo_1.8-12 jsonlite_1.8.8
[111] BiocParallel_1.38.0 RCurl_1.98-1.14
[113] GenomeInfoDbData_1.2.12 dotCall64_1.1-1
[115] patchwork_1.2.0 Rhdf5lib_1.26.0
[117] munsell_0.5.1 reticulate_1.38.0
[119] stringi_1.8.4 zlibbioc_1.50.0
[121] MASS_7.3-60.2 pkgbuild_1.4.4
[123] listenv_0.9.1 ggrepel_0.9.5
[125] deldir_2.0-4 splines_4.4.1
[127] tensor_1.5 hms_1.1.3
[129] igraph_2.0.3 spatstat.geom_3.2-9
[131] RcppHNSW_0.6.0 reshape2_1.4.4
[133] pkgload_1.3.4 XML_3.99-0.16.1
[135] evaluate_0.24.0 SeuratObject_5.0.2
[137] tzdb_0.4.0 httpuv_1.6.15
[139] RANN_2.6.1 polyclip_1.10-6
[141] future_1.33.2 scattermore_1.2
[143] xtable_1.8-4 restfulr_0.0.15
[145] RSpectra_0.16-1 later_1.3.2
[147] viridisLite_0.4.2 memoise_2.0.1
[149] GenomicAlignments_1.40.0 cluster_2.1.6
[151] timechange_0.3.0 globals_0.16.3

------- Log Info

2025-02-20 20:17:10.711106 : createArrowFiles Input-Parameters, Class = list

createArrowFiles Input-Parameters$inputFiles: length = 2
S1 S2
"./rawdata/S1_fragments.tsv.gz" "./rawdata/S2_fragments.tsv.gz"

createArrowFiles Input-Parameters$sampleNames: length = 2
[1] "S1" "S2"

createArrowFiles Input-Parameters$outputNames: length = 2
[1] "S1" "S2"

createArrowFiles Input-Parameters$validBarcodes: length = 0
NULL

createArrowFiles Input-Parameters$minTSS: length = 1
[1] 4

createArrowFiles Input-Parameters$minFrags: length = 1
[1] 1000

createArrowFiles Input-Parameters$maxFrags: length = 1
[1] 1e+05

createArrowFiles Input-Parameters$minFragSize: length = 1
[1] 10

createArrowFiles Input-Parameters$maxFragSize: length = 1
[1] 2000

createArrowFiles Input-Parameters$QCDir: length = 1
[1] "QualityControl"

createArrowFiles Input-Parameters$nucLength: length = 1
[1] 147

createArrowFiles Input-Parameters$promoterRegion: length = 2
[1] 2000 100

createArrowFiles Input-Parameters$excludeChr: length = 2
[1] "chrM" "chrY"

createArrowFiles Input-Parameters$nChunk: length = 1
[1] 5

createArrowFiles Input-Parameters$bcTag: length = 1
[1] "qname"

createArrowFiles Input-Parameters$gsubExpression: length = 0
NULL

createArrowFiles Input-Parameters$bamFlag: length = 0
NULL

createArrowFiles Input-Parameters$offsetPlus: length = 1
[1] 4

createArrowFiles Input-Parameters$offsetMinus: length = 1
[1] -5

createArrowFiles Input-Parameters$addTileMat: length = 1
[1] TRUE

createArrowFiles Input-Parameters$addGeneScoreMat: length = 1
[1] TRUE

createArrowFiles Input-Parameters$force: length = 1
[1] FALSE

createArrowFiles Input-Parameters$threads: length = 1
[1] 40

createArrowFiles Input-Parameters$parallelParam: length = 0
NULL

createArrowFiles Input-Parameters$subThreading: length = 1
[1] TRUE

createArrowFiles Input-Parameters$verbose: length = 1
[1] TRUE

createArrowFiles Input-Parameters$cleanTmp: length = 1
[1] TRUE

createArrowFiles Input-Parameters$logFile: length = 1
[1] "ArchRLogs/ArchR-createArrows-30dd3f1522a482-Date-2025-02-20_Time-20-17-10.618419.log"

createArrowFiles Input-Parameters$filterFrags: length = 1
[1] 1000

createArrowFiles Input-Parameters$filterTSS: length = 1
[1] 4

2025-02-20 20:17:10.735497 :
2025-02-20 20:17:10.739421 : Batch Execution w/ safelapply!, 0 mins elapsed.

###########
2025-02-20 20:17:11.55902 : Creating Arrow File S2.arrow (S2 : 1 of 2)
###########

2025-02-20 20:17:11.561097 : validBC, Class = NULL

validBC: length = 0
NULL

2025-02-20 20:17:11.684675 :

###########
2025-02-20 20:17:12.231882 : Creating Arrow File S1.arrow (S1 : 2 of 2)
###########

2025-02-20 20:17:12.233667 : validBC, Class = NULL

validBC: length = 0
NULL

2025-02-20 20:17:12.339596 :

2025-02-20 20:17:12.363655 :

------- Completed

End Time : 2025-02-20 20:17:12.368532
Elapsed Time Minutes = 0.0290901939074198
Elapsed Time Hours = 0.000485041075282627


@LoRner22 LoRner22 added the bug Something isn't working label Feb 20, 2025
@immanuelazn
Copy link
Collaborator

Hi @LoRner22! Thanks for using ArchR! Please make sure that your post belongs in the Issues section. Only bugs and error reports belong in the Issues section. Usage questions and feature requests should be posted in the Discussions section, not in Issues.

If you are getting an error, it is likely due to something specific to your dataset, usage, or computational environment, all of which are extremely challenging to troubleshoot. As such, we require reproducible examples (preferably using the tutorial dataset) from users who want assistance. If you cannot reproduce your error, we will not be able to help.
Before going through the work of making a reproducible example, search the previous Issues, Discussions, function definitions, or the ArchR manual and you will likely find the answers you are looking for.
If your post does not contain a reproducible example, it is unlikely to receive a response.

In addition to a reproducible example, you must do the following things before we help you, unless your original post already contained this information:
1. If you've encountered an error, have you already searched previous Issues to make sure that this hasn't already been solved?
2. Did you post your log file? If not, add it now.
3. Remove any screenshots that contain text and instead copy and paste the text using markdown's codeblock syntax (three consecutive backticks). You can do this by editing your original post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants