diff --git a/dev/articles/interval-stats.html b/dev/articles/interval-stats.html
index c730ad19..760f31fc 100644
--- a/dev/articles/interval-stats.html
+++ b/dev/articles/interval-stats.html
@@ -79,7 +79,7 @@
diff --git a/dev/search.json b/dev/search.json
index 2c73648a..999cfc36 100644
--- a/dev/search.json
+++ b/dev/search.json
@@ -1 +1 @@
-[{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/CODE_OF_CONDUCT.html","id":"our-pledge","dir":"","previous_headings":"","what":"Our Pledge","title":"Contributor Covenant Code of Conduct","text":"members, contributors, leaders pledge make participation community harassment-free experience everyone, regardless age, body size, visible invisible disability, ethnicity, sex characteristics, gender identity expression, level experience, education, socio-economic status, nationality, personal appearance, race, religion, sexual identity orientation. pledge act interact ways contribute open, welcoming, diverse, inclusive, healthy community.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CODE_OF_CONDUCT.html","id":"our-standards","dir":"","previous_headings":"","what":"Our Standards","title":"Contributor Covenant Code of Conduct","text":"Examples behavior contributes positive environment community include: Demonstrating empathy kindness toward people respectful differing opinions, viewpoints, experiences Giving gracefully accepting constructive feedback Accepting responsibility apologizing affected mistakes, learning experience Focusing best just us individuals, overall community Examples unacceptable behavior include: use sexualized language imagery, sexual attention advances kind Trolling, insulting derogatory comments, personal political attacks Public private harassment Publishing others’ private information, physical email address, without explicit permission conduct reasonably considered inappropriate professional setting","code":""},{"path":"https://rnabioco.github.io/valr/dev/CODE_OF_CONDUCT.html","id":"enforcement-responsibilities","dir":"","previous_headings":"","what":"Enforcement Responsibilities","title":"Contributor Covenant Code of Conduct","text":"Community leaders responsible clarifying enforcing standards acceptable behavior take appropriate fair corrective action response behavior deem inappropriate, threatening, offensive, harmful. Community leaders right responsibility remove, edit, reject comments, commits, code, wiki edits, issues, contributions aligned Code Conduct, communicate reasons moderation decisions appropriate.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CODE_OF_CONDUCT.html","id":"scope","dir":"","previous_headings":"","what":"Scope","title":"Contributor Covenant Code of Conduct","text":"Code Conduct applies within community spaces, also applies individual officially representing community public spaces. Examples representing community include using official e-mail address, posting via official social media account, acting appointed representative online offline event.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CODE_OF_CONDUCT.html","id":"enforcement","dir":"","previous_headings":"","what":"Enforcement","title":"Contributor Covenant Code of Conduct","text":"Instances abusive, harassing, otherwise unacceptable behavior may reported community leaders responsible enforcement [INSERT CONTACT METHOD]. complaints reviewed investigated promptly fairly. community leaders obligated respect privacy security reporter incident.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CODE_OF_CONDUCT.html","id":"enforcement-guidelines","dir":"","previous_headings":"","what":"Enforcement Guidelines","title":"Contributor Covenant Code of Conduct","text":"Community leaders follow Community Impact Guidelines determining consequences action deem violation Code Conduct:","code":""},{"path":"https://rnabioco.github.io/valr/dev/CODE_OF_CONDUCT.html","id":"id_1-correction","dir":"","previous_headings":"Enforcement Guidelines","what":"1. Correction","title":"Contributor Covenant Code of Conduct","text":"Community Impact: Use inappropriate language behavior deemed unprofessional unwelcome community. Consequence: private, written warning community leaders, providing clarity around nature violation explanation behavior inappropriate. public apology may requested.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CODE_OF_CONDUCT.html","id":"id_2-warning","dir":"","previous_headings":"Enforcement Guidelines","what":"2. Warning","title":"Contributor Covenant Code of Conduct","text":"Community Impact: violation single incident series actions. Consequence: warning consequences continued behavior. interaction people involved, including unsolicited interaction enforcing Code Conduct, specified period time. includes avoiding interactions community spaces well external channels like social media. Violating terms may lead temporary permanent ban.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CODE_OF_CONDUCT.html","id":"id_3-temporary-ban","dir":"","previous_headings":"Enforcement Guidelines","what":"3. Temporary Ban","title":"Contributor Covenant Code of Conduct","text":"Community Impact: serious violation community standards, including sustained inappropriate behavior. Consequence: temporary ban sort interaction public communication community specified period time. public private interaction people involved, including unsolicited interaction enforcing Code Conduct, allowed period. Violating terms may lead permanent ban.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CODE_OF_CONDUCT.html","id":"id_4-permanent-ban","dir":"","previous_headings":"Enforcement Guidelines","what":"4. Permanent Ban","title":"Contributor Covenant Code of Conduct","text":"Community Impact: Demonstrating pattern violation community standards, including sustained inappropriate behavior, harassment individual, aggression toward disparagement classes individuals. Consequence: permanent ban sort public interaction within community.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CODE_OF_CONDUCT.html","id":"attribution","dir":"","previous_headings":"","what":"Attribution","title":"Contributor Covenant Code of Conduct","text":"Code Conduct adapted Contributor Covenant, version 2.0, available https://www.contributor-covenant.org/version/2/0/ code_of_conduct.html. Community Impact Guidelines inspired Mozilla’s code conduct enforcement ladder. answers common questions code conduct, see FAQ https://www.contributor-covenant.org/faq. Translations available https:// www.contributor-covenant.org/translations.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CONTRIBUTING.html","id":null,"dir":"","previous_headings":"","what":"Contributing to valr","title":"Contributing to valr","text":"outlines propose change valr. detailed info contributing , tidyverse packages, please see development contributing guide.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CONTRIBUTING.html","id":"fixing-typos","dir":"","previous_headings":"","what":"Fixing typos","title":"Contributing to valr","text":"can fix typos, spelling mistakes, grammatical errors documentation directly using GitHub web interface, long changes made source file. generally means ’ll need edit roxygen2 comments .R, .Rd file. can find .R file generates .Rd reading comment first line.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CONTRIBUTING.html","id":"bigger-changes","dir":"","previous_headings":"","what":"Bigger changes","title":"Contributing to valr","text":"want make bigger change, ’s good idea first file issue make sure someone team agrees ’s needed. ’ve found bug, please file issue illustrates bug minimal reprex (also help write unit test, needed).","code":""},{"path":"https://rnabioco.github.io/valr/dev/CONTRIBUTING.html","id":"pull-request-process","dir":"","previous_headings":"Bigger changes","what":"Pull request process","title":"Contributing to valr","text":"Fork package clone onto computer. haven’t done , recommend using usethis::create_from_github(\"rnabioco/valr\", fork = TRUE). Install development dependences devtools::install_dev_deps(), make sure package passes R CMD check running devtools::check(). R CMD check doesn’t pass cleanly, ’s good idea ask help continuing. Create Git branch pull request (PR). recommend using usethis::pr_init(\"brief-description--change\"). Make changes, commit git, create PR running usethis::pr_push(), following prompts browser. title PR briefly describe change. body PR contain Fixes #issue-number. user-facing changes, add bullet top NEWS.md (.e. just first header). Follow style described https://style.tidyverse.org/news.html.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CONTRIBUTING.html","id":"code-style","dir":"","previous_headings":"Bigger changes","what":"Code style","title":"Contributing to valr","text":"New code follow tidyverse style guide. can use styler package apply styles, please don’t restyle code nothing PR. use roxygen2, Markdown syntax, documentation. use testthat unit tests. Contributions test cases included easier accept.","code":""},{"path":"https://rnabioco.github.io/valr/dev/CONTRIBUTING.html","id":"code-of-conduct","dir":"","previous_headings":"","what":"Code of Conduct","title":"Contributing to valr","text":"Please note valr project released Contributor Code Conduct. contributing project agree abide terms.","code":""},{"path":"https://rnabioco.github.io/valr/dev/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2016-2018 Jay R Hesselberth Kent Riemondy Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"https://rnabioco.github.io/valr/dev/articles/interval-stats.html","id":"overview","dir":"Articles","previous_headings":"","what":"Overview","title":"Interval statistics","text":"valr includes several functions exploring statistical relationships sets intervals. Calculate significance overlaps sets intervals bed_fisher() bed_projection(). Quantify relative absolute distances sets intervals bed_reldist() bed_absdist(). Quantify extent overlap sets intervals bed_jaccard(). vignette explore relationship transcription start sites repetitive elements human genome.","code":"library(valr) #> Error in get(paste0(generic, \".\", class), envir = get_method_env()) : #> object 'type_sum.accel' not found library(dplyr) library(ggplot2) library(cowplot) library(tidyr) # load repeats and genes. Data in the valr package is restricted to chr22; the entire # files can be downloaded from UCSC. rpts <- read_bed(valr_example(\"hg19.rmsk.chr22.bed.gz\")) genes <- read_bed12(valr_example(\"hg19.refGene.chr22.bed.gz\")) # load chrom sizes genome <- read_genome(valr_example(\"hg19.chrom.sizes.gz\")) # create 1 bp intervals representing transcription start sites tss <- create_tss(genes) tss #> # A tibble: 1,267 × 6 #> chrom start end name score strand #> #> 1 chr22 16193008 16193009 NR_122113 0 - #> 2 chr22 16157078 16157079 NR_133911 0 + #> 3 chr22 16162065 16162066 NR_073459 0 + #> 4 chr22 16162065 16162066 NR_073460 0 + #> 5 chr22 16231288 16231289 NR_132385 0 - #> 6 chr22 16287936 16287937 NM_001136213 0 - #> 7 chr22 16274608 16274609 NR_046571 0 + #> 8 chr22 16449803 16449804 NM_001005239 0 - #> 9 chr22 17073699 17073700 NM_014406 0 - #> 10 chr22 17082800 17082801 NR_001591 0 + #> # ℹ 1,257 more rows"},{"path":"https://rnabioco.github.io/valr/dev/articles/interval-stats.html","id":"distance-metrics","dir":"Articles","previous_headings":"","what":"Distance metrics","title":"Interval statistics","text":"First define function takes x y intervals computes distance statistics (using bed_reldist() bed_absdist()) specified groups. value statistic assigned .value column. use distance_stats() function apply bed_absdist() function group data. done set shuffled group data. bed_shuffle() used shuffle coordinates repeats within chromosome (.e., coordinates change, chromosome stays .) Now can bind observed shuffled data together, tidying put data format appropriate statistical test. involves: unnest()ing data frames creating groups repeat (name), stat (reldist absdist) type (obs shf) adding unique surrogate row numbers group using tidyr::pivot_wider() create two new obs shuf columns removing rows NA values. Now data formatted, can use non-parametric ks.test() determine whether significant differences observed shuffled data group. broom::tidy() used reformat results test tibble, results test pivoted type column test type. Histgrams different stats help visualize distribution p.values. can also assess false discovery rates (q.values) using p.adjust(). Finally can visualize results using stat_ecdf().","code":"distance_stats <- function(x, y, genome, group_var, type = NA) { group_by(x, !!rlang::sym(group_var)) |> do( reldist = bed_reldist(., y, detail = TRUE) |> select(.value = .reldist), absdist = bed_absdist(., y, genome) |> select(.value = .absdist) ) |> tidyr::pivot_longer( cols = -name, names_to = \"stat\", values_to = \"value\" ) |> mutate(type = type) } obs_stats <- distance_stats(rpts, tss, genome, \"name\", \"obs\") obs_stats #> # A tibble: 2,106 × 4 #> name stat value type #> #> 1 (A)n reldist obs #> 2 (A)n absdist obs #> 3 (AAAAACA)n reldist obs #> 4 (AAAAACA)n absdist obs #> 5 (AAAAC)n reldist obs #> 6 (AAAAC)n absdist obs #> 7 (AAAAG)n reldist obs #> 8 (AAAAG)n absdist obs #> 9 (AAAAT)n reldist obs #> 10 (AAAAT)n absdist obs #> # ℹ 2,096 more rows shfs <- bed_shuffle(rpts, genome, within = TRUE) shf_stats <- distance_stats(shfs, tss, genome, \"name\", \"shuf\") res <- bind_rows(obs_stats, shf_stats) |> tidyr::unnest(value) |> group_by(name, stat, type) |> mutate(.id = row_number()) |> tidyr::pivot_wider( names_from = \"type\", values_from = \".value\" ) |> na.omit() res #> # A tibble: 16,785 × 5 #> # Groups: name, stat [1,904] #> name stat .id obs shuf #> #> 1 (A)n reldist 1 0.363 0.177 #> 2 (A)n reldist 2 0.429 0.404 #> 3 (A)n reldist 3 0.246 0.119 #> 4 (A)n reldist 4 0.478 0.157 #> 5 (A)n reldist 5 0.260 0.176 #> 6 (A)n reldist 6 0.286 0.225 #> 7 (A)n reldist 7 0.498 0.128 #> 8 (A)n reldist 8 0.237 0.385 #> 9 (A)n reldist 9 0.314 0.413 #> 10 (A)n reldist 10 0.149 0.234 #> # ℹ 16,775 more rows library(broom) pvals <- res |> do( twosided = tidy(ks.test(.$obs, .$shuf)), less = tidy(ks.test(.$obs, .$shuf, alternative = \"less\")), greater = tidy(ks.test(.$obs, .$shuf, alternative = \"greater\")) ) |> tidyr::pivot_longer(cols = -c(name, stat), names_to = \"alt\", values_to = \"type\") |> unnest(type) |> select(name:p.value) |> arrange(p.value) ggplot(pvals, aes(p.value)) + geom_histogram(binwidth = 0.05) + facet_grid(stat ~ alt) + theme_cowplot() pvals <- group_by(pvals, stat, alt) |> mutate(q.value = p.adjust(p.value)) |> ungroup() |> arrange(q.value) res_gather <- tidyr::pivot_longer(res, cols = -c(name, stat, .id), names_to = \"type\", values_to = \"value\" ) signif <- head(pvals, 5) res_signif <- signif |> left_join(res_gather, by = c(\"name\", \"stat\")) #> Warning in left_join(signif, res_gather, by = c(\"name\", \"stat\")): Detected an unexpected many-to-many relationship between `x` and `y`. #> ℹ Row 1 of `x` matches multiple rows in `y`. #> ℹ Row 29037 of `y` matches multiple rows in `x`. #> ℹ If a many-to-many relationship is expected, set `relationship = #> \"many-to-many\"` to silence this warning. ggplot(res_signif, aes(x = value, color = type)) + stat_ecdf() + facet_grid(stat ~ name) + theme_cowplot() + scale_x_log10() + scale_color_brewer(palette = \"Set1\")"},{"path":"https://rnabioco.github.io/valr/dev/articles/interval-stats.html","id":"projection-test","dir":"Articles","previous_headings":"","what":"Projection test","title":"Interval statistics","text":"bed_projection() statistical approach assess relationship two intervals based binomial distribution. , examine distribution repetitive elements within promoters coding non-coding genes. First ’ll extract 5 kb regions upstream transcription start sites represent promoter regions coding non-coding genes. Next ’ll apply bed_projection() test repeat class coding non-coding regions. projection test two-tailed statistical test. significant p-value indicates either enrichment depletion query intervals compared reference interval sets. value lower_tail = TRUE column indicates query intervals depleted, whereas lower_tail = FALSE indicates query intervals enriched.","code":"# create intervals 5kb upstream of tss representing promoters promoters <- bed_flank(genes, genome, left = 5000, strand = TRUE) |> mutate(name = ifelse(grepl(\"NR_\", name), \"non-coding\", \"coding\")) |> select(chrom:strand) # select coding and non-coding promoters promoters_coding <- filter(promoters, name == \"coding\") promoters_ncoding <- filter(promoters, name == \"non-coding\") promoters_coding #> # A tibble: 973 × 6 #> chrom start end name score strand #> #> 1 chr22 16287937 16292937 coding 0 - #> 2 chr22 16449804 16454804 coding 0 - #> 3 chr22 17073700 17078700 coding 0 - #> 4 chr22 17302589 17307589 coding 0 - #> 5 chr22 17302589 17307589 coding 0 - #> 6 chr22 17489112 17494112 coding 0 - #> 7 chr22 17560848 17565848 coding 0 + #> 8 chr22 17560848 17565848 coding 0 + #> 9 chr22 17602213 17607213 coding 0 - #> 10 chr22 17602257 17607257 coding 0 - #> # ℹ 963 more rows promoters_ncoding #> # A tibble: 294 × 6 #> chrom start end name score strand #> #> 1 chr22 16152078 16157078 non-coding 0 + #> 2 chr22 16157065 16162065 non-coding 0 + #> 3 chr22 16157065 16162065 non-coding 0 + #> 4 chr22 16193009 16198009 non-coding 0 - #> 5 chr22 16231289 16236289 non-coding 0 - #> 6 chr22 16269608 16274608 non-coding 0 + #> 7 chr22 17077800 17082800 non-coding 0 + #> 8 chr22 17156430 17161430 non-coding 0 - #> 9 chr22 17229328 17234328 non-coding 0 - #> 10 chr22 17303363 17308363 non-coding 0 + #> # ℹ 284 more rows # function to apply bed_projection to groups projection_stats <- function(x, y, genome, group_var, type = NA) { group_by(x, !!rlang::sym(group_var)) |> do( n_repeats = nrow(.), projection = bed_projection(., y, genome) ) |> mutate(type = type) } pvals_coding <- projection_stats(rpts, promoters_coding, genome, \"name\", \"coding\") pvals_ncoding <- projection_stats(rpts, promoters_ncoding, genome, \"name\", \"non_coding\") pvals <- bind_rows(pvals_ncoding, pvals_coding) |> ungroup() |> tidyr::unnest(cols = c(n_repeats, projection)) |> select(-chrom) # filter for repeat classes with at least 10 intervals pvals <- filter( pvals, n_repeats > 10, obs_exp_ratio != 0 ) # adjust pvalues pvals <- mutate(pvals, q.value = p.adjust(p.value)) pvals #> # A tibble: 179 × 7 #> name n_repeats p.value obs_exp_ratio lower_tail type q.value #> #> 1 (A)n 28 0.00353 4.72 FALSE non_coding 0.558 #> 2 (AT)n 48 0.298 0.917 FALSE non_coding 1 #> 3 (CA)n 31 0.156 1.42 FALSE non_coding 1 #> 4 (GT)n 42 0.247 1.05 FALSE non_coding 1 #> 5 (T)n 61 0.405 0.721 FALSE non_coding 1 #> 6 (TG)n 40 0.0622 2.20 FALSE non_coding 1 #> 7 A-rich 54 0.348 0.815 FALSE non_coding 1 #> 8 Alu 15 0.0446 2.93 FALSE non_coding 1 #> 9 AluJb 271 0.0225 1.79 FALSE non_coding 1 #> 10 AluJo 208 0.0216 1.90 FALSE non_coding 1 #> # ℹ 169 more rows library(DT) # find and show top 5 most significant repeats signif_tests <- pvals |> arrange(q.value) |> group_by(type) |> top_n(-5, q.value) |> arrange(type) DT::datatable(signif_tests)"},{"path":"https://rnabioco.github.io/valr/dev/articles/valr.html","id":"familiar-tools-natively-in-r","dir":"Articles","previous_headings":"","what":"Familiar tools, natively in R","title":"valr overview","text":"functions valr similar names BEDtools counterparts, familiar users coming BEDtools suite. Similar pybedtools, valr terse syntax:","code":"library(valr) library(dplyr) snps <- read_bed(valr_example(\"hg19.snps147.chr22.bed.gz\")) genes <- read_bed(valr_example(\"genes.hg19.chr22.bed.gz\")) # find snps in intergenic regions intergenic <- bed_subtract(snps, genes) # distance from intergenic snps to nearest gene nearby <- bed_closest(intergenic, genes) nearby |> select(starts_with(\"name\"), .overlap, .dist) |> filter(abs(.dist) < 1000) #> # A tibble: 285 × 4 #> name.x name.y .overlap .dist #> #> 1 rs2261631 P704P 0 -268 #> 2 rs570770556 POTEH 0 -913 #> 3 rs538163832 POTEH 0 -953 #> 4 rs9606135 TPTEP1 0 -422 #> 5 rs11912392 ANKRD62P1-PARP4P3 0 105 #> 6 rs8136454 BC038197 0 356 #> 7 rs5992556 XKR3 0 -456 #> 8 rs114101676 GAB4 0 474 #> 9 rs62236167 CECR7 0 262 #> 10 rs5747023 CECR1 0 -387 #> # ℹ 275 more rows"},{"path":"https://rnabioco.github.io/valr/dev/articles/valr.html","id":"input-data","dir":"Articles","previous_headings":"","what":"Input data","title":"valr overview","text":"valr assigns common column names facilitate comparisons tbls. tbls chrom, start, end columns, tbls multi-column formats additional pre-determined column names. See read_bed() documentation details. valr can also operate BED-like data.frames already constructed R, provided columns named chrom, start end present. New tbls can also constructed either tibbles base R data.frames.","code":"bed_file <- valr_example(\"3fields.bed.gz\") read_bed(bed_file) # accepts filepaths or URLs #> # A tibble: 10 × 3 #> chrom start end #> #> 1 chr1 11873 14409 #> 2 chr1 14361 19759 #> 3 chr1 14406 29370 #> 4 chr1 34610 36081 #> 5 chr1 69090 70008 #> 6 chr1 134772 140566 #> 7 chr1 321083 321115 #> 8 chr1 321145 321207 #> 9 chr1 322036 326938 #> 10 chr1 327545 328439 bed <- tribble( ~chrom, ~start, ~end, \"chr1\", 1657492, 2657492, \"chr2\", 2501324, 3094650 ) bed #> # A tibble: 2 × 3 #> chrom start end #> #> 1 chr1 1657492 2657492 #> 2 chr2 2501324 3094650"},{"path":"https://rnabioco.github.io/valr/dev/articles/valr.html","id":"interval-coordinates","dir":"Articles","previous_headings":"","what":"Interval coordinates","title":"valr overview","text":"valr adheres BED format specifies start position interval zero based end position one-based. first position chromosome 0. end position chromosome one position passed last base, included interval. example:","code":"# a chromosome 100 basepairs in length chrom <- tribble( ~chrom, ~start, ~end, \"chr1\", 0, 100 ) chrom #> # A tibble: 1 × 3 #> chrom start end #> #> 1 chr1 0 100 # single base-pair intervals bases <- tribble( ~chrom, ~start, ~end, \"chr1\", 0, 1, # first base of chromosome \"chr1\", 1, 2, # second base of chromosome \"chr1\", 99, 100 # last base of chromosome ) bases #> # A tibble: 3 × 3 #> chrom start end #> #> 1 chr1 0 1 #> 2 chr1 1 2 #> 3 chr1 99 100"},{"path":"https://rnabioco.github.io/valr/dev/articles/valr.html","id":"remote-databases","dir":"Articles","previous_headings":"","what":"Remote databases","title":"valr overview","text":"Remote databases can accessed db_ucsc() (access UCSC Browser) db_ensembl() (access Ensembl databases).","code":"# access the `refGene` tbl on the `hg38` assembly. if (require(RMariaDB)) { ucsc <- db_ucsc(\"hg38\") tbl(ucsc, \"refGene\") }"},{"path":"https://rnabioco.github.io/valr/dev/articles/valr.html","id":"visual-documentation","dir":"Articles","previous_headings":"","what":"Visual documentation","title":"valr overview","text":"bed_glyph() tool illustrates results operations valr, similar found BEDtools documentation. glyph shows result intersecting x y intervals bed_intersect(): glyph illustrates bed_merge():","code":"x <- tribble( ~chrom, ~start, ~end, \"chr1\", 25, 50, \"chr1\", 100, 125 ) y <- tribble( ~chrom, ~start, ~end, \"chr1\", 30, 75 ) bed_glyph(bed_intersect(x, y)) x <- tribble( ~chrom, ~start, ~end, \"chr1\", 1, 50, \"chr1\", 10, 75, \"chr1\", 100, 120 ) bed_glyph(bed_merge(x))"},{"path":"https://rnabioco.github.io/valr/dev/articles/valr.html","id":"grouping-data","dir":"Articles","previous_headings":"","what":"Grouping data","title":"valr overview","text":"group_by function dplyr can used perform functions subsets single multiple data_frames. Functions valr leverage grouping enable variety comparisons. example, intervals can grouped strand perform comparisons among intervals strand. Comparisons intervals opposite strands done using flip_strands() function: single set (e.g. bed_merge()) multi set operations respect groupings input intervals.","code":"x <- tribble( ~chrom, ~start, ~end, ~strand, \"chr1\", 1, 100, \"+\", \"chr1\", 50, 150, \"+\", \"chr2\", 100, 200, \"-\" ) y <- tribble( ~chrom, ~start, ~end, ~strand, \"chr1\", 50, 125, \"+\", \"chr1\", 50, 150, \"-\", \"chr2\", 50, 150, \"+\" ) # intersect tbls by strand x <- group_by(x, strand) y <- group_by(y, strand) bed_intersect(x, y) #> # A tibble: 2 × 8 #> chrom start.x end.x strand.x start.y end.y strand.y .overlap #> #> 1 chr1 1 100 + 50 125 + 50 #> 2 chr1 50 150 + 50 125 + 75 x <- group_by(x, strand) y <- flip_strands(y) y <- group_by(y, strand) bed_intersect(x, y) #> # A tibble: 3 × 8 #> chrom start.x end.x strand.x start.y end.y strand.y .overlap #> #> 1 chr1 1 100 + 50 150 + 50 #> 2 chr1 50 150 + 50 150 + 100 #> 3 chr2 100 200 - 50 150 - 50"},{"path":"https://rnabioco.github.io/valr/dev/articles/valr.html","id":"column-specification","dir":"Articles","previous_headings":"","what":"Column specification","title":"valr overview","text":"Columns BEDtools referred position: valr, columns referred name can used multiple name/value expressions summaries.","code":"# calculate the mean of column 6 for intervals in `b` that overlap with `a` bedtools map -a a.bed -b b.bed -c 6 -o mean # calculate the mean and variance for a `value` column bed_map(a, b, .mean = mean(value), .var = var(value)) # report concatenated and max values for merged intervals bed_merge(a, .concat = concat(value), .max = max(value))"},{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/articles/valr.html","id":"meta-analysis","dir":"Articles","previous_headings":"Getting started","what":"Meta-analysis","title":"valr overview","text":"demonstration illustrates use valr tools perform “meta-analysis” signals relative genomic features. analyze distribution histone marks surrounding transcription start sites. First load libraries relevant data. generate 1 bp intervals represent transcription start sites (TSSs). focus + strand genes, - genes easily accommodated filtering using bed_makewindows() reversed window numbers. Now use .win_id group bed_map() calculate sum mapping y signals onto intervals x. data regrouped .win_id summary mean sd values calculated. Finally, summary statistics used construct plot illustrates histone density surrounding TSSs.","code":"# `valr_example()` identifies the path of example files bedfile <- valr_example(\"genes.hg19.chr22.bed.gz\") genomefile <- valr_example(\"hg19.chrom.sizes.gz\") bgfile <- valr_example(\"hela.h3k4.chip.bg.gz\") genes <- read_bed(bedfile) genome <- read_genome(genomefile) y <- read_bedgraph(bgfile) # generate 1 bp TSS intervals, `+` strand only tss <- genes |> filter(strand == \"+\") |> mutate(end = start + 1) # 1000 bp up and downstream region_size <- 1000 # 50 bp windows win_size <- 50 # add slop to the TSS, break into windows and add a group x <- tss |> bed_slop(genome, both = region_size) |> bed_makewindows(win_size) x #> # A tibble: 13,530 × 7 #> chrom start end name score strand .win_id #> #> 1 chr22 16161065 16161115 LINC00516 3 + 1 #> 2 chr22 16161115 16161165 LINC00516 3 + 2 #> 3 chr22 16161165 16161215 LINC00516 3 + 3 #> 4 chr22 16161215 16161265 LINC00516 3 + 4 #> 5 chr22 16161265 16161315 LINC00516 3 + 5 #> 6 chr22 16161315 16161365 LINC00516 3 + 6 #> 7 chr22 16161365 16161415 LINC00516 3 + 7 #> 8 chr22 16161415 16161465 LINC00516 3 + 8 #> 9 chr22 16161465 16161515 LINC00516 3 + 9 #> 10 chr22 16161515 16161565 LINC00516 3 + 10 #> # ℹ 13,520 more rows # map signals to TSS regions and calculate summary statistics. res <- bed_map(x, y, win_sum = sum(value, na.rm = TRUE)) |> group_by(.win_id) |> summarize( win_mean = mean(win_sum, na.rm = TRUE), win_sd = sd(win_sum, na.rm = TRUE) ) res #> # A tibble: 41 × 3 #> .win_id win_mean win_sd #> #> 1 1 101. 85.8 #> 2 2 111. 81.1 #> 3 3 123. 99.1 #> 4 4 116. 96.3 #> 5 5 116. 102. #> 6 6 125. 95.1 #> 7 7 123. 94.4 #> 8 8 128. 91.5 #> 9 9 130. 95.7 #> 10 10 130. 88.8 #> # ℹ 31 more rows x_labels <- seq( -region_size, region_size, by = win_size * 5 ) x_breaks <- seq(1, 41, by = 5) sd_limits <- aes( ymax = win_mean + win_sd, ymin = win_mean - win_sd ) ggplot( res, aes( x = .win_id, y = win_mean ) ) + geom_point() + geom_pointrange(sd_limits) + scale_x_continuous( labels = x_labels, breaks = x_breaks ) + labs( x = \"Position (bp from TSS)\", y = \"Signal\", title = \"Human H3K4me3 signal near transcription start sites\" ) + theme_classic()"},{"path":"https://rnabioco.github.io/valr/dev/articles/valr.html","id":"related-work","dir":"Articles","previous_headings":"","what":"Related work","title":"valr overview","text":"Command-line tools BEDtools bedops. Python library pybedtools wraps BEDtools. R packages GenomicRanges, bedr, IRanges GenometriCorr provide similar capability different philosophy.","code":""},{"path":"https://rnabioco.github.io/valr/dev/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Jay Hesselberth. Author. Kent Riemondy. Author, maintainer. RNA Bioscience Initiative. Funder, copyright holder.","code":""},{"path":"https://rnabioco.github.io/valr/dev/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Riemondy KA, Sheridan RM, Gillen , Yu Y, Bennett CG, Hesselberth JR (2017). “valr: Reproducible Genome Interval Arithmetic R.” F1000Research. doi:10.12688/f1000research.11997.1.","code":"@Article{, title = {valr: Reproducible Genome Interval Arithmetic in R}, year = {2017}, author = {Kent A. Riemondy and Ryan M. Sheridan and Austin Gillen and Yinni Yu and Christopher G. Bennett and Jay R. Hesselberth}, journal = {F1000Research}, doi = {10.12688/f1000research.11997.1}, }"},{"path":"https://rnabioco.github.io/valr/dev/index.html","id":"valr-","dir":"","previous_headings":"","what":"Genome Interval Arithmetic","title":"Genome Interval Arithmetic","text":"valr provides tools read manipulate genome intervals signals, similar BEDtools suite.","code":""},{"path":"https://rnabioco.github.io/valr/dev/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Genome Interval Arithmetic","text":"","code":"# Install development version from GitHub # install.packages(\"pak\") pak::pak(\"rnabioco/valr\")"},{"path":"https://rnabioco.github.io/valr/dev/index.html","id":"valr-example","dir":"","previous_headings":"","what":"valr Example","title":"Genome Interval Arithmetic","text":"Functions valr similar names BEDtools counterparts, familiar users coming BEDtools suite. Unlike tools wrap BEDtools write temporary files disk, valr tools run natively memory. Similar pybedtools, valr terse syntax:","code":"library(valr) library(dplyr) snps <- read_bed(valr_example(\"hg19.snps147.chr22.bed.gz\")) genes <- read_bed(valr_example(\"genes.hg19.chr22.bed.gz\")) # find snps in intergenic regions intergenic <- bed_subtract(snps, genes) # find distance from intergenic snps to nearest gene nearby <- bed_closest(intergenic, genes) nearby |> select(starts_with(\"name\"), .overlap, .dist) |> filter(abs(.dist) < 5000) #> # A tibble: 1,047 × 4 #> name.x name.y .overlap .dist #> #> 1 rs530458610 P704P 0 2579 #> 2 rs2261631 P704P 0 -268 #> 3 rs570770556 POTEH 0 -913 #> 4 rs538163832 POTEH 0 -953 #> 5 rs190224195 POTEH 0 -1399 #> 6 rs2379966 DQ571479 0 4750 #> 7 rs142687051 DQ571479 0 3558 #> 8 rs528403095 DQ571479 0 3309 #> 9 rs555126291 DQ571479 0 2745 #> 10 rs5747567 DQ571479 0 -1778 #> # ℹ 1,037 more rows"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed12_to_exons.html","id":null,"dir":"Reference","previous_headings":"","what":"Convert BED12 to individual exons in BED6. — bed12_to_exons","title":"Convert BED12 to individual exons in BED6. — bed12_to_exons","text":"conversion BED6 format, score column contains exon number, respect strand (.e., first exon - strand genes larger start end coordinates).","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed12_to_exons.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Convert BED12 to individual exons in BED6. — bed12_to_exons","text":"","code":"bed12_to_exons(x)"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed12_to_exons.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Convert BED12 to individual exons in BED6. — bed12_to_exons","text":"x ivl_df","code":""},{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/reference/bed12_to_exons.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Convert BED12 to individual exons in BED6. — bed12_to_exons","text":"","code":"x <- read_bed12(valr_example(\"mm9.refGene.bed.gz\")) bed12_to_exons(x) #> # A tibble: 1,683 × 6 #> chrom start end name score strand #> #> 1 chr1 3204562 3207049 NM_001011874 3 - #> 2 chr1 3411782 3411982 NM_001011874 2 - #> 3 chr1 3660632 3661579 NM_001011874 1 - #> 4 chr1 4280926 4283093 NM_001195662 4 - #> 5 chr1 4341990 4342162 NM_001195662 3 - #> 6 chr1 4342282 4342918 NM_001195662 2 - #> 7 chr1 4399250 4399322 NM_001195662 1 - #> 8 chr1 4847774 4848057 NM_001159750 1 + #> 9 chr1 4847774 4848057 NM_011541 1 + #> 10 chr1 4848408 4848584 NM_001159751 1 + #> # ℹ 1,673 more rows"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_absdist.html","id":null,"dir":"Reference","previous_headings":"","what":"Compute absolute distances between intervals. — bed_absdist","title":"Compute absolute distances between intervals. — bed_absdist","text":"Computes absolute distance midpoint x interval midpoints closest y interval.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_absdist.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compute absolute distances between intervals. — bed_absdist","text":"","code":"bed_absdist(x, y, genome)"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_absdist.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compute absolute distances between intervals. — bed_absdist","text":"x ivl_df y ivl_df genome genome_df","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_absdist.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compute absolute distances between intervals. — bed_absdist","text":"ivl_df .absdist .absdist_scaled columns.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_absdist.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Compute absolute distances between intervals. — bed_absdist","text":"Absolute distances scaled inter-reference gap chromosome follows. Q query points R reference points chromosome, scale distance query point closest reference point inter-reference gap chromosome. x interval matching y chromosome, .absdist NA. $$d_i(x,y) = min_k(|q_i - r_k|)\\frac{R}{Length\\ \\ chromosome}$$ absolute scaled distances reported .absdist .absdist_scaled. Interval statistics can used combination dplyr::group_by() dplyr::() calculate statistics subsets data. See vignette('interval-stats') examples.","code":""},{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_absdist.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Compute absolute distances between intervals. — bed_absdist","text":"","code":"genome <- read_genome(valr_example(\"hg19.chrom.sizes.gz\")) x <- bed_random(genome, seed = 1010486) y <- bed_random(genome, seed = 9203911) bed_absdist(x, y, genome) #> # A tibble: 1,000,000 × 5 #> chrom start end .absdist .absdist_scaled #> #> 1 chr1 5184 6184 1392 0.448 #> 2 chr1 7663 8663 1087 0.350 #> 3 chr1 9858 10858 1526 0.491 #> 4 chr1 13805 14805 2421 0.779 #> 5 chr1 14081 15081 2697 0.868 #> 6 chr1 16398 17398 1700 0.547 #> 7 chr1 17486 18486 612 0.197 #> 8 chr1 22063 23063 466 0.150 #> 9 chr1 22494 23494 897 0.289 #> 10 chr1 29351 30351 1143 0.368 #> # ℹ 999,990 more rows"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_closest.html","id":null,"dir":"Reference","previous_headings":"","what":"Identify closest intervals. — bed_closest","title":"Identify closest intervals. — bed_closest","text":"Identify closest intervals.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_closest.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Identify closest intervals. — bed_closest","text":"","code":"bed_closest(x, y, overlap = TRUE, suffix = c(\".x\", \".y\"))"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_closest.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Identify closest intervals. — bed_closest","text":"x ivl_df y ivl_df overlap report overlapping intervals suffix colname suffixes output","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_closest.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Identify closest intervals. — bed_closest","text":"ivl_df additional columns: .overlap amount overlap overlapping interval. Non-overlapping adjacent intervals overlap 0. .overlap included output overlap = FALSE. .dist distance closest interval. Negative distances denote upstream intervals. Book-ended intervals distance 1.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_closest.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Identify closest intervals. — bed_closest","text":"input tbls grouped chrom default, additional groups can added using dplyr::group_by(). example, grouping strand constrain analyses strand. compare opposing strands across two tbls, strands y tbl can first inverted using flip_strands().","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_closest.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Identify closest intervals. — bed_closest","text":"interval x bed_closest() returns overlapping intervals y closest non-intersecting y interval. Setting overlap = FALSE report closest non-intersecting y intervals, ignoring overlapping y intervals.","code":""},{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_closest.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Identify closest intervals. — bed_closest","text":"","code":"x <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 100, 125 ) y <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 25, 50, \"chr1\", 140, 175 ) bed_glyph(bed_closest(x, y)) x <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 500, 600, \"chr2\", 5000, 6000 ) y <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 100, 200, \"chr1\", 150, 200, \"chr1\", 550, 580, \"chr2\", 7000, 8500 ) bed_closest(x, y) #> # A tibble: 4 × 7 #> chrom start.x end.x start.y end.y .overlap .dist #> #> 1 chr1 500 600 550 580 30 0 #> 2 chr1 500 600 100 200 0 -301 #> 3 chr1 500 600 150 200 0 -301 #> 4 chr2 5000 6000 7000 8500 0 1001 bed_closest(x, y, overlap = FALSE) #> # A tibble: 3 × 6 #> chrom start.x end.x start.y end.y .dist #> #> 1 chr1 500 600 100 200 -301 #> 2 chr1 500 600 150 200 -301 #> 3 chr2 5000 6000 7000 8500 1001 # Report distance based on strand x <- tibble::tribble( ~chrom, ~start, ~end, ~name, ~score, ~strand, \"chr1\", 10, 20, \"a\", 1, \"-\" ) y <- tibble::tribble( ~chrom, ~start, ~end, ~name, ~score, ~strand, \"chr1\", 8, 9, \"b\", 1, \"+\", \"chr1\", 21, 22, \"b\", 1, \"-\" ) res <- bed_closest(x, y) # convert distance based on strand res$.dist_strand <- ifelse(res$strand.x == \"+\", res$.dist, -(res$.dist)) res #> # A tibble: 2 × 14 #> chrom start.x end.x name.x score.x strand.x start.y end.y name.y score.y #> #> 1 chr1 10 20 a 1 - 21 22 b 1 #> 2 chr1 10 20 a 1 - 8 9 b 1 #> # ℹ 4 more variables: strand.y , .overlap , .dist , #> # .dist_strand # report absolute distances res$.abs_dist <- abs(res$.dist) res #> # A tibble: 2 × 15 #> chrom start.x end.x name.x score.x strand.x start.y end.y name.y score.y #> #> 1 chr1 10 20 a 1 - 21 22 b 1 #> 2 chr1 10 20 a 1 - 8 9 b 1 #> # ℹ 5 more variables: strand.y , .overlap , .dist , #> # .dist_strand , .abs_dist "},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_cluster.html","id":null,"dir":"Reference","previous_headings":"","what":"Cluster neighboring intervals. — bed_cluster","title":"Cluster neighboring intervals. — bed_cluster","text":"output .id column can used downstream grouping operations. Default max_dist = 0 means overlapping book-ended intervals clustered.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_cluster.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Cluster neighboring intervals. — bed_cluster","text":"","code":"bed_cluster(x, max_dist = 0)"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_cluster.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Cluster neighboring intervals. — bed_cluster","text":"x ivl_df max_dist maximum distance clustered intervals.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_cluster.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Cluster neighboring intervals. — bed_cluster","text":"ivl_df .id column specifying sets clustered intervals.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_cluster.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Cluster neighboring intervals. — bed_cluster","text":"input tbls grouped chrom default, additional groups can added using dplyr::group_by(). example, grouping strand constrain analyses strand. compare opposing strands across two tbls, strands y tbl can first inverted using flip_strands().","code":""},{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_cluster.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Cluster neighboring intervals. — bed_cluster","text":"","code":"x <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 100, 200, \"chr1\", 180, 250, \"chr1\", 250, 500, \"chr1\", 501, 1000, \"chr2\", 1, 100, \"chr2\", 150, 200 ) bed_cluster(x) #> # A tibble: 6 × 4 #> chrom start end .id #> #> 1 chr1 100 200 1 #> 2 chr1 180 250 1 #> 3 chr1 250 500 1 #> 4 chr1 501 1000 2 #> 5 chr2 1 100 3 #> 6 chr2 150 200 4 # glyph illustrating clustering of overlapping and book-ended intervals x <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 1, 10, \"chr1\", 5, 20, \"chr1\", 30, 40, \"chr1\", 40, 50, \"chr1\", 80, 90 ) bed_glyph(bed_cluster(x), label = \".id\")"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_complement.html","id":null,"dir":"Reference","previous_headings":"","what":"Identify intervals in a genome not covered by a query. — bed_complement","title":"Identify intervals in a genome not covered by a query. — bed_complement","text":"Identify intervals genome covered query.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_complement.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Identify intervals in a genome not covered by a query. — bed_complement","text":"","code":"bed_complement(x, genome)"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_complement.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Identify intervals in a genome not covered by a query. — bed_complement","text":"x ivl_df genome ivl_df","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_complement.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Identify intervals in a genome not covered by a query. — bed_complement","text":"ivl_df","code":""},{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_complement.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Identify intervals in a genome not covered by a query. — bed_complement","text":"","code":"x <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 0, 10, \"chr1\", 75, 100 ) genome <- tibble::tribble( ~chrom, ~size, \"chr1\", 200 ) bed_glyph(bed_complement(x, genome)) genome <- tibble::tribble( ~chrom, ~size, \"chr1\", 500, \"chr2\", 600, \"chr3\", 800 ) x <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 100, 300, \"chr1\", 200, 400, \"chr2\", 0, 100, \"chr2\", 200, 400, \"chr3\", 500, 600 ) # intervals not covered by x bed_complement(x, genome) #> # A tibble: 6 × 3 #> chrom start end #> #> 1 chr1 0 100 #> 2 chr1 400 500 #> 3 chr2 100 200 #> 4 chr2 400 600 #> 5 chr3 0 500 #> 6 chr3 600 800"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_coverage.html","id":null,"dir":"Reference","previous_headings":"","what":"Compute coverage of intervals. — bed_coverage","title":"Compute coverage of intervals. — bed_coverage","text":"Compute coverage intervals.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_coverage.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compute coverage of intervals. — bed_coverage","text":"","code":"bed_coverage(x, y, ...)"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_coverage.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compute coverage of intervals. — bed_coverage","text":"x ivl_df y ivl_df ... extra arguments (used)","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_coverage.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compute coverage of intervals. — bed_coverage","text":"ivl_df following additional columns: .ints number x intersections .cov per-base coverage x intervals .len total length y intervals covered x intervals .frac .len scaled number y intervals","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_coverage.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Compute coverage of intervals. — bed_coverage","text":"input tbls grouped chrom default, additional groups can added using dplyr::group_by(). example, grouping strand constrain analyses strand. compare opposing strands across two tbls, strands y tbl can first inverted using flip_strands().","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_coverage.html","id":"note","dir":"Reference","previous_headings":"","what":"Note","title":"Compute coverage of intervals. — bed_coverage","text":"Book-ended intervals included coverage calculations.","code":""},{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_coverage.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Compute coverage of intervals. — bed_coverage","text":"","code":"x <- tibble::tribble( ~chrom, ~start, ~end, ~strand, \"chr1\", 100, 500, \"+\", \"chr2\", 200, 400, \"+\", \"chr2\", 300, 500, \"-\", \"chr2\", 800, 900, \"-\" ) y <- tibble::tribble( ~chrom, ~start, ~end, ~value, ~strand, \"chr1\", 150, 400, 100, \"+\", \"chr1\", 500, 550, 100, \"+\", \"chr2\", 230, 430, 200, \"-\", \"chr2\", 350, 430, 300, \"-\" ) bed_coverage(x, y) #> # A tibble: 4 × 8 #> chrom start end strand .ints .cov .len .frac #> #> 1 chr1 100 500 + 2 250 400 0.625 #> 2 chr2 200 400 + 2 170 200 0.85 #> 3 chr2 300 500 - 2 130 200 0.65 #> 4 chr2 800 900 - 0 0 100 0"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_fisher.html","id":null,"dir":"Reference","previous_headings":"","what":"Fisher's test to measure overlap between two sets of intervals. — bed_fisher","title":"Fisher's test to measure overlap between two sets of intervals. — bed_fisher","text":"Calculate Fisher's test number intervals shared unique two sets x y intervals.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_fisher.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Fisher's test to measure overlap between two sets of intervals. — bed_fisher","text":"","code":"bed_fisher(x, y, genome)"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_fisher.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Fisher's test to measure overlap between two sets of intervals. — bed_fisher","text":"x ivl_df y ivl_df genome genome_df","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_fisher.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Fisher's test to measure overlap between two sets of intervals. — bed_fisher","text":"ivl_df","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_fisher.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Fisher's test to measure overlap between two sets of intervals. — bed_fisher","text":"Interval statistics can used combination dplyr::group_by() dplyr::() calculate statistics subsets data. See vignette('interval-stats') examples.","code":""},{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_fisher.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Fisher's test to measure overlap between two sets of intervals. — bed_fisher","text":"","code":"genome <- read_genome(valr_example(\"hg19.chrom.sizes.gz\")) x <- bed_random(genome, n = 1e4, seed = 1010486) y <- bed_random(genome, n = 1e4, seed = 9203911) bed_fisher(x, y, genome) #> # A tibble: 1 × 6 #> estimate p.value conf.low conf.high method alternative #> #> 1 0.945 0.707 0.722 1.22 Fisher's Exact Test for Count… two.sided"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_flank.html","id":null,"dir":"Reference","previous_headings":"","what":"Create flanking intervals from input intervals. — bed_flank","title":"Create flanking intervals from input intervals. — bed_flank","text":"Create flanking intervals input intervals.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_flank.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create flanking intervals from input intervals. — bed_flank","text":"","code":"bed_flank( x, genome, both = 0, left = 0, right = 0, fraction = FALSE, strand = FALSE, trim = FALSE, ... )"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_flank.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create flanking intervals from input intervals. — bed_flank","text":"x ivl_df genome genome_df number bases sizes left number bases left side right number bases right side fraction define flanks based fraction interval length strand define left right based strand trim adjust coordinates --bounds intervals ... extra arguments (used)","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_flank.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Create flanking intervals from input intervals. — bed_flank","text":"ivl_df","code":""},{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_flank.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Create flanking intervals from input intervals. — bed_flank","text":"","code":"x <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 25, 50, \"chr1\", 100, 125 ) genome <- tibble::tribble( ~chrom, ~size, \"chr1\", 130 ) bed_glyph(bed_flank(x, genome, both = 20)) x <- tibble::tribble( ~chrom, ~start, ~end, ~name, ~score, ~strand, \"chr1\", 500, 1000, \".\", \".\", \"+\", \"chr1\", 1000, 1500, \".\", \".\", \"-\" ) genome <- tibble::tribble( ~chrom, ~size, \"chr1\", 5000 ) bed_flank(x, genome, left = 100) #> # A tibble: 2 × 6 #> chrom start end name score strand #> #> 1 chr1 400 500 . . + #> 2 chr1 900 1000 . . - bed_flank(x, genome, right = 100) #> # A tibble: 2 × 6 #> chrom start end name score strand #> #> 1 chr1 1000 1100 . . + #> 2 chr1 1500 1600 . . - bed_flank(x, genome, both = 100) #> # A tibble: 4 × 6 #> chrom start end name score strand #> #> 1 chr1 400 500 . . + #> 2 chr1 900 1000 . . - #> 3 chr1 1000 1100 . . + #> 4 chr1 1500 1600 . . - bed_flank(x, genome, both = 0.5, fraction = TRUE) #> # A tibble: 4 × 6 #> chrom start end name score strand #> #> 1 chr1 250 500 . . + #> 2 chr1 750 1000 . . - #> 3 chr1 1000 1250 . . + #> 4 chr1 1500 1750 . . -"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_genomecov.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculate coverage across a genome — bed_genomecov","title":"Calculate coverage across a genome — bed_genomecov","text":"function useful calculating interval coverage across entire genome.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_genomecov.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculate coverage across a genome — bed_genomecov","text":"","code":"bed_genomecov(x, genome, zero_depth = FALSE)"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_genomecov.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculate coverage across a genome — bed_genomecov","text":"x ivl_df genome genome_df zero_depth TRUE, report intervals zero depth. Zero depth intervals reported respect groups.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_genomecov.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculate coverage across a genome — bed_genomecov","text":"ivl_df additional column: .depth depth interval coverage","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_genomecov.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Calculate coverage across a genome — bed_genomecov","text":"input tbls grouped chrom default, additional groups can added using dplyr::group_by(). example, grouping strand constrain analyses strand. compare opposing strands across two tbls, strands y tbl can first inverted using flip_strands().","code":""},{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_genomecov.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculate coverage across a genome — bed_genomecov","text":"","code":"x <- tibble::tribble( ~chrom, ~start, ~end, ~strand, \"chr1\", 20, 70, \"+\", \"chr1\", 50, 100, \"-\", \"chr1\", 200, 250, \"+\", \"chr1\", 220, 250, \"+\" ) genome <- tibble::tribble( ~chrom, ~size, \"chr1\", 500, \"chr2\", 1000 ) bed_genomecov(x, genome) #> # A tibble: 5 × 4 #> chrom start end .depth #> #> 1 chr1 20 50 1 #> 2 chr1 50 70 2 #> 3 chr1 70 100 1 #> 4 chr1 200 220 1 #> 5 chr1 220 250 2 bed_genomecov(dplyr::group_by(x, strand), genome) #> # A tibble: 4 × 5 #> chrom start end strand .depth #> #> 1 chr1 20 70 + 1 #> 2 chr1 200 220 + 1 #> 3 chr1 220 250 + 2 #> 4 chr1 50 100 - 1 bed_genomecov(dplyr::group_by(x, strand), genome, zero_depth = TRUE) #> # A tibble: 11 × 5 #> chrom start end strand .depth #> #> 1 chr1 0 20 + 0 #> 2 chr1 0 50 - 0 #> 3 chr1 20 70 + 1 #> 4 chr1 50 100 - 1 #> 5 chr1 70 200 + 0 #> 6 chr1 100 500 - 0 #> 7 chr1 200 220 + 1 #> 8 chr1 220 250 + 2 #> 9 chr1 250 500 + 0 #> 10 chr2 0 1000 + 0 #> 11 chr2 0 1000 - 0"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_glyph.html","id":null,"dir":"Reference","previous_headings":"","what":"Create example glyphs for valr functions. — bed_glyph","title":"Create example glyphs for valr functions. — bed_glyph","text":"Used illustrate output valr functions small examples.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_glyph.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create example glyphs for valr functions. — bed_glyph","text":"","code":"bed_glyph(expr, label = NULL)"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_glyph.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create example glyphs for valr functions. — bed_glyph","text":"expr expression evaluate label column name use label values. present result call.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_glyph.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Create example glyphs for valr functions. — bed_glyph","text":"ggplot2::ggplot()","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_glyph.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Create example glyphs for valr functions. — bed_glyph","text":"","code":"x <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 25, 50, \"chr1\", 100, 125 ) y <- tibble::tribble( ~chrom, ~start, ~end, ~value, \"chr1\", 30, 75, 50 ) bed_glyph(bed_intersect(x, y)) x <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 30, 75, \"chr1\", 50, 90, \"chr1\", 91, 120 ) bed_glyph(bed_merge(x)) bed_glyph(bed_cluster(x), label = \".id\")"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_intersect.html","id":null,"dir":"Reference","previous_headings":"","what":"Identify intersecting intervals. — bed_intersect","title":"Identify intersecting intervals. — bed_intersect","text":"Report intersecting intervals x y tbls. Book-ended intervals .overlap values 0 output.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_intersect.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Identify intersecting intervals. — bed_intersect","text":"","code":"bed_intersect(x, ..., invert = FALSE, suffix = c(\".x\", \".y\"))"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_intersect.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Identify intersecting intervals. — bed_intersect","text":"x ivl_df ... one (e.g. list ) y ivl_df()s invert report x intervals y suffix colname suffixes output","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_intersect.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Identify intersecting intervals. — bed_intersect","text":"ivl_df original columns x y suffixed .x .y, new .overlap column extent overlap intersecting intervals. multiple y tbls supplied, .source contains variable names associated interval. original columns y suffixed .y output. ... contains named inputs (.e = y, b = z list(= y, b = z)), .source contain supplied names (see examples).","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_intersect.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Identify intersecting intervals. — bed_intersect","text":"input tbls grouped chrom default, additional groups can added using dplyr::group_by(). example, grouping strand constrain analyses strand. compare opposing strands across two tbls, strands y tbl can first inverted using flip_strands().","code":""},{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_intersect.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Identify intersecting intervals. — bed_intersect","text":"","code":"x <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 25, 50, \"chr1\", 100, 125 ) y <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 30, 75 ) bed_glyph(bed_intersect(x, y)) bed_glyph(bed_intersect(x, y, invert = TRUE)) x <- tibble::tribble( ~chrom, ~start, ~end, \"chr1\", 100, 500, \"chr2\", 200, 400, \"chr2\", 300, 500, \"chr2\", 800, 900 ) y <- tibble::tribble( ~chrom, ~start, ~end, ~value, \"chr1\", 150, 400, 100, \"chr1\", 500, 550, 100, \"chr2\", 230, 430, 200, \"chr2\", 350, 430, 300 ) bed_intersect(x, y) #> # A tibble: 6 × 7 #> chrom start.x end.x start.y end.y value.y .overlap #> #> 1 chr1 100 500 150 400 100 250 #> 2 chr1 100 500 500 550 100 0 #> 3 chr2 200 400 230 430 200 170 #> 4 chr2 200 400 350 430 300 50 #> 5 chr2 300 500 230 430 200 130 #> 6 chr2 300 500 350 430 300 80 bed_intersect(x, y, invert = TRUE) #> # A tibble: 1 × 3 #> chrom start end #> #> 1 chr2 800 900 # start and end of each overlapping interval res <- bed_intersect(x, y) dplyr::mutate(res, start = pmax(start.x, start.y), end = pmin(end.x, end.y) ) #> # A tibble: 6 × 9 #> chrom start.x end.x start.y end.y value.y .overlap start end #> #> 1 chr1 100 500 150 400 100 250 150 400 #> 2 chr1 100 500 500 550 100 0 500 500 #> 3 chr2 200 400 230 430 200 170 230 400 #> 4 chr2 200 400 350 430 300 50 350 400 #> 5 chr2 300 500 230 430 200 130 300 430 #> 6 chr2 300 500 350 430 300 80 350 430 z <- tibble::tribble( ~chrom, ~start, ~end, ~value, \"chr1\", 150, 400, 100, \"chr1\", 500, 550, 100, \"chr2\", 230, 430, 200, \"chr2\", 750, 900, 400 ) bed_intersect(x, y, z) #> # A tibble: 11 × 8 #> chrom start.x end.x start.y end.y value.y .source .overlap #> #> 1 chr1 100 500 150 400 100 y 250 #> 2 chr1 100 500 150 400 100 z 250 #> 3 chr1 100 500 500 550 100 y 0 #> 4 chr1 100 500 500 550 100 z 0 #> 5 chr2 200 400 230 430 200 y 170 #> 6 chr2 200 400 230 430 200 z 170 #> 7 chr2 200 400 350 430 300 y 50 #> 8 chr2 300 500 230 430 200 y 130 #> 9 chr2 300 500 230 430 200 z 130 #> 10 chr2 300 500 350 430 300 y 80 #> 11 chr2 800 900 750 900 400 z 100 bed_intersect(x, exons = y, introns = z) #> # A tibble: 11 × 8 #> chrom start.x end.x start.y end.y value.y .source .overlap #> #> 1 chr1 100 500 150 400 100 exons 250 #> 2 chr1 100 500 150 400 100 introns 250 #> 3 chr1 100 500 500 550 100 exons 0 #> 4 chr1 100 500 500 550 100 introns 0 #> 5 chr2 200 400 230 430 200 exons 170 #> 6 chr2 200 400 230 430 200 introns 170 #> 7 chr2 200 400 350 430 300 exons 50 #> 8 chr2 300 500 230 430 200 exons 130 #> 9 chr2 300 500 230 430 200 introns 130 #> 10 chr2 300 500 350 430 300 exons 80 #> 11 chr2 800 900 750 900 400 introns 100 # a list of tbl_intervals can also be passed bed_intersect(x, list(exons = y, introns = z)) #> # A tibble: 11 × 8 #> chrom start.x end.x start.y end.y value.y .source .overlap #> #> 1 chr1 100 500 150 400 100 exons 250 #> 2 chr1 100 500 150 400 100 introns 250 #> 3 chr1 100 500 500 550 100 exons 0 #> 4 chr1 100 500 500 550 100 introns 0 #> 5 chr2 200 400 230 430 200 exons 170 #> 6 chr2 200 400 230 430 200 introns 170 #> 7 chr2 200 400 350 430 300 exons 50 #> 8 chr2 300 500 230 430 200 exons 130 #> 9 chr2 300 500 230 430 200 introns 130 #> 10 chr2 300 500 350 430 300 exons 80 #> 11 chr2 800 900 750 900 400 introns 100"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_jaccard.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculate the Jaccard statistic for two sets of intervals. — bed_jaccard","title":"Calculate the Jaccard statistic for two sets of intervals. — bed_jaccard","text":"Quantifies extent overlap sets intervals terms base-pairs. Groups shared input used calculate statistic subsets data.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_jaccard.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculate the Jaccard statistic for two sets of intervals. — bed_jaccard","text":"","code":"bed_jaccard(x, y)"},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_jaccard.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculate the Jaccard statistic for two sets of intervals. — bed_jaccard","text":"x ivl_df y ivl_df","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_jaccard.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculate the Jaccard statistic for two sets of intervals. — bed_jaccard","text":"tibble following columns: len_i length intersection base-pairs len_u length union base-pairs jaccard value jaccard statistic n_int number intersecting intervals x y inputs grouped, return value contain one set values per group.","code":""},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_jaccard.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Calculate the Jaccard statistic for two sets of intervals. — bed_jaccard","text":"Jaccard statistic takes values [0,1] measured : $$ J(x,y) = \\frac{\\mid x \\bigcap y \\mid} {\\mid x \\bigcup y \\mid} = \\frac{\\mid x \\bigcap y \\mid} {\\mid x \\mid + \\mid y \\mid - \\mid x \\bigcap y \\mid} $$ Interval statistics can used combination dplyr::group_by() dplyr::() calculate statistics subsets data. See vignette('interval-stats') examples.","code":""},{"path":[]},{"path":"https://rnabioco.github.io/valr/dev/reference/bed_jaccard.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculate the Jaccard statistic for two sets of intervals. — bed_jaccard","text":"","code":"genome <- read_genome(valr_example(\"hg19.chrom.sizes.gz\")) x <- bed_random(genome, seed = 1010486) y <- bed_random(genome, seed = 9203911) bed_jaccard(x, y) #> # A tibble: 1 × 4 #> len_i len_u jaccard n #> #> 1 236184699 1708774142 0.160 399981 # calculate jaccard per chromosome bed_jaccard( dplyr::group_by(x, chrom), dplyr::group_by(y, chrom) ) #> # A tibble: 25 × 5 #> chrom len_i len_u jaccard n #>