Skip to content

Commit

Permalink
Fixed warnings
Browse files Browse the repository at this point in the history
  • Loading branch information
ThomasBrazier committed Sep 25, 2024
1 parent cdc32e3 commit 3b3f3b3
Show file tree
Hide file tree
Showing 13 changed files with 95 additions and 56 deletions.
6 changes: 3 additions & 3 deletions .github/workflows/r.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,10 @@ permissions:

jobs:
build:
runs-on: macos-latest
runs-on: [macos-latest, linux]
strategy:
matrix:
r-version: ['3.6.3', '4.1.1']
r-version: ['4.4.1']

steps:
- uses: actions/checkout@v4
Expand All @@ -32,7 +32,7 @@ jobs:
r-version: ${{ matrix.r-version }}
- name: Install dependencies
run: |
install.packages(c("remotes", "rcmdcheck", "gatepoints", "ggplot2", "pbmcapply", "paralle", "BiocManager", "dineq", "npreg", "reshape2", "segmented", "knitr"))
install.packages(c("remotes", "rcmdcheck", "gatepoints", "ggplot2", "pbmcapply", "parallel", "BiocManager", "dineq", "npreg", "reshape2", "segmented", "knitr"))
remotes::install_deps(dependencies = TRUE)
BiocManager::install("GenomicRanges")
shell: Rscript {0}
Expand Down
18 changes: 12 additions & 6 deletions R/broken_stick.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
#' Diversity and determinants of recombination landscapes in flowering plants.
#' PLOS Genetics, 18(8), Article 8. https://doi.org/10.1371/journal.pgen.1010141
#'
#' @param marey a `comparative_marey_map` object
#' @param x a `comparative_marey_map` object
#' @param k the number of segments (default = 10)
#' @param method the method to infer the breakpoint of segments, either `strict` to cut a marker position, or `segmented` to interpolate segments breakpoints
#' @param plot (logical) whether to plot the broken stick directly or return a data frame (default = `TRUE` will plot the figure)
Expand All @@ -25,19 +25,19 @@
#'
#' @export
#'
brokenstick = function(marey, k = 10, method = "strict", plot = TRUE) {
brokenstick = function(x, k = 10, method = "strict", plot = TRUE) {

# the list of set and names to process
s = marey$set
n = marey$map
s = x$set
n = x$map

list_bs = list()

for (i in 1:length(s)) {
idx = (s == s[i] & n == n[i])
cat(s[i], n[i], "\n")

subs = subset_comparative_marey(marey, subset = idx)
subs = subset_comparative_marey(x, subset = idx)
subs = comparative_marey_to_dataframe(subs)

bs = list(brokenstick_one_map(subs, k = k, method = method))
Expand Down Expand Up @@ -67,7 +67,7 @@ brokenstick = function(marey, k = 10, method = "strict", plot = TRUE) {
# Besides, estimates the ratio expected/observed (longer than expected will have lower relative recombination rate)
brokenstick$ratio = brokenstick$proportion.length/(1/k)

p = ggplot(data = brokenstick, aes(x=sample, y=proportion.length, fill = log10(ratio)))+
p = ggplot(data = brokenstick, aes(x=.data$sample, y=.data$proportion.length, fill = log10(.data$ratio)))+
geom_bar(stat='identity', width = 1) +
# scale_fill_manual(values = color) +
scale_fill_viridis_c(breaks = c(-1, 0, 1), labels = c("-1", "0", "1"), direction = -1, limits = c(-1, 1),
Expand Down Expand Up @@ -109,6 +109,12 @@ brokenstick = function(marey, k = 10, method = "strict", plot = TRUE) {
#' @description
#' Estimate the proportions of a broken stick model for a `comparative_marey_map` object
#' i.e. proportion of relative genetic length (cM) in k segments of equal genomic size (bp) along the chromosome
#'
#' @param marey a single `mareyMap` object
#' @param k the number of segments (default = 10)
#' @param method the method used to infer segments breakpoints
#'
#'
brokenstick_one_map = function(marey, k = 10, method = "strict") {

marey$map = as.character(marey$map)
Expand Down
68 changes: 35 additions & 33 deletions R/comparative_marey_map.R
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ comparative_marey_map = function(x = data.frame(),
#' @param x a `comparative_marey_map` object
#' @param method the interpolation method to apply
#' @param verbose whether to print messages and progress bar
#'
#' @param ... additional arguments
#'
#' @return a `comparative_marey_map` object with recombination maps updated
#' @export
Expand All @@ -71,7 +71,7 @@ comparative_recombination_maps = function(x,
#'
#' @param x a `comparative_marey_map` object
#' @param statistics a vector of statistics to compute
#' @param verbose whether to print messages and progress bar
#' @param ... additional arguments
#'
#' @slot set a vector of dataset names
#' @slot map a vector of chromosome names
Expand All @@ -85,7 +85,8 @@ comparative_recombination_maps = function(x,
#' @return a list of summary statistics
#' @export
#'
compute_stats_marey = function(x, statistics = c('mean', 'median'), ...) {
compute_stats_marey = function(x,
statistics = c('mean', 'median'), ...) {
list_stats = list()
list_stats$set = as.character(x$set)
list_stats$map = as.character(x$map)
Expand Down Expand Up @@ -160,44 +161,44 @@ compute_stats_marey = function(x, statistics = c('mean', 'median'), ...) {
#' @return a `ggplot2` object of Marey maps
#' @export
#'
plot_comparative_marey = function(x, group = 'set + map') {
plot_comparative_marey = function(x, group = 'set + map', ...) {

df = comparative_marey_to_dataframe(x)

# Add the Marey interpolated function
marey = comparative_interpolation_to_dataframe(x)

if (group == 'set') {
grouping = as.formula(~as.factor(set))
grouping = as.formula(~as.factor(.data$set))
facet = facet_wrap(grouping, scales = "free")
point_rec = geom_point(aes(colour = as.factor(map), fill = as.factor(map)), alpha = 0.2)
line_rec = geom_line(data = marey, aes(x = physicalPosition/10^6, y = geneticPositioncM, group = as.factor(map)), fill = "black")
ribbon_rec = geom_ribbon(data = marey, aes(x = physicalPosition/10^6, y = geneticPositioncM, ymin = lowerGeneticPositioncM, ymax = upperGeneticPositioncM, group = as.factor(map)),
point_rec = geom_point(aes(colour = as.factor(.data$map), fill = as.factor(.data$map)), alpha = 0.2)
line_rec = geom_line(data = marey, aes(x = .data$physicalPosition/10^6, y = .data$geneticPositioncM, group = as.factor(.data$map)), fill = "black")
ribbon_rec = geom_ribbon(data = marey, aes(x = .data$physicalPosition/10^6, y = .data$geneticPositioncM, ymin = .data$lowerGeneticPositioncM, ymax = .data$upperGeneticPositioncM, group = as.factor(.data$map)),
alpha = 0.4)

}
if (group == 'map') {
grouping = as.formula(~ as.factor(map))
grouping = as.formula(~ as.factor(.data$map))
facet = facet_wrap(grouping, scales = "free")
point_rec = geom_point(aes(colour = as.factor(set), fill = as.factor(set)), alpha = 0.2)
line_rec = geom_line(data = marey, aes(x = physicalPosition/10^6, y = geneticPositioncM, group = as.factor(set)), fill = "black")
ribbon_rec = geom_ribbon(data = marey, aes(x = physicalPosition/10^6, y = geneticPositioncM, ymin = lowerGeneticPositioncM, ymax = upperGeneticPositioncM, group = as.factor(set)),
point_rec = geom_point(aes(colour = as.factor(.data$set), fill = as.factor(.data$set)), alpha = 0.2)
line_rec = geom_line(data = marey, aes(x = .data$physicalPosition/10^6, y = .data$geneticPositioncM, group = as.factor(.data$set)), fill = "black")
ribbon_rec = geom_ribbon(data = marey, aes(x = .data$physicalPosition/10^6, y = .data$geneticPositioncM, ymin = .data$lowerGeneticPositioncM, ymax = .data$upperGeneticPositioncM, group = as.factor(.data$set)),
alpha = 0.4)
}
if (group == 'set + map') {
grouping = as.formula(~as.factor(map) + as.factor(set))
grouping = as.formula(~as.factor(.data$map) + as.factor(.data$set))
facet = facet_grid(grouping, scales = "free")
point_rec = geom_point(alpha = 0.2)
line_rec = geom_line(data = marey, aes(x = physicalPosition/10^6, y = geneticPositioncM), colour = "black")
ribbon_rec = geom_ribbon(data = marey, aes(x = physicalPosition/10^6, y = geneticPositioncM, ymin = lowerGeneticPositioncM, ymax = upperGeneticPositioncM),
line_rec = geom_line(data = marey, aes(x = .data$physicalPosition/10^6, y = .data$geneticPositioncM), colour = "black")
ribbon_rec = geom_ribbon(data = marey, aes(x = .data$physicalPosition/10^6, y = .data$geneticPositioncM, ymin = .data$lowerGeneticPositioncM, ymax = .data$upperGeneticPositioncM),
fill = "darkorange", colour = "darkorange", alpha = 0.3)
}


if (nrow(marey) > 0) {
marey$vld = TRUE

p = ggplot2::ggplot(data = df, aes(x = phys/10^6, y = gen)) +
p = ggplot2::ggplot(data = df, aes(x = .data$phys/10^6, y = .data$gen)) +
point_rec +
line_rec +
ribbon_rec +
Expand All @@ -215,7 +216,7 @@ plot_comparative_marey = function(x, group = 'set + map') {
axis.text=element_text(colour="black"),
legend.position = "right")
} else {
p = ggplot2::ggplot(data = df, aes(x = phys/10^6, y = gen)) +
p = ggplot2::ggplot(data = df, aes(x = .data$phys/10^6, y = .data$gen)) +
point_rec +
facet +
labs(x = "Genomic position (Mb)", y = "Genetic distance (cM)", colour = "dataset") +
Expand Down Expand Up @@ -258,25 +259,25 @@ plot_comparative_recmap = function(x, group = 'set + map') {
x$lowerRecRate = x$lowerRecRate * 10^6

if (group == 'set') {
grouping = as.formula(~as.factor(set))
grouping = as.formula(~as.factor(.data$set))
facet = facet_wrap(grouping, scales = "free")
line_rec = geom_line(aes(group = map, color = map))
ribbon_rec = geom_ribbon(aes(x = point/10^6, ymin = lowerRecRate, ymax = upperRecRate, fill = map), alpha = 0.2)
line_rec = geom_line(aes(group = .data$map, color = .data$map))
ribbon_rec = geom_ribbon(aes(x = .data$point/10^6, ymin = .data$lowerRecRate, ymax = .data$upperRecRate, fill = .data$map), alpha = 0.2)
}
if (group == 'map') {
grouping = as.formula(~as.factor(map))
grouping = as.formula(~as.factor(.data$map))
facet = facet_wrap(grouping, scales = "free")
line_rec = geom_line(aes(group = set, color = set))
ribbon_rec = geom_ribbon(aes(x = point/10^6, ymin = lowerRecRate, ymax = upperRecRate, fill = set), alpha = 0.2)
line_rec = geom_line(aes(group = .data$set, color = .data$set))
ribbon_rec = geom_ribbon(aes(x = .data$point/10^6, ymin = .data$lowerRecRate, ymax = .data$upperRecRate, fill = .data$set), alpha = 0.2)
}
if (group == 'set + map') {
grouping = as.formula(~as.factor(map) + as.factor(set))
grouping = as.formula(~as.factor(.data$map) + as.factor(.data$set))
facet = facet_grid(grouping, scales = "free")
line_rec = geom_line()
ribbon_rec = geom_ribbon(aes(x = point/10^6, ymin = lowerRecRate, ymax = upperRecRate), fill = 'darkgray', alpha = 0.2)
ribbon_rec = geom_ribbon(aes(x = .data$point/10^6, ymin = .data$lowerRecRate, ymax = .data$upperRecRate), fill = 'darkgray', alpha = 0.2)
}

p = ggplot2::ggplot(data = x, aes(x = point/10^6, y = recRate)) +
p = ggplot2::ggplot(data = x, aes(x = .data$point/10^6, y = .data$recRate)) +
line_rec +
ribbon_rec +
# facet_grid(~as.factor(set)) +
Expand Down Expand Up @@ -352,20 +353,21 @@ subset_comparative_marey = function(x,

#' Summary of a `comparative_marey_map` object
#'
#' @param x a `comparative_marey_map` object to summarize
#' @param object a `comparative_marey_map` object to summarize
#' @param ... additional arguments
#'
#' @return a summary
#'
#' @method summary comparative_marey_map
#' @export
#'
summary.comparative_marey_map = function(x, ...) {
dataset = paste0(c(as.character(head(x$set, 3)), "..."))
n_maps = length(x$set)
length_linkage_map = unlist(lapply(x$data, function(x) max(x[[1]]$gen)))
length_genome_Mb = unlist(lapply(x$data, function(x) max(x[[1]]$phys)))
summary.comparative_marey_map = function(object, ...) {
dataset = paste0(c(as.character(head(object$set, 3)), "..."))
n_maps = length(object$set)
length_linkage_map = unlist(lapply(object$data, function(x) max(x[[1]]$gen)))
length_genome_Mb = unlist(lapply(object$data, function(x) max(x[[1]]$phys)))
length_genome_Mb = length_genome_Mb / 1000000
n_markers = unlist(lapply(x$data, function(x) length(x[[1]]$gen)))
n_markers = unlist(lapply(object$data, function(x) length(x[[1]]$gen)))
density_markers = n_markers / length_genome_Mb

cat("============== Summary of the comparative marey map ==============\n",
Expand Down
8 changes: 4 additions & 4 deletions R/lorenz.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,10 @@
#' @importFrom ggplot2 .data
#'
lorenz = function(x, return.plot = TRUE) {
if (class(x) == 'mareyMap') {
if (is(x, 'mareyMap')) {
df = x$recMap
} else {
if (class(x) == 'comparative_marey_map') {
if (is(x, 'comparative_marey_map')) {
df = comparative_recmap_to_dataframe(x)
}
}
Expand Down Expand Up @@ -62,10 +62,10 @@ lorenz = function(x, return.plot = TRUE) {
diagonal = data.frame(x = seq(0, 1, by = 0.01),
y = seq(0, 1, by = 0.01))

p = ggplot(data = out, aes(x = relativePhys, y = relativeGen, fill = as.factor(map), colour = as.factor(set))) +
p = ggplot(data = out, aes(x = .data$relativePhys, y = .data$relativeGen, fill = as.factor(.data$map), colour = as.factor(.data$set))) +
geom_line(alpha = 0.3) +
# facet_wrap(~ as.factor(set)) +
geom_line(data = diagonal, aes(x = x, y = y, fill = NA, colour = NA), color = "black") +
geom_line(data = diagonal, aes(x = .data$x, y = .data$y, fill = NA, colour = NA), color = "black") +
xlim(0, 1) +
ylim(0, 1) +
xlab("Proportion of genomic distance") +
Expand Down
1 change: 1 addition & 0 deletions R/summary.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
#'
#' @method summary mareyMap
#' @export
#'
summary.mareyMap = function(object, ...) {
dataset = levels(object$mareyMap$set)
nameChromosome = object$chromosomeName
Expand Down
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# EasyMareyMap

![license](https://badgen.net/badge/license/GPL3.0/blue)
![release](https://badgen.net/badge/release/0.2.0/green?icon=github)
![r-workflow](https://github.com/github/docs/actions/workflows/r.yml/badge.svg)



Estimate local recombination rates with the Marey map method as described in [Brazier and Glémin 2022](https://doi.org/10.1371/journal.pgen.1010141).

This R package provides an easy command line solution for the Marey map method, which is described in more detail in the original MareyMap package. The original MareyMap package is designed for interactive graphical use only and does not provide scripting capabilities. It does not allow to process many maps at a time. Hence we re-implemented the method in a more machine-friendly approach, in order to use it for batch processing and reproducible scripts. See the original MareyMap package here [Rezvoy et al. 2007](https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btm315) and [Siberchicot et al. 2017](https://CRAN.R-project.org/package=MareyMap).
Expand Down
4 changes: 2 additions & 2 deletions man/brokenstick.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions man/brokenstick_one_map.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions man/comparative_recombination_maps.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/compute_stats_marey.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/plot_comparative_marey.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions man/summary.comparative_marey_map.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 17 additions & 4 deletions vignettes/RecombinationMap.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -13,22 +13,35 @@ knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```


The `EasyMareyMap` package for R has been implemented to estimate recombination maps from genetic maps (or linkage maps) using the Marey map approach [@rezvoy_mareymap:_2007]. It is designed to provide easy access to the data (stored in slots within objects) and the functions via the command line, in order to maximize scriptability, reproducibility and scalability to large datasets. It offers a statistical framework to estimate recombination maps with the automatic calibration of hyperparameters and a bootstrap procedure to measure the uncertainty of our estimates. The method has been used and validated in [@brazierDiversityDeterminantsRecombination2022].
The `EasyMareyMap` package for R has been implemented to estimate recombination maps from genetic maps (or linkage maps) using the Marey map approach [@rezvoy_mareymap:_2007]. The original MareyMap package was designed for interactive graphical use only and does not provide scripting capabilities. It does not allow to process many maps at a time. See the original MareyMap package here [Rezvoy et al. 2007](https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btm315) and [Siberchicot et al. 2017](https://CRAN.R-project.org/package=MareyMap).


Our `EasyMareyMap` package is designed to provide easy access to the data (stored in slots within objects) and the functions via the command line, hence a more machine-friendly approach, in order to maximize scriptability, reproducibility and scalability to large datasets. It offers a statistical framework to estimate recombination maps with the automatic calibration of hyperparameters and a bootstrap procedure to measure the uncertainty of our estimates. The method has been used and validated in [@brazierDiversityDeterminantsRecombination2022].



Moreover, it offers a statistical framework to estimate recombination maps with the automatic calibration of hyperparameters and a bootstrap procedure to measure the uncertainty of our estimates, which is an advantage for robust data analyses. The method has been used and validated in [@brazierDiversityDeterminantsRecombination2022]. Our `EasyMareyMap` package adds these new features:
* automatic calibration of the smoothing parameter
* bootstraps to estimate a 95% confidence interval
* graphical zone selection to exclude a bunch of outlier markers
* a graphical and statistical framework for comparative genomics (compare Marey maps between species, populations or genomes)





The package is loaded with the following command.

```{r setup, message=FALSE, warning=FALSE}
library(ggplot2)
library(EasyMareyMap)
library(ggplot2)
```


# Create a MareyMap object
# Create a `MareyMap` object

You can import the example dataset based on linkage maps for the five chromosomes of the plant *Arabidopsis thaliana* [@serin_construction_2017].

Expand Down

0 comments on commit 3b3f3b3

Please sign in to comment.