Skip to content

Commit

Permalink
Merge pull request #25 from marbl/develop
Browse files Browse the repository at this point in the history
v0.8.3 Release
  • Loading branch information
alexsweeten authored Jun 11, 2024
2 parents 4e4a492 + 32e9d64 commit cc2e3a1
Show file tree
Hide file tree
Showing 6 changed files with 135 additions and 72 deletions.
41 changes: 22 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
![](images/logo.png)

- [Cite](#cite)
- [About](#about)
- [Installation](#installation)
- [Usage](#usage)
Expand All @@ -14,11 +15,20 @@
- [Sample run - comparing two sequences](#sample-run---comparing-two-sequences)
- [Questions](#questions)
- [Known Issues](#known-issues)
- [Cite](#cite)


## Cite

Alexander P. Sweeten, Michael C. Schatz, Adam M. Phillippy, ModDotPlot - Rapid and interactive visualization of complex repeats
bioRxiv 2024.04.15.589623; doi: https://doi.org/10.1101/2024.04.15.589623

If you use ModDotPlot for your research, please cite our software!

---

## About

ModDotPlot is a novel dot plot visualization tool, similar to [StainedGlass](https://mrvollger.github.io/StainedGlass/). ModDotPlot utilizes modimizers to compute the [Containment Index](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1841-x) between pairwise combinations of genomic intervals, and rapidly approximates their Average Nucleotide Identity. This significantly reduces the computational time required to produce these plots, enough to view multiple layers of resolution in real time!
ModDotPlot is a dot plot visualization tool designed for large sequences and whole genomes. ModDotPlot outputs an identity heatmap similar to [StainedGlass](https://mrvollger.github.io/StainedGlass/) by rapidly approximating the Average Nucleotide Identity between pairwise combinations of genomic intervals. This significantly reduces the computational time required to produce these plots, enough to view multiple layers of resolution in real time!

![](images/demo.gif)

Expand Down Expand Up @@ -53,8 +63,6 @@ Finally, confirm that the installation was installed correctly by running `moddo
| | | | (_) | (_| | | |__| | (_) | |_ | | | | (_) | |_
|_| |_|\___/ \__,_| |_____/ \___/ \__| |_| |_|\___/ \__|
v0.8.2
usage: moddotplot [-h] {interactive,static} ...
ModDotPlot: Visualization of Complex Repeat Structures
Expand All @@ -79,7 +87,7 @@ ModDotPlot must be run either in `interactive` mode, or `static` mode:
moddotplot interactive <ARGS>
```

This will launch a [Dash application](https://plotly.com/dash/) on your machine's localhost. Open any web browser and go to `http://127.0.0.1:<PORT_NUMBER>` to view the interactive plot. Running `Ctrl+C` on the command line will exit the Dash application. The default port number used by Dash is `8050`, but this can be customized using the `--port` command (see [interactive mode commands](#interactive-mode-commands) for further info).
This will launch a [Dash application](https://plotly.com/dash/) on your machine's localhost. Open any web browser and go to `http://127.0.0.1:<PORT_NUMBER>` to view the interactive plot (this should happen automatically, but depending on your environment you might need to copy and paste this URL into your web browser). Running `Ctrl+C` on the command line will exit the Dash application. The default port number used by Dash is `8050`, but this can be customized using the `--port` command (see [interactive mode commands](#interactive-mode-commands) for further info).

### Static Mode

Expand Down Expand Up @@ -217,6 +225,14 @@ List of custom colors in hexcode format can be entered sequentially, mapped from

Add custom identity threshold breakpoints. Note that the number of breakpoints must be equal to the number of colors + 1.

`-t / --axes-ticks <list of ints>`

Custom tickmarks for x and y axis. Values outside of the axes-limits will not be shown.

`-a / --axes-limits <int>`

Change axis limits for x and y axis. Useful for comparing multiple plots, allowing them to stay in scale.

`--bin-freq <bool>`

By default, histograms are evenly spaced based on the number of colors and the identity threshold. Select this argument to bin based on the frequency of observed identity values.
Expand All @@ -235,8 +251,6 @@ $ moddotplot interactive -f sequences/Chr1_cen.fa
| | | | (_) | (_| | | |__| | (_) | |_ | | | | (_) | |_
|_| |_|\___/ \__,_| |_____/ \___/ \__| |_| |_|\___/ \__|
v0.8.2
Running ModDotPlot in interactive mode
Retrieving k-mers from Chr1:14000000-18000000....
Expand Down Expand Up @@ -323,8 +337,6 @@ $ moddotplot static -c config/config.json
| | | | (_) | (_| | | |__| | (_) | |_ | | | | (_) | |_
|_| |_|\___/ \__,_| |_____/ \___/ \__| |_| |_|\___/ \__|
v0.8.2
Running ModDotPlot in static mode
Retrieving k-mers from Chr1:14M-18M....
Expand Down Expand Up @@ -362,7 +374,7 @@ Chr1_cen_plots/Chr1:14M-18M_TRI.png, Chr1_cen_plots/Chr1:14M-18M_TRI.pdf, Chr1_c
ModDotPlot can produce an a vs. b style dotplot for each pairwise combination of input sequences. Use the `--compare` command line argument to include these plots. When running `--compare` in interactive mode, a dropdown menu will appear, allowing the user to switch between self-identity and pairwise plots. Note that a maximum of two sequences are allowed in interactive mode. If you want to skip the creation of self-identity plots, you can use `--compare-only`:

```
moddotplot interactive -f sequences/chr14_segment.fa sequences/chr21_segment.fa --compare-only
moddotplot interactive -f sequences/chr15_segment.fa sequences/chr21_segment.fa --compare-only
```

---
Expand All @@ -382,12 +394,3 @@ For bug reports or general usage questions, please raise a GitHub issue, or emai
- If you encounter an error with the following traceback: `rv = reductor(4) TypeError: cannot pickle 'generator' object`, ths means that you have a newer version of Plotnine that is incompatible with ModDotPlot. Please uninstall plotnine and reinstall version 0.12.4 `pip install plotnine==0.12.4`.

- In interactive mode, comparing sequences of two sizes will lead to errors in zooming for the larger sequence. I plan to fix this in v0.9.0.

---


## Cite

Alexander P. Sweeten, Michael C. Schatz, Adam M. Phillippy, ModDotPlot - Rapid and interactive visualization of complex repeats
bioRxiv 2024.04.15.589623; doi: https://doi.org/10.1101/2024.04.15.589623

2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "ModDotPlot"
version = "0.8.2"
version = "0.8.3"
requires-python = ">= 3.7"
dependencies = [
"pysam",
Expand Down
2 changes: 1 addition & 1 deletion src/moddotplot/const.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
VERSION = "0.8.2"
VERSION = "0.8.3"
COLS = [
"#query_name",
"query_start",
Expand Down
32 changes: 18 additions & 14 deletions src/moddotplot/estimate_identity.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ def partitionOverlaps(
try:
kmer_list.append(lst[delta_start_index:delta_end_index])
except Exception as e:
print("test")
print(e)
kmer_list.append(lst[delta_start_index:seq_len])
counter += win
Expand Down Expand Up @@ -369,8 +370,7 @@ def pairwiseContainmentMatrix(
Returns:
np.ndarray: An identity matrix containing containment values.
"""
# X is the larger, y is the smaller
n = len(mod_set_x)
n = max(len(mod_set_y), len(mod_set_x))
progress_thresholds = round(n / 77)

if not supress_progress:
Expand All @@ -382,20 +382,24 @@ def pairwiseContainmentMatrix(
if w % progress_thresholds == 0:
printProgressBar(w, n, prefix="Progress:", suffix="Complete", length=40)
for q in range(n):
containment_matrix[w, q] = (
binomial_distance(
containment_neighbors(
mod_set_x[q],
mod_set_y[w],
mod_set_x_neighbors[q],
mod_set_y_neighbors[w],
identity,
try:
containment_matrix[w, q] = (
binomial_distance(
containment_neighbors(
mod_set_x[q],
mod_set_y[w],
mod_set_x_neighbors[q],
mod_set_y_neighbors[w],
identity,
k,
),
k,
),
k,
)
* 100.0
)
* 100.0
)
# Bandaid solution for too sequences that are too small.
except IndexError as e:
pass

if not supress_progress:
printProgressBar(
Expand Down
52 changes: 34 additions & 18 deletions src/moddotplot/moddotplot.py
Original file line number Diff line number Diff line change
Expand Up @@ -276,14 +276,6 @@ def get_parser():
"--width", default=9, type=float, nargs="+", help="Plot width."
)

static_parser.add_argument(
"--xaxis",
default=None,
type=float,
nargs="+",
help="Change x axis for self identity plots. Default is length of the sequence, in mbp.",
)

static_parser.add_argument("--dpi", default=600, type=int, help="Plot dpi.")

# TODO: Create list of accepted colors.
Expand Down Expand Up @@ -316,6 +308,22 @@ def get_parser():
help="Introduce custom color thresholds. Must be between identity threshold and 100.",
)

static_parser.add_argument(
"-a",
"--axes-limits",
default=None,
type=float,
help="Change x and y axis limits for self identity plots. Default is length of the sequence. Can't be shorter than length of sequence.",
)

static_parser.add_argument(
"-t",
"--axes-ticks",
default=None,
nargs="+",
help="Tick labels to include in x and y axis for custom plots.",
)

static_parser.add_argument(
"--bin-freq",
action="store_true",
Expand Down Expand Up @@ -402,13 +410,14 @@ def main():
args.no_plot = config.get("no_plot", args.no_plot)
args.no_hist = config.get("no_hist", args.no_hist)
args.width = config.get("width", args.width)
args.xaxis = config.get("xaxis", args.xaxis)
args.axes_limits = config.get("axes_limits", args.axes_limits)
args.dpi = config.get("dpi", args.dpi)
args.palette = config.get("palette", args.palette)
args.palette_orientation = config.get(
"palette_orientation", args.palette_orientation
)
args.colors = config.get("color", args.colors)
args.axes_ticks = config.get("axes_ticks", args.axes_ticks)
args.breakpoints = config.get("breakpoints", args.breakpoints)
args.bin_freq = config.get("bin_freq", args.bin_freq)

Expand Down Expand Up @@ -463,11 +472,12 @@ def main():
width=args.width,
dpi=args.dpi,
is_freq=args.bin_freq,
xlim=None, # TODO: Get xlim working
xlim=args.axes_limits,
custom_colors=args.colors,
custom_breakpoints=args.breakpoints,
from_file=df,
is_pairwise=False,
axes_labels=args.axes_ticks,
)
# Case 2: Pairwise bed file
if len(pairwise_id_scores) > 1:
Expand All @@ -482,11 +492,12 @@ def main():
width=args.width,
dpi=args.dpi,
is_freq=args.bin_freq,
xlim=None, # TODO: Get xlim working
xlim=args.axes_limits, # TODO: Get xlim working
custom_colors=args.colors,
custom_breakpoints=args.breakpoints,
from_file=df,
is_pairwise=True,
axes_labels=args.axes_ticks,
)
# Exit once all bed files have been iterated through
sys.exit(0)
Expand Down Expand Up @@ -818,9 +829,12 @@ def main():

if win < args.modimizer:
args.modimizer = win
"""raise ValueError(
"Window size must be greater than or equal to the modimizer sketch size"
)"""
if win < 10:
print(f"Error: sequence too small for analysis.\n")
print(
f"ModDotPlot requires a minimum window size of 10. Sequences less than 10Kbp will not work with ModDotPlot under normal resolution. We recommend rerunning ModDotPlot with --r {math.ceil(seq_length / 10)}.\n"
)
sys.exit(0)

seq_sparsity = round(win / args.modimizer)
if seq_sparsity <= args.modimizer:
Expand All @@ -830,11 +844,11 @@ def main():
expectation = round(win / seq_sparsity)
xaxis = 0
width = 0
if isinstance(args.xaxis, int) or args.xaxis == None:
"""if isinstance(args.xaxis, int) or args.xaxis == None:
xaxis = args.xaxis
else:
assert len(args.xaxis) == len(sequences)
xaxis = args.xaxis[i]
xaxis = args.xaxis[i]"""
if isinstance(args.width, int):
width = args.width
else:
Expand Down Expand Up @@ -888,11 +902,12 @@ def main():
width=width,
dpi=args.dpi,
is_freq=args.bin_freq,
xlim=xaxis,
xlim=args.axes_limits,
custom_colors=args.colors,
custom_breakpoints=args.breakpoints,
from_file=None,
is_pairwise=False,
axes_labels=args.axes_ticks,
)

# -----------COMPUTE COMAPRATIVE PLOTS-----------
Expand Down Expand Up @@ -991,11 +1006,12 @@ def main():
width=None,
dpi=args.dpi,
is_freq=args.bin_freq,
xlim=None,
xlim=args.axes_limits,
custom_colors=args.colors,
custom_breakpoints=args.breakpoints,
from_file=None,
is_pairwise=True,
axes_labels=args.axes_ticks,
)


Expand Down
Loading

0 comments on commit cc2e3a1

Please sign in to comment.