Skip to content

Commit

Permalink
VSEARCH 2.4.2: Improved paired-end merging
Browse files Browse the repository at this point in the history
  • Loading branch information
torognes committed Mar 10, 2017
1 parent e5ef8a2 commit 31b6e7d
Show file tree
Hide file tree
Showing 4 changed files with 21 additions and 27 deletions.
24 changes: 12 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,9 @@ In the example below, VSEARCH will identify sequences in the file database.fsa t
**Source distribution** To download the source distribution from a [release](https://github.com/torognes/vsearch/releases) and build the executable and the documentation, use the following commands:

```
wget https://github.com/torognes/vsearch/archive/v2.4.1.tar.gz
tar xzf v2.4.1.tar.gz
cd vsearch-2.4.1
wget https://github.com/torognes/vsearch/archive/v2.4.2.tar.gz
tar xzf v2.4.2.tar.gz
cd vsearch-2.4.2
./autogen.sh
./configure
make
Expand Down Expand Up @@ -68,33 +68,33 @@ Binary distributions are provided for x86-64 systems running GNU/Linux, macOS (v
Download the appropriate executable for your system using the following commands if you are using a Linux x86_64 system:

```sh
wget https://github.com/torognes/vsearch/releases/download/v2.4.1/vsearch-2.4.1-linux-x86_64.tar.gz
tar xzf vsearch-2.4.1-linux-x86_64.tar.gz
wget https://github.com/torognes/vsearch/releases/download/v2.4.2/vsearch-2.4.2-linux-x86_64.tar.gz
tar xzf vsearch-2.4.2-linux-x86_64.tar.gz
```

Or these commands if you are using a Linux ppc64le system:

```sh
wget https://github.com/torognes/vsearch/releases/download/v2.4.1/vsearch-2.4.1-linux-ppc64le.tar.gz
tar xzf vsearch-2.4.1-linux-ppc64le.tar.gz
wget https://github.com/torognes/vsearch/releases/download/v2.4.2/vsearch-2.4.2-linux-ppc64le.tar.gz
tar xzf vsearch-2.4.2-linux-ppc64le.tar.gz
```

Or these commands if you are using a Mac:

```sh
wget https://github.com/torognes/vsearch/releases/download/v2.4.1/vsearch-2.4.1-macos-x86_64.tar.gz
tar xzf vsearch-2.4.1-macos-x86_64.tar.gz
wget https://github.com/torognes/vsearch/releases/download/v2.4.2/vsearch-2.4.2-macos-x86_64.tar.gz
tar xzf vsearch-2.4.2-macos-x86_64.tar.gz
```

Or if you are using Windows, download and extract (unzip) the contents of this file:

```
https://github.com/torognes/vsearch/releases/download/v2.4.1/vsearch-2.4.1-win-x86_64.zip
https://github.com/torognes/vsearch/releases/download/v2.4.2/vsearch-2.4.2-win-x86_64.zip
```

Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.4.1-linux-x86_64` or `vsearch-2.4.1-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`.
Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.4.2-linux-x86_64` or `vsearch-2.4.2-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`.

Windows: You will now have the binary distribution in a folder called `vsearch-2.4.1-win-x86_64`. The vsearch executable is called `vsearch.exe`. The manual in PDF format is called `vsearch_manual.pdf`.
Windows: You will now have the binary distribution in a folder called `vsearch-2.4.2-win-x86_64`. The vsearch executable is called `vsearch.exe`. The manual in PDF format is called `vsearch_manual.pdf`.

**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/doc/vsearch.1). A pdf version (vsearch_manual.pdf) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form (vsearch_manual.pdf) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).

Expand Down
2 changes: 1 addition & 1 deletion configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# Process this file with autoconf to produce a configure script.

AC_PREREQ([2.63])
AC_INIT([vsearch], [2.4.1], [[email protected]])
AC_INIT([vsearch], [2.4.2], [[email protected]])
AC_CANONICAL_TARGET
AM_INIT_AUTOMAKE([subdir-objects])
AC_LANG([C++])
Expand Down
9 changes: 7 additions & 2 deletions man/vsearch.1
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
.\" ============================================================================
.TH vsearch 1 "March 1, 2017" "version 2.4.1" "USER COMMANDS"
.TH vsearch 1 "March 10, 2017" "version 2.4.2" "USER COMMANDS"
.\" ============================================================================
.SH NAME
vsearch \(em chimera detection, clustering, dereplication and
Expand Down Expand Up @@ -1054,7 +1054,7 @@ with the \-\-fastq_maxns are also discarded (no limit by
default). Staggered reads are not merged unless the
\-\-fastq_allowmergestagger option is specified. The minimum length of
the overlap region between the reads may be specified with the
\-\-minovlen option (default 10), and the overlap region may not
\-\-minovlen option (default 16), and the overlap region may not
include more mismatches than specified with the \-\-maxdiffs option (5
by default), otherwise the read pair is discarded. The mimimum and
maximum length of the merged sequence may be specified with the
Expand Down Expand Up @@ -3058,6 +3058,11 @@ command in help text.
Fixed an overflow bug in fastq_stats and fastq_eestats affecting
analysis of very large FASTQ files. Fixed maximum memory usage
reporting on Windows.
.TP
.BR v2.4.2\~ "released March 10th, 2017"
Default value for fastq_minovlen increased to 16 in accordance with
help text and for compatibility with usearch. Minor changes for
improved accuracy of paired-end read merging.
.RE
.LP
.\" ============================================================================
Expand Down
13 changes: 1 addition & 12 deletions src/mergepairs.cc
Original file line number Diff line number Diff line change
Expand Up @@ -61,12 +61,11 @@
#include "vsearch.h"

#define INPUTCHUNKSIZE 10000
#define SCOREMETHOD 2

/* scores */

const double alpha = 4.0;
const double beta = -5.0;
const double beta = -22.0;

/* static variables */

Expand Down Expand Up @@ -207,11 +206,7 @@ void precompute_qual()

p = 1.0 - px - py + px * py * 4.0 / 3.0;

#if SCOREMETHOD == 2
match_score[x][y] = alpha * p + beta * (1.0 - p);
#else
match_score[x][y] = alpha * p;
#endif

/* Mismatch */

Expand All @@ -221,11 +216,7 @@ void precompute_qual()

p = 1.0 - (px + py) / 3.0 + px * py * 4.0 / 9.0;

#if SCOREMETHOD == 2
mism_score[x][y] = alpha * (1.0 - p) + beta * p;
#else
mism_score[x][y] = beta * p;
#endif

}
}
Expand Down Expand Up @@ -462,7 +453,6 @@ double overlap_score(merge_data_t * ip,

int64_t optimize(merge_data_t * ip)
{
// int64_t i1 = opt_fastq_minovlen;
int64_t i1 = 1;

i1 = MAX(i1, ip->fwd_trunc + ip->rev_trunc - opt_fastq_maxmergelen);
Expand Down Expand Up @@ -582,7 +572,6 @@ void process(merge_data_t * ip)
if (!skip)
ip->offset = optimize(ip);

// if (ip->offset)
if (ip->offset >= opt_fastq_minovlen)
merge(ip);
}
Expand Down

0 comments on commit 31b6e7d

Please sign in to comment.