Releases · HKU-BAL/Clair3

06 Mar 14:52

zhengzhenxian

v1.0.0

f95b634

v1.0.0

Added Clair3 version number to the VCF header (#141).
Fixed the numpy.int issue when using newer numpy version (#165, PR contributor @Aaron Tyler).
The new version converts all IUPAC bases to 'N' in both VCF and GVCF output, use --keep_iupac_bases to keep the IUPAC bases (#153).
Added options --use_longphase_for_intermediate_phasing, --use_whatshap_for_final_output_phasing, --use_longphase_for_final_output_phasing, --use_whatshap_for_final_output_haplotagging to disambiguate intermediate phasing and final VCF phasing either using WhatsHap or LongPhase, old options are still usable (#164).
Fixed "shell script interpreter selection problem" when using Clair3 as a host user within a Docker container (#175).

Assets 2

20 Aug 03:21

zhengzhenxian

v0.1-r12

2dd8b44

v0.1-r12

CRAM input is supported (#117).
Bumped up dependencies' version to "Python 3.9" (#96), "TensorFlow 2.8", "Samtools 1.15.1", "WhatsHap 1.4".
VCF DP tag now shows raw coverage for both pileup and full-alignment calls (before r12, sub-sampled coverage was shown for pileup calls if average DP > 144, (#128).
Fixed Illumina representation unification out-of-range error in training (#110).

Assets 2

13 Jun 03:28

zhengzhenxian

v0.1-r11.1

b87d9d8

v0.1-r11.1 Pre-release

Pre-release

Users, please ignore this pre-release. This pre-release is for Zenodo to pull and archive Clair3 for the first time.

Assets 2

04 Apr 10:16

zhengzhenxian

v0.1-r11

e8c2e50

v0.1-r11

Variant calling ~2.5x faster than v0.1-r10 tested with ONT Q20 data, with feature generation in both pileup and full-alignment now implemented in C (co-contributors @cjw85, @ftostevin-ont, @EpiSlim).
Added the lightning-fast longphase as an option for phasing. Enable using longphase with option --longphase_for_phasing. New option is disabled by default to align with the default behavior of the previous versions, but we recommend enable when calling human variants with ≥20x long-reads).
Added --min_coverage and --min_mq options (#83).
Added --min_contig_size option to skip calling variants in short contigs when using genome assembly as input.
Reads haplotagging after phasing before full-alignment calling now integrated into full-alignment calling to avoid generating an intermediate BAM file.
Supported .csi BAM index for large references (#90). For more speedup details, please check Notes on r11.

v0.1-r11 minor 2 patches are included in all installation options

Assets 2

13 Jan 12:43

zhengzhenxian

v0.1-r10

977553a

v0.1-r10

Added a new ONT Guppy5 model (r941_prom_sup_g5014). Click here for some benchmarking results. This sup model is also applicable to reads called using the hac and fast mode. The old r941_prom_sup_g506 model that was fine-tuned from the Guppy3,4 model is obsoleted.
Added --var_pct_phasing option to control the percentage of top ranked heterozygous pile-up variants used for WhatsHap phasing.

Assets 2

01 Dec 12:05

zhengzhenxian

v0.1-r9

e3dbd49

v0.1-r9

Added the --enable_long_indel option to output indel variant calls >50bp (#64), Click here to see more benchmarking results.

Assets 2

11 Nov 13:59

zhengzhenxian

v0.1-r8

c5407ad

v0.1-r8

Added the --enable_phasing option that adds a step after Clair3 calling to output variants phased by Whatshap (#63).
Fixed unexpected program termination on successful runs.

Assets 2

19 Oct 09:10

zhengzhenxian

v0.1-r7

6cd8994

v0.1-r7

Increased var_pct_full in ONT mode from 0.3 to 0.7. Indel F1-score increased ~0.2%, but took ~30 minutes longer to finish calling a ~50x ONT dataset.
Expand fall through to next most likely variant if network prediction has insufficient read coverage (#53 commit 09a7d18, contributor @ftostevin-ont), accuracy improved on complex Indels.
Streamized pileup and full-alignment training workflows. Reduce diskspace demand in model training (#55 commit 09a7d18, contributor @ftostevin-ont).
Added mini_epochs option in Train.py, performance slightly improved in training a model for ONT Q20 data using mini-epochs(#60, contributor @ftostevin-ont).
Massively reduced disk space demand when outputting GVCF. Now compressing GVCF intermediate files with lz4, five times smaller with little speed penalty.
Added --remove_intermediate_dirto remove intermediate files as soon as no longer needed (#48).
Renamed ONT pre-trained models with Medaka's naming convention.
Fixed training data spilling over to validation data (#57).

Assets 2

04 Sep 13:47

zhengzhenxian

v0.1-r6

ab47f45

v0.1-r6

Reduced memory footprint at the SortVcf stage(#45).
Reduced ulimit -n (number of files simultaneously opened) requirement (#45, #47).
Added Clair3-Illumina package in bioconda(#42)

Assets 2

19 Jul 15:11

zhengzhenxian

v0.1-r5

aa79f1f

v0.1-r5

Modified data generator in model training to avoid memory exhaustion and unexpected segmentation fault by Tensorflow (contributor @ftostevin-ont ).
Simplified dockerfile workflow to reuse container caching (contributor @amblina).
Fixed ALT output for reference calls (contributor @wdecoster).
Fixed a bug in multi-allelic AF computation (AF of [ACGT]Del variants was wrong before r5).
Added AD tag to the GVCF output.
Added the --call_snp_only option to only call SNP only (#40).
Added pileup and full-alignment output validity check to avoid workflow crashing (#32, #38).

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: HKU-BAL/Clair3

v1.0.0

v0.1-r12

v0.1-r11.1

v0.1-r11

v0.1-r10

v0.1-r9

v0.1-r8

v0.1-r7

v0.1-r6

v0.1-r5