Releases: poseidon-framework/poseidon-hs
Release v1.6.2.1
This is a bigger release with various new features and improvements. It is technically breaking, because a minor, redundant argument of genoconvert
was removed.
Writing support for gzipped genotype data
After reading support for zipped data was already added in V 1.5.7.0, this release now introduces the complementary writing feature for EIGENSTRAT and PLINK files in genoconvert
and forge
. Both commands get a new option -z
which creates gzipped output.
-z,--zip Should the resulting genotype- and snp-files be
gzipped?
Note that this feature includes a smart way of handling already available files to not overwrite them, but still consider them when updating a package's POSEIDON.yml file. -z
is also usable with unpackaged genotype data (-p
, --onlyGeno
).
Future versions of the Poseidon package schema will formally specify this feature.
Bibliography information in list
and the Web-API
The list
subcommand now supports a new view (next to --packages
, --groups
and individuals
): --bibliography
allows to get a tabular overview of publications in a package repository.
$ trident list -d 2010_RasmussenNature --bibliography
...
.---------------------.--------------------------------------------------------------.-----------------------.------.---------------------------.---------------.
| BibKey | Title | Author | Year | DOI | Nr of samples |
:=====================:==============================================================:=======================:======:===========================:===============:
| AADR | The Allen Ancient DNA Resource (AADR): A curated compendium… | Swapan Mallick et al. | 2023 | 10.1101/2023.04.06.535797 | 1 |
| AADRv424 | The Allen Ancient DNA Resource (AADR): A curated compendium… | S Mallick and D Reich | 2023 | 10.7910/DVN/FFIDCW | 1 |
| RasmussenNature2010 | Ancient human genome sequence of an extinct Palaeo-Eskimo | M Rasmussen et al. | 2010 | 10.1038/nature08835 | 1 |
'---------------------'--------------------------------------------------------------'-----------------------'------'---------------------------'---------------'
Additional fields from the .bib file can be added to this table with -b|--bibField ...
(just as -j|--jannoColumn ...
for --individuals
). --fullBib
adds everything that is available (just as --fullJanno
). As usual, tab-separated output can be requested with --raw
for derived analyses on the command line.
Correspondingly the Web-API supports a new endpoint /bibliography
to serve bibliography information via HTTP in JSON format. The optional query argument additionalJannoColumns=...
allows to request extra fields here.
Remove empty .janno columns with rectify
The rectify
subcommand was upgraded with a first option to manipulated .janno files in one or multiple packages: --jannoRemoveEmpty
. This allows to remove empty columns from .janno files, so columns that only feature empty strings or n/a
values.
--jannoRemoveEmpty Reorder the .janno file and remove empty colums.
Remember to pair this option with --checksumJanno to
also update the checksum.
With this change came a rewrite of the way trident fills empty fields with n/a
when writing .janno and .ssf files. This behaviour now also affects the output of list
!
Removed redundant --onlyGeno
from genoconvert
We realized that --onlyGeno
in genoconvert
had the same effect as -o
if a different output directory is chosen. We therefore decided to remove this argument and improve the documentation of -o
:
-o,--outPackagePath DIR Path for the converted genotype files to be written
to. If a path is provided, only the converted
genotype files are written out, with no change of the
original package. If no path is provided, genotype
files will be converted in-place, including a change
in the POSEIDON.yml file to yield an updated valid
package (default: Nothing)
Bug fixes and technical changes
We fixed two bugs that broke the long-form genotype data input option (with --genoFile + --snpFile + ...
). They were accidentally added with the recent interface changes for V 1.5.7.0. This input interface should now be fully functional again.
We finally switched to a new compiler version (GHC 9.6.6) and a new stackage resolver version (lts-22.43). This required some minor adjustments in the server code, but should not have any user-facing consequences.
Release v1.5.7.3
This patch release fixes three minor bugs, some of which were accidentally introduced with the big changes in v1.5.7.0.
- Fixed a bug in the .janno reading triggered by trailing
à
characters. - Reverted unspecified behaviour:
0
is again allowed in theNr_SNPs
.janno column. - Fixed a bug introduced in v1.5.5.0, where command line input using the
-p
option would not behave correctly if the input files have multiple file endings, separated by dots.
Release v1.5.7.0
Warning
On 2024/11/06 we realized that this release includes a breaking change that is not documented below.
The command line input interface for unpackaged genotype data was modified from previously --inFormat EIGENSTRAT|PLINK + --genoFile + --snpFile + --indFile
to now --genoFile + --snpFile + --indFile
and --bedFile + --bimFile + --famFile
. So the format selection with the --inFormat
argument was removed and replaced with separate file selectors for EIGENSTRAT and PLINK data.
This affects all trident
subcommands that allow reading of unpackaged genotype data, namely init
, forge
, genoconvert
and validate
.
This release further improves .janno
parsing error messages and adds reading support for gzipped PLINK (.bed
and .bim
) and EIGENSTRAT (.geno
and .snp
) files. We also added (experimental) support for reading VCF files.
Better .janno error messages
Working with Poseidon packages generally involves reading and validation of .janno
files. trident
parses them carefully and reports structural issues that compromise their machine-readability. So far the error reports generally only included the line and type of an offending entry. This made it sometimes hard to determine which column exactly is broken. For this release we introduced individual data types for all specified .janno
columns, which allows more precises error messages.
To demonstrate this we modified an existing .janno
file in the Poseidon community archive (2012_MeyerScience
) and broke some of its columns. We added non-UTF8 encoded characters in the Relation_Note
column of line 2, a trailing ;
in the Coverage_on_Target_SNPs
column of line 3, and a leading x
to the Latitude
column of line 7.
Here is how these issues were previously reported and how they are shown now:
[Error] Can't read sample in 2012_MeyerScience/2012_MeyerScience2.csv in line 2:
-parse error (Failed reading: conversion error: Cannot decode byte '\x80': Data.Text.Encoding: Invalid UTF-8 stream)
+parse error (Failed reading: conversion error: Cannot decode byte '\x80': Data.Text.Encoding: Invalid UTF-8 stream in column Relation_Note)
[Error] Can't read sample in 2012_MeyerScience/2012_MeyerScience2.csv in line 3:
-parse error in one column (expected data type: Double, broken value: "32.12;", problematic characters: ";")
+parse error (Failed reading: conversion error: Coverage_on_Target_SNPs can not be converted to Double, because of a trailing ";")
[Error] Can't read sample in 2012_MeyerScience/2012_MeyerScience2.csv in line 7:
-parse error (Failed reading: conversion error: expected Double, got "x18.93726" (Failed reading: takeWhile1))
+parse error (Failed reading: conversion error: Latitude can not be converted to Double because input does not start with a digit)
The error messages now include the relevant column name and are more concrete and easy to understand.
Reading support for gzipped genotype data
Although not yet part of the Poseidon 2.7.1 standard, Poseidon packages can now contain gzipped genotype files. Specifically, for EIGENSTRAT-formatted genotype data, the genotype matrix file (.geno
) and the snp-list file (.snp
) can now also be zipped. This strictly requires file endings with .gz
, so .geno.gz
and .snp.gz
, respectively. Similarly, for PLINK-formatted genotype data, we now also accept .bed.gz
and .bim.gz
. Any such files with the gz
file ending are assumed to be gzipped, and are decoded on the fly using stream-processing. Gzipped and unzipped files can also be mixed within the same package.
For commands that support the --genoOne
option (init
, forge
and genoconvert
), note that we make some assumptions, which are summarised in the help text for the option:
-p,--genoOne FILE One of the input genotype data files. Expects .bed,
.bed.gz, .bim, .bim.gz or .fam for PLINK, or .geno,
.geno.gz, .snp, .snp.gz or .ind for EIGENSTRAT. The
other files must be in the same directory and must
have the same base name. If a gzipped file is given,
it is assumed that the file pairs (.geno.gz, .snp.gz)
or (.bim.gz, .bed.gz) are both zipped, but not the
.fam or .ind file. If a .ind or .fam file is given,
it is assumed that none of the file triples is
zipped. For VCF please see option --vcfFile
At this point, genoconvert
and forge
do not support writing of gzipped files. This will be added in the future.
VCF support for genotype data
Although not yet part of the Poseidon 2.7.1 standard, Poseidon packages can now contain VCF (Variant Call Format) files as genotype data, optionally gzipped. In contrast to EIGENSTRAT and PLINK format, which require triples of files, the VCF format requires just one file with ending .vcf
or .vcf.gz
. VCF files contain sample names, but no information about genetic sex or group names. This information is usually provided in .janno
files, so there is no loss of information in Poseidon packages. For trident init
, which constructs a minimal .janno
file from the genotypem file, we set the Genetic_Sex
column to "U", and the Group_Name
column to "unknown".
The VCF file format is very flexible and can encode a large amount of information (see https://samtools.github.io/hts-specs/VCFv4.2.pdf). We do not consider our parsing of VCF files to be complete. The feature is for now experimental, since future users may encounter valid VCF files that cause parsing errors in edge cases. Do not hesitate to file an issue in such a case: https://github.com/poseidon-framework/poseidon-hs/issues.
At this point, genoconvert
and forge
do not support writing of VCF files. This will be added in the future.
Release v1.5.4.0
This bigger release adds a number of useful features to trident
, some of them long requested. The highlights are ordered output for forge
, a way to preserve key information if forge
is applied to a singular source package, a new Web-API option to return the content of all available .janno
columns, and better error messages for common trident
issues.
Order forge
output with --ordered
The order of samples in a Poseidon package created with trident forge
depends on the order in which the relevant source packages are discovered by trident
(e.g. when it crawls for packages in the -d
base directories) and then the sample order within these packages. This mechanism did not allow for any convenient way to manually set the output order.
v1.5.4.0 adds a new option --ordered
, which causes trident
to output the resulting package with samples ordered according to the selection in -f
or --forgeFile
. This works through an alternative, slower sample selection algorithm that loops through the list of entities and checks for each entity which samples it adds or removes respectively from the final selection.
For simple, positive selection, packages, groups and samples are added as expected. Negative selection removes samples from the list again. If an entity is selected twice via positive selection, then its first occurrence is considered for the ordering.
Preserve the source package in forge
with --preservePyml
For the specific task of subsetting a singular, existing Poseidon package it can be useful to preserve some fields of the POSEIDON.yml
file of the source package, as well as supplementary information in the README.md
and the CHANGELOG.md
file. These are typically discarded by forge
, but can now be copied over to the output package with the new --preservePyml
output mode. Naturally this only works with a single source package!
--preservePyml
specifically preserves the following POSEIDON.yml
fields:
description
contributor
packageVersion
lastModified
readmeFile
changelogFile
Note that this does not include the package title
, which can be easily set to be identical to the source with -n
or -o
if it is desired. The poseidonVersion
field is also not copied, because trident
can only ever produce output packages with the latest Poseidon schema version.
While implementing this we clearly separated the different forge
output modes (--onlyGeno
, --minimal
, --preservePyml
and the default) and made them mutually exclusive. We did so to avoid an increasingly complex set of interactions between them for the future.
One particular application of --preservePyml
is the reordering of samples in an existing Poseidon package MyPac
with the new --ordered
flag. We suggest the following workflow for this application:
- Generate a
--forgeFile
with the desired order of the samples inMyPac
. This can be done manually or with any suitable tool. Here is an example, where we employqjanno
to generate aforge
selection so that the samples are ordered alphabetically by theirPoseidon_ID
:
qjanno "SELECT '<'||Poseidon_ID||'>' FROM d(MyPac) ORDER BY Poseidon_ID" --raw --noOutHeader > myOrder.txt
- Use
trident forge
with--ordered
and--preservePyml
to create the package with the specified order:
trident forge -d MyPac --forgeFile myOrder.txt -o MyPac2 --ordered --preservePyml
- Apply
trident rectify
to increment the package version number and document the reordering:
trident rectify -d MyPac2 --packageVersion Minor --logText "reordered the samples alphabetically by Poseidon_ID"
MyPac2
then acts as a stand-in replacement for MyPac
that only differs in the order of samples (and maybe the order of variables/fields in the POSEIDON.yml
, .janno
, .ssf
or .bib
files). This workflow is not as convenient as in-place reordering would be -- but much safer.
Request all .janno
columns in list
and the Web-API
trident list --individuals
allows to access per-sample information for Poseidon packages on the command line. With the -j
option arbitrary additional columns from the .janno
files can be appended to the output. Here, for example, the Country
and the Genetic_Sex
columns:
trident list -d 2010_RasmussenNature --individuals -j "Country" -j "Genetic_Sex"
.------------.---------------------.----------------------.----------------.-----------.-----------.-------------.
| Individual | Group | Package | PackageVersion | Is Latest | Country | Genetic_Sex |
:============:=====================:======================:================:===========:===========:=============:
| Inuk.SG | Greenland_Saqqaq.SG | 2010_RasmussenNature | 2.1.1 | True | Greenland | M |
'------------'---------------------'----------------------'----------------'-----------'-----------'-------------'
v1.5.4.0 adds a --fullJanno
flag to request all columns at once, without having to list them individually with many -j
arguments.
This convenience feature was also added to the Web-API, where it can be triggered with ?additionalJannoColumns=ALL
on the /individuals
endpoint:
https://server.poseidon-adna.org/individuals?additionalJannoColumns=ALL
Better error messages
In previous trident
versions some common error messages were not well rendered on the command line. This concerned particularly errors when parsing command line input, the POSEIDON.yml
file or genotype data. We applied multiple changes here to improve the cli output.
The behaviour of the global trident
option --errLength
was also changed. It now only truncates genotype data-related messages, but does so as well if these are raised on the [Warning]
log level. This should make the previously often illegible trident
output upon broken genotype data more readable.
Release v1.5.0.1
This very minor release only affects the static trident
executables produced for every release.
It introduces a distinction between pre-built X64
and ARM64
executables for macOS, where changes in the main processor architecture have recently rendered old builds invalid for new systems and vice versa.
That means the executable trident-macOS
will henceforward not longer exist, but instead the executables trident-macOS-X64
and trident-macOS-ARM64
.
In the past we have not explicitly documented changes in the compilation pipeline - v1.5.0.0, for example, came with a major overhaul of the pipeline - but in this case a small version bump seems to be in order to announce the split in available artefacts.
Release v1.5.0.0
This is a minor, but technically breaking release. It removes the example contributor Josiah Carberry from new packages created by trident init
and trident forge
Previously every package created by init
or forge
included an example entry in the contributor
field of the POSEIDON.yml
file:
- name: Josiah Carberry
email: [email protected]
orcid: 0000-0002-1825-0097
This served the purpose of reminding users to actually set a contributor and giving an example how to do so. To simplify scripting with Poseidon packages we now remove this slightly gimmicky default.
To encourage setting the contributor field we instead introduce a reading/validation warning in case the contributor
field is empty:
[Warning] Contributor missing in POSEIDON.yml file of package 2010_RasmussenNature-2.1.1
Release v1.4.1.0
This release adds an entirely new subcommand to merge two .janno
files (jannocoalesce
) and improves the error messages for broken .janno
files.
Merging .janno
files with jannocoalesce
The need for a tool to combine the information of two .janno
files arose in the Poseidon ecosystem as we started to conceptualize the Poseidon Minotaur Archive. This archive will be populated by paper-wise Poseidon packages for which the genotype data was regenerated through the Minotaur workflow (work in progress). We plan to reprocess various packages that are already in the Poseidon Community Archive and for these packages we want to copy e.g. spatiotemporal information from the already available .janno
files. jannocoalesce
is the answer to this specific need, but can also be useful for various other applications.
It generally works by reading a source .janno
file with -s|--sourceFile
(or all .janno
files in a -d|--baseDir
) and a target .janno
file with -t|--targetFile
. It then merges these files by a key column, which can be selected with --sourceKey
and --targetKey
. The default for both of these key columns is the Poseidon_ID
. In case the entries in the key columns slightly and systematically differ, e.g. because the Poseidon_ID
s in either have a special suffix (for example _SG
), then the --stripIdRegex
option allows to strip these with a regular expression to thus match the keys.
jannocoalesce
generally attempts to fill all empty cells in the target .janno
file with information from the source. --includeColumns
and --excludeColumns
allow to select specific columns for which this should be done. In some cases it may be desirable to not just fill empty fields in the target, but overwrite the information already there with the -f|--force
option. If the target file should be preserved, then the output can be directed to a new output .janno
file with -o|--outFile
.
Better error messages for broken .janno
files
.janno
file validation is a core feature of trident
. With this release we try to improve the error messages for a two common situations:
- Broken number fields. This can happen if some text or wrong character ends up in a number field.
So far the error messages for this case have been pretty technical. Here for example if an integer field is filled with 430;
, where the integer number 430
is accidentally written with a trailing ;
:
parse error (Failed reading: conversion error: expected Int, got "430;" (incomplete field parse, leftover: [59]))
The new error message is more clear:
parse error in one column (expected data type: Int, broken value: "430;", problematic characters: ";")
- Inconsistent
Date_*
,Contamination_*
andRelation_*
columns. These sets of columns have to be cross-consistent, following a logic that is especially complex for theDate_*
fields (see here).
So far any inconsistency was reported with this generic error message:
The Date_* columns are not consistent
Now we include far more precise messages, like e.g.:
Date_Type is not "C14", but either Date_C14_Uncal_BP or Date_C14_Uncal_BP_Err are not empty.
This should simplify tedious .janno
file debugging in the future.
Release v1.4.0.3
This small release fixes a performance issue related to finding the latest version of all packages. The bug had severe detrimental effects on forge
and fetch
, which are now resolved.
We used this opportunity to switch to a new GHC version and new versions of a lot of dependencies for building trident.
Release v1.4.0.2
This release finally fully enables handling multiple Poseidon package versions with trident. It includes a significant overhaul of the selection language in forge
and fetch
with major changes in its implementation and, as a consequence, multiple (subtle, but strictly breaking) changes in its semantics.
Handling multiple package versions
The trident subcommands fetch
, forge
, genoconvert
, list
, rectify
, survey
and validate
now by default consider all versions of each Poseidon package in the given base directories. Previously all of them only considered the latest versions. If this old behaviour is desired now, it can be enabled with the flag --onlyLatest
for the subcommands genoconvert
, list
, rectify
, survey
and validate
. fetch
and forge
now generally consider all package versions (if we ignore the selection language semantics) and summarize
continues to consider only the latest ones.
Changes to the selection language
In the forge
- and fetch
-selection language it is now possible to specify which version of a Poseidon package should be forged/fetched by appending the version number after a minus: e.g. *2010_RasmussenNature-2.1.1*
. This also works for the more verbose and precise syntax to describe individuals, e.g. <2010_RasmussenNature-2.1.1:Greenland_Saqqaq.SG:Inuk.SG>
.
While implementing this change, we also reworked the entity selection logic. It now adheres to the following rules:
Inclusion queries
*Pac1*
: Select all individuals in the latest version of package "Pac1"*Pac1-1.0.1*
: Select all individuals in package "Pac1" with version "1.0.1"Group1
: Select all individuals associated with "Group1" in all latest versions of all packages<Ind1>
: Select the individual named "Ind1", searching in all latest packages.<Pac1:Group1:Ind1>
: Select the individual named "Ind1" associated with "Group1" in the latest version of package "Pac1"<Pac1-1.0.1:Group1:Ind1>
: Select the individual named "Ind1" associated with "Group1" in the package "Pac1" with version "1.0.1"
Exclusion queries
-*Pac1*
: Remove all individuals in all versions of package "Pac1"-*Pac1-1.0.1*
: Remove only individuals in package "Pac1" with version "1.0.1" (but leave other versions in)-Group1
: Remove all individuals associated with "Group1" in all versions of all packages (not just the latest)-<Ind1>
: Remove all individuals named "Ind1" in all versions of all packages (not just the latest).-<Pac1:Group1:Ind1>
: Remove the individual named "Ind1" associated with "Group1", searching in all versions of package "Pac1"-<Pac1-1.0.1:Group1:Ind1>
: Remove the individual named "Ind1" associated with "Group1", but only if they are in "Pac1" with version "1.0.1"
Missing (or mis-spelled) entities in a selection-set lead to errors now.
If the forge entity list starts with a negative entity, or if the entity list is empty, forge
will still implicitly assume you want to include all individuals from only the latest packages found in the baseDirs.
Likewise, trident fetch --downloadAll
considers only latest versions now.
The specific individual selection syntax (with -<Pac1-1.0.1:Group1:Ind1>
) does not perform automatic duplicate resolution any more. If there is another <Ind1>
somewhere within the selected entities, then this will cause an error that has to be resolved manually by adjusting the selection.
Minor additional changes
- The Web API and the
list
subcommand now return an extra, boolean field/columnisLatest
to point out if an entity (individual, group, package) is from the latest package version. list
now also returns column headers with the--raw
flag. If they are not desired, then they have to be filtered out manually on the command line (e.g. withtrident list ... | tail -n+2
).- Duplicate individuals in a package collection do not anymore lead to errors. Instead, only a selection for
forge
(and externally also inxerxes
), if resulting in multiple individuals in the selection, will lead to errors.validate
will also fail by default, except--ignoreDuplicates
is set.
Release v1.3.0.4
This is a significant release with a breaking change, multiple new features and a number of minor fixes and improvements.
rectify
replaces update
The subcommand update
was renamed to rectify
to better express its purpose. The name update
suggested that this command could effectively migrate packages from one Poseidon version to the next, which was never the case. Structural and semantical changes to the package always have to be performed by the user through other means, so usually by manually editing the respective package files. rectify
only helps to adjust a number of parameters (mostly in the POSEIDON.yml file) after the changes have been applied: It updates checksums, iterates version numbers, adds contributors and appends logging information to CHANGELOG files. Despite this limitation it is still a valuable tool, especially for the management of large package archives, where structural changes are often applied to many packages at once, all requiring "rectification" in the end.
rectify
doesn't only introduce a different name, it also features a different interface. While update
was a catch-all procedure with an opinionated, default behaviour that could partially be adjusted with various flags, rectify
follows a much more transparent opt-in philosophy. The new interface allows to precisely choose which aspects of a package should be updated.
rectify
. They have to be requested explicitly!
Here is the new command line documentation of rectify
:
Usage: trident rectify (-d|--baseDir DIR) [--ignorePoseidonVersion]
[--poseidonVersion ?.?.?]
[--packageVersion VPART [--logText STRING]]
[--checksumAll | [--checksumGeno] [--checksumJanno]
[--checksumSSF] [--checksumBib]]
[--newContributors DSL]
Adjust POSEIDON.yml files automatically to package changes
Available options:
-h,--help Show this help text
-d,--baseDir DIR A base directory to search for Poseidon packages.
--ignorePoseidonVersion Read packages even if their poseidonVersion is not
compatible with trident.
--poseidonVersion ?.?.? Poseidon version the packages should be updated to:
e.g. "2.5.3".
--packageVersion VPART Part of the package version number in the
POSEIDON.yml file that should be updated: Major,
Minor or Patch (see https://semver.org).
--logText STRING Log text for this version in the CHANGELOG file.
--checksumAll Update all checksums.
--checksumGeno Update genotype data checksums.
--checksumJanno Update .janno file checksum.
--checksumSSF Update .ssf file checksum
--checksumBib Update .bib file checksum.
--newContributors DSL Contributors to add to the POSEIDON.yml file in the
form "[Firstname Lastname](Email address);...".
serve
now provides different package archives and list
and fetch
can access them
trident serve
, so the subcommand behind the server providing the Poseidon Web API, can now host packages from multiple named archives in parallel. This works through a modified -d
interface on the command line and a new option ?archive=...
in the Web API. The client commands fetch
and list
can request information and package download from these different archives with a new option --archive
. If --archive
(or ?archive=...
in the http request) are not given, then the server falls back to a default archive (the first in -d
).
See the Poseidon public archive and Web API documentation for the concrete consequences of this new feature.
validate
can now check individual files
The validate
subcommand is no longer confined to validating entire poseidon packages. It can still very much do so with -d
, where -- just as before -- a number of optional flags can be used to control the exact behaviour. This release, in fact, adds the new options --ignorePoseidonVersion
and --ignoreChecksums
here. But validate
can now also read, parse and thus check individual files: POSEIDON.yml files, genotype data files, .janno files, .ssf files or .bib files. This is tremendously useful for building packages step-by-step, e.g. in automatic pipelines.
Here is the new command line documentation of validate
:
Usage: trident validate ((-d|--baseDir DIR) [--ignoreGeno] [--fullGeno]
[--ignoreDuplicates] [-c|--ignoreChecksums]
[--ignorePoseidonVersion] |
--pyml FILE | (-p|--genoOne FILE) | --inFormat FORMAT
--genoFile FILE --snpFile FILE --indFile FILE |
--janno FILE | --ssf FILE | --bib FILE) [--noExitCode]
Check Poseidon packages or package components for structural correctness
Available options:
-h,--help Show this help text
-d,--baseDir DIR A base directory to search for Poseidon packages.
--ignoreGeno Ignore snp and geno file.
--fullGeno Test parsing of all SNPs (by default only the first
100 SNPs are probed).
--ignoreDuplicates Do not stop on duplicated individual names in the
package collection.
-c,--ignoreChecksums Whether to ignore checksums. Useful for speedup in
debugging.
--ignorePoseidonVersion Read packages even if their poseidonVersion is not
compatible with trident.
--pyml FILE Path to a POSEIDON.yml file.
-p,--genoOne FILE One of the input genotype data files. Expects .bed,
.bim or .fam for PLINK and .geno, .snp or .ind for
EIGENSTRAT. The other files must be in the same
directory and must have the same base name.
--inFormat FORMAT The format of the input genotype data: EIGENSTRAT or
PLINK. Only necessary for data input with --genoFile
+ --snpFile + --indFile.
--genoFile FILE Path to the input geno file.
--snpFile FILE Path to the input snp file.
--indFile FILE Path to the input ind file.
--janno FILE Path to a .janno file.
--ssf FILE Path to a .ssf file.
--bib FILE Path to a .bib file.
--noExitCode Do not produce an explicit exit code.
Other, minor changes
- Fixed the behaviour of
forge
when combining .bib files. Publication duplicates are now properly removed upon merging and the output is alphabetically sorted. - Added a global option
--debug
, which is short for--logMode VerboseLog
. - Refactored
summarise
to make the resulting counts more accurate. Some variables in the output table have been renamed as well. - Fixed the behaviour of
chronicle
when updating a chronicle file (with-u
): ThelastModified
field is now only touched if there is actually a change in the package list. - Some cleaning of the general
trident
command line documentation: Added meaningful meta variables to all arguments. - Shortened the default command line output of
fetch
to make it more readable. - Slightly better error handling for failed http requests in
fetch
andlist --remote
.