Skip to content

Commit

Permalink
Merge pull request #32 from pepkit/dev
Browse files Browse the repository at this point in the history
v0.1.6
  • Loading branch information
nleroy917 authored May 16, 2022
2 parents 564b730 + e7489f3 commit 0d141f3
Show file tree
Hide file tree
Showing 77 changed files with 13,840 additions and 440 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/black.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Lint

on: [push, pull_request]
on: [pull_request]

jobs:
lint:
Expand Down
21 changes: 21 additions & 0 deletions .github/workflows/run-codecov.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
name: Run codecov

on:
pull_request:
branches: [master]

jobs:
pytest:
runs-on: ${{ matrix.os }}
strategy:
matrix:
python-version: [3.9]
os: [ubuntu-latest]

steps:
- uses: actions/checkout@v2
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v2
with:
file: ./coverage.xml
name: py-${{ matrix.python-version }}-${{ matrix.os }}
12 changes: 2 additions & 10 deletions .github/workflows/run-pytest.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
name: Run pytests

on:
push:
branches: [master, dev]
pull_request:
branches: [master, dev]

Expand All @@ -11,8 +9,8 @@ jobs:
runs-on: ${{ matrix.os }}
strategy:
matrix:
python-version: [3.6, 3.7, 3.8, 3.9]
os: [ubuntu-latest, macos-latest]
python-version: ["3.6", "3.9", "3.10"]
os: [ubuntu-latest]

steps:
- uses: actions/checkout@v2
Expand All @@ -33,9 +31,3 @@ jobs:

- name: Run pytest tests
run: pytest tests -x -vv --cov=./ --cov-report=xml

- name: Upload coverage to Codecov
uses: codecov/codecov-action@v1
with:
file: ./coverage.xml
name: py-${{ matrix.python-version }}-${{ matrix.os }}
25 changes: 17 additions & 8 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,16 @@ doc/build/*
# generic ignore list:
*.lst

# Compiled source
# Compiled source
*.com
*.class
*.dll
*.exe
*.o
*.so
*.pyc
# Packages

# Packages
# it's better to unpack these files and commit the raw source
# git has its own built in compression methods
*.7z
Expand All @@ -34,13 +34,13 @@ doc/build/*
*.rar
*.tar
*.zip
# Logs and databases

# Logs and databases
*.log
*.sql
*.sqlite
# OS generated files

# OS generated files
.DS_Store
.DS_Store?
._*
Expand All @@ -49,7 +49,7 @@ doc/build/*
ehthumbs.db
Thumbs.db

# Gedit temporary files
# Gedit temporary files
*~

# libreoffice lock files:
Expand All @@ -61,6 +61,7 @@ open_pipelines/

# IDE-specific items
.idea/
.vscode/

# pytest-related
.cache/
Expand All @@ -81,3 +82,11 @@ peppy.egg-info/

# mkdocs website
site


webeido/webeido/uploads

# virtual env's
.env
env
venv
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.4.0
rev: v4.0.1
hooks:
- id: trailing-whitespace
- id: check-yaml
Expand All @@ -9,12 +9,12 @@ repos:
- id: trailing-whitespace

- repo: https://github.com/PyCQA/isort
rev: 5.7.0
rev: 5.9.1
hooks:
- id: isort
args: ["--profile", "black"]

- repo: https://github.com/psf/black
rev: 20.8b1
rev: 21.6b0
hooks:
- id: black
2 changes: 1 addition & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
include requirements/*
include README.md
include LICENSE.txt
include LICENSE.txt
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,3 @@
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

[PEP](http://pepkit.github.io) validation tool based on [jsonschema](https://github.com/Julian/jsonschema). See [documentation](http://eido.databio.org) for usage.

2 changes: 1 addition & 1 deletion codecov.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@ ignore:
- "*/argparser.py"
- "*/cli.py"
- "*/__main__.py"
- "setup.py"
- "setup.py"
18 changes: 11 additions & 7 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,21 @@

## Introduction

Eido is a validation and format conversion tool for [PEPs](http://pepkit.github.io). It provides validation based on [JSON Schema](https://github.com/Julian/jsonschema). The PEP specification defines a formal structure for organizing project and sample metadata, and eido provides a way to validate if data complies with that specification. Eido extends the JSON Schema vocabulary with PEP-specific features, like required input files. Eido also provides a command-line interface to convert a PEP input into a variety of outputs using [eido filters](filters.md), and includes ability to write [custom filters](writing-a-filter.md).
Eido is used to 1) validate or 2) convert format of sample metadata. Sample metadata is stored according to the standard [PEP specification](https://pep.databio.org). For validation, eido is based on [JSON Schema](https://json-schema.org) and extends it with new features, like required input files. You can [write your own schema](writing-a-schema.md) for your pipeline and use eido to validate sample metadata. For conversion, [eido filters](filters.md) convert sample metadata input into any output format, including [custom filters](writing-a-filter.md).

## Why do we need eido?

A PEP consists of metadata describing a set of items called *samples*. The metadata is divided into two-components: 1) sample-specific attributes; and 2) project attributes, which apply to all samples. A PEP follows a [formal specification](http://pep.databio.org) for formatting and organizing this data. Projects that follow the PEP specification can be read by a variety of PEP-compatible tools, which may require specific sample or project attributes. Eido is used to validate these specific required attributes.
Data-intensive bioinformatics projects often include metadata describing a set of samples. When it comes to handling such sample metadata, there are two common challenges that eido solves:

[JSON Schema](https://json-schema.org/) is a vocabulary that allows you to annotate and validate JSON documents. It's great for validating JSON documents, but alone it cannot validate a PEP, which has powerful portability features that go beyond a simple JSON document, so we require additional capability to validate it. Eido extends JSON Schema to add this capability, along with other features for validating sample metadata listed below.
<img src="img/validation.svg" style="float:right; width:220px; margin-left:50px">

## PEP-specific validation features
- **Validation**. Tool authors use eido to specify and describe required input sample attributes. Input sample attributes are described with a schema, and eido validates the sample metadata to ensure it satisfies the tool's needs. Eido uses [JSON Schema](https://json-schema.org/), which annotates and validates JSON. JSON schema alone is great for validating JSON, but bioinformatics sample metadata is more complicated, so eido provides additional capability and features tailored to bioinformatics projects listed below.

<img src="img/conversion.svg" style="float:right; width:220px; margin-left:50px">

- **Format conversion**. Tools often require sample metadata in a specific format. Eido filters take a metadata in standard PEP format and convert it to any desired output format. Filters can be either built-in or custom. This allows a single sample metadata source to be used for multiple downstream analyses.

## Eido validation features

An eido schema is written using the JSON Schema vocabulary, plus a few additional features:

Expand All @@ -32,8 +38,6 @@ An eido schema is written using the JSON Schema vocabulary, plus a few additiona

---

## What does 'eido' mean?
## Why the name 'eido'?

*Eidos* is a Greek term meaning *form*, *essence*, or *type* (see Plato's [Theory of Forms](https://en.wikipedia.org/wiki/Theory_of_forms)). Schemas are analogous to *forms*, and eido tests claims that an instance is of a particular form. Eido also helps *change* forms using filters.


94 changes: 3 additions & 91 deletions docs/autodoc_build/eido.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ document.addEventListener('DOMContentLoaded', (event) => {
</script>

<style>
h3 .content {
h3 .content {
padding-left: 22px;
text-indent: -15px;
}
Expand All @@ -18,7 +18,7 @@ h3 .hljs .content {
martin-bottom: 0px;
}
h4 .content, table .content, p .content, li .content { margin-left: 30px; }
h4 .content {
h4 .content {
font-style: italic;
font-size: 1em;
margin-bottom: 0px;
Expand Down Expand Up @@ -110,95 +110,7 @@ Print inspection info: Project or, if sample_names argument is provided, matched



```python
def get_available_pep_filters()
```

Get a list of available target formats
#### Returns:

- `List[str]`: a list of available formats




```python
def convert_project(prj, target_format, plugin_kwargs=None)
```

Convert a `peppy.Project` object to a selected format
#### Parameters:

- `prj` (`peppy.Project`): a Project object to convert
- `plugin_kwargs` (`dict`): kwargs to pass to the plugin function
- `target_format` (`str`): the format to convert the Project object to


#### Raises:

- `EidoFilterError`: if the requested filter is not defined




```python
def basic_pep_filter(p, **kwargs)
```

Basic PEP filter, that does not convert the Project object.

This filter can save the PEP representation to file, if kwargs include `path`.
#### Parameters:

- `p` (`peppy.Project`): a Project to run filter on




```python
def yaml_pep_filter(p, **kwargs)
```

YAML PEP filter, that returns Project object representation.

This filter can save the YAML to file, if kwargs include `path`.
#### Parameters:

- `p` (`peppy.Project`): a Project to run filter on




```python
def csv_pep_filter(p, **kwargs)
```

CSV PEP filter, that returns Sample object representations

This filter can save the CSVs to files, if kwargs include
`sample_table_path` and/or `subsample_table_path`.
#### Parameters:

- `p` (`peppy.Project`): a Project to run filter on




```python
def yaml_samples_pep_filter(p, **kwargs)
```

YAML samples PEP filter, that returns only Sample object representations.

This filter can save the YAML to file, if kwargs include `path`.
#### Parameters:

- `p` (`peppy.Project`): a Project to run filter on







*Version Information: `eido` v0.1.5-dev, generated by `lucidoc` v0.4.2*
*Version Information: `eido` v0.1.0, generated by `lucidoc` v0.4.2*
22 changes: 17 additions & 5 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,26 @@

This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html) and [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) format.

## [0.1.5] - 2021-04-15
## [0.1.6] - 2022-05-16
### Added
- a possibility to set a custom sample table index with `-s/--st-index` option
- an option to see filters docs via CLI: `eido filters -f <filter_name>`
- PEP filters now return their conversion result for progrommatic use.
- PEP filters can write to files.
- A filter can write multiple outputs to multiple files using the `paths` keyword arg.

### Fixed
- Some error messages with incorrectly defined schemas.
- 'required' attribute is no longer required in schema

** The PEP filters are an experimental feature and may change in feature versions of `eido`**
### Changed
- Moved all `eido filter` functionality into the `eido convert` command for simplicity. This way, a single top-level command namespace holds all related functionality. Filters are still EXPERIMENTAL.

- `eido convert` command that converts the provided PEP to a specified format
- `eido filter` command that lists available filters in current environment
- built-in plugins:
## [0.1.5] - 2021-04-15
### Added
- `eido convert` converts the provided PEP to a specified format (EXPERIMENTAL! may change in future versions)
- `eido filter` lists available filters in current environment (EXPERIMENTAL! may change in future versions)
- built-in plugins (EXPERIMENTAL! may change in future versions):
- `basic_pep_filter`
- `yaml_pep_filter`
- `csv_pep_filter`
Expand Down
3 changes: 1 addition & 2 deletions docs/contributing.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
## Contributing

Pull requests or issues are welcome. After adding a new feature, please add tests in the `tests` folder and run the test suite. The only additional dependencies needed beyond those for the package can be installed with: `pip install -r requirements/requirements-dev.txt`.

Once those are installed, run the tests with `pytest` or `python setup.py test`.

Once those are installed, run the tests with `pytest` or `python setup.py test`.
2 changes: 1 addition & 1 deletion docs/example-schemas.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ With `eido` you can create your own schema to describe the kind of projects your
- [Generic PEP2.0.0 schema](http://schema.databio.org/pep/2.0.0.yaml) -- all PEPs should validate against this schema
- [PEPPRO pipeline schema](http://schema.databio.org/pipelines/ProseqPEP.yaml) -- describes PEPs compatible with the [PEPPRO](http://peppro.databio.org) pipeline
- [PEPATAC pipeline schema](http://schema.databio.org/pipelines/pepatac.yaml) -- describes PEPs compatible with the [PEPATAC](http://pepatac.databio.org) pipeline
- [refgenie databio build schema](https://schema.databio.org/refgenie/refgenie_build.yaml) -- describes PEPs compatible with building [refgenie](http://refgenie.databio.org) assets
- [refgenie databio build schema](https://schema.databio.org/refgenie/refgenie_build.yaml) -- describes PEPs compatible with building [refgenie](http://refgenie.databio.org) assets
Loading

0 comments on commit 0d141f3

Please sign in to comment.