Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The pipestat schema should be a JSON Schema. #85

Closed
Tracked by #82
donaldcampbelljr opened this issue Sep 28, 2023 · 18 comments · Fixed by #87
Closed
Tracked by #82

The pipestat schema should be a JSON Schema. #85

donaldcampbelljr opened this issue Sep 28, 2023 · 18 comments · Fixed by #87

Comments

@donaldcampbelljr
Copy link
Contributor

donaldcampbelljr commented Sep 28, 2023

The pipestat schema should be a JSON Schema. This will make it more familiar to people.

@donaldcampbelljr donaldcampbelljr changed the title the pipestat schema should be a JSON Schema. This will make it more familiar to people. The pipestat schema should be a JSON Schema. Sep 28, 2023
@donaldcampbelljr donaldcampbelljr self-assigned this Sep 28, 2023
@donaldcampbelljr
Copy link
Contributor Author

Some notes on this:

  1. Right now, pipestat takes a config_dict or config file and validates it using a json schema validator:
    self[CONFIG_KEY] = YAMLConfigManager(entries=config_dict, filepath=self._config_path)
    _, cfg_schema = read_yaml_data(CFG_SCHEMA, "config schema")
    validate(self[CONFIG_KEY].exp, cfg_schema)

However, it appears that we currently do not perform json schema validation on the output_schema. This make sense since it is not a JSON Schema. Should we perform this validation during schema processing (in parsed_schema.py)?

  1. I assume we will still use a .yaml for the output schema, it will simply need to conform to JSON schema requirements.

  2. From some reading, there are some subtle differences between the current .yaml schemas and a JSON schema in yaml format. For example:

Current example of an output schema as a .yaml:

pipeline_name: default_pipeline_name  
samples:  
  collection_of_images:  
    description: "This store collection of values or objects"  
    type: array  
    items:  
      properties:  
          prop1:  
            description: "This is an example file"  
            type: file  
  output_file_in_object:  
    type: object  
    properties:  
      prop1:  
        description: "This is an example file"  
        type: file  
      prop2:  
        description: "This is an example image"  
        type: image  
    description: "Object output"  
  output_file:  
    type: file  
    description: "This a path to the output file"  
  output_image:  
    type: image  
    description: "This a path to the output image"

The same schema in proper JSON schema format (as a .yaml):

$schema: https://json-schema.org/draft/2020-12/schema
$id: https://example.com/output_schema.json
type: object
title: Default Pipeline
description: A pipeline that collects images and outputs a file and an image.
properties:
  pipeline_name:
    type: string
    default: default_pipeline_name
  samples:
    type: object
    properties:
      collection_of_images:
        type: array
        description: A collection of images.
        items:
          type: object
          properties:
            prop1:
              type: file
              description: An example file.
      output_file_in_object:
        type: object
        description: An object containing output files.
        properties:
          prop1:
            type: file
            description: An example file.
          prop2:
            type: image
            description: An example image.
  output_file:
    type: file
    description: The path to the output file.
  output_image:
    type: image
    description: The path to the output image.

My understanding regarding $id and $schema:
-if $schema is not declared the latest draft will be used (which should we use?)
-if $id is not declared, the schema can only be used locally (this seems fine).

Next steps:

  1. Change schemas to conform to JSON schema
  2. Update schema parsing functions (including recursive functions for complex objects) to handle JSON schemas.
  3. Add validation to confirm given output_schema is a valid JSON schema.

Do you have any other thoughts on this @nsheff ?

@nsheff
Copy link
Contributor

nsheff commented Sep 28, 2023

I think I'm on the same page as you.

can we use the same thing we already defined for an eido schema?

https://eido.databio.org/en/latest/writing-a-schema/

@donaldcampbelljr
Copy link
Contributor Author

Regarding the imports section of the eido schema, I believe you can achieve something similar with $ref:
https://json-schema.org/understanding-json-schema/structuring.html#ref

Are you also asking if we should just make the pipestat output_schema an Eido schema? Specifically, to take advantage of checking for files' existence which is outside the scope of JSON schema validation? Do we even need (or desire) to do that for a pipestat output_schema?

@donaldcampbelljr
Copy link
Contributor Author

donaldcampbelljr commented Oct 2, 2023

Basically we do want to be able to use a JSON schema that is extendable, similar to Eido.

Example:
Eido validates sample tables and has an expected input schema (which are peps).
It uses JSON schema but extends it, e.g. imports and required files. During JSON validation, it ignores items that it does not recognize such as imports. Eido knows what to do with these items and processes them appropriately after JSON validation.

We want to have pipestat use JSON schema that is extendable.
To consider:

  • Can we replace imports with $ref?
  • Make project attributes parallel to samples. Eido currently only validates samples.
  • Make pipeline_name optional?

@donaldcampbelljr
Copy link
Contributor Author

I left pipeline_name as being required via the output schema.
Pipestat can now accept output schemas that are JSON schema OR continue using the original output schema for backwards compatibility.

donaldcampbelljr added a commit that referenced this issue Oct 4, 2023
* first pass to allow for output_schema as a true JSON schema #85

* remove unused import

* simplify logic, extend _safe_pop_one_mapping to handles multiple keys

* disambiguation key vs keys in parsed schema

* make samples and project objects instead of arrays, add more test assertions, clean up docstrings.

* add status data to string representation of ParsedSchema
@donaldcampbelljr donaldcampbelljr added this to the v0.6.0 milestone Oct 5, 2023
@nsheff
Copy link
Contributor

nsheff commented Oct 5, 2023

I noticed you used project as the attribute name for the project-level info. In eido, we used config as that name. So, these will not be compatible.

I think we should use the same standard. But, I actually like project better than config. So... what do you think?

@donaldcampbelljr
Copy link
Contributor Author

In the context of pipestat, project makes more sense to me than having it be under config. This is because we also have a pipestat config file which is separate (this could become confusing to the user).

@donaldcampbelljr
Copy link
Contributor Author

Per discussion, we would like eido to use the same attribute names as pipestat (project vs config) while still maintaining backwards compatibility.

@nsheff
Copy link
Contributor

nsheff commented Oct 27, 2023

Can you provide an example of a true JSON schema pipestat schema? I think the above examples are all not quite right... they lack the project section

@nsheff
Copy link
Contributor

nsheff commented Oct 27, 2023

Here's an idea for that, now using $defs and $ref instead of our custom replacements.

$schema: https://json-schema.org/draft/2020-12/schema
$id: https://example.com/output_schema.json
$defs:
  image:
    type: object
    object_type: image
    properties:
      path:
        type: string
      thumbnail_path:
        type: string
      title:
        type: string
    required:
      - path
      - thumbnail_path
      - title
  file:
    type: object
    object_type: file
    properties:
      path:
        type: string
      title:
        type: string
    required:
      - path
      - title
type: object
title: Default Pipeline
description: A pipeline that collects images and outputs a file and an image.
properties:
  pipeline_name:
    type: string
    default: default_pipeline_name
  samples:
    type: object
    properties:
      right_file:
        $ref: "#/$defs/file"
        label: bigBed
        description: bigBed file
      collection_of_images:
        type: array
        description: A collection of images.
        items:
          type: object
          properties:
            prop1:
              type: file
              description: An example file.
      output_file_in_object:
        type: object
        description: An object containing output files.
        properties:
          prop1:
            type: file
            description: An example file.
          prop2:
            type: image
            description: An example image.
  project:
    type: object
    properties: 
      output_file:
        type: file
        description: The path to the output file.
      output_image:
        type: image
        description: The path to the output image.

@donaldcampbelljr
Copy link
Contributor Author

donaldcampbelljr commented Oct 30, 2023

Implemented the above suggestion using $ref and $defs: b71d387

@nsheff
Copy link
Contributor

nsheff commented Oct 31, 2023

excellent! Can you try creating new versions of the bbconf schemas following this approach, and see if they will work with bbconf/bedhost?

@donaldcampbelljr
Copy link
Contributor Author

Yes, we will need to merge these changes to dev via PR #109 first

@donaldcampbelljr
Copy link
Contributor Author

Per Tuesday's discussion, I attempted to use the built in referencing library to resolve internal references within a JSON schema written in yaml using $ref and $def.

This is not working; I cannot get a resolved version of the schema.

Reading material:
https://python-jsonschema.readthedocs.io/en/stable/referencing/#resolving-references-to-schemas-written-in-yaml
https://readthedocs.org/projects/python-jsonschema/downloads/pdf/latest/

MCVE:

Given a simple schema:

$schema": "https://json-schema.org/draft/2020-12/schema"
$id: "#/$defs/"
title: An example Pipestat output schema
description: A pipeline that uses pipestat to report sample and project level results.
type: object
properties:
  pipeline_name: "default_pipeline_name"
  samples:
    type: object
    properties:
        output_image:
          $ref: "#/$defs/image"
          description: "This an output image"
$defs:
  image:
    type: object
    object_type: image
    properties:
      path:
        type: string
      thumbnail_path:
        type: string
      title:
        type: string
    required:
      - path
      - thumbnail_path
      - title

main.py:

from pathlib import Path
import yaml
from referencing import Registry, Resource
from jsonschema import Draft202012Validator
path_yaml = Path("/home/drc/GITHUB/pythonpractice/jsonschema_practice/sample.yaml") 
contents = yaml.safe_load(path_yaml.read_text())

resource = Resource.from_contents(contents, {"$schema": "https://json-schema.org/draft/2020-12/schema"})

registry = Registry()

registry = registry.with_resources([("#/$defs", resource)])

for ref in registry._uncrawled:
    print(ref)

validator = Draft202012Validator(schema=registry)

validator.validate(registry)

The above code loads the yaml file and the validator has no issues. However, the registry continues to have registry._uncrawled and the validator object does not seem to resolve them.

@nsheff
Copy link
Contributor

nsheff commented Nov 3, 2023

Yeah, I can confirm. I've been poking around here for a few hours tonight and also cannot figure out how to get a resolved schema. You can type resource.crawl() but it doesn't help.

The schema that I get out is called a Resolved object, but it still includes $ref items. validation works fine with these. So, I guess we may have to roll our own after all :(

@nsheff
Copy link
Contributor

nsheff commented Nov 3, 2023

@donaldcampbelljr
Copy link
Contributor Author

Ok, I will merge PR 109 to dev and proceed with implementing these changes in bbconf.

I will keep this issue open to circle back around for:

@nsheff
Copy link
Contributor

nsheff commented Nov 3, 2023

Just a few thoughts on this I had this morning:

  • the reason we needed the resolving is so that for things like an image, we make the correct sqlmodel object
  • this happens right now with a 'type lookup', here:
    curr_type_spec = CANONICAL_TYPES[curr_type_name]
    -- if the type is image, then it gets replace by a built-in image model.

we could solve this easily but just having it do this instead:

  • when determining what the type of the element is in that function, we first check if the particular element has a $ref propery.
  • if there's no $ref property, we just proceed as normal, no change.
  • if there is a $ref property, we follow that ref, and then use .update to update the dict describing the type.
  • if the $ref points starts with #/$defs, then we look it up in the defs entry of the current.
  • if it starts with http, then we retrieve it from URL.
  • otherwise, we throw an error that we can't follow the $ref.

The function below (not tested pseudocode) should be all we need:

def resolve_ref(subschema: dict, schema: dict) -> dict:
    """
    Given a subschema, and its schema, resolve the $ref entry (if any)
    and return an equivalent subschema with the $ref resolved

    @param subschema represents the schema for a specific property
    @param schema the parent schema, which may include the $defs for lookup
    """
    if "$ref" in subschema:
        uri = subschema["$ref"]
        if uri.startswith("#/$defs"):
            print("Found ref to local $def:", uri)
            relative_uri = uri.removeprefix("#/$defs")
            resolved_ref = schema["$defs"][relative_uri]  # resolve ref
            subschema.update(resolved_ref)  # update schema
        else if uri.startswith("http"):
            print("Found ref to remote schema: ", uri)
            response = httpx.get(uri)  # resolve ref
            subschema.update(response.json())  # update schema
        else:
            raise NotImplementedError(f"Can't resolve ref: {uri}")
        del subschema["$ref"]  # remove resolved ref
    return subschema

@donaldcampbelljr donaldcampbelljr modified the milestones: v0.6.0, v0.7.0 Nov 6, 2023
donaldcampbelljr added a commit that referenced this issue Dec 22, 2023
* Add get_records to pipestat_manager and add related test #75

* clean up signatures #72

* clean up docstrings

* remove superfluous get_table_name function

* Reduce get_orm and get_model into one function. #71

* Fix CLI to use record-identifier instead of sample-name

* Fix CLI to use pipeline_type #37

* modify get_records to return new structure #75

* fix html report generation using get_records

* add simple CLI test for reporting and retrieving #81

* Pipestat reader (#80)

* update version in prep for pipestat table

* add table function

* add initial stats function from looper, remove counter

* fixing _create_stats_summary and get_file_for_project

* fix sample_level stats reporting

* add object reporting

* remove redundancies

* lint

* add assertion for table generation to pre-existing test

* use pipeline_type when retrieving samples

* Fix stats table generation output to look better

* fix return types

* fix list vs List

* fix typo for stats table

* fix doc strings

* update func names for disambiguation

* fix key error issue in html_reports

* clean up

* adjust docstrings

* update docstrings

Co-authored-by: Nathan Sheffield <[email protected]>

* adjust docstrings, rename html_reports_pipestat.py to reports.py

* adjust LOGGER.info string

* simplify objs and stats generation

* remove old todos

* remove sample_name, add pipeline_name to project field definitions

* work towards using project_name instead of sample_name from project-level pipelines

* fix check record bug, add better error message for pipeline type.

* add checks for pipeline types, lint

* fix get_samples with pipeline type

* fix fetch_pipeline_results

* add basic tests for project_level, fix associated bugs

* allow table creation to use project_name if applicable

* add passing project name to file_backend

* refactor to use r_id as input

* update doc strings

* more refactoring of records vs samples to be more general

* move table generation to reports.py

* allow status file dir path to be associated with config path OR the file path to ensure looper compatibility.

* add output_dir as a parameter for placing reports and stat files. Otherwise defaults to results_file dir or config_file dir

* add obtaining schema_path from pipestat config for Looper compatibility

* added select function to main pipestat class

* added select_txt and select_distinct function to main pipestat class

* added get_one_record function to main pipestat class #70

* Removing CLI req that one must supply config and schema because the user can supply schema within the config. For Looper compatibility.

* Revert "added get_one_record function to main pipestat class #70"

This reverts commit b477060.

* Revert "added select_txt and select_distinct function to main pipestat class"

This reverts commit 4588f4f.

* Revert "added select function to main pipestat class"

This reverts commit ee98145.

* Add basic glossary page in the form of a table. Addresses: pepkit/looper#290

* remove redundant tag

* add RecordNotFoundError across both backends #74

* Add PipelineTypeNotSuppliedError to hide dbbackend information #74

* Initial work towards a fastapi implementation of pipestat reader
#22

* Initial work to separate classes #78

* working implementation of child classes for reporting using DBBACKEND. Manual testing confirmed. pytest Tests broken. #78

* implement report for filebackend

* fix inheritance issue with pipeline_type

* typo

* major refactoring, many tests still broken #78

* Fix signatures for report and explicitly state input arguments.

* remove pipeline type check

* add projectpipestamanager and refactor project_name to record_identifier

* fix get_status

* fix tests

* fix all tests #78

* add todo

* code cleanup and remove getting_table_name

* clean up

* partial removal of pipeline_type logic from DB backend

* remove pipeline_type input arguments from File Backend

* clean up

* add simple class to wrap SamplePipestatManager and ProjectPipestatManager

* change pipestatmanager to mutable mapping #78

* lowercase attributes #78

* polish PipestatBoss #78

* env_vars for pipeline_type

* update docs

* working proof of concept to retrieve results #22

* update readme and add getting output schema

* add else catch for outptu schema and fix typo

* add ability to run with uvicorn if calling reader.py directly.

* add __init__.py and main func

* add more endpoints #22

* update endpoints, add Query

* allow for more complex filtering using post and pydantic models #22

* Add image and file endpoints #22

* lint

* Add basic arg parser to pass absolute path to config file

* Add endpoint for get_records()

* clean up

* modify get_records to return new structure #75

* fix html report generation using get_records

* attempt global creation of psm, does not work

* fix creation of global psm

* change fetching by filetype

* add cli option for pipestat serve

* simplify and fix pipestat serve #22

* add host and port arguments #22

* add config error

* update readme

---------

Co-authored-by: nsheff <[email protected]>

* list to List

* fixed typing in python3.8

* update changelog

* fix table_name_bug

* "0.5.4a1" for alpha release

* "0.6.0a1" for alpha release

* Dev json schema (#87)

* first pass to allow for output_schema as a true JSON schema #85

* remove unused import

* simplify logic, extend _safe_pop_one_mapping to handles multiple keys

* disambiguation key vs keys in parsed schema

* make samples and project objects instead of arrays, add more test assertions, clean up docstrings.

* add status data to string representation of ParsedSchema

* add retrieve_distinct function for #70 and databio/bedhost#75

* add `pipestat link` functionality to create symlinks (#90)

* begin work on file inking via pipestat

* continue implementing link for an output_dir OR the results_file.yaml
#89

* implemented symlinks for filebackend given output_dir
#89

* implemented symlinks for filebackend using results.yaml, polish tests
#89

* fix typo, clean up

* more lcean up

* begin rework to be backend agnostic

* refactor to place in folders specific to a result_identifier

* add more complex types, add temp directory for better testing

* fix recursive finding of paths

* clean up, move to abstract class, confirm works for both backends

* remove unused test files

* add jinja2 to requirements-all.txt #91

* complex example with collision

* add warning

* add cli option

* remove unused functions

* update changelog

* clean up docstrings, remove unused variables, refactor link_dir

* clean up more docstrings

* remove unnecessary print statement to reduce verbosity

* Begin work on adding created and modified datetime, DateTime objects are not parsing correctly, tests broken #93

* fix datetime objects #93

* fix pipeline_name field beginning with underscore #94

* timestamp continuation, list_recent_results does NOT work

* fix doc string for schema

* add filtering by created and modified time #93

* update version for alpha release v0.6.0a2

* fix list reversing #93

* polish tests #93

* add filebackend list_recent_results, remove records broken #93

* fix remove records bug #93

* update doc strings #93

* update change log for list_recent_results

* begin work on retrieve_multiple #96

* add db backend, test broken #96

* fix db backend logic #96

* implement file backend #96

* fix docstrings

* some minor polishing

* update readme

* move pipestat reader per #22

* fix pipestat reader setup and manifest per #22

* Add optional dependencies via setup #22

* pipestat can now run without DB specific dependencies #22

* Some refactoring #64

* add other requirements to test

* refactor to have ParsedSchemaDB inherit from ParsedSchema for code reduction #64

* Clean up

* return total record count #75

* make initialization of file and DB backend into separate functions

* add decorator for checking dependencies

* polishing decorator for checking dependencies

* consolidate decorator and ancillary function into one

* first pass #99, some tests broken

* fix all tests #99

* refactor store to cfg #99

* fix bug with table path

* setitem is now a wrapper for report

* fix getitem implementation, add test #99

* Begin work on select_records, retrieve_one, retrieve_many #103

* Implement select_records, and cursor pagination #103

* clarify init message

* Update dbbackend.py

* add loading JSON and column.contains for filtering #103

* testing differences

* add or_ and and_ sqlmodel operators #103

* some comments

* steps toward smarter pytesting

* #103 -- added multi key filter

* typo

* polish changes to select_records and add tests. #103

* fix typo

* remove Union for python 3.8 compat

* move get_nested_column and define_sqlalchemy_type outside of select_records

* update doc strings

* Modify to be true JSONs chema with using $ref and $def for custom objects (image, file) #85

* Update pipestat/backends/abstract.py

Co-authored-by: Oleksandr <[email protected]>

* Update pipestat/backends/db_backend/dbbackend.py

Co-authored-by: Oleksandr <[email protected]>

* Update pipestat/backends/db_backend/dbbackend.py

Co-authored-by: Oleksandr <[email protected]>

* Update pipestat/pipestat.py

Co-authored-by: Oleksandr <[email protected]>

* Update pipestat/pipestat.py

Co-authored-by: Oleksandr <[email protected]>

* Update tests/data/sample_output_schema.yaml

Co-authored-by: Oleksandr <[email protected]>

* Polish PR, fix docstrings, add wrapper for remove

* remove unnecessary mutablemapping inheritance

* lint

* cleaning

* fixed failing test

* remove old, unused validate_schema function

* change ALL output_schemas to reflect JSON schema

* lint

* per request, remove select function as well as dynamic_filter

* add to do

* revert PipestatManager to a MutableMapping

* remove select_txt and select_distinct

* revert select_distinct for now

* remove unused backend funcs, tests broken

* implement list_recent_results wrapper (DBBackend only)

* implement list_recent_results wrapper (DBBackend only)

* some polish, begin implementing select_records

* more work towards select_records

* change operator func, add list comprehension for sets

* implement AND OR Logic and fix tests, add sorting

* fix time stamp filtering if NOT nested

* lint and fix db test

* attempt to pare down results based on columns, does not work.

* add column filtering for filebackend, add tests, add no filter condition retrieving all records on filebackend

* fix complex object filtering

* implement select_distinct for filebackend

* select_dsitinct now works on record identifiers for filebackend, polish tests

* fix retrieve_one and retrieve_many for filebackend

* fix timestamp test for filebackend

* use isinstance for type checking

* refactored for efficiency

* fix recursive schema after merge

* remove get_records, fix return type for select_records if given columns

* begin remove of get_one_record

* deleted get_one_record function

* fix get_status on dbbackend using select_records

* remove retrieve from dbbackend and reports

* replace retrieve with retrieve_one and begin replacing retrieve with select_records

* lint

* fix select_records filebackend bug, remove retrieve from tests

* finish removing retrieve function

* attempt re-implementing record not found error for retrieve_one and retrieve_many

* fix bugs with select_records and RecordNotFound, implement ColumnNotFound exception

* lint

* update changelog and version for 0.6.0a3 release

* refactoring based on ruff

* update cli and usage and api docs, lint

* update documentation focused on schemas

* add result_identifier as parameter to retrieve_one

* add result_identifier as parameter to retrieve_many

* __version__ = "0.6.0a4"

* Finish adding pytest.mark.skipif

* lint

* move general tests to test_pipestat, remove some redundancy

* implement clear_status pytest for filebackend #110

* polish test set_item, report_overwrite

* add pytest fixture val_dict

* add pytest fixture values_sample, values_project

* add pytest fixture values_complex_linking, move fixtures to conftest.py

* deleted unnecessary print

* add test_select_no_filter_limit

* black formatting

* add pytest fixture for ranges

* lint

* fix record_identifier being returned twice if it is an input column

* fix typo

* change retrieve_one output to no longer we wrapped by select_records output

* fix schema propert to return psm.cfg["_schema"].to_dict()

* fix schema propert to return psm.cfg["_schema"]

* add psm.schema.original_schema to parsed schema

* add r_id = record_identifier or self.record_identifier to retrieve_one

* fix majority of tests due to retrieve_one

* lint

* fix table creation bug with retrieve_one

* remove RecordNotFoundError because it breaks pypiper

* fix parsed_schema json schema parsing to be correct hierarchy

* update jupyter api docs

* update cli docs

* fix property access #113

* add retrieving multiple results using retrieve_one

* list to List

* add quickstart guide

* add psm.schema.resolved_schema

* remove unused function

* version bump to v0.6.0a5

* fix config_path property

* re-implement RecordNotFoundError

* be more selective about which exceptions to pass

* potential solution for #117

* version bump 0.6.0a6

* remove redundant pydantic

* fix pipestat reader dependency checking

* fix cli bug regarding env variables

* retrieve_one record_identifier defaults to None to allow for use with env variables, clean README.md

* further documentation polishing

* fix obtaining record_identifier from cfg file

* correct config.md

* correct pipestat_specification.md

* 0.6.0a7 prerelease

* Clarify return on select_records

* ensure requirements list yacman>= 0.9.2

* potential fix for #119

* change warning to debug for creating status table

* fix pipestat creating subdirectories before creating results file.

* First proof of concept for #120

* simply do replacement at beginning

* Fixed #106

* now functional, create aggregate_results.yaml which replaces self._data before reporting #120

* replace spaces in names with %20 during html link generation

* fix reported_objs bug

* expandpath for link_dir during pipestat link

* add tests for new aggregate_results functionality

* refactor db-backend to dbbackend

* fix stats and objects buttons for summary page #130

* fix stats table on summary page #130

* fix broken status if multi_results_files #130

* fix log and profile files #130

* fix tsv columns #130

* Add check for pipeline type during report generation #130

* refactor Sample to Record #130

* fix refactor for sample page

* fix stats.tsv and objects.yaml generation to incorporate pipeline types

* add project_objects page #130

* minor polishing

* remove temp test

* lint

* add passing looper's samples to pipestat summarize to populate table

* remove unused code

* version bump for pre-release v0.6.0a9

* use "object_type" for image and file types or fall back on using "type", add try except

* ensure project level objects are accurate on summary/index page

* remove project name from stats and object file names

* update changelog

* 0.6.0a10 bump for pre-release

* allow project-level figures to display as columns instead of rows

* 0.6.0a11 pre-release

* v0.6.0 release prep

* lint

---------

Co-authored-by: nsheff <[email protected]>
Co-authored-by: Khoroshevskyi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants