Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement!: CLI overhaul #299

Merged
merged 27 commits into from
Dec 13, 2023
Merged

enhancement!: CLI overhaul #299

merged 27 commits into from
Dec 13, 2023

Conversation

jsstevenson
Copy link
Member

@jsstevenson jsstevenson commented Nov 21, 2023

close #293
close #217
progress on #298

stuff I did:

  • reorganize the CLI stuff to more closely match unix tool/click patterns. Instead of a bunch of different entry points, use a common one (gene-normalizer) and then different actions are commands (eg gene-normalizer update).
  • For the update command, sources are now arguments, not an option. Since the number of sources can be 0 to n, I think it makes more sense to be used as an argument instead of a param passed to an option (avoids the quoting, spacing etc stuff).
  • add a --silent option to suppress feedback to console (false by default) in cases like pipelines that process stuff going to stdout. Addresses Database endpoint usage feedback #217 with a very stringent check (value must be True, can't just be truthy, so no accidental silencing - hence the noqa).
  • Some cleanup on CLI descriptions. Also add a CLI autosummary to docs instead of maintaining a separate description -- there's some fun (sort of janky) stuff done to make it look okay in both.
  • Refactor a lot of the internal CLI update methods to a public module, gene.etl.update. That way they can easily be used programmatically (this is something that Thera-Py does with the Disease Normalizer, for example).

The updated docs page looks like this: https://gene-normalizer--299.org.readthedocs.build/en/299/managing_data/loading_and_updating_data.html

And this is what the help looks like for the update command:

Usage: gene-normalizer update [OPTIONS] [SOURCES]...

  Update provided normalizer SOURCES in the gene database.

  Valid SOURCES are "HGNC", "NCBI", and "Ensembl" (case is irrelevant).
  SOURCES are optional, but if not provided, either --all or --normalize must
  be used.

  For example, the following command will update NCBI and HGNC source records:

      $ gene-normalizer update HGNC NCBI

  To completely reload all source records and construct normalized concepts,
  use the --all and --normalize options:

      $ gene-normalizer update --all --normalize

  The Gene Normalizer will fetch the latest available data from all sources if
  local data is out-of-date. To suppress this and force usage of local files
  only, use the –use_existing flag:

      $ gene-normalizer update --all --use_existing

Options:
  --all           Update records for all sources.
  --normalize     Create normalized records.
  --db_url TEXT   URL endpoint for the application database. Can either be a
                  URL to a local DynamoDB server (e.g.
                  "http://localhost:8001") or a libpq-compliant PostgreSQL
                  connection description (e.g. "postgresql://postgres:password
                  @localhost:5432/gene_normalizer").
  --aws_instance  Use cloud DynamodDB instance.
  --use_existing  Use most recent local source data instead of fetching latest
                  version
  -s, --silent    Suppress console output.  
  --help          Show this message and exit.

@jsstevenson jsstevenson changed the base branch from main to interface-updates-epic November 21, 2023 01:24
@jsstevenson jsstevenson changed the base branch from interface-updates-epic to main November 21, 2023 12:52
@jsstevenson jsstevenson changed the base branch from main to interface-updates-epic November 21, 2023 12:52
@jsstevenson jsstevenson changed the title enhancement!: break CLI update methods into reusable module enhancement!: CLI overhaul Nov 21, 2023
@jsstevenson jsstevenson added the priority:low Low priority label Nov 21, 2023
@jsstevenson jsstevenson marked this pull request as ready for review November 21, 2023 15:32
@jsstevenson
Copy link
Member Author

@korikuzma what would you think about shortening the options to --all and --normalized? eg

gene-normalizer update --all and gene-normalizer update --normalized HGNC NCBI ensembl

@jsstevenson jsstevenson marked this pull request as draft November 21, 2023 17:55
@korikuzma
Copy link
Member

@korikuzma what would you think about shortening the options to --all and --normalized? eg

gene-normalizer update --all and gene-normalizer update --normalized HGNC NCBI ensembl

@jsstevenson to confirm, all would be all sources and concept group creation? I'm not sure about the name normalized for the sources.

@jsstevenson jsstevenson marked this pull request as ready for review December 13, 2023 15:33
@jsstevenson jsstevenson removed the request for review from korikuzma December 13, 2023 15:35
@jsstevenson jsstevenson marked this pull request as draft December 13, 2023 15:35
@jsstevenson jsstevenson marked this pull request as ready for review December 13, 2023 18:48
src/gene/cli.py Outdated Show resolved Hide resolved
src/gene/database/database.py Outdated Show resolved Hide resolved
src/gene/cli.py Outdated Show resolved Hide resolved
Co-authored-by: Kori Kuzma <[email protected]>
Copy link
Member

@korikuzma korikuzma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@jsstevenson jsstevenson merged commit 0e4f7d9 into interface-updates-epic Dec 13, 2023
21 checks passed
@jsstevenson jsstevenson deleted the cli-refactor branch December 13, 2023 20:06
@jsstevenson jsstevenson mentioned this pull request Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:low Low priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants