Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data exports #101

Merged
merged 107 commits into from
Mar 4, 2020
Merged

Data exports #101

merged 107 commits into from
Mar 4, 2020

Conversation

nciemniak
Copy link
Contributor

@nciemniak nciemniak commented Jan 22, 2020

Data Exports functionality is responsible for:

  • Generating transcription data files and saving them to Azure Blob Storage upon transcription approval. this includes a raw data file (.json), a consensus text file (.txt), a metadata file (.csv), and a line metadata file (.csv)
  • Removing transcription data files from storage if transcription is unapproved
  • Downloading all transcription files pertaining to a requested project, workflow, group, or single transcription, zipping the files, and sending them to the user
  • Generating a single csv file containing the metadata for all transcriptions included in the collection (handled by the AggregateMetadataFileGenerator class), which is included in the zip file

@nciemniak nciemniak requested review from zwolf and camallen January 22, 2020 18:06
@nciemniak nciemniak requested a review from camallen February 21, 2020 20:01
@nciemniak nciemniak requested review from zwolf and camallen March 3, 2020 20:29
Copy link
Member

@zwolf zwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The very last thing I'll say on this is that I think it would be better form to mock the file system interactions rather than actually building them. This is potentially premature optimization but I want to flag it and maybe create and issue to explore alternatives to disk i/o for tests.

Not blocking, nice work. :shipit:

@nciemniak nciemniak merged commit 2b991e4 into master Mar 4, 2020
@nciemniak nciemniak deleted the data-exports branch March 4, 2020 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants