A command-line tool for analyzing and visualizing BioImage Archive (BIA) study statistics.
Clone this repository and install dependencies:
git clone https://github.com/bioimage-archive/bia-study-stats.git
cd bia-study-stats
pip install -e .
-
print_accessions
: Display a table of accession IDs and their sizesbia-study-stats print_accessions stats.json
-
summarize
: Show summary statistics including total accessions and storage usagebia-study-stats summarize stats.json
-
merge_df_sizes
: Merge size information from adf
command output filebia-study-stats merge_df_sizes stats.json df_output.txt
-
merge_s3_cache
: Update sizes using an S3 cache filebia-study-stats merge_s3_cache stats.json s3_cache.json
-
update_from_fire
: Fetch sizes directly from S3/FIRE storage for studies with zero sizebia-study-stats update_from_fire stats.json --failed-log errors.log
-
data_added_after
: Calculate total data volume added after a specific datebia-study-stats data_added_after stats.json 2023-01-01
-
plot_cumulative_size
: Generate a bar chart showing cumulative data size by quarterbia-study-stats plot_cumulative_size stats.json
-
plot_cumulative_entries
: Create a bar chart of cumulative study count by quarterbia-study-stats plot_cumulative_entries stats.json
-
print_ebi_stats
: Output monthly cumulative size statistics in EBI formatbia-study-stats print_ebi_stats stats.json
For commands that interact with S3/FIRE storage, create a .env
file with:
S3_BUCKET=your-bucket-name
S3_ENDPOINT=https://your-endpoint.com # Optional
AWS_PROFILE=your-profile # Optional
quarterly_cumulative_size.png
: Generated byplot_cumulative_size
quarterly_cumulative_entries.png
: Generated byplot_cumulative_entries