Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Updated synthetic image mirror generation script, created helper function for generating images iin SyntheticImageGenerator class, moved notebooks to new notebooks dir. * Restored ensure_save_path func back to annotation_utils.py * Added latency tracking, max images generated field for synthetic image mirror pipeline. * Added mean synthetic image gen latency print statement * Update example arg inputs * Fix imports * Fixed and reformatted args * Suppressed TensorFlow warnings, fixed image gen from annotation * Index and loop bugfixes * Index, looping, args logic fixes. * Add load diffuser function call * Clear gpu after using synthetic image generator * Always load from Hugging Face. * Batch processing for memory optimization. Added optional name field to generate_image in SyntheticImageGenerato for future customization. * Memory optimizations (saving annotation .jsons to disk), added args for chunking, pm2 examples * Fix save as json on disk, ensure no hanging reference when gpu is cleared in SyntheticImageGenerator * Replaced generic DiffusionPipeline with StableDiffusionPipeline that inherits from it. Specified generated image dimensions in diffuser call params. * convert diffuser to float32 before moving onto cpu, fixed duplicate image count logs * Added a testing function to save images from real image dataset, changed annotations 'index' field to 'id' for consistency, various data loading and parameter fixes * Added pipeline for diffusion models to constants.py, and dynamic pipeline loading and image size customization to generation. * Fixed Hugging Face authentication errors. Added instruction to authenticate with huggingface-cli login * Fixed all annotations being used to generate mirrors regardless of start and end indices * Added a new load_and_sort_dataset function to handle Hugging Face dataset rows being ordered by filename string-wise instead of numerically. Added generate_synthetic_images arg and updated dataset naming conventions for parallelization-friendliness. Disabled diffusion pipeline progress bars. Added const for progress updates in terminal. * Removed extra disable progress bar call. Added ceil import for progress calculation. * Adjust Hugging Face annotations dataset name * Reverted annotations dataset name to have data range, now requiring start_index and end_index args. * Re-removed data range from annotations * Update 'index' to 'id' * Fixed loading annotations from Hugging Face and savng specified indices to disk. * Utils refactored, smaller functions. Added resize arg. Added combine_datasets script to put together all generated splits into one Hugging Face dataset. * Replace hardcoded name * Fix fstring * Fix args * Fixed typos * Updated combine_datasets.py to match Hugging Face dataset nomenclature. * removing unused files * initial validator forward pytest * initial ci.yml * new mock classes for ci workflow * temporarily removing old version of generate_synthetic_data.py * rename get_mock_image() -> create_random_image() * adding test_mock.py * renaming build -> test step in ci.yml * test_rewards.py * parameterizing fake_prob to allow intentionally testing real/synth image flows in vali fwd * forcing vali fwd through real and synth image flows * fake_prob -> _fake_prob * using dot operator to read config in mock vali until I replace namespace cfg with bt.config * allowing mock code to skip force_register_neuron in the case that the neuron was already registered in previous test instance * removing unused circleci dir from template repo * image transforms tests * fixing setting of mock dentrite process_time * adding test_mock.py * reset mock chain state in between test cases * cleaning up state management for MockSubtensor * __init__.py * replacing hardcoded string with random image b64 * Fixed saving synthetic images after resizing. * new auto update implementation from sn19 * inital self heal script from sn19 * Flag for downloading annotations from HuggingFace * fixing reference to self.config * Enforcing no watermarking in all cases * self heal in autoupdate script * making autoupdate scripts executable * self heal restart 6 -> 6.5 * typo * allowing --no-auto-update and --no-self-heal for validators * combining run scripts into run_neuron.py * replacing neuron type with --validator and --miner * documentation updates for new run script * docs update * adding wandb to docs * Arg for skipping annotation generation * Prompt truncation for annotations longer than max token length * Suppress token max exceeded warning, cleaned up error logging * Removed all tqdm loading bars, cleaned imports, updated fake dataset paths to parquet versions. * Improved annotation cleanliness with inter-prompt spacing and stripped endings. * removing fixtures reference from mock.py * read btcli args from .env * docs update * Formatting * fixing fixtures import * adding .env file creation to install script * moving network (test/finney) into .env, reducing script redundancy * missing netuid arg for MockSubtensor/MockMetagraph inits in test * adding .env to .gitignore * AXON_PORT -> MINER_AXON_PORT env var rename * docs updates to reflect latest run_neuron.py updates * updating .env paths * small docs update * Fixed annotation json filenames not starting with start_idx arg * locking down version numbers * Added docstrings and comments * fixing image_index field for wanbd logging * try except for wandb init * adding retries for nan images * fixing image isnan check by adding np.any * rename wandb fields *image_id -> *image_name * Updated failure case for generating annotations. * Adjusted TF logging level to include error messages. Cleaned up unnecessary imports. Simplified clear_gpu to not moving tensor to CPU. * Reverted deletion of necessary diffusion pipeline imports. Adjusted TF logging level in dataset generation script to be consistent with synthetic generation classes. * adding a sleep to reduce metagraph resync freq * fixing edge case that occurs when only 1 miner has nonzero weight * bump version to 1.0.2 * fixing download_data extension * Update fake dataset paths * replacing conda activate with /home/user/mambaforge/envs/tensorml --------- Co-authored-by: Benjamin <[email protected]> Co-authored-by: aliang322 <[email protected]>
- Loading branch information