-
-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing ensemble #508
Merged
Merged
Fixing ensemble #508
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
peteryang1
added a commit
that referenced
this pull request
Jan 17, 2025
* refine ds modal for more cases: eval and es * update model template * prompts for model and ensemble * fix a bug * fix a bug * init: ds workflow evovingstrategy * Adding ensemble (#505) * Initial Draft * Updating logic for init * Revising * Successful Testing * Updating to use the latest & right class * bug: bug-fixing for testing * data science loop changes * data science loop base * ds loop feedback * fix * remove measure_time because it's duplicated (in LoopBase) * add the knowledge query for data_loader & feature * edit ds workflow evaluator * data_loader bug fix * stop evolving when all tasks completed * llm app change * fix break all complete strategy * Adding queried knowledge (#508) Co-authored-by: XianBW <[email protected]> * fix loop bug * ds workflow evaluator; test; refine prompts * workflow spec * fix ci * feature task changes * ds loop change * fix a bug in feat * add query knowledge for model and workflow * llm_debug info(for show) using pickle instead of json * remove NextLoopException * loop change * coder raise CoderError when all sub_tasks failed * rename code_dict to file_dict in FBWorkspace * add CoSTEER unittest * now show self.version in Task.get_task_information(), simplify CoSTEER sub tasks definition * remove some properties in ModelTask, add model_type in it. * fix llm app bug * llm web app bug fix * ds loop bug fix * fix: give component code to feature&ens eval * loop catch error bug * rename load_from_raw_data to load_data * feat: Add debug data creation functionality for data science scenarios * support local folder (#511) * support local folder * remove unnecessary random * KaggleScen Subclass * small fix * use template for style description * update default scen to kaggle * update sample data script * make sure frac < 1 * fix a bug * feature spec changes * fix * changeimport order * clear unnecessary std outputs * fix a typo * create sample folder after unzip kaggle data * feature/model test script update * Align the data types across modules. * fix a bug in model eval * show line number * move sample entry point to app * spec & model prompt changes * Refine the competition specification to address the data type problem and the coherence issue. * fix some bugs * add file filter in FBworkspace.code property * support non-binary prediction * avoid too much warnings * fix a bug in ensemble module * filtered the knowledge query in all modules * delete RAG in idea proposal * refine the code in ensemble * show exp workspace in llm_st * exp_gen bug fix * feedback bug fix * use `feature` instead of `feat01` * Trace & method of judging if exp is completed change * fix a bug in package calling and execute ci * fix code * bug fix * bug fix * fix a bug * fix some bugs * fix a bug * refactor: Enhance error handling and feedback in data science loop * support different use_azure on chat and embedding models * multi-model proposal logic * fix a small syntax error * loopBase and some changes * ensemble scores change * fbworkspace.code -> .all_codes * use all model codes in workflow coder * check scores.csv's keys(model_names) * model name changes * add a todo in ensemble test * sota_exp changes * give model info in exp gen * add runner time limit * config using debug data or not in evals * exp to feedback base * add feature code when writing model task * small problem * copying during sampling * update * refactor: Simplify code handling and improve workspace management * model part output fix * print model's execution time * bug fix * ensemble test fix * ens small change * ens_test bug fix * Refine partial expansion logic to display only a few subfolders when their structure is uniform, improving readability in nested directories. * several update on prompts * sample subfolders * Filter the stdout after code execution to remove irrelevant information e.g. progress bars, whitespace characters, excessive line breaks. * Add some more prompts and comments * several update on the first init rounds * model timeout as error * fix pattern of getting model codes in workspace * small bux fix on model prompts * remove get_code_with_key since we have regex pattern * fix: Correct tqdm progress bar update logic in LoopBase class * feat: Add diff generation and enhance feedback mechanism in data science loop * update some fix to model and workflow prompts * refine the logic of progress bar filter * add last_successful_exp in exp_gen * fix a one line bug * add a hint in prompt * fix data sample for bms * fix data sample for bms * hypothesis small fix * crawler readme update * fix component gen * fix bug * annotation change * load description.md if it exists * refactor: Simplify SOTA description handling in feedback and prompts * refactor: Use shared templates for feedback and experiment descriptions * change webapp for model codes changes * update proposal * add timeout message for docker run output * fix * refine the code in docker time processing * use .shape instead of len() when do shape eval * won't change size during iteration * support bson sample * sample support jsonl and bson * add former_code to coder prompts * a little speed us in debug data creating * filter progress bar when eval ens and main * avoid costeer makes no change to former code * fix several log error * add timeout judge threshold * fix some bugs in the evaluation of component output shapes * File structure for supporting litellm (#517) Co-authored-by: Young <[email protected]> * ignore submission and show processing * ignore submission and show processing * add efficiency notice * refactor: Enhance error message with detailed feedback summary * refactor: Simplify component handling in DSExpGen class * refactor: Update code structure and add docstring for clarity * reserve one sample to each label in data sampling * add Evaluation info * refine costeer code to avoid giving same code twice * use raw_description as plain text * add a prompt hint to avoid same dict key * model task name bug in first model exp gen * fix a typo * add some debug info in costeer tests * task init change * enhance data sampling * refine the code in data_loader * more reasonable loop * fix a bug in data folder description * add error msg & traceback to execution feedback * fix llm error msg detection * add task information to costeer eval & add cache to docker run(use zipfile to store the whole workspace) * fix CI first round * fix CI second round * use txt to store test script to avoid pytest * remove zipfile in requirements * add azure.identity to requirements * ignore debug web page * component test changes * remove redundent task_desc in model coder * feat: Add APE module and prompts for automated prompt engineering * fix: Update .gitignore and improve text formatting in eval.py * refactor: Update print output and improve code comments and imports * style: Fix string formatting and import order in ape.py and fmt.py * exclude ape * add a data folder notice * reduce unnecessary output to stdout * refine the code of describe_data_folder * fix ci * style: streamlit style update (#522) * streamlit style update * fix import * fix format * fix llm_st loop progress bar * debugapp small change * fix model str * refine some prompts * fix model str * fix CI * refine the logic associated with the data_folder * fix ci * small change * set filter_progress_bar as default in execute * model proposal with workflow * add submission check in workflow eval * fix bug * small change * fix CI * fix CI * refactor: Move generate_diff to utils and update DSExpGen logic * more reasonable prompt describing metric direction * fix a minor jinja2 bug * quick fix exp_gen bugs * fix the following bug * fix * fix some bugs * remove workflow from model * add pending_tasks_list in data science to enable coding model and workflow * refine the code for handling JSON-formatted data descriptions * assert with information * ensure correct csv file name * add logging to help record the output * log competition * add log tag for debug llm app * test: Test ds refactor ll (#523) * fix bugs to former scenario * fix a bug because coding in rdloop changed * fix the bug when feedback gets no hypothesis * fix trace structure * change all trace hist when merging hypothesis to experiments * ignore some error in ruff * fix kaggle scenario bugs * refine one line * another bug * another small bug * fix ui bugs * chage kaggle train.py path --------- Co-authored-by: Xu Yang <[email protected]> * fix CI * Update rdagent/app/data_science/loop.py Co-authored-by: Copilot <[email protected]> * add samplecsv into spec prompts * fix CI --------- Co-authored-by: TPLin22 <[email protected]> Co-authored-by: yuanteli <[email protected]> Co-authored-by: Xisen Wang <[email protected]> Co-authored-by: Bowen Xian <[email protected]> Co-authored-by: Xu Yang <[email protected]> Co-authored-by: XianBW <[email protected]> Co-authored-by: Tim <[email protected]> Co-authored-by: 炼金术师华华 <[email protected]> Co-authored-by: Linlang <[email protected]> Co-authored-by: Copilot <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Motivation and Context
How Has This Been Tested?
Screenshots of Test Results (if appropriate):
Types of changes
📚 Documentation preview 📚: https://RDAgent--508.org.readthedocs.build/en/508/