Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some minor bugs caused by version changes. #128

Merged
merged 63 commits into from
Jul 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
49541c6
Init todo
you-n-g Jul 17, 2024
f3b097e
update all code
WinstonLiyt Jul 18, 2024
61dc8ce
update
WinstonLiyt Jul 18, 2024
6481278
Extract factors from financial reports loop finished
WinstonLiyt Jul 19, 2024
16e3b3a
Merge branch 'main' of https://github.com/microsoft/RD-Agent into fix…
WinstonLiyt Jul 19, 2024
f2d031e
Merge branch 'main' of https://github.com/microsoft/RD-Agent into fix…
WinstonLiyt Jul 19, 2024
ce30c04
Fix two small bugs.
WinstonLiyt Jul 19, 2024
aa2ffac
Delete rdagent/app/qlib_rd_loop/run_script.sh
WinstonLiyt Jul 19, 2024
cecb4c5
Minor mod
you-n-g Jul 19, 2024
61d352f
Delete rdagent/app/qlib_rd_loop/nohup.out
you-n-g Jul 19, 2024
367a1ce
Fix a small bug in file reading.
WinstonLiyt Jul 22, 2024
7887905
some updates
WinstonLiyt Jul 22, 2024
d5f36d9
Update the detailed process and prompt of factor loop.
WinstonLiyt Jul 22, 2024
b4594ef
Merge branch 'main' into fix_some_errors_when_debug_factor
WinstonLiyt Jul 22, 2024
aa4c7e5
Evaluation & dataset
taozhiwang Jul 23, 2024
6d022b8
Optimize the prompt for generating hypotheses and feedback in the fac…
WinstonLiyt Jul 23, 2024
c51a6f0
Generate new data
taozhiwang Jul 23, 2024
90bd7e3
dataset generation
taozhiwang Jul 24, 2024
4fd9733
Performed further optimizations on the factor loop and report extract…
WinstonLiyt Jul 24, 2024
1da2635
Merge branch 'main' into fix_some_errors_when_debug_factor
WinstonLiyt Jul 24, 2024
1d66f16
Update rdagent/components/coder/factor_coder/CoSTEER/evaluators.py
you-n-g Jul 24, 2024
b1bdfdd
Update package.txt for fitz.
WinstonLiyt Jul 24, 2024
50a8ff0
Merge branch 'fix_some_errors_when_debug_factor' of https://github.co…
WinstonLiyt Jul 24, 2024
864f5a0
add the result
taozhiwang Jul 24, 2024
048c6fe
Performed further optimizations on the factor loop and report extract…
WinstonLiyt Jul 24, 2024
f9b57b9
Analysis
taozhiwang Jul 24, 2024
b9d9194
Optimized log output.
WinstonLiyt Jul 24, 2024
9218e5f
Merge branch 'fix_some_errors_when_debug_factor' of https://github.co…
WinstonLiyt Jul 24, 2024
ec5cc64
Merge branch 'fix_some_errors_when_debug_factor' into main
WinstonLiyt Jul 24, 2024
db82b67
Factor update
taozhiwang Jul 24, 2024
dcb7e07
Optimized log output.
WinstonLiyt Jul 24, 2024
265b6b3
A draft of the "Quick Start" section for README
WinstonLiyt Jul 24, 2024
39282eb
Merge branch 'main' of https://github.com/microsoft/RD-Agent into doc…
WinstonLiyt Jul 24, 2024
68f0a75
Add scenario descriptions.
WinstonLiyt Jul 24, 2024
52dc938
Updates
taozhiwang Jul 25, 2024
11980dc
Adjust content
you-n-g Jul 25, 2024
12c0eba
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 25, 2024
c9809f2
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 25, 2024
98906af
Enable logging of backtesting in Qlib and store rich-text description…
WinstonLiyt Jul 25, 2024
b97f24f
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 25, 2024
b7a04c2
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 25, 2024
702c830
Reformat analysis.py
taozhiwang Jul 25, 2024
ac80c93
CI fix
taozhiwang Jul 25, 2024
eb1c04e
Refactor
you-n-g Jul 25, 2024
f9295e0
remove useless code
you-n-g Jul 25, 2024
cab4f46
Merge branch 'benchmark'
taozhiwang Jul 25, 2024
d2770c6
fix bugs (#111)
SH-Src Jul 25, 2024
f4d553a
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 25, 2024
22b176b
Fix two small bugs.
WinstonLiyt Jul 25, 2024
26f2f74
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 25, 2024
f44e4ae
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 25, 2024
fb1478e
Fix a merge bug.
WinstonLiyt Jul 25, 2024
09e2d88
Fix two small bugs.
WinstonLiyt Jul 26, 2024
33b70e2
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 26, 2024
cf568a5
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 26, 2024
05869ce
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 26, 2024
9c64f14
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 26, 2024
787450c
fix some bugs.
WinstonLiyt Jul 29, 2024
b36e1cf
Fix some format bugs.
WinstonLiyt Jul 29, 2024
3e42a7b
Restore a file.
WinstonLiyt Jul 29, 2024
87dba2d
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 29, 2024
fb28226
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 29, 2024
ad7d18d
Fix a format bug.
WinstonLiyt Jul 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 16 additions & 13 deletions rdagent/app/qlib_rd_loop/factor_from_report_sh.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from dotenv import load_dotenv
from jinja2 import Environment, StrictUndefined

from rdagent.app.qlib_rd_loop.conf import PROP_SETTING
from rdagent.app.qlib_rd_loop.conf import FACTOR_PROP_SETTING
from rdagent.components.document_reader.document_reader import (
extract_first_page_screenshot_from_pdf,
load_and_process_pdfs_by_langchain,
Expand Down Expand Up @@ -37,33 +37,33 @@

assert load_dotenv()

scen: Scenario = import_class(PROP_SETTING.factor_scen)()
scen: Scenario = import_class(FACTOR_PROP_SETTING.scen)()

hypothesis_gen: HypothesisGen = import_class(PROP_SETTING.factor_hypothesis_gen)(scen)
hypothesis_gen: HypothesisGen = import_class(FACTOR_PROP_SETTING.hypothesis_gen)(scen)

hypothesis2experiment: Hypothesis2Experiment = import_class(PROP_SETTING.factor_hypothesis2experiment)()
hypothesis2experiment: Hypothesis2Experiment = import_class(FACTOR_PROP_SETTING.hypothesis2experiment)()

qlib_factor_coder: Developer = import_class(PROP_SETTING.factor_coder)(scen)
qlib_factor_coder: Developer = import_class(FACTOR_PROP_SETTING.coder)(scen)

qlib_factor_runner: Developer = import_class(PROP_SETTING.factor_runner)(scen)
qlib_factor_runner: Developer = import_class(FACTOR_PROP_SETTING.runner)(scen)

qlib_factor_summarizer: HypothesisExperiment2Feedback = import_class(PROP_SETTING.factor_summarizer)(scen)
qlib_factor_summarizer: HypothesisExperiment2Feedback = import_class(FACTOR_PROP_SETTING.summarizer)(scen)

with open(PROP_SETTING.report_result_json_file_path, "r") as f:
with open(FACTOR_PROP_SETTING.report_result_json_file_path, "r") as f:
judge_pdf_data = json.load(f)

prompts_path = Path(__file__).parent / "prompts.yaml"
prompts = Prompts(file_path=prompts_path)


def save_progress(trace, current_index):
with open(PROP_SETTING.progress_file_path, "wb") as f:
with open(FACTOR_PROP_SETTING.progress_file_path, "wb") as f:
pickle.dump((trace, current_index), f)


def load_progress():
if Path(PROP_SETTING.progress_file_path).exists():
with open(PROP_SETTING.progress_file_path, "rb") as f:
if Path(FACTOR_PROP_SETTING.progress_file_path).exists():
with open(FACTOR_PROP_SETTING.progress_file_path, "rb") as f:
return pickle.load(f)
return Trace(scen=scen), 0

Expand All @@ -87,8 +87,9 @@ def generate_hypothesis(factor_result: dict, report_content: str) -> str:
response_json = json.loads(response)
hypothesis_text = response_json.get("hypothesis", "No hypothesis generated.")
reason_text = response_json.get("reason", "No reason provided.")
concise_reason_text = response_json.get("concise_reason", "No concise reason provided.")

return Hypothesis(hypothesis=hypothesis_text, reason=reason_text)
return Hypothesis(hypothesis=hypothesis_text, reason=reason_text, concise_reason=concise_reason_text)


def extract_factors_and_implement(report_file_path: str) -> tuple:
Expand Down Expand Up @@ -131,7 +132,9 @@ def extract_factors_and_implement(report_file_path: str) -> tuple:
break
file_path, attributes = judge_pdf_data_items[index]
if attributes["class"] == 1:
report_file_path = Path(file_path.replace(PROP_SETTING.origin_report_path, PROP_SETTING.local_report_path))
report_file_path = Path(
file_path.replace(FACTOR_PROP_SETTING.origin_report_path, FACTOR_PROP_SETTING.local_report_path)
)
if report_file_path.exists():
logger.info(f"Processing {report_file_path}")

Expand Down
8 changes: 4 additions & 4 deletions rdagent/app/qlib_rd_loop/factor_w_sc.py
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@
class FactorRDLoop(RDLoop):
skip_loop_error = (FactorEmptyError,)

def exp_gen(self, prev_out: dict[str, Any]):
with logger.tag("r"): # research
exp = self.hypothesis2experiment.convert(prev_out["propose"], self.trace)
def running(self, prev_out: dict[str, Any]):
with logger.tag("ef"): # evaluate and feedback
exp = self.runner.develop(prev_out["coding"])
if exp is None:
logger.error(f"Factor extraction failed.")
raise FactorEmptyError("Factor extraction failed.")
logger.log_object(exp.sub_tasks, tag="experiment generation")
logger.log_object(exp, tag="runner result")
return exp


Expand Down
3 changes: 2 additions & 1 deletion rdagent/app/qlib_rd_loop/prompts.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ hypothesis_generation:
Please ensure your response is in JSON format as shown below:
{
"hypothesis": "A clear and concise hypothesis based on the provided information.",
"reason": "A detailed explanation supporting the generated hypothesis."
"reason": "A detailed explanation supporting the generated hypothesis.",
"concise_reason": One line summary that focuses on the justification for the change that leads to the hypothesis (like a part of a knowledge that we are building)
}

user: |-
Expand Down
4 changes: 2 additions & 2 deletions rdagent/components/benchmark/eval_method.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,14 @@
from rdagent.components.coder.factor_coder.factor import FactorFBWorkspace
from rdagent.core.conf import RD_AGENT_SETTINGS
from rdagent.core.developer import Developer
from rdagent.core.exception import CoderException, RunnerException
from rdagent.core.exception import CoderError
from rdagent.core.experiment import Task, Workspace
from rdagent.core.scenario import Scenario
from rdagent.core.utils import multiprocessing_wrapper

EVAL_RES = Dict[
str,
List[Tuple[FactorEvaluator, Union[object, RunnerException]]],
List[Tuple[FactorEvaluator, Union[object, CoderError]]],
]


Expand Down
6 changes: 5 additions & 1 deletion rdagent/scenarios/data_mining/proposal/model_proposal.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,11 @@ def prepare_context(self, trace: Trace) -> Tuple[dict, bool]:

def convert_response(self, response: str) -> ModelHypothesis:
response_dict = json.loads(response)
hypothesis = DMModelHypothesis(hypothesis=response_dict["hypothesis"], reason=response_dict["reason"], concise_reason=response_dict["concise_reason"])
hypothesis = DMModelHypothesis(
hypothesis=response_dict["hypothesis"],
reason=response_dict["reason"],
concise_reason=response_dict["concise_reason"],
)
return hypothesis


Expand Down
15 changes: 0 additions & 15 deletions rdagent/scenarios/qlib/prompts.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,21 +43,6 @@ model_hypothesis_specification: |-

6th Round Hypothesis (If fourth round didn't work): The model should be a CNN. The CNN should have 5 convolutional layers. Use Leaky ReLU activation for all layers. Use dropout regularization with a rate of 0.3. (Reasoning: As regularisation rate of 0.5 didn't work, we only change a new regularisation and keep the other elements that worked. This means making changes in the current level.)

factor_hypothesis_specification: |-
Additional Specifications:
Hypotheses should grow and evolve based on the previous hypothesis. If there is no previous hypothesis, start with something simple. Gradually build up upon previous hypotheses and feedback.
Ensure that the hypothesis focuses on the creation and selection of factors in quantitative finance. Each hypothesis should address specific factor characteristics such as type (momentum, value, quality), calculation methods, or inclusion criteria. Avoid hypotheses related to model architecture or optimization processes.

Sample Hypotheses (Only learn from the format as these are not the knowledge):
- "Include a momentum factor based on the last 12 months' returns."
- "Add a value factor calculated as the book-to-market ratio."
- "Incorporate a quality factor derived from return on equity (ROE)."
- "Use a volatility factor based on the standard deviation of returns over the past 6 months."
- "Include a sentiment factor derived from news sentiment scores."
- "The momentum factor should be calculated using a 6-month look-back period."
- "Combine value and momentum factors using a weighted average approach."
- "Filter stocks by market capitalization before calculating the factors."

factor_hypothesis_specification: |-
Specifications:
- Hypotheses should grow and evolve based on the previous hypothesis. If there is no previous hypothesis, start with something simple.
Expand Down