You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
logger.info(f"error symbol nums: {len(error_symbol)}")
205
198
logger.info(f"current get symbol nums: {len(instrument_list)}")
@@ -365,7 +358,7 @@ def download_data(
365
358
start=None,
366
359
end=None,
367
360
interval="1d",
368
-
check_data_length=False,
361
+
check_data_length: int=None,
369
362
limit_nums=None,
370
363
):
371
364
"""download data from Internet
@@ -382,8 +375,8 @@ def download_data(
382
375
start datetime, default "2000-01-01"
383
376
end: str
384
377
end datetime, default ``pd.Timestamp(datetime.datetime.now() + pd.Timedelta(days=1))``
385
-
check_data_length: bool
386
-
check data length, by default False
378
+
check_data_length: int
379
+
check data length, if not None and greater than 0, each symbol will be considered complete if its data length is greater than or equal to this value, otherwise it will be fetched again, the maximum number of fetches being (max_collector_count). By default None.
* *source_dir*: The directory where the raw data collected from the Internet is saved, default "Path(__file__).parent/source"
154
-
* *normalize_dir*: Directory for normalize data, default "Path(__file__).parent/normalize"
155
-
* *qlib_data_1d_dir*: the qlib data to be updated for yahoo, usually from: [download qlib data](https://github.com/microsoft/qlib/tree/main/scripts#download-cn-data)
156
-
* *trading_date*: trading days to be updated, by default ``datetime.datetime.now().strftime("%Y-%m-%d")``
157
-
* *end_date*: end datetime, default ``pd.Timestamp(trading_date + pd.Timedelta(days=1))``; open interval(excluding end)
158
-
* *region*: region, value from ["CN", "US"], default "CN"
149
+
* `trading_date`: start of trading day
150
+
* `end_date`: end of trading day(not included)
151
+
* `check_data_length`: check the number of rows per *symbol*, by default `None`
152
+
> if `len(symbol_df) < check_data_length`, it will be re-fetched, with the number of re-fetches coming from the `max_collector_count` parameter
* `source_dir`: The directory where the raw data collected from the Internet is saved, default "Path(__file__).parent/source"
156
+
* `normalize_dir`: Directory for normalize data, default "Path(__file__).parent/normalize"
157
+
* `qlib_data_1d_dir`: the qlib data to be updated for yahoo, usually from: [download qlib data](https://github.com/microsoft/qlib/tree/main/scripts#download-cn-data)
158
+
* `trading_date`: trading days to be updated, by default ``datetime.datetime.now().strftime("%Y-%m-%d")``
159
+
* `end_date`: end datetime, default ``pd.Timestamp(trading_date + pd.Timedelta(days=1))``; open interval(excluding end)
160
+
* `region`: region, value from ["CN", "US"], default "CN"
check data length, if not None and greater than 0, each symbol will be considered complete if its data length is greater than or equal to this value, otherwise it will be fetched again, the maximum number of fetches being (max_collector_count). By default None.
trading days to be updated, by default ``datetime.datetime.now().strftime("%Y-%m-%d")``
960
971
end_date: str
961
972
end datetime, default ``pd.Timestamp(trading_date + pd.Timedelta(days=1))``; open interval(excluding end)
962
-
973
+
check_data_length: int
974
+
check data length, if not None and greater than 0, each symbol will be considered complete if its data length is greater than or equal to this value, otherwise it will be fetched again, the maximum number of fetches being (max_collector_count). By default None.
975
+
delay: float
976
+
time.sleep(delay), default 1
963
977
Notes
964
978
-----
965
979
If the data in qlib_data_dir is incomplete, np.nan will be populated to trading_date for the previous trading day
0 commit comments